Security, networking and system integration: bgp

Showing posts with label bgp. Show all posts

Wednesday, September 4, 2013

DDOS attack mitigation via remote black hole

How to black hole (stop) an attacker inside your network

Remote triggered Black Hole filtering and preventing of spoofed, DDOS active subnets is a great way to save router resources and prevent the attacker from damaging your network.

A common DoS attack directed against a customer of a service provider involves generating a greater volume of attack traffic destined for the target than will fit down the links from the service provider(s) to the victim (customer). This traffic "starves out" legitimate traffic and often results in collateral damage or negative effects to other customers or the network infrastructure as well. Rather than having all destinations on their network be affected by the attack, the customer may ask their service provider to filter traffic destined to the target destination IP address(es), or the service provider may determine that this is necessary themselves, in order to preserve network availability.

However, with destination-based RTBH filtering, the impact of the attack on the target is complete. That is, destination-based RTBH filtering injects a discard route into the forwarding table for the target prefix. All packets towards that destination, attack traffic AND legitimate traffic, are then dropped by the participating routers, thereby taking the target completely offline. The benefit is that collateral damage to other systems or network availability at the customer location or in the ISP network is limited, but the negative impact to the target itself is arguably increased.

In this small scenario I will use a eBGP speaking router that will advertise the "spoofed DDOS subnet" of 99.99.99.0/24. All of the iBGP routers inside the AS100 domain will have this prefix installed in the BGP table.




The iBGP router CX2 is used as a trigger device, that has a simple task, to advertise the DDOS prefix inside the AS100, 

and put those packets inside the Black Hole. Let us use the configuration of the Trigger router.

CX2

interface Loopback1
 ip address 192.0.2.1 255.255.255.255

!

route-map BLACK-HOLE permit 10
 match tag 999
 set local-preference 200
 set origin igp
 set community no-export
 set ip next-hop 192.0.2.1
!
route-map BLACK-HOLE deny 20

!

router bgp 100
 no synchronization
 bgp log-neighbor-changes
 network 5.5.5.5 mask 255.255.255.255
 redistribute static route-map BLACK-HOLE
 neighbor 16.2.1.1 remote-as 100
 neighbor 18.1.1.2 remote-as 100
 no auto-summary

We have created a simple route map that will tagg the static route of our given prefix and set the next hop towards the 192.0.2.1 interface. 

This address belongs to the discard address space. Every other iBGP router must have a 

static route for the 192.0.2.1 address that points those packets to the NULL0 interface.

CX1#ip route 192.0.2.1 255.255.255.255 Null0

Customer#ip route 192.0.2.1 255.255.255.255 Null0



Now let us take a look at the BGP table of the CX1 router. We can see the 99.99.99.0 prefix is being advertised 

into the table and we have connectivity with the SPOOFED address.


CX1#sh ip bgp 99.99.99.0
BGP routing table entry for 99.99.99.0/24, version 18
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Not advertised to any peer
  200 65535
    16.1.1.2 from 16.1.1.2 (1.1.1.1)
      Origin IGP, metric 0, localpref 100, valid, internal, best




We can test the route by pinging the spoofed address. 


CX1#ping 99.99.99.1 source loopback 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 99.99.99.1, timeout is 2 seconds:
Packet sent with a source address of 4.4.4.4
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/74/96 ms








Now to deny this traffic trough the AS100 we must redistribute and create a static route on the 

trigger router 

that will be distributed inside the AS100. I will let a long ping period from the CX1 router to 

demonstrate how the traffic stops inside the AS100 as soon as I create the static route for the 99.99.99.0/24 subnet.









As soon as I typed in the static route the PING has stopped. And if we look at the BGP table of

the CX1 router now, we can se that the route is being advertised from the TRIGGER router

, and the next-hop is set to the 192.0.2.1 , the discard IP address.

CX1#sh ip route 99.99.99.0
Routing entry for 99.99.99.0/24
  Known via "bgp 100", distance 200, metric 0, type internal
  Last update from 192.0.2.1 00:02:23 ago
  Routing Descriptor Blocks:
  * 192.0.2.1, from 16.2.1.2, 00:02:23 ago
      Route metric is 0, traffic share count is 1
      AS Hops 0

This very simple DDOS mechanism can be used in more complex scenarios, with redundant Route Reflectors inside a large BGP domain.

One can stop an attacker in a very short time

period. There are more explanations on the RFC5635 document.

Feel free to comment.

Monday, September 2, 2013

BGP Communities - routes reside inside the local AS

Setting the NO-EXPORT BGP community

BGP communities are attributes that maybe added to every prefix we choose. This is very interesting if one wants to logically separate incoming and outgoing traffic. With communities we have more granular control over the data plane inside our Autonomus system.

The communities attribute is a way to group destinations into communities and apply routing decisions based on the communities. This method simplifies the configuration of a BGP speaker that controls distribution of routing information.

The communities attribute is an optional, transitive, global attribute in the numerical range from 1 to 4,294,967,200. Along with Internet community, there are a few predefined, well-known communities, as follows:

internet—Advertise this route to the Internet community. All routers belong to it.
no-export—Do not advertise this route to eBGP peers.
no-advertise—Do not advertise this route to any peer (internal or external).
local-as—Do not advertise this route to peers outside the local autonomous system. This route will not be advertised to other autonomous systems or sub-autonomous systems when confederations are configured.

In this small case scenario we have a customer AS 100 that does not want some prefixes to be advertised outside his own AS. Maybe the prefixes are malicious, or they have no agreement with the ISP companies or within any other reason this can be done with using the BGP default community NO-EXPORT. Let us take a look into the diagram.

In our small example we will be looking at the CX1 router and some of the ISP routing tables. First let us take a look at the BGP configs of the CX1 router, and the BGP table of the ISP2 router.

We can see that the ISP2 router has the two prefixes inside the RIB table. Those the prefixes we do not want to advertise outside the AS100. This is achieved via a simple route-map and an ACL that is associated with the desired traffic.

CX1#sh ip access-lists

Standard IP access list 1

10 permit 44.44.44.0, wildcard bits 0.0.1.255 (2 matches)

CX1#sh route-map

route-map NO-EXPORT, permit, sequence 10

Match clauses:

ip address (access-lists): 1

Set clauses:

community no-export

Policy routing matches: 0 packets, 0 bytes

CX1

neighbor 16.1.1.2 send-community both

neighbor 16.1.1.2 route-map NO-EXPORT out

We have an ACL that is used to capture source of loopback address advertised with the prefixes 44.44.44.0/24 and 44.44.45.0/24. After that I have created an route-map called NO-EXPORT that uses the acl 1 and sets the community no-export on those prefixes. Then we have applied this route map to the neighbor inside the AS100.

Now let us see the RIB table of the ISP2 router.

And it is working fine, we do not see the prefixes we stopped to advertise. We can verify this further on the edge router of the AS100.

CUSTOMER#sh ip bgp neighbors 172.16.1.2 advertised-routes

BGP table version is 11, local router ID is 1.1.1.1

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 1.1.1.1/32 0.0.0.0 0 32768 i

*> 2.2.2.2/32 192.168.1.2 0 0 200 i

*> 3.3.3.3/32 172.16.1.2 0 0 300 i

*>i5.5.5.5/32 17.1.1.2 0 100 0 i

We can see that the prefixes are not being advertised to the eBGP neighbor as we intended. Down following we can see the BGP table of the AS100 edge router.

Total number of prefixes 4

CUSTOMER#sh ip bgp 44.44.44.0

BGP routing table entry for 44.44.44.0/24, version 10

Paths: (1 available, best #1, table Default-IP-Routing-Table, not advertised to EBGP peer)

Not advertised to any peer

Local

16.1.1.1 from 16.1.1.1 (4.4.4.4)

Origin IGP, metric 0, localpref 100, valid, internal, best

Community: no-export

The prefix 44.44.44.0 is in the routing table but it has the no-export community attached to the route. Thus the prefix is not being exported to the eBGP neighbors. Very simple and clean.

Feel free to comment.

Friday, August 30, 2013

MultiProtocol BGP meshed IPv6 and IPv4

Implementing MP-BGP in a SP IPv6 and IPv4 network

The multiprotocol BGP (MBGP) feature adds capabilities to BGP to enable multicast routing policy throughout the Internet and to connect multicast topologies within and between BGP autonomous systems. In other words, multiprotocol BGP (MBGP) is an enhanced BGP that carries IP multicast routes. BGP carries two sets of routes, one set for unicast routing and one set for multicast routing. The routes associated with multicast routing are used by the Protocol Independent Multicast (PIM) to build data distribution trees.

The only three pieces of information carried by BGP-4 that are IPv4 specific are (a) the NEXT_HOP attribute (expressed as an IPv4 address), (b) AGGREGATOR (contains an IPv4 address), and (c) NLRI(expressed as IPv4 address prefixes). Any BGP speaker, including MBGP speakers, has to have an IPv4 address, which will be used, among other things, in the AGGREGATOR attribute. To enable BGP-4 to support routing for multiple Network Layer protocols the only two things that have to be added to BGP-4 are (a) the ability to associate a particular Network Layer protocol with the next hop information, and (b) the ability to associated a particular Network Layer protocol with NLRI.

MP-BGP is an extension to the BGP protocol that has an objective to carry routing information about:

other protocols
Multicast
MPLS VPN
IPv6
6PE
CLNS

Exchange of Multi-Protocol NLRI must be negotiated at session set up.

For some practical presentation of the MP-BGP protocol I have created a small ISP lab with couple of UPSTREAM providers that will use the IPv6 and IPv4 prefix routing at the same time. This is a common practice nowadays in the ISP enviroment.

We have a small ISP with two routers the are iBGP speakers and couple of eBGP peers with upstream connections. For those that are familiar with the IPV6 setup and address space this will come easy. I am using /127 networks for the WAN links to simulate only two IP address space in the peer connection. On the same physical link I am using also the IPv4 address to peer with the BGP speaking router. Now let us look at the configs, I will try to clarify every command. For more on MP-BGP protocol , one can read a RFC on that subject - RFC2858.

ISP1

ipv6 unicast-routing >> important to turn on because by default IPV6 routing is disabled

interface Loopback0

ip address 1.1.1.1 255.255.255.255

interface Loopback1

no ip address

ipv6 address 2030:1::1/64 >> I have defined a couple of /64 networks to propagate to AS100

ipv6 address 2030:2::1/64

ipv6 address 2030:3::1/64

ipv6 enable

interface FastEthernet0/0 >> dual IP stack IPv4 and IPv6 address on the WAN link

ip address 10.0.0.2 255.255.255.252

duplex auto

speed auto

ipv6 address 2005:1::/127 << /127 networks allows only two IPv6 hosts

ipv6 enable << on some routers this is enabled after entering the IP address

router bgp 100

bgp router-id 1.1.1.1

no bgp default ipv4-unicast << I have disabled the default behaviour of BGP , as we are using

bgp log-neighbor-changes address family concept >>

neighbor 10.0.0.1 remote-as 100

neighbor 2005:1::1 remote-as 100

address-family ipv4 << the address family model for IPV4

neighbor 10.0.0.1 activate

no auto-summary

no synchronization

exit-address-family

address-family ipv6

neighbor 2005:1::1 activate

network 2030:1::1/64 << advertising loopback 1 subnets into BGP

network 2030:2::1/64

network 2030:3::1/64

exit-address-family

The Cisco BGP address family identifier (AFI) model was introduced with multiprotocol BGP and is designed to be modular and scalable, and to support multiple AFI and subsequent address family identifier (SAFI) configurations.

As we can see I have defined two address families IPv4 and IPv6 for the BGP peerings. We must use the activate command on every neighbor for the family, or the peer group to make it easier to manage. We must add the peer address and the AS number under the global BGP process, and further activate the neighbor under the family model.

Now let us look at the rest of the router config, they are pretty much the same.

ISP2

ipv6 unicast-routing

interface Loopback0

ip address 2.2.2.2 255.255.255.255

interface Loopback1

no ip address

ipv6 address 2010:1::1/64

ipv6 address 2010:2::1/64

ipv6 address 2010:3::1/64

ipv6 enable

interface Loopback2

ip address 22.22.22.1 255.255.255.0 secondary

ip address 22.22.24.1 255.255.255.0

interface FastEthernet0/0

ip address 10.0.0.1 255.255.255.252

duplex auto

speed auto

ipv6 address 2005:1::1/127

ipv6 enable

interface FastEthernet1/0

ip address 172.16.1.1 255.255.255.252

duplex auto

speed auto

ipv6 address 2001:1::/127

ipv6 enable

interface FastEthernet2/0

ip address 173.16.1.1 255.255.255.252

duplex auto

speed auto

ipv6 address 2002:1::/127

ipv6 enable

router bgp 100

bgp router-id 2.2.2.2

no bgp default ipv4-unicast

bgp log-neighbor-changes

neighbor 10.0.0.2 remote-as 100

neighbor 2001:1::1 remote-as 200

neighbor 2002:1::1 remote-as 300

neighbor 2005:1:: remote-as 100

neighbor 172.16.1.2 remote-as 200

neighbor 173.16.1.2 remote-as 300

address-family ipv4

neighbor 10.0.0.2 activate

neighbor 172.16.1.2 activate

neighbor 173.16.1.2 activate

no auto-summary

no synchronization

network 22.22.22.0 mask 255.255.255.0

network 22.22.24.0 mask 255.255.255.0

exit-address-family

address-family ipv6

neighbor 2001:1::1 activate

neighbor 2002:1::1 activate

neighbor 2005:1:: activate

neighbor 2005:1:: next-hop-self

network 2010:1::1/64

network 2010:2::1/64

network 2010:3::1/64

exit-address-family

UPSTREAM1

ipv6 unicast-routing

interface Loopback0

ip address 5.5.5.5 255.255.255.255

interface Loopback1

no ip address

ipv6 address 2006:1::1/64

ipv6 address 2006:2::1/64

ipv6 address 2006:3::1/64

ipv6 address 2006:4::1/64

interface Loopback2

ip address 55.55.56.1 255.255.255.0 secondary

ip address 55.55.55.1 255.255.255.0

interface FastEthernet0/0

ip address 172.16.1.2 255.255.255.252

duplex auto

speed auto

ipv6 address 2001:1::1/127

ipv6 enable

router bgp 200

bgp router-id 5.5.5.5

no bgp default ipv4-unicast

bgp log-neighbor-changes

neighbor 2001:1:: remote-as 100

neighbor 172.16.1.1 remote-as 100

address-family ipv4

neighbor 172.16.1.1 activate

no auto-summary

no synchronization

network 55.55.55.0 mask 255.255.255.0

network 55.55.56.0 mask 255.255.255.0

exit-address-family

address-family ipv6

neighbor 2001:1:: activate

network 2006:1::1/64

network 2006:2::1/64

network 2006:3::1/64

network 2006:4::1/64

exit-address-family

UPSTREAM2

ipv6 unicast-routing

interface Loopback0

ip address 6.6.6.6 255.255.255.255

interface Loopback1

no ip address

ipv6 address 2020:1::1/64

ipv6 address 2020:2::1/64

ipv6 address 2020:3::1/64

ipv6 address 2020:4::1/64

ipv6 enable

interface Loopback2

ip address 66.66.67.1 255.255.255.0

interface FastEthernet0/0

ip address 173.16.1.2 255.255.255.252

duplex auto

speed auto

ipv6 address 2002:1::1/127

ipv6 enable

router bgp 300

bgp router-id 6.6.6.6

no bgp default ipv4-unicast

bgp log-neighbor-changes

neighbor 2002:1:: remote-as 100

neighbor 173.16.1.1 remote-as 100

address-family ipv4

neighbor 173.16.1.1 activate

no auto-summary

no synchronization

network 66.66.66.0 mask 255.255.255.0

network 66.66.67.0 mask 255.255.255.0

exit-address-family

address-family ipv6

neighbor 2002:1:: activate

network 2020:1::1/64

network 2020:2::1/64

network 2020:3::1/64

network 2020:4::1/64

exit-address-family

To see the BGP table we must use some different syntax on the IPV6 address family. First let us look at the BGP table on the ISP2 router, that interconnects every other router in our small topology.

ISP2#sh ip bgp

BGP table version is 6, local router ID is 2.2.2.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 22.22.22.0/24 0.0.0.0 0 32768 i

*> 22.22.24.0/24 0.0.0.0 0 32768 i

*> 55.55.55.0/24 172.16.1.2 0 0 200 i

*> 55.55.56.0/24 172.16.1.2 0 0 200 i

*> 66.66.67.0/24 173.16.1.2 0 0 300 i

The BGP table looks simple and clean. We have routes from internal and external neighbors in our table correctly installed. We can test the IPV4 data plane with a simple ping. And verify that it is working fine.

ISP2#ping 66.66.67.1

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 66.66.67.1, timeout is 2 seconds:

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 20/24/40 ms

Now, let us look a the IPV6 BGP table and the IPV6 family neighbors. Cisco introduces a new command to verify the IPV6 neighbor connectivity and the BGP table.

ISP2#sh ip bgp ipv6 unicast summary

BGP router identifier 2.2.2.2, local AS number 100

BGP table version is 15, main routing table version 15

14 network entries using 2086 bytes of memory

14 path entries using 1064 bytes of memory

5/4 BGP path/bestpath attribute entries using 620 bytes of memory

2 BGP AS-PATH entries using 48 bytes of memory

0 BGP route-map cache entries using 0 bytes of memory

0 BGP filter-list cache entries using 0 bytes of memory

BGP using 3818 total bytes of memory

BGP activity 23/4 prefixes, 23/4 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

2001:1::1 4 200 64 67 15 0 0 00:59:34 4

2002:1::1 4 300 47 51 15 0 0 00:42:28 4

2005:1:: 4 100 297 297 15 0 0 04:51:16 3

We can see that we have three IPV6 BGP neighbors, two external and one internal BGP speaking router. The prefixes are exhanged between them. Now , let us see the IPV6 BGP routing table.

ISP2#sh ip bgp ipv6 unicast

BGP table version is 15, local router ID is 2.2.2.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 2006:1::/64 2001:1::1 0 0 200 i

*> 2006:2::/64 2001:1::1 0 0 200 i

*> 2006:3::/64 2001:1::1 0 0 200 i

*> 2006:4::/64 2001:1::1 0 0 200 i

*> 2010:1::1/64 :: 0 32768 i

*> 2010:2::1/64 :: 0 32768 i

*> 2010:3::1/64 :: 0 32768 i

*> 2020:1::/64 2002:1::1 0 0 300 i

*> 2020:2::/64 2002:1::1 0 0 300 i

*> 2020:3::/64 2002:1::1 0 0 300 i

*> 2020:4::/64 2002:1::1 0 0 300 i

*>i2030:1::/64 2005:1:: 0 100 0 i

*>i2030:2::/64 2005:1:: 0 100 0 i

*>i2030:3::/64 2005:1:: 0 100 0 i

We can see all the prefixes from the advertised IPV6 loopbacks that are insalled in the global IPV6 routing table. We can do a simple ping to verify connectivity. I can verify that it is working ok.

ISP2#ping ipv6 2020:1::1 source loopback 1

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 2020:1::1, timeout is 2 seconds:

Packet sent with a source address of 2010:1::1

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 16/20/28 ms

There is much more to talk about the MP-BGP protocol, and more blogs to come on L3 MPLS VPN, where we will exchange the VPNV4 and VPNV6 routes. On detailed implementation one can always use the Cisco site on MP-BGP for IPV6.

Feel free to comment.

Thursday, August 29, 2013

Reduce BGP router utilization using ORF

Implementing outbound route filtering in BGP

The BGP Prefix Based Outbound Route Filtering feature uses Border Gateway Protocol (BGP) outbound route filter (ORF) send and receive capabilities to minimize the number of BGP updates that are sent between BGP peers. Configuring this feature can help reduce the amount of system resources required for generating and processing routing updates by filtering out unwanted routing updates at the source.

This cool feature could be very useful when a Customer router is filtering and receiving the FULL Internet routing table that could be heavy as 200 MB, with over 300,000 prefixes. This way the router will not have so many processing of the filtered routes and free up a lot of system resources. In our example we have a couple of routers in a simple isp-customer PE-CE network topology.

I will configure a simple BGP peering topology between the PE and the CE router. The CE router will receive the default route from the ISP router. The Customer router does not need the full BGP routing table, maybe it is a stub router, or the default route is enough for all the Internet information the customer wants. So in order to fulfill that scenario a prefix list should be created to filter out the unnecessary routes. Before I applied a prefix list let us look at the BGP table of the CE router. The routes installed are simulated from the loopbacks address. This could also be a full Internet routing table in a production enviroment.

Now let us finish the peering and create a filter to chose only a couple of networks and a default route.

router bgp 65535

no synchronization

bgp router-id 2.2.2.2

bgp log-neighbor-changes

neighbor 192.168.1.2 remote-as 999

neighbor 192.168.1.2 prefix-list ISP_IN in

no auto-summary

ip prefix-list ISP_IN seq 10 permit 0.0.0.0/0

ip prefix-list ISP_IN seq 20 permit 10.10.10.0/24

ip prefix-list ISP_IN seq 30 permit 20.20.20.0/24

router bgp 999

no synchronization

bgp router-id 1.1.1.1

bgp log-neighbor-changes

redistribute connected

neighbor 192.168.1.1 remote-as 65535

neighbor 192.168.1.1 default-originate

no auto-summary

ip route 0.0.0.0 0.0.0.0 Null0

The filters are now working fine on the CE router. The BGP RIB is now much smaller and the CE router has only the desired routes we have assigned to him.

CE#sh ip bgp

BGP table version is 25, local router ID is 2.2.2.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 0.0.0.0 192.168.1.2 0 0 999 i

*> 10.10.10.0/24 192.168.1.2 0 0 999 ?

*> 20.20.20.0/24 192.168.1.2 0 0 999 ?

But what happens under the hood can be seen on the DEBUG BGP updates output. The router is denying all other routes from the PE router. In our case this is not a big problem because of the smaller RIB table, but if we could have the FULL Internet routing table this list could be very long and CPU intensive.

The CE router is generating a DENIED message for every prefix the is not destined for the routing table. This messages generating has very CPU intensive task issuing for the router, and this is why we should try he outbound route filtering.

Outbound route filtering is a dynamic mechanism. It mean it should be configured on both the routers. As we have seen , the CE router is filtering the routes he is receiving from the PE routes. When we have ORF in place the CE router can send dynamic ORF messages to the BGP PE speaking router, that will inform the PE router which updates should be sent over the peer connection. This means that the CE router is telling the PE router how to perform an outbound filtering for his routing table.

To implement it we can use two simple commands under the BGP process of the PE and CE routers.

CE(config-router)#neighbor 192.168.1.2 capability orf prefix-list send

PE(config-router)#neighbor 192.168.1.1 capability orf prefix-list receive

To verify the BPG neighbor capabilities of the CE router:

AF-dependant capabilities:

Outbound Route Filter (ORF) type (128) Prefix-list:

Send-mode: advertised

Receive-mode: received

Outbound Route Filter (ORF): sent;

Incoming update prefix filter list is ISP_IN

Sent Rcvd

Prefix activity: ---- ----

Prefixes Current: 0 3 (Consumes 156 bytes)

Prefixes Total: 0 4

Implicit Withdraw: 0 1

Explicit Withdraw: 0 0

Used as bestpath: n/a 3

Used as multipath: n/a 0

We did not create a prefix filter on the PE router for the 3 routes the CE is interested, but if we do a show output of received information from the CE router we can verify that the we have the current prefix list.

PE#sh ip bgp neighbors 192.168.1.1 received prefix-filter

Address family: IPv4 Unicast

ip prefix-list 192.168.1.1: 3 entries

seq 10 permit 0.0.0.0/0

seq 20 permit 10.10.10.0/24

seq 30 permit 20.20.20.0/24

The final verification is to see once more debug on the CE router. We should verify if the ORF is downsizing the DENIED messages on the CE router for the denied prefixes.

First look at this debug, we can see that now the CE router is only receiving the PREFIXES that it requested. No extra overhead BGP update traffic is getting into the RIB of the CE router. This is greatly reducing the convergence time and offloading the CPU usage.

If wee need to add more routes to the BGP routing table of the CE router, we can use a route refresh with the inbound prefix filter.

CE#clear ip bgp 999 in prefix-filter

On further more granular use of the ORF one can look into the Cisco guid on the web.

http://www.cisco.com/en/US/docs/ios/12_2s/feature/guide/fsbgporf.html#wp1043332

Feel free to comment.

Tuesday, August 27, 2013

Configure BGP TTL Security

BGP TTL Security feature

Default behaviour of BGP clients is to send an BGP update messages to peer with a TTL value of 1. There are a ways to remedy this configuration with neighbor statements but we are interested in security in this blog.

This small scenario of couple of eBGP peers will let us demonstrate this security feature.

If we send an BGP update with a TTL value of 1 , this mean that the router needs to be connected directly to the peer it is homing with. It is very easy for an attacker that is simulating a SYN packet on the TCP port 179 where a BGP speaker is listening, to change the TTL value of the SYN requests. Many of these packets could bring down the peering and migitiate a serious DOS attack. This could cause harm on a production enviroment.

We could prevent this , by configuring (we must do this on both sides) a TTL security hop count for the eBGP neighbors. This way it is very hard for an attacker to simulate a correct TTL value we have assigned to the neighbor statements.

Let us try the configuration scripts:

ISP1(config-router)#neighbor 192.168.1.2 ttl-security hops 1

And if we do and clear ip bgp * we can see in the output that we have no peering yet.

ISP1#sh ip bgp summary

BGP router identifier 1.1.1.1, local AS number 500

BGP table version is 2, main routing table version 2

2 network entries using 234 bytes of memory

2 path entries using 104 bytes of memory

6/1 BGP path/bestpath attribute entries using 744 bytes of memory

2 BGP AS-PATH entries using 48 bytes of memory

0 BGP route-map cache entries using 0 bytes of memory

0 BGP filter-list cache entries using 0 bytes of memory

BGP using 1130 total bytes of memory

BGP activity 17/15 prefixes, 20/18 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

172.16.1.2 4 500 256 265 2 0 0 00:00:36 2

192.168.1.2 4 1 479 495 0 0 0 00:00:37 OpenSent

Now let us do it on the other side for the customer router. We can see that the hold down period has expired.

*Mar 1 04:08:10.142: %BGP-3-NOTIFICATION: sent to neighbor 192.168.1.1 4/0 (hold time expired) 0 bytes

So we must put in the exact same TTL hops in the customer neighbor statement.

CUSTOMER(config-router)#neighbor 192.168.1.1 ttl-security hops 1

And after a couple of seconds we have our peering up and running.

*Mar 1 04:10:39.186: %BGP-5-ADJCHANGE: neighbor 192.168.1.1 Up

So let us look a the final configs at the BGP processes.

CUSTOMER#sh running-config | section bgp

router bgp 1

no synchronization

bgp router-id 10.10.10.10

bgp log-neighbor-changes

bgp scan-time 20

network 50.50.50.0 mask 255.255.255.0

network 100.100.100.0 mask 255.255.255.0

redistribute connected

neighbor 192.168.1.1 remote-as 500

neighbor 192.168.1.1 ttl-security hops 1

neighbor 192.168.1.1 timers 20 60

neighbor 192.168.1.1 advertisement-interval 15

no auto-summary

ISP1#sh running-config | section bgp

router bgp 500

no synchronization

bgp router-id 1.1.1.1

bgp log-neighbor-changes

network 1.1.1.1 mask 255.255.255.255

neighbor 172.16.1.2 remote-as 500

neighbor 172.16.1.2 next-hop-self

neighbor 192.168.1.2 remote-as 1

neighbor 192.168.1.2 ttl-security hops 1

neighbor 192.168.1.2 soft-reconfiguration inbound

no auto-summary

This feature is explained more on the RFC5082

Feel free to comment.

BGP Soft Reconfiguration Inbound

Configure soft reconfiguration inbound

When a BGP speaking router advertises routes another BGP router updates his BGP table with the same. But there are some situations where an network engineer want's to apply an inbound policy for the routes that the organization is receiving. Because of the BGP protocol design , the BGP Update messages sent to peers are incremental, and if one want's to filter the complete tables and prefixes it must use a hard reset or a route refresh (sometimes a router does not support this feature). A hard reseting of the bgp peering in a production enviroment is not a good thing.

After this said, I will introduce a mechanism that allows us to store all the untouched NLRI (Network Layer Reachability Information) in a different table that can be filtered later on. I have attached a small lab diagram to further elaborate the feature.

Let us take a look at the BGP table organization

Adj-RIBs-In --—-> Loc-RIB —---> Adj-RIBs-Out

The Adj-RIBs-In stores UPDATE messages from other BGP speakers. These are un-edited routes received from our neighbor. Next, our inbound policy is applied, and routes that pass through the policy & have a valid/resolvable next hop, are put into the Loc-RIB. The rest of the routes in the Adj-RIBs-In are discarded.

The Adj-RIBs-Out stores routing information that the BGP speaker will advertise to its peers (i.e. routes that have passed through outbound policies & will be sent in the BGP UPDATE messages to other peers). This is actually just a pointer back to the record in the Loc-RIB.

Soft reconfiguration allows you to store a copy of the Adj-RIB-in.

We can configure the soft reconfiguration on the ISP router so one can filter the routes from the Customer.

First to clarify that the soft inbound is not reconfigured and we cannot see the unfiltered routes.

ISP1#sh ip bgp neighbors 192.168.1.2 received-routes

% Inbound soft reconfiguration not enabled on 192.168.1.2

We use a simple command to configure it.

ISP1(config-router)#neighbor 192.168.1.2 soft-reconfiguration inbound

Now if we try the show output for the received routes on the ISP we can see all the routes with their original NLRI data sent over the BGP peering.

ISP1#sh ip bgp neighbors 192.168.1.2 received-routes

BGP table version is 28, local router ID is 1.1.1.1

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 10.10.10.10/32 192.168.1.2 0 0 1 ?

*> 15.15.15.0/24 192.168.1.2 0 0 1 ?

*> 15.15.16.0/24 192.168.1.2 0 0 1 ?

*> 15.15.17.0/24 192.168.1.2 0 0 1 ?

*> 50.50.50.0/24 192.168.1.2 0 0 1 i

*> 100.100.100.0/24 192.168.1.2 0 0 1 i

r> 192.168.1.0/30 192.168.1.2 0 0 1 ?

Total number of prefixes 7

We can see the last NLRI that has a r> sign in front of the data. That tells us that there is a RIB failure for that particular route. If we do a show command for that route we can see that there is route in the routing table with a smaller AD. This is the attached interface on the ISP1 router.

ISP1#sh ip route 192.168.1.0

Routing entry for 192.168.1.0/30, 1 known subnets

Attached (1 connections)

C 192.168.1.0 is directly connected, FastEthernet2/0

Soft reconfiguration inbound utilizies a lot of memory resources on the router. So it is better not to use this feature on every router, in every scenario.

Feel free to comment.

Tune BGP Timers

How to configure BGP timers

BGP timers are used in peering procedure with iBGP and eBGP speaking routers. In this blog I will try to explain where to use and how to tune and benefit for BGP timers. First let us see the basic timers:

KEEPALIVE and HOLD-DOWN
ADVERTISEMENT INVERVAL
SCAN-TIMER

Keepalive and hold-down timers are the most common in BGP peerings. Using the default settings, the keepalive timer is 60 seconds and hold-down timer is 3 x keepalive or 180 seconds.When we have a successfull peering, router counts from 0 to every second up. Every keepalive packet a router receives from the neighbor resets the BGP timer and the count procedure starts again. If a router does not send keep alives packets three in a row the default BGP hold-down timer expires. This will reproduce a hold down period expired and the peering will go down. Thus the routes from the iBGP speaking router will not be advertised from the neighbor router. We do not want that to happen often in the production enviroment.

Let us see a small lab diagram and test the timer tuning configs.

We can use the show command to see the output of the default BGP timer values on one of the eBGP peers. We can look at the customer router.

CUSTOMER#sh ip bgp neighbors 192.168.1.1

BGP neighbor is 192.168.1.1, remote AS 500, external link

BGP version 4, remote router ID 1.1.1.1

BGP state = Established, up for 01:35:20

Last read 00:00:20, last write 00:00:20, hold time is 180, keepalive interval is 60 seconds

Default minimum time between advertisement runs is 30 seconds

In order to keep the BGP table stable BGP speaking router maintains a period of advertising the routes to the neighboring router. This period is called advertisiment timer. The default timer for the iBGP router is 0 seconds and for the eBGP routes is 30 seconds.

Service providers often agree on what should be the BGP timers set on their sides, depending on what services is the router carrying. In our situation we will change to smaller values to improve convergence in case of failures.

We will set the the keepalive to 20 seconds and the hold down timer to 60 seconds. To send this settings to eBGP peers, we should reset the BGP peering with the clear ip bgp * (note the the soft reconfiguration will not work under these cases).

CUSTOMER(config-router)#neighbor 192.168.1.1 timers 20 60

CUSTOMER#sh running-config | section bgp

router bgp 1

no synchronization

bgp router-id 10.10.10.10

bgp log-neighbor-changes

network 50.50.50.0 mask 255.255.255.0

network 100.100.100.0 mask 255.255.255.0

neighbor 192.168.1.1 remote-as 500

neighbor 192.168.1.1 timers 20 60

no auto-summary

CUSTOMER#clear ip bgp *

*Mar 1 02:08:14.423: %BGP-5-ADJCHANGE: neighbor 192.168.1.1 Up

We can verify now that the settings took place and we have a smaller time frame set on the current BGP peering with the eBGP router.

CUSTOMER#sh ip bgp neighbors 192.168.1.1

BGP neighbor is 192.168.1.1, remote AS 500, external link

BGP version 4, remote router ID 1.1.1.1

BGP state = Established, up for 00:00:33

Last read 00:00:13, last write 00:00:01, hold time is 60, keepalive interval is 20 seconds

Now we should tweak the advertisiment timer under the BGP proccess. We will set the routes refresh for 10 seconds. This is more than enough for this type of connection.

CUSTOMER(config-router)#neighbor 192.168.1.1 advertisement-interval 15

We can se from the output of the BGP neighbor on the Customer router we have some advertisement activity. And the default timer is now configured to 15 seconds as we told the router to do.

For address family: IPv4 Unicast

BGP table version 5, neighbor version 5/0

Output queue size : 0

Index 1, Offset 0, Mask 0x2

1 update-group member

Sent Rcvd

Prefix activity: ---- ----

Prefixes Current: 2 2 (Consumes 104 bytes)

Prefixes Total: 2 2

Implicit Withdraw: 0 0

Explicit Withdraw: 0 0

Used as bestpath: n/a 2

Used as multipath: n/a 0

Outbound Inbound

Local Policy Denied Prefixes: -------- -------

Bestpath from this peer: 2 n/a

Total: 2 0

Number of NLRIs in the update sent: max 2, min 2

Minimum time between advertisement runs is 15 seconds

One more last timer we should include in this blog is the BGP-SCANNER. BGP scan time defines the period that the router will retry to scan the complete routing table. As the table grow larger, a complete Internet routing table could get up to 200 MB, default time is 60 seconds. The Scan process of BGP protocol looks inside the routing table and finds the missing or wrong IGP route for the next-hop , or a better alternative to a prefix using the BGP attributes. We can lower this time if we have enough resources on the router to do this job for us.

CUSTOMER(config-router)#bgp scan-time 20

To verify the actual scanning time we can see it in the summary output.

CUSTOMER#sh ip bgp summary

BGP router identifier 10.10.10.10, local AS number 1

BGP table version is 5, main routing table version 5

4 network entries using 468 bytes of memory

4 path entries using 208 bytes of memory

4/3 BGP path/bestpath attribute entries using 496 bytes of memory

1 BGP AS-PATH entries using 24 bytes of memory

0 BGP route-map cache entries using 0 bytes of memory

0 BGP filter-list cache entries using 0 bytes of memory

BGP using 1196 total bytes of memory

BGP activity 16/12 prefixes, 16/12 paths, scan interval 20 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

192.168.1.1 4 500 178 165 5 0 0 00:11:07 2

So the final configs on our freshly tuned eBGP peer should look like this:

CUSTOMER#show running-config | section bgp

router bgp 1

no synchronization

bgp router-id 10.10.10.10

bgp log-neighbor-changes

bgp scan-time 20

network 50.50.50.0 mask 255.255.255.0

network 100.100.100.0 mask 255.255.255.0

neighbor 192.168.1.1 remote-as 500

neighbor 192.168.1.1 timers 20 60

neighbor 192.168.1.1 advertisement-interval 15

no auto-summary

Thanks for reading. Feel free to comment.

BGP Next-hop-Self explained

How and when to use BGP next-hop-self

I have created a small ISP scenario with a Customer and an Upstream router to simulate yet another BGP command that is used often in BGP scenarios. As we all know BGP advertises destinations, but to use those destinations BGP protocol also is using a next-hop value inside a BGP update message.

For starters let us look at the diagram.

I have preconfigured the BGP proccess with the coressponding Autonomous systems. In this scenario we are using simulated WAN links between the BGP speaking routers as the neighbor addresses. We are not using an eBGP multihop, or no distribution. This is how we will see the problem and the solution. Now let us take look at the configs.

ISP1

interface Loopback0

ip address 1.1.1.1 255.255.255.255

interface FastEthernet0/0

ip address 172.16.1.1 255.255.255.252 << Wan link to iBGP router

duplex auto

speed auto

interface FastEthernet2/0

ip address 192.168.1.1 255.255.255.252 << link to Customer

duplex auto

speed auto

router ospf 1

router-id 1.1.1.1

log-adjacency-changes

passive-interface FastEthernet2/0

network 172.16.1.0 0.0.0.3 area 0

router bgp 500

no synchronization

bgp router-id 1.1.1.1

bgp log-neighbor-changes

network 1.1.1.1 mask 255.255.255.255

neighbor 172.16.1.2 remote-as 500

neighbor 192.168.1.2 remote-as 1

no auto-summary

ISP2

interface Loopback0

ip address 2.2.2.2 255.255.255.255

interface FastEthernet0/0

ip address 172.16.1.2 255.255.255.252 << link to iBGP neighbor

duplex auto

speed auto

interface FastEthernet1/0

ip address 192.168.3.1 255.255.255.252 <<link to UPSTREAM

duplex auto

speed auto

router ospf 1

router-id 2.2.2.2

log-adjacency-changes

passive-interface FastEthernet1/0

network 172.16.1.0 0.0.0.3 area 0

router bgp 500

no synchronization

bgp router-id 2.2.2.2

bgp log-neighbor-changes

network 2.2.2.2 mask 255.255.255.255

neighbor 172.16.1.1 remote-as 500

neighbor 192.168.3.2 remote-as 100

no auto-summary

We can now verify the bgp neighborships between the iBGP routers inside the ISP domain. Everything is ok and the prefixes are being received.

ISP1#sh ip bgp summary

BGP router identifier 1.1.1.1, local AS number 500

BGP table version is 14, main routing table version 14

5 network entries using 585 bytes of memory

5 path entries using 260 bytes of memory

5/4 BGP path/bestpath attribute entries using 620 bytes of memory

2 BGP AS-PATH entries using 48 bytes of memory

0 BGP route-map cache entries using 0 bytes of memory

0 BGP filter-list cache entries using 0 bytes of memory

BGP using 1513 total bytes of memory

BGP activity 8/3 prefixes, 9/4 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

172.16.1.2 4 500 56 57 14 0 0 00:24:42 2

The configs on the other two router are very similar. We are only advertising the loopbacks.

CUSTOMER

interface Loopback0

ip address 10.10.10.10 255.255.255.255

interface Loopback1

ip address 100.100.100.1 255.255.255.0

interface Loopback2

ip address 50.50.50.1 255.255.255.0

interface FastEthernet0/0

ip address 192.168.1.2 255.255.255.252

duplex auto

speed auto

router bgp 1

no synchronization

bgp router-id 10.10.10.10

bgp log-neighbor-changes

network 50.50.50.0 mask 255.255.255.0

network 100.100.100.0 mask 255.255.255.0

neighbor 192.168.1.1 remote-as 500

no auto-summary

UPSTREAM

interface Loopback0

ip address 5.5.5.5 255.255.255.255

interface Loopback1

ip address 200.200.200.1 255.255.255.0

interface FastEthernet0/0

ip address 192.168.3.2 255.255.255.252

duplex auto

speed auto

router bgp 100

no synchronization

bgp router-id 5.5.5.5

bgp log-neighbor-changes

network 200.200.200.0

neighbor 192.168.3.1 remote-as 500

no auto-summary

We should achieve full BGP meshed routing tables inside the ISP autonomous system. If we see that is not the case. Let us se what networks the Customer is advertising.

CUSTOMER#sh ip bgp neighbors 192.168.1.1 advertised-routes

BGP table version is 13, local router ID is 10.10.10.10

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*> 50.50.50.0/24 0.0.0.0 0 32768 i

*> 100.100.100.0/24 0.0.0.0 0 32768 i

Total number of prefixes 2

We see two prefixes from the customer. We shall now see the output from the BGP RIB table inside the second router of the ISP Autonomous system.

ISP2#sh ip bgp

BGP table version is 8, local router ID is 2.2.2.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*>i1.1.1.1/32 172.16.1.1 0 100 0 i

*> 2.2.2.2/32 0.0.0.0 0 32768 i

* i50.50.50.0/24 192.168.1.2 0 100 0 1 i

* i100.100.100.0/24 192.168.1.2 0 100 0 1 i

*> 200.200.200.0 192.168.3.2 0 0 100 i

Two prefixes learned from customers (50.50.50.0/24 and 100.100.100.0/24) are inside the BGP table, but there is no > sign. This means that they will not be inside the routing table and shall not be advertised to the eBGP neighbor Upstream. The mein problem is that the ISP2 router cannot reach the NEXT HOP address.

ISP2#sh ip route 192.168.1.2

% Network not in table

To remedy this problem we will use the next-hop-self command under the BGP proccess of the ISP1 routers. This command will tell the iBGP speaking router to change the BGP next-hop attribute to the known IP address to the router ISP2. This attribute is preserved in eBGP connections, thus the next-hop is not seen by the ISP2 router.

ISP1(config-router)#neighbor 172.16.1.2 next-hop-self

Now we can take look at the RIB table of the ISP2 router.

ISP2#sh ip bgp

BGP table version is 10, local router ID is 2.2.2.2

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

*>i1.1.1.1/32 172.16.1.1 0 100 0 i

*> 2.2.2.2/32 0.0.0.0 0 32768 i

*>i50.50.50.0/24 172.16.1.1 0 100 0 1 i

*>i100.100.100.0/24 172.16.1.1 0 100 0 1 i

*> 200.200.200.0 192.168.3.2 0 0 100 i

As the routing plane is in fuction the data plane is working fine. We can use a simple ping to verify that.

ISP2#ping 100.100.100.1 source loopback 0

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 100.100.100.1, timeout is 2 seconds:

Packet sent with a source address of 2.2.2.2

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 28/48/64 ms

Thanks !!!

Security, networking and system integration

Wednesday, September 4, 2013

DDOS attack mitigation via remote black hole

How to black hole (stop) an attacker inside your network

Monday, September 2, 2013

BGP Communities - routes reside inside the local AS

Setting the NO-EXPORT BGP community

Friday, August 30, 2013

MultiProtocol BGP meshed IPv6 and IPv4

Implementing MP-BGP in a SP IPv6 and IPv4 network

Thursday, August 29, 2013

Reduce BGP router utilization using ORF

Implementing outbound route filtering in BGP

Tuesday, August 27, 2013

Configure BGP TTL Security

BGP TTL Security feature

BGP Soft Reconfiguration Inbound

Configure soft reconfiguration inbound

Tune BGP Timers

How to configure BGP timers

BGP Next-hop-Self explained

How and when to use BGP next-hop-self

Total Pageviews

Visitors