Showing posts with label bgp. Show all posts
Showing posts with label bgp. Show all posts

Wednesday, September 4, 2013

DDOS attack mitigation via remote black hole

How to black hole (stop) an attacker inside your network

Remote triggered Black Hole filtering and preventing of spoofed, DDOS active subnets is a great way to save router resources and prevent the attacker from damaging your network. 
A common DoS attack directed against a customer of a service provider involves generating a greater volume of attack traffic destined for the target than will fit down the links from the service provider(s) to the victim (customer). This traffic "starves out" legitimate traffic and often results in collateral damage or negative effects to other customers or the network infrastructure as well.  Rather than having all destinations on their network be affected by the attack, the customer may ask their service provider to filter traffic destined to the target destination IP address(es), or the service provider may determine that this is necessary themselves, in order to preserve network availability.
However, with destination-based RTBH filtering, the impact of the attack on the target is complete.  That is, destination-based RTBH filtering injects a discard route into the forwarding table for the target prefix.  All packets towards that destination, attack traffic AND legitimate traffic, are then dropped by the participating routers, thereby taking the target completely offline.  The benefit is  that collateral damage to other systems or network availability at the customer location or in the ISP network is limited, but the negative impact to the target itself is arguably increased.
In this small scenario I will use a eBGP speaking router that will advertise the "spoofed DDOS subnet" of 99.99.99.0/24. All of the iBGP routers inside the AS100 domain will have this prefix installed in the BGP table. 

The iBGP router CX2 is used as a trigger device, that has a simple task, to advertise the DDOS prefix inside the AS100, 
and put those packets inside the Black Hole. Let us use the configuration of the Trigger router.
CX2
interface Loopback1 ip address 192.0.2.1 255.255.255.255
!
route-map BLACK-HOLE permit 10 match tag 999 set local-preference 200 set origin igp set community no-export set ip next-hop 192.0.2.1 ! route-map BLACK-HOLE deny 20
!
router bgp 100 no synchronization bgp log-neighbor-changes network 5.5.5.5 mask 255.255.255.255 redistribute static route-map BLACK-HOLE neighbor 16.2.1.1 remote-as 100 neighbor 18.1.1.2 remote-as 100 no auto-summary
We have created a simple route map that will tagg the static route of our given prefix and set the next hop towards the 192.0.2.1 interface. 
This address belongs to the discard address space. Every other iBGP router must have a 
static route for the 192.0.2.1 address that points those packets to the NULL0 interface.
CX1#ip route 192.0.2.1 255.255.255.255 Null0
Customer#ip route 192.0.2.1 255.255.255.255 Null0
Now let us take a look at the BGP table of the CX1 router. We can see the 99.99.99.0 prefix is being advertised 
into the table and we have connectivity with the SPOOFED address.
CX1#sh ip bgp 99.99.99.0 BGP routing table entry for 99.99.99.0/24, version 18 Paths: (1 available, best #1, table Default-IP-Routing-Table) Not advertised to any peer 200 65535 16.1.1.2 from 16.1.1.2 (1.1.1.1) Origin IGP, metric 0, localpref 100, valid, internal, best
We can test the route by pinging the spoofed address.
CX1#ping 99.99.99.1 source loopback 0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 99.99.99.1, timeout is 2 seconds: Packet sent with a source address of 4.4.4.4 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 48/74/96 ms
Now to deny this traffic trough the AS100 we must redistribute and create a static route on the 
trigger router 
that will be distributed inside the AS100. I will let a long ping period from the CX1 router to 
demonstrate how the traffic stops inside the AS100 as soon as I create the static route for the 99.99.99.0/24 subnet.
As soon as I typed in the static route the PING has stopped. And if we look at the BGP table of 
the CX1 router now, we can se that the route is being advertised from the TRIGGER router
, and the next-hop is set to the 192.0.2.1 , the discard IP address.

CX1#sh ip route 99.99.99.0
Routing entry for 99.99.99.0/24
  Known via "bgp 100", distance 200, metric 0, type internal
  Last update from 192.0.2.1 00:02:23 ago
  Routing Descriptor Blocks:
  * 192.0.2.1, from 16.2.1.2, 00:02:23 ago
      Route metric is 0, traffic share count is 1
      AS Hops 0

This very simple DDOS mechanism can be used in more complex scenarios, with redundant Route Reflectors inside a large BGP domain. 
One can stop an attacker in a very short time 
period. There are more explanations on the RFC5635 document.

Feel free to comment.

Monday, September 2, 2013

BGP Communities - routes reside inside the local AS

Setting the NO-EXPORT BGP community 

BGP communities are attributes that maybe added to every prefix we choose. This is very interesting if one wants to logically separate incoming and outgoing traffic. With communities we have more granular control over the data plane inside our Autonomus system. 
The communities attribute is a way to group destinations into communities and apply routing decisions based on the communities. This method simplifies the configuration of a BGP speaker that controls distribution of routing information.
The communities attribute is an optional, transitive, global attribute in the numerical range from 1 to 4,294,967,200. Along with Internet community, there are a few predefined, well-known communities, as follows:
  • internet—Advertise this route to the Internet community. All routers belong to it.
  • no-export—Do not advertise this route to eBGP peers.
  • no-advertise—Do not advertise this route to any peer (internal or external).
  • local-as—Do not advertise this route to peers outside the local autonomous system. This route will not be advertised to other autonomous systems or sub-autonomous systems when confederations are configured.

In this small case scenario we have a customer AS 100 that does not want some prefixes to be advertised outside his own AS. Maybe the prefixes are malicious, or they have no agreement with the ISP companies or within any other reason this can be done with using the BGP default community NO-EXPORT. Let us take a look into the diagram.


In our small example we will be looking at the CX1 router and some of the ISP routing tables. First let us take a look at the BGP configs of the CX1 router, and the BGP table of the ISP2 router.


We can see that the ISP2 router has the two prefixes inside the RIB table. Those the prefixes we do not want to advertise outside the AS100. This is achieved via a simple route-map and an ACL that is associated with the desired traffic.

CX1#sh ip access-lists
Standard IP access list 1
    10 permit 44.44.44.0, wildcard bits 0.0.1.255 (2 matches)

CX1#sh route-map
route-map NO-EXPORT, permit, sequence 10
  Match clauses:
    ip address (access-lists): 1
  Set clauses:
    community no-export
  Policy routing matches: 0 packets, 0 bytes

CX1
neighbor 16.1.1.2 send-community both
 neighbor 16.1.1.2 route-map NO-EXPORT out

We have an ACL that is used to capture source of loopback address advertised with the prefixes 44.44.44.0/24 and 44.44.45.0/24. After that I have created an route-map called NO-EXPORT that uses the acl 1 and sets the community no-export on those prefixes. Then we have applied this route map to the neighbor inside the AS100.
Now let us see the RIB table of the ISP2 router.


And it is working fine, we do not see the prefixes we stopped to advertise. We can verify this further on the edge router of the AS100. 

CUSTOMER#sh ip bgp neighbors 172.16.1.2 advertised-routes
BGP table version is 11, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.1/32       0.0.0.0                  0         32768 i
*> 2.2.2.2/32       192.168.1.2              0             0 200 i
*> 3.3.3.3/32       172.16.1.2               0             0 300 i
*>i5.5.5.5/32       17.1.1.2                 0    100      0 i

We can see that the prefixes are not being advertised to the eBGP neighbor as we intended. Down following we can see the BGP table of the AS100 edge router.

Total number of prefixes 4

CUSTOMER#sh ip bgp 44.44.44.0
BGP routing table entry for 44.44.44.0/24, version 10
Paths: (1 available, best #1, table Default-IP-Routing-Table, not advertised to EBGP peer)
  Not advertised to any peer
  Local
    16.1.1.1 from 16.1.1.1 (4.4.4.4)
      Origin IGP, metric 0, localpref 100, valid, internal, best
      Community: no-export

The prefix 44.44.44.0 is in the routing table but it has the no-export community attached to the route. Thus the prefix is not being exported to the eBGP neighbors. Very simple and clean.

Feel free to comment.

Friday, August 30, 2013

MultiProtocol BGP meshed IPv6 and IPv4

Implementing MP-BGP in a SP IPv6 and IPv4 network


The multiprotocol BGP (MBGP) feature adds capabilities to BGP to enable multicast routing policy throughout the Internet and to connect multicast topologies within and between BGP autonomous systems. In other words, multiprotocol BGP (MBGP) is an enhanced BGP that carries IP multicast routes. BGP carries two sets of routes, one set for unicast routing and one set for multicast routing. The routes associated with multicast routing are used by the Protocol Independent Multicast (PIM) to build data distribution trees.
The only three pieces of information carried by BGP-4 that are IPv4 specific are (a) the NEXT_HOP attribute (expressed as an IPv4 address), (b) AGGREGATOR (contains an IPv4 address), and (c) NLRI(expressed as IPv4 address prefixes). Any BGP speaker, including MBGP speakers, has to have an IPv4 address, which will be used, among other things, in the AGGREGATOR attribute. To enable BGP-4 to support routing for multiple Network Layer protocols the only two things that have to be added to BGP-4 are (a) the ability to associate a particular Network Layer protocol with the next hop information, and (b) the ability to associated a particular Network Layer protocol with NLRI.

MP-BGP is an extension to the BGP protocol that has an objective to carry routing information about:
  • other protocols
  • Multicast
  • MPLS VPN
  • IPv6
  • 6PE
  • CLNS
Exchange of Multi-Protocol NLRI must be negotiated at session set up.
For some practical presentation of the MP-BGP protocol I have created a small ISP lab with couple of UPSTREAM providers that will use the IPv6 and IPv4 prefix routing at the same time. This is a common practice nowadays in the ISP enviroment. 

We have a small ISP with two routers the are iBGP speakers and couple of eBGP peers with upstream connections. For those that are familiar with the IPV6 setup and address space this will come easy. I am using /127 networks for the WAN links to simulate only two IP address space in the peer connection. On the same physical link I am using also the IPv4 address to peer with the BGP speaking router. Now let us look at the configs, I will try to clarify every command. For more on MP-BGP protocol , one can read a RFC on that subject - RFC2858.

ISP1
ipv6 unicast-routing    >> important to turn on because by default IPV6 routing is disabled 
!
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
!
interface Loopback1
 no ip address
 ipv6 address 2030:1::1/64   >> I have defined a couple of /64 networks to propagate to AS100
 ipv6 address 2030:2::1/64
 ipv6 address 2030:3::1/64
 ipv6 enable
!
interface FastEthernet0/0   >> dual IP stack  IPv4 and IPv6 address on the WAN link
 ip address 10.0.0.2 255.255.255.252  
 duplex auto
 speed auto
 ipv6 address 2005:1::/127   << /127 networks allows only two IPv6 hosts
 ipv6 enable          << on some routers this is enabled after entering the IP address
!
router bgp 100
 bgp router-id 1.1.1.1
 no bgp default ipv4-unicast  << I have disabled the default behaviour of BGP , as we are using
 bgp log-neighbor-changes          address family concept  >>
 neighbor 10.0.0.1 remote-as 100
 neighbor 2005:1::1 remote-as 100
 ! 
 address-family ipv4          << the address family model for IPV4
  neighbor 10.0.0.1 activate  
  no auto-summary
  no synchronization
 exit-address-family
 !
 address-family ipv6
  neighbor 2005:1::1 activate
  network 2030:1::1/64          << advertising loopback 1 subnets into BGP
  network 2030:2::1/64
  network 2030:3::1/64
 exit-address-family

The Cisco BGP address family identifier (AFI) model was introduced with multiprotocol BGP and is designed to be modular and scalable, and to support multiple AFI and subsequent address family identifier (SAFI) configurations.
As we can see I have defined two address families IPv4 and IPv6 for the BGP peerings. We must use the activate command on every neighbor for the family, or the peer group to make it easier to manage. We must add the peer address and the AS number under the global BGP process, and further activate the neighbor under the family model. 
Now let us look at the rest of the router config, they are pretty much the same.

ISP2
ipv6 unicast-routing
!
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
!
interface Loopback1
 no ip address
 ipv6 address 2010:1::1/64
 ipv6 address 2010:2::1/64
 ipv6 address 2010:3::1/64
 ipv6 enable
!
interface Loopback2
 ip address 22.22.22.1 255.255.255.0 secondary
 ip address 22.22.24.1 255.255.255.0
!
interface FastEthernet0/0
 ip address 10.0.0.1 255.255.255.252
 duplex auto
 speed auto
 ipv6 address 2005:1::1/127
 ipv6 enable
!
interface FastEthernet1/0
 ip address 172.16.1.1 255.255.255.252
 duplex auto
 speed auto
 ipv6 address 2001:1::/127
 ipv6 enable
!
interface FastEthernet2/0
 ip address 173.16.1.1 255.255.255.252
 duplex auto
 speed auto
 ipv6 address 2002:1::/127
 ipv6 enable
!
router bgp 100
 bgp router-id 2.2.2.2
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 10.0.0.2 remote-as 100
 neighbor 2001:1::1 remote-as 200
 neighbor 2002:1::1 remote-as 300
 neighbor 2005:1:: remote-as 100
 neighbor 172.16.1.2 remote-as 200
 neighbor 173.16.1.2 remote-as 300
 !
 address-family ipv4
  neighbor 10.0.0.2 activate
  neighbor 172.16.1.2 activate
  neighbor 173.16.1.2 activate
  no auto-summary
  no synchronization
  network 22.22.22.0 mask 255.255.255.0
  network 22.22.24.0 mask 255.255.255.0
 exit-address-family
 !
 address-family ipv6
  neighbor 2001:1::1 activate
  neighbor 2002:1::1 activate
  neighbor 2005:1:: activate
  neighbor 2005:1:: next-hop-self
  network 2010:1::1/64
  network 2010:2::1/64
  network 2010:3::1/64
 exit-address-family

UPSTREAM1
ipv6 unicast-routing
!
interface Loopback0
 ip address 5.5.5.5 255.255.255.255
!
interface Loopback1
 no ip address
 ipv6 address 2006:1::1/64
 ipv6 address 2006:2::1/64
 ipv6 address 2006:3::1/64
 ipv6 address 2006:4::1/64
!
interface Loopback2
 ip address 55.55.56.1 255.255.255.0 secondary
 ip address 55.55.55.1 255.255.255.0
!
interface FastEthernet0/0
 ip address 172.16.1.2 255.255.255.252
 duplex auto
 speed auto
 ipv6 address 2001:1::1/127
 ipv6 enable
!
router bgp 200
 bgp router-id 5.5.5.5
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 2001:1:: remote-as 100
 neighbor 172.16.1.1 remote-as 100
 !
 address-family ipv4
  neighbor 172.16.1.1 activate
  no auto-summary
  no synchronization
  network 55.55.55.0 mask 255.255.255.0
  network 55.55.56.0 mask 255.255.255.0
 exit-address-family
 !
 address-family ipv6
  neighbor 2001:1:: activate
  network 2006:1::1/64
  network 2006:2::1/64
  network 2006:3::1/64
  network 2006:4::1/64
 exit-address-family

UPSTREAM2
ipv6 unicast-routing
!
interface Loopback0
 ip address 6.6.6.6 255.255.255.255
!
interface Loopback1
 no ip address
 ipv6 address 2020:1::1/64
 ipv6 address 2020:2::1/64
 ipv6 address 2020:3::1/64
 ipv6 address 2020:4::1/64
 ipv6 enable
!
interface Loopback2
 ip address 66.66.67.1 255.255.255.0
!
interface FastEthernet0/0
 ip address 173.16.1.2 255.255.255.252
 duplex auto
 speed auto
 ipv6 address 2002:1::1/127
 ipv6 enable
!
router bgp 300
 bgp router-id 6.6.6.6
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 2002:1:: remote-as 100
 neighbor 173.16.1.1 remote-as 100
 !
 address-family ipv4
  neighbor 173.16.1.1 activate
  no auto-summary
  no synchronization
  network 66.66.66.0 mask 255.255.255.0
  network 66.66.67.0 mask 255.255.255.0
 exit-address-family
 !
 address-family ipv6
  neighbor 2002:1:: activate
  network 2020:1::1/64
  network 2020:2::1/64
  network 2020:3::1/64
  network 2020:4::1/64
 exit-address-family

To see the BGP table we must use some different syntax on the IPV6 address family. First let us look at the BGP table on the ISP2 router, that interconnects every other router in our small topology.

ISP2#sh ip bgp
BGP table version is 6, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 22.22.22.0/24    0.0.0.0                  0         32768 i
*> 22.22.24.0/24    0.0.0.0                  0         32768 i
*> 55.55.55.0/24    172.16.1.2               0             0 200 i
*> 55.55.56.0/24    172.16.1.2               0             0 200 i
*> 66.66.67.0/24    173.16.1.2               0             0 300 i

The BGP table looks simple and clean. We have routes from internal and external neighbors in our table correctly installed. We can test the IPV4 data plane with a simple ping. And verify that it is working fine.

ISP2#ping 66.66.67.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 66.66.67.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/24/40 ms

Now, let us look a the IPV6 BGP table and the IPV6 family neighbors. Cisco introduces a new command to verify the IPV6 neighbor connectivity and the BGP table.


ISP2#sh ip bgp ipv6 unicast summary
BGP router identifier 2.2.2.2, local AS number 100
BGP table version is 15, main routing table version 15
14 network entries using 2086 bytes of memory
14 path entries using 1064 bytes of memory
5/4 BGP path/bestpath attribute entries using 620 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 3818 total bytes of memory
BGP activity 23/4 prefixes, 23/4 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
2001:1::1       4   200      64      67       15    0    0 00:59:34        4
2002:1::1       4   300      47      51       15    0    0 00:42:28        4
2005:1::        4   100     297     297       15    0    0 04:51:16       3

We can see that we have three IPV6 BGP neighbors, two external and one internal BGP speaking router. The prefixes are exhanged between them. Now , let us see the IPV6 BGP routing table.

ISP2#sh ip bgp ipv6 unicast
BGP table version is 15, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 2006:1::/64      2001:1::1                0             0 200 i
*> 2006:2::/64      2001:1::1                0             0 200 i
*> 2006:3::/64      2001:1::1                0             0 200 i
*> 2006:4::/64      2001:1::1                0             0 200 i
*> 2010:1::1/64     ::                       0         32768 i
*> 2010:2::1/64     ::                       0         32768 i
*> 2010:3::1/64     ::                       0         32768 i
*> 2020:1::/64      2002:1::1                0             0 300 i
*> 2020:2::/64      2002:1::1                0             0 300 i
*> 2020:3::/64      2002:1::1                0             0 300 i
*> 2020:4::/64      2002:1::1                0             0 300 i
*>i2030:1::/64      2005:1::                 0    100      0 i
*>i2030:2::/64      2005:1::                 0    100      0 i
*>i2030:3::/64      2005:1::                 0    100      0 i

We can see all the prefixes from the advertised IPV6 loopbacks that are insalled in the global IPV6 routing table. We can do a simple ping to verify connectivity. I can verify that it is working ok.

ISP2#ping ipv6 2020:1::1 source loopback 1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2020:1::1, timeout is 2 seconds:
Packet sent with a source address of 2010:1::1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/20/28 ms

There is much more to talk about the MP-BGP protocol, and more blogs to come on L3 MPLS VPN, where we will exchange the VPNV4 and VPNV6 routes. On detailed implementation one can always use the Cisco site on MP-BGP for IPV6.

Feel free to comment.

Thursday, August 29, 2013

Reduce BGP router utilization using ORF

Implementing outbound route filtering in BGP


The BGP Prefix Based Outbound Route Filtering feature uses Border Gateway Protocol (BGP) outbound route filter (ORF) send and receive capabilities to minimize the number of BGP updates that are sent between BGP peers. Configuring this feature can help reduce the amount of system resources required for generating and processing routing updates by filtering out unwanted routing updates at the source.

This cool feature could be very useful when a Customer router is filtering and receiving the FULL Internet routing table that could be heavy as 200 MB, with over 300,000 prefixes. This way the router will not have so many processing of the filtered routes and free up a lot of system resources. In our example we have a couple of routers in a simple isp-customer PE-CE network topology.



I will configure a simple BGP peering topology between the PE and the CE router. The CE router will receive the default route from the ISP router. The Customer router does not need the full BGP routing table, maybe it is a stub router, or the default route is enough for all the Internet information the customer wants. So in order to fulfill that scenario a prefix list should be created to filter out the unnecessary routes. Before I applied a prefix list let us look at the BGP table of the CE router. The routes installed are simulated from the loopbacks address. This could also be a full Internet routing table in a production enviroment.


Now let us finish the peering and create a filter to chose only a couple of networks and a default route.

CE
router bgp 65535
 no synchronization
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 192.168.1.2 remote-as 999
 neighbor 192.168.1.2 prefix-list ISP_IN in
 no auto-summary

ip prefix-list ISP_IN seq 10 permit 0.0.0.0/0
ip prefix-list ISP_IN seq 20 permit 10.10.10.0/24
ip prefix-list ISP_IN seq 30 permit 20.20.20.0/24

PE
router bgp 999
 no synchronization
 bgp router-id 1.1.1.1
 bgp log-neighbor-changes
 redistribute connected
 neighbor 192.168.1.1 remote-as 65535
 neighbor 192.168.1.1 default-originate
 no auto-summary
!
ip route 0.0.0.0 0.0.0.0 Null0

The filters are now working fine on the CE router. The BGP RIB is now much smaller and the CE router has only the desired routes we have assigned to him.

CE#sh ip bgp
BGP table version is 25, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 0.0.0.0          192.168.1.2                   0             0 999 i
*> 10.10.10.0/24    192.168.1.2              0             0 999 ?
*> 20.20.20.0/24    192.168.1.2              0             0 999 ?

But what happens under the hood can be seen on the DEBUG BGP updates output. The router is denying all other routes from the PE router. In our case this is not a big problem because of the smaller RIB table, but if we could have the FULL Internet routing table this list could be very long and CPU intensive.


The CE router is generating a DENIED message for every prefix the is not destined for the routing table. This messages generating has very CPU intensive task issuing for the router, and this is why we should try he outbound route filtering.

Outbound route filtering is a dynamic mechanism. It mean it should be configured on both the routers. As we have seen , the CE router is filtering the routes he is receiving from the PE routes. When we have ORF in place the CE router can send dynamic ORF messages to the BGP PE speaking router, that will inform the PE router which updates should be sent over the peer connection. This means that the CE router is telling the PE router how to perform an outbound filtering for his routing table.

To implement it we can use two simple commands under the BGP process of the PE and CE routers.

CE(config-router)#neighbor 192.168.1.2 capability orf prefix-list send
PE(config-router)#neighbor 192.168.1.1 capability orf prefix-list receive

To verify the BPG neighbor capabilities of the CE router:

 AF-dependant capabilities:
    Outbound Route Filter (ORF) type (128) Prefix-list:
      Send-mode: advertised
      Receive-mode: received
  Outbound Route Filter (ORF): sent;
  Incoming update prefix filter list is ISP_IN
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               0          3 (Consumes 156 bytes)
    Prefixes Total:                 0          4
    Implicit Withdraw:              0          1
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          3
    Used as multipath:            n/a          0

We did not create a prefix filter on the PE router for the 3 routes the CE is interested, but if we do a show output of received information from the CE router we can verify that the we have the current prefix list.

PE#sh ip bgp neighbors 192.168.1.1 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 192.168.1.1: 3 entries
   seq 10 permit 0.0.0.0/0
   seq 20 permit 10.10.10.0/24
   seq 30 permit 20.20.20.0/24

The final verification is to see once more debug on the CE router. We should verify if the ORF is downsizing the DENIED messages on the CE router for the denied prefixes.


First look at this debug, we can see that now the CE router is only receiving the PREFIXES that it requested. No extra overhead BGP update traffic is getting into the RIB of the CE router. This is greatly reducing the convergence time and offloading the CPU usage.
If wee need to add more routes to the BGP routing table of the CE router, we can use a route refresh with the inbound prefix filter.

CE#clear ip bgp 999 in prefix-filter

On further more granular use of the ORF one can look into the Cisco guid on the web.

Feel free to comment.

Tuesday, August 27, 2013

Configure BGP TTL Security

BGP TTL Security feature


Default behaviour of BGP clients is to send an BGP update messages to peer with a TTL value of 1. There are a ways to remedy this configuration with neighbor statements but we are interested in security in this blog.
This small scenario of couple of eBGP peers will let us demonstrate this security feature.
If we send an BGP update with a TTL value of 1 , this mean that the router needs to be connected directly to the peer it is homing with. It is very easy for an attacker that is simulating a SYN packet on the TCP port 179 where a BGP speaker is listening, to change the TTL value of the SYN requests. Many of these packets could bring down the peering and migitiate a serious DOS attack. This could cause harm on a production enviroment. 

We could prevent this , by configuring (we must do this on both sides) a TTL security hop count for the eBGP neighbors. This way it is very hard for an attacker to simulate a correct TTL value we have assigned to the neighbor statements.

Let us try the configuration scripts:

ISP1(config-router)#neighbor 192.168.1.2 ttl-security hops 1

And if we do and clear ip bgp * we can see in the output that we have no peering yet.

ISP1#sh ip bgp summary
BGP router identifier 1.1.1.1, local AS number 500
BGP table version is 2, main routing table version 2
2 network entries using 234 bytes of memory
2 path entries using 104 bytes of memory
6/1 BGP path/bestpath attribute entries using 744 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1130 total bytes of memory
BGP activity 17/15 prefixes, 20/18 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
172.16.1.2      4   500     256     265        2    0    0 00:00:36        2
192.168.1.2     4     1     479     495        0    0    0 00:00:37 OpenSent

Now let us do it on the other side for the customer router. We can see that the hold down period has expired. 
*Mar  1 04:08:10.142: %BGP-3-NOTIFICATION: sent to neighbor 192.168.1.1 4/0 (hold time expired) 0 bytes


So we must put in the exact same TTL hops in the customer neighbor statement.

CUSTOMER(config-router)#neighbor 192.168.1.1 ttl-security hops 1


And after a couple of seconds we have our peering up and running.

*Mar  1 04:10:39.186: %BGP-5-ADJCHANGE: neighbor 192.168.1.1 Up


So let us look a the final configs at the BGP processes. 

CUSTOMER#sh running-config | section bgp
router bgp 1
 no synchronization
 bgp router-id 10.10.10.10
 bgp log-neighbor-changes
 bgp scan-time 20
 network 50.50.50.0 mask 255.255.255.0
 network 100.100.100.0 mask 255.255.255.0
 redistribute connected
 neighbor 192.168.1.1 remote-as 500
 neighbor 192.168.1.1 ttl-security hops 1
 neighbor 192.168.1.1 timers 20 60
 neighbor 192.168.1.1 advertisement-interval 15
 no auto-summary


ISP1#sh running-config | section bgp
router bgp 500
 no synchronization
 bgp router-id 1.1.1.1
 bgp log-neighbor-changes
 network 1.1.1.1 mask 255.255.255.255
 neighbor 172.16.1.2 remote-as 500
 neighbor 172.16.1.2 next-hop-self
 neighbor 192.168.1.2 remote-as 1
 neighbor 192.168.1.2 ttl-security hops 1
 neighbor 192.168.1.2 soft-reconfiguration inbound
 no auto-summary

This feature is explained more on the RFC5082

Feel free to comment.

BGP Soft Reconfiguration Inbound

Configure soft reconfiguration inbound


When a BGP speaking router advertises routes another BGP router updates his BGP table with the same. But there are some situations where an network engineer want's to apply an inbound policy for the routes that the organization is receiving. Because of the BGP protocol design , the BGP Update messages sent to peers are incremental, and if one want's to filter the complete tables and prefixes it must use a hard reset or a route refresh (sometimes a router does not support this feature). A hard reseting of the bgp peering in a production enviroment is not a good thing. 
After this said, I will introduce a mechanism that allows us to store all the untouched NLRI (Network Layer Reachability Information) in a different table that can be filtered later on.  I have attached a small lab diagram to further elaborate the feature.

Let us take a look at the BGP table organization

Adj-RIBs-In --—-> Loc-RIB —---> Adj-RIBs-Out

The Adj-RIBs-In stores UPDATE messages from other BGP speakers. These are un-edited routes received from our neighbor.  Next, our inbound policy is applied, and routes that pass through the policy & have a valid/resolvable next hop, are put into the Loc-RIB. The rest of the routes in the Adj-RIBs-In are discarded.

The Adj-RIBs-Out stores routing information that the BGP speaker will advertise to its peers (i.e. routes that have passed through outbound policies & will be sent in the BGP UPDATE messages to other peers). This is actually just a pointer back to the record in the Loc-RIB.

Soft reconfiguration allows you to store a copy of the Adj-RIB-in.  

We can configure the soft reconfiguration on the ISP router so one can filter the routes from the Customer.
First to clarify that the soft inbound is not reconfigured and we cannot see the unfiltered routes.

ISP1#sh ip bgp neighbors 192.168.1.2 received-routes
% Inbound soft reconfiguration not enabled on 192.168.1.2

We use a simple command to configure it.

ISP1(config-router)#neighbor 192.168.1.2 soft-reconfiguration inbound


Now if we try the show output for the received routes on the ISP we can see all the routes with their original NLRI data sent over the BGP peering.


ISP1#sh ip bgp neighbors 192.168.1.2 received-routes
BGP table version is 28, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.10.10.10/32   192.168.1.2             0             0 1 ?
*> 15.15.15.0/24    192.168.1.2              0             0 1 ?
*> 15.15.16.0/24    192.168.1.2              0             0 1 ?
*> 15.15.17.0/24    192.168.1.2              0             0 1 ?
*> 50.50.50.0/24    192.168.1.2              0             0 1 i
*> 100.100.100.0/24 192.168.1.2           0             0 1 i
r> 192.168.1.0/30   192.168.1.2             0             0 1 ?

Total number of prefixes 7

We can see the last NLRI that has a r> sign in front of the data. That tells us that there is a RIB failure for that particular route. If we do a show command for that route we can see that there is route in the routing table with a smaller AD. This is the attached interface on the ISP1 router.

ISP1#sh ip route 192.168.1.0
Routing entry for 192.168.1.0/30, 1 known subnets
  Attached (1 connections)
C       192.168.1.0 is directly connected, FastEthernet2/0

Soft reconfiguration inbound utilizies a lot of memory resources on the router. So it is better not to use this feature on every router, in every scenario.

Feel free to comment.

Tune BGP Timers

How to configure BGP timers

BGP timers are used in peering procedure with iBGP and eBGP speaking routers. In this blog I will try to explain where to use and how to tune and benefit for BGP timers. First let us see the basic timers:
  • KEEPALIVE and HOLD-DOWN
  • ADVERTISEMENT INVERVAL
  • SCAN-TIMER 
Keepalive and hold-down timers are the most common in BGP peerings. Using the default settings, the keepalive timer is 60 seconds and hold-down timer is 3 x keepalive or 180 seconds.When we have a successfull peering, router counts from 0 to every second up. Every keepalive packet a router receives from the neighbor resets the BGP timer and the count procedure starts again. If a router does not send keep alives packets three in a row the default BGP hold-down timer expires. This will reproduce a hold down period expired and the peering will go down. Thus the routes from the iBGP speaking router will not be advertised from the neighbor router. We do not want that to happen often in the production enviroment. 

Let us see a small lab diagram and test the timer tuning configs.

We can use the show command to see the output of the default BGP timer values on one of the eBGP peers. We can look at the customer router.

CUSTOMER#sh ip bgp neighbors 192.168.1.1
BGP neighbor is 192.168.1.1,  remote AS 500, external link
  BGP version 4, remote router ID 1.1.1.1
  BGP state = Established, up for 01:35:20
  Last read 00:00:20, last write 00:00:20, hold time is 180, keepalive interval is 60 seconds
  Default minimum time between advertisement runs is 30 seconds

In order to keep the BGP table stable BGP speaking router maintains a period of advertising the routes to the neighboring router. This period is called advertisiment timer. The default timer for the iBGP router is 0 seconds and for the eBGP routes is 30 seconds.
Service providers often agree on what should be the BGP timers set on their sides, depending on what services is the router carrying. In our situation we will change to smaller values to improve convergence in case of failures. 

We will set the the keepalive to 20 seconds and the hold down timer to 60 seconds. To send this settings to eBGP peers, we should reset the BGP peering with the clear ip bgp * (note the the soft reconfiguration will not work under these cases).

CUSTOMER(config-router)#neighbor 192.168.1.1 timers 20 60

CUSTOMER#sh running-config | section bgp
router bgp 1
 no synchronization
 bgp router-id 10.10.10.10
 bgp log-neighbor-changes
 network 50.50.50.0 mask 255.255.255.0
 network 100.100.100.0 mask 255.255.255.0
 neighbor 192.168.1.1 remote-as 500
 neighbor 192.168.1.1 timers 20 60
 no auto-summary

CUSTOMER#clear ip bgp *
*Mar  1 02:08:14.423: %BGP-5-ADJCHANGE: neighbor 192.168.1.1 Up

We can verify now that the settings took place and we have a smaller time frame set on the current BGP peering with the eBGP router.

CUSTOMER#sh ip bgp neighbors 192.168.1.1
BGP neighbor is 192.168.1.1,  remote AS 500, external link
  BGP version 4, remote router ID 1.1.1.1
  BGP state = Established, up for 00:00:33
  Last read 00:00:13, last write 00:00:01, hold time is 60, keepalive interval is 20 seconds

Now we should tweak the advertisiment timer under the BGP proccess. We will set the routes refresh for 10 seconds. This is more than enough for this type of connection.

CUSTOMER(config-router)#neighbor 192.168.1.1 advertisement-interval 15

We can se from the output of the BGP neighbor on the Customer router we have some advertisement activity. And the default timer is now configured to 15 seconds as we told the router to do.

 For address family: IPv4 Unicast
  BGP table version 5, neighbor version 5/0
 Output queue size : 0
  Index 1, Offset 0, Mask 0x2
  1 update-group member
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               2          2 (Consumes 104 bytes)
    Prefixes Total:                 2          2
    Implicit Withdraw:              0          0
    Explicit Withdraw:              0          0
    Used as bestpath:             n/a          2
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Bestpath from this peer:              2        n/a
    Total:                                2          0
  Number of NLRIs in the update sent: max 2, min 2
  Minimum time between advertisement runs is 15 seconds


One more last timer we should include in this blog is the BGP-SCANNER. BGP scan time defines the period that the router will retry to scan the complete routing table. As the table grow larger, a complete Internet routing table could get up to 200 MB, default time is 60 seconds. The Scan process of BGP protocol looks inside the routing table and finds the missing or wrong IGP route for the next-hop , or a better alternative to a prefix using the BGP attributes. We can lower this time if we have enough resources on the router to do this job for us.

CUSTOMER(config-router)#bgp scan-time 20


To verify the actual scanning time we can see it in the summary output.

CUSTOMER#sh ip bgp summary
BGP router identifier 10.10.10.10, local AS number 1
BGP table version is 5, main routing table version 5
4 network entries using 468 bytes of memory
4 path entries using 208 bytes of memory
4/3 BGP path/bestpath attribute entries using 496 bytes of memory
1 BGP AS-PATH entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1196 total bytes of memory
BGP activity 16/12 prefixes, 16/12 paths, scan interval 20 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
192.168.1.1     4   500     178     165        5    0    0 00:11:07        2


So the final configs on our freshly tuned eBGP peer should look like this:

CUSTOMER#show running-config | section bgp
router bgp 1
 no synchronization
 bgp router-id 10.10.10.10
 bgp log-neighbor-changes
 bgp scan-time 20
 network 50.50.50.0 mask 255.255.255.0
 network 100.100.100.0 mask 255.255.255.0
 neighbor 192.168.1.1 remote-as 500
 neighbor 192.168.1.1 timers 20 60
 neighbor 192.168.1.1 advertisement-interval 15
 no auto-summary


Thanks for reading. Feel free to comment.

BGP Next-hop-Self explained

How and when to use BGP next-hop-self

I have created a small ISP scenario with a Customer and an Upstream router to simulate yet another BGP command that is used often in BGP scenarios. As we all know BGP advertises destinations, but to use those destinations BGP protocol also is using a next-hop value inside a BGP update message.
For starters let us look at the diagram. 

I have preconfigured the BGP proccess with the coressponding Autonomous systems. In this scenario we are using simulated WAN links between the BGP speaking routers as the neighbor addresses. We are not using an eBGP multihop, or no distribution. This is how we will see the problem and the solution. Now let us take look at the configs.

ISP1
interface Loopback0
 ip address 1.1.1.1 255.255.255.255
!
interface FastEthernet0/0
 ip address 172.16.1.1 255.255.255.252  << Wan link to iBGP router
 duplex auto
 speed auto
!
interface FastEthernet2/0
 ip address 192.168.1.1 255.255.255.252  << link to Customer
 duplex auto
 speed auto
!
router ospf 1
 router-id 1.1.1.1
 log-adjacency-changes
 passive-interface FastEthernet2/0
 network 172.16.1.0 0.0.0.3 area 0
!
router bgp 500
 no synchronization
 bgp router-id 1.1.1.1
 bgp log-neighbor-changes
 network 1.1.1.1 mask 255.255.255.255
 neighbor 172.16.1.2 remote-as 500
 neighbor 192.168.1.2 remote-as 1
 no auto-summary

ISP2
interface Loopback0
 ip address 2.2.2.2 255.255.255.255
!
interface FastEthernet0/0
 ip address 172.16.1.2 255.255.255.252   << link to iBGP neighbor
 duplex auto
 speed auto
!
interface FastEthernet1/0
 ip address 192.168.3.1 255.255.255.252 <<link to UPSTREAM
 duplex auto
 speed auto
!
router ospf 1
 router-id 2.2.2.2
 log-adjacency-changes
 passive-interface FastEthernet1/0
 network 172.16.1.0 0.0.0.3 area 0
!
router bgp 500
 no synchronization
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 network 2.2.2.2 mask 255.255.255.255
 neighbor 172.16.1.1 remote-as 500
 neighbor 192.168.3.2 remote-as 100
 no auto-summary

We can now verify the bgp neighborships between the iBGP routers inside the ISP domain. Everything is ok and the prefixes are being received.

ISP1#sh ip bgp summary
BGP router identifier 1.1.1.1, local AS number 500
BGP table version is 14, main routing table version 14
5 network entries using 585 bytes of memory
5 path entries using 260 bytes of memory
5/4 BGP path/bestpath attribute entries using 620 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 1513 total bytes of memory
BGP activity 8/3 prefixes, 9/4 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
172.16.1.2      4   500         56      57           14            0              0 00:24:42        2

The configs on the other two router are very similar. We are only advertising the loopbacks.

CUSTOMER
interface Loopback0
 ip address 10.10.10.10 255.255.255.255
!
interface Loopback1
 ip address 100.100.100.1 255.255.255.0
!
interface Loopback2
 ip address 50.50.50.1 255.255.255.0
!
interface FastEthernet0/0
 ip address 192.168.1.2 255.255.255.252
 duplex auto
 speed auto
!
router bgp 1
 no synchronization
 bgp router-id 10.10.10.10
 bgp log-neighbor-changes
 network 50.50.50.0 mask 255.255.255.0
 network 100.100.100.0 mask 255.255.255.0
 neighbor 192.168.1.1 remote-as 500
 no auto-summary

UPSTREAM
interface Loopback0
 ip address 5.5.5.5 255.255.255.255
!
interface Loopback1
 ip address 200.200.200.1 255.255.255.0
!
interface FastEthernet0/0
 ip address 192.168.3.2 255.255.255.252
 duplex auto
 speed auto
!
router bgp 100
 no synchronization
 bgp router-id 5.5.5.5
 bgp log-neighbor-changes
 network 200.200.200.0
 neighbor 192.168.3.1 remote-as 500
 no auto-summary

We should achieve full BGP meshed routing tables inside the ISP autonomous system. If we see that is not the case. Let us se what networks the Customer is advertising.

CUSTOMER#sh ip bgp neighbors 192.168.1.1 advertised-routes
BGP table version is 13, local router ID is 10.10.10.10
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 50.50.50.0/24    0.0.0.0                  0         32768 i
*> 100.100.100.0/24 0.0.0.0                  0         32768 i

Total number of prefixes 2

We see two prefixes from the customer. We shall now see the output from the BGP RIB table inside the second router of the ISP Autonomous system.

ISP2#sh ip bgp
BGP table version is 8, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i1.1.1.1/32       172.16.1.1               0    100      0 i
*> 2.2.2.2/32       0.0.0.0                  0         32768 i
* i50.50.50.0/24    192.168.1.2              0    100      0 1 i
* i100.100.100.0/24 192.168.1.2              0    100      0 1 i
*> 200.200.200.0    192.168.3.2              0             0 100 i

Two prefixes learned from customers (50.50.50.0/24 and 100.100.100.0/24) are inside the BGP table, but there is no > sign. This means that they will not be inside the routing table and shall not be advertised to the eBGP neighbor Upstream. The mein problem is that the ISP2 router cannot reach the NEXT HOP address.

ISP2#sh ip route 192.168.1.2
% Network not in table

To remedy this problem we will use the next-hop-self command under the BGP proccess of the ISP1 routers. This command will tell the iBGP speaking router to change the BGP next-hop attribute to the known IP address to the router ISP2. This attribute is preserved in eBGP connections, thus the next-hop is not seen by the ISP2 router.

ISP1(config-router)#neighbor 172.16.1.2 next-hop-self

Now we can take look at the RIB table of the ISP2 router.

ISP2#sh ip bgp
BGP table version is 10, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*>i1.1.1.1/32       172.16.1.1               0    100      0 i
*> 2.2.2.2/32       0.0.0.0                  0         32768 i
*>i50.50.50.0/24    172.16.1.1               0    100      0 1 i
*>i100.100.100.0/24 172.16.1.1               0    100      0 1 i
*> 200.200.200.0    192.168.3.2              0             0 100 i

As the routing plane is in fuction the data plane is working fine. We can use a simple ping to verify that.

ISP2#ping 100.100.100.1 source loopback 0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 100.100.100.1, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 28/48/64 ms

Thanks !!!