In the previous post i labbed a VXLAN Flood and Learn solution in order to put in communication 2 devices in the same subnet dived by a routed network, today i’ll try to upgrade this scenario introducing a control-plane like BGP EVPN.
With a control-plane i don’t need anymore Flood and Learn in order to discover the remote mac-address. When a client is “discovered” by the VTEP, the VTEP itself send the reachability info (e.g. client mac-address, advertising VTEP IP) to the other VTEPs using BGP and its AFI/SAFI “l2vpn evpn”. In this case, we can use 2 different types of BUM traffic management system:
- Ingress or Head-end Replication
- Multicast
Ingress replication is the most simple type because it duplicates the reachability info (Route-Type 2 or RT2) with all the neighbors, regardless if the remote VTEP need that kind of info or not. If the remote VTEP doesn’t have that specific vlan configured, it still receive the info and silently drop the packet. Every VTEP need also to know the updated list of the remote VTEP in order to send the RT2 info, so ingress-replication use a different Route-Type message called RT3. This mode is very simple to configure and manage (you don’t need multicast configurations) but is not very efficient!
Multicast is the best way to manage BUM traffic but it require more configuration and knowledge about PIM protocol.
Lab topology is:
In this lab, Spine nodes act as BGP route-reflector (in cluster configuration) in order to simplify iBGP session management:
e.g. Spine1
feature bgp
router bgp 65000
cluster-id 1
address-family l2vpn evpn
maximum-paths 16
neighbor 10.0.0.3
remote-as 65000
update-source loopback1
address-family ipv4 unicast
soft-reconfiguration inbound
address-family l2vpn evpn
send-community
send-community extended
route-reflector-client
neighbor 10.0.0.5
remote-as 65000
update-source loopback1
address-family ipv4 unicast
soft-reconfiguration inbound
address-family l2vpn evpn
send-community
send-community extended
route-reflector-client
Leaf nodes is where i need to activate EVPN control-plane, configure BGP and adjust the NVE interface configuration for ingress-replication capability.
EVPN activation is very straightforward, you only need to enbale the necessary nx-os feature and activate it!
feature vn-segment-vlan-based
feature nv overlay
nv overlay evpn
BGP configuration is very simple but, as you can also see in Spine configuration, you MUST activate the extended-community option in the l2vpn evpn address-family because they are used to transport some vxlan key info:
e.g. Leaf1
feature bgp
interface loopback1
description VTEP
ip address 10.0.0.3/32
ip router ospf UNDERLAY area 0.0.0.0
router bgp 65000
address-family l2vpn evpn
maximum-paths 16
neighbor 10.0.0.1
remote-as 65000
update-source loopback1
address-family ipv4 unicast
soft-reconfiguration inbound always
address-family l2vpn evpn
send-community
send-community extended
neighbor 10.0.0.2
remote-as 65000
update-source loopback1
address-family ipv4 unicast
soft-reconfiguration inbound always
address-family l2vpn evpn
send-community
send-community extended
Finally you need to instruct NVE interface to use BGP in order to transport host-reachability informations and ingress-replication as a BUM traffic management type:
interface nve1
no shutdown
host-reachability protocol bgp
source-interface loopback1
global ingress-replication protocol bgp
member vni 30010
Once the BGP sessions are established, every VTEP start to update all the other VTEPs sending its local-known mac-addresses and receiving the remote mac-adrresses.
As you can see, there are Route-type 2 regarding mac-addresses info and Route-type 3 regarding ingress-replication peers.
Leaf1# show bgp l2vpn evpn vni-id 30010
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 81, Local Router ID is 10.0.0.3
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-i
njected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup, 2 - b
est2
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.0.0.3:32777 (L2VNI 30010)
*>l[2]:[0]:[0]:[48]:[aabb.cc00.6000]:[0]:[0.0.0.0]/216
10.0.0.3 100 32768 i
*>i[2]:[0]:[0]:[48]:[aabb.cc00.7000]:[0]:[0.0.0.0]/216
10.0.0.5 100 0 i
*>l[3]:[0]:[32]:[10.0.0.3]/88
10.0.0.3 100 32768 i
*>i[3]:[0]:[32]:[10.0.0.5]/88
10.0.0.5 100 0 i
Dig into the specific remote mac-address entry you can see a lot of useful informations like:
- L2VNI value
- Route-type
- Mac-address
- Available path with the proper next-hop
- Spine who send the NLRI
- Received label (VNID)
- Extended-community: L2VNI Route-target (65000:30010) and encapsulation type (8 = VXLAN)
Note: VTEP also receive “from the fabric” info about host mobility between VTEPs using the MAC mobility sequence number (The higher number referring to the most updated advertisement) as an extended-community. In case of host mobility the source VTEP is always updated about where is located a remote client and how to reach it.
Leaf1# show bgp l2vpn evpn aabb.cc00.7000
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 10.0.0.3:32777 (L2VNI 30010)
BGP routing table entry for [2]:[0]:[0]:[48]:[aabb.cc00.7000]:[0]:[0.0.0.0]/216,
version 77
Paths: (1 available, best #1)
Flags: (0x000212) (high32 00000000) on xmit-list, is in l2rib/evpn, is not in HW
Multipath: eBGP
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop, in rib
Imported from 10.0.0.5:32777:[2]:[0]:[0]:[48]:[aabb.cc00.7000]:[0]:
[0.0.0.0]/216
AS-Path: NONE, path sourced internal to AS
10.0.0.5 (metric 81) from 10.0.0.1 (10.0.0.1)
Origin IGP, MED not set, localpref 100, weight 0
Received label 30010
Extcommunity: RT:65000:30010 ENCAP:8
Originator: 10.0.0.5 Cluster list: 0.0.0.1
Path-id 1 not advertised to any peer
Route Distinguisher: 10.0.0.5:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[aabb.cc00.7000]:[0]:[0.0.0.0]/216,
version 80
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not i
n HW
Multipath: eBGP
Path type: internal, path is valid, not best reason: Neighbor Address, no labe
led nexthop
AS-Path: NONE, path sourced internal to AS
10.0.0.5 (metric 81) from 10.0.0.2 (10.0.0.2)
Origin IGP, MED not set, localpref 100, weight 0
Received label 30010
Extcommunity: RT:65000:30010 ENCAP:8
Originator: 10.0.0.5 Cluster list: 0.0.0.1
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
Imported to 1 destination(s)
Imported paths list: L2-30010
AS-Path: NONE, path sourced internal to AS
10.0.0.5 (metric 81) from 10.0.0.1 (10.0.0.1)
Origin IGP, MED not set, localpref 100, weight 0
Received label 30010
Extcommunity: RT:65000:30010 ENCAP:8
Originator: 10.0.0.5 Cluster list: 0.0.0.1
Path-id 1 not advertised to any peer
Once the VTEP receive te reachaility info it populate its own l2route table (aka VXLAN mac-address table) with the reachability info (local and remote) and then update the mac-address table:
Leaf1# show l2route evpn mac evi 10
Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv (AD):Auto-Delete (D):Del Pending
(S):Stale (C):Clear, (Ps):Peer Sync (O):Re-Originated (Nho):NH-Override
(Pf):Permanently-Frozen, (Orp): Orphan
Topology Mac Address Prod Flags Seq No Next-Hops
----------- -------------- ------ ------------- ---------- ---------------------
------------------
10 aabb.cc00.6000 Local L, 0 Eth1/3
10 aabb.cc00.7000 BGP Rcv 0 10.0.0.5 (Label: 30010)
Leaf1# show mac address-table
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False, C - ControlPlane MAC, ~ - vsan
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
* 10 aabb.cc00.6000 dynamic 0 F F Eth1/3
C 10 aabb.cc00.7000 dynamic 0 F F nve1(10.0.0.5)
Finally, the magic happens:
Client1-Vlan10#ping 192.168.10.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.10.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 9/10/12 ms
Client2-Vlan10#ping 192.168.10.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.10.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 9/10/14 ms
In the next post i will cover VXLAN BGP EVPN with multicast for BUM traffic management, also i want to introduce the concept of VPC in the lab. See you!