In the last weeks i have work on a sample project about a VXLAN EVPN DCI since a customer asks for legacy L2 trunk DCI dismission (Hurray!) so i started with requirements collection from the customer, something like:
- Redundancy
- Fast failover
- Simplicity (after all they need to manage the solution provided)
and after talked together about some design considerations, i finally decide to create a sample lab in order to test the possible final soluton.
What are the main design consideration which drive me to the final target?
1 - Redundancy
In my opinion, redundancy in a datacenter is equivalent to say vpc (virtual port-channel) so i decide to use 2 different Nexus peered in a vpc domain per each site. It give me hardware redundancy and data-plane redundancy, in fact the next design step is to use distributed LACP link-aggregation (one link per Nexus) to connect the border gateway to the LAN devices (e.g. the core switch). With this solution in mynd i can achieved pyshical redundancy (if 1 link goes down) and improve the bandwidth.
Concerning the routing side, since i have more than one link between the sites i can use ECMP (equal-cost multipath) to route the traffic (underlay and overlay) through multiple links.
Depending on your LAN deployment, take care about spanning-tree configuration. If you risk to cause a L2 loop using the new border gateway, they must be set as root-bridge, otherwise one “leg” from your LAN to the border gateway could be set in blocking state by the protocol.
2 - Fast failover
Using LACP for physical connection i can reduce the link fault detection using LACP fast rate (LACP query the peer every 1 second instead of default 30s), in this case a physical single fault is detect quickly and the traffic is redirect to the healthy link(s).
Into the vpc domain context, IP arp synchronization can also help for fast failover beacause it maintain a synchronized arp table between the vpc peer. In case of primary Nexus hardware failure, the secondary one doesn’t need to learn nothing.
Regarding the routing protocol failover, i decide to dramatically improve it by using BFD without reducing protocol timers and consequently burden the cpu.
3 - Simplicity
Since i need to provide the most simple solution as possible, i decide to use ingress-replication as BUM traffic replication method. In this mode i can reduce the complexity avoiding multicast configuration and RP placement, in addition i only have few remote VTEPs and high capacity links between the sites so it doesn’t matter if this solution is not so efficient and makes some extra-traffic (every ingress VTEP replicate the packet and forward a copy to all the peer VTEPs).
When the border gateway will be online, create a new vlan, propagate it to the correct uplinks and add the L2VNI will be the only manual configuration that the network admin will do!
N.B: VXLAN permit DC isolation since you run L3 protocol between the border gateways but L2 loops are still possible! In order to safe your deployment from broadcast storm you must use the storm control feature inside your VXLAN EVPN configuration.
Here the final topology i achieve:
In the next post i will focus on underlay configuration and vpc domain.