2014年3月3日星期一

Why the Virtual Switches Need Bpdu Guard

I have receive the question as below:
What happens if a VM running within a vSphere host sends a BPDU? Will it get dropped by the vSwitch or will it be sent to the physical switch (potentially triggering BPDU guard)?

Last, I got the answer from visibly harassed Kurt (@networkjanitor) Bales during the Networking Tech Field Day; one of his customers has managed to do just that.


Here’s a sketchy overview of what was going on: they were running a Windows VM inside his VMware infrastructure, decided to configure bridging between a vNIC and a VPN link, and the VM started to send BPDUs through the vNIC. vSwitch ignored them, but the physical switch didn’t – it shut down the port, cutting a number of VMs off the network.
Best case, BPDU guard on the physical switch blocks but doesn’t shut down the port – all VMs pinned to that link get blackholed, but the damage stops there. More often BPDU guard shuts down the physical port (the reaction of BPDU guard is vendor/switch-specific), VMs using that port get pinned to another port, and the misconfigured VM triggers BPDU guard on yet another port, until the whole vSphere host is cut off from the rest of the network. Absolutely worst case, you’re running VMware High Availability, the vSphere host triggers isolation response, and the offending VM is restarted on another host (eventually bringing down the whole cluster).

There is only one good solution to this problem: implement BPDU guard on the virtual switch. Unfortunately, no virtual switch running in VMware environment implements BPDU guard.

Nexus 1000V seems to offer a viable alternative. It has an implicit BPDU filter (you cannot configure it) that would block the BPDUs coming from a VM, but that only hides the problem – you could still get forwarding loops if a VM bridges between two vNICs connected to the same LAN. However, you can reject forged transmits (source-MAC-based filter, a standard vSphere feature) to block bridged packets coming from a VM.

Lacking Nexus 1000V, you can use a virtual firewall (example: vShield App) that can filter layer-2 packets based on ethertype. Yet again, you should combine that with rejection of forged transmits.

In theory, an interesting approach might be to use VM-FEX. A VM using VM-FEX is connected directly to a logical interface in the upstream switch and the BPDU guard configured on that interface would shut down just the offending VM. Unfortunately, I can’t find a way to configure BPDU guard in UCS Manager.

Other alternatives to BPDU guard in a vSwitch range from bad to worse:

Disable BPDU guard on the physical Cisco Switch. You’ve just moved the problem from access to distribution layer (if you use BPDU guard there) ... or you’ve made the whole layer-2 domain totally unstable, as any VM could cause STP topology change.

Enable BPDU filter on the physical switch. Even worse – if someone actually manages to configure bridging between two vNICs (or two physical NICs in a Hyper-V host), you’re toast; BPDU filter causes the physical switch to pretend the problem doesn’t exist.

Enable BPDU filter on the physical switch and reject forged transmits in vSwitch. This one protects against bridging within a VM, but not against physical server misconfiguration. If you’re absolutely utterly positive all your physical servers are vSphere hosts, you can use this approach (vSwitch has built-in loop prevention); if there’s a minimal chance someone might connect bare-metal server or a Hyper-V/XenServer host to your network, don’t even think about using BPDU filter on the physical switch.


Summary

BPDU filter available in Nexus 1000V or ethertype-based filters available in virtual firewalls can stop the BPDUs within the vSphere host (and thus protect the physical links). If you combine it with forged transmit rejection, you get a solution that protects the network from almost any VM-level misconfiguration.

However, I would still prefer a proper BPDU guard with logging in the virtual switches for several reasons:

BPDU filter just masks the configuration problem;
If the vSwitch accepts forged transmits, you could get an actual forwarding loop;
While the solution described above does protect the network on Catalyst 2960 , it also makes troubleshooting a lot more obscure – a clear logging message in vCenter telling the virtualization admin that BPDU guard has shut down a vNIC would be way better;
Last but definitely not least, someone just might decide to change the settings and accept forged transmits (with potentially disastrous results) while troubleshooting a customer problem @ 2AM.

Notes: There are more comments of discussing Virtual Switches Need Bpdu Guard in the original page http://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.html

没有评论:

发表评论