After upgrading our OpenVPN server VM from Debian 7 to Debian 8 (moving us from OpenVPN 2.2 to OpenVPN 2.3 and Linux kernel 3.2 to Linux kernel 3.16) and upgrading our virtualization from VMware ESXi 5.5 to ESXi 6.0 and moving the VM to a different host, the VPN got really unreliable: the VPN connection itself worked fine, but any connections established across the VPN were very slow to get established. Once they were established, everything worked fine and you could even create new connections to the same host across the VPN and they would be established quickly.
I wasn’t sure which one of the many changes caused the issue, but luckily Wireshark quickly revealed the problem: As we are using OpenVPN in layer 2 mode (i.e. with tap interfaces), ARP packets are quite important. While I could see the ARP requests making it across the interface bridge from tap0 to eth0, I saw the ARP replies going into eth0 and not making it to tap0. The server-side fix is easy, just disable the MAC table on the bridge completely and simply lets all packets pass:
brctl setageing br0 0
Now that ARP was working, I noticed that VPN clients also did not get IPv6 addresses. Evidently, the ICMPv6 multicasts weren’t making it across the bridge either. To fix that, enable multicast snooping on the bridge:
echo 1 > /sys/devices/virtual/net/br0/bridge/multicast_querier
Update March 2016: A recent kernel update in Debian Jessie appears to have changed the multicast bridging behavior. I now need to disable multicast snooping:
echo 0 > /sys/devices/virtual/net/br0/bridge/multicast_querier