When the router for a FreeBSD guest on KVM is also a FreeBSD guest on
the same KVM host, and which is using the virtio network driver from
virtio_kmod, ping will work between guests on different subnets, but
no userland network daemons will respond. If I switch to the e1000
driver on the router, but change nothing else, everything works correctly.
How-To-Repeat: I created three FreeBSD guests on one Linux KVM host. I am using bridged
networking on the KVM host, as br0 and br1. One of the guests has two
network interfaces and acts as a router between two subnets, as follows:
router1: br0, 192.168.1.1; br1, 192.168.2.1
client1: br0, 192.168.1.100; default route 192.168.1.1
client2: br1, 192.168.2.100; default route 192.168.2.1
I configured virtio network interfaces on all three hosts. I enabled
forwarding on router1, but no packet filtering. No NAT is in use.
* client1 can ping client2, and vice versa.
* ssh works from router1 to client1 and vice versa, and from router1
to client2 and vice versa.
* ssh from client1 to client2 will fail (and vice versa); the client
simply hangs indefinitely while trying to connect.
* tcpdump on client2 will show that the SYN is arriving at client2
port 22, but client2 never replies, nor generates any debug or log
output that suggests it ever saw the connection attempt.
* any other userland network service I try (both tcp and udp) will
show the same thing -- packets arrive at client2 from client1, but
the daemon seems to never see them. Since ping works, I know the
kernel is getting them.
* If I switch back to the e1000 driver on router1, but make no other
changes, and make no changes at all to client1 and client2, then
ssh will work properly from client1 to client2 and the problem is resolved.
* If I let router1 continue to use virtio interfaces, but move router1
onto a different KVM host -- so that the traffic from client1 to client2
must leave the KVM host via the bridged interface and then return on a
different interface - then ssh will work properly from client1 to
client2 and the problem is resolved.
KVM guests: FreeBSD 9
KVM host: Ubuntu 11.10
Over to maintainer (via the GNATS Auto Assign Tool)
Is this something you may be acquainted with, as the virtio maintainer?
Do you have any recommendations?
Please accept my apologies if this isn't something for you...
I wanted to confirm that this bug is present in FreeBSD 8.3-RELEASE_p9
running on SmartOS KVM (a different implementation than the linux KVM).
Fantastic, thanks for your quick response.
Jeffrey, does disabling checksum offloading work for you?
This did not resolve my issue.
Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux
KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian)
Guest: FreeBSD 9.2-R amd64
Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both
interfaces (this is a router) solves the issue, but performance becomes
terrible (150 KB/sec uses 100% CPU on host).
vtnet interfaces are, Host side, bridged to VLANs.
Problem does not appear if the traffic is to/from the router itself. Only
forwarded traffic is a problem.
Can provide more info/feedback if needed.
Phil Regnauld (regnauld) writes:
> Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux
> KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian)
> Guest: FreeBSD 9.2-R amd64
> Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both
> interfaces (this is a router) solves the issue, but performance becomes
> terrible (150 KB/sec uses 100% CPU on host).
> vtnet interfaces are, Host side, bridged to VLANs.
> Problem does not appear if the traffic is to/from the router itself. Only
> forwarded traffic is a problem.
> Can provide more info/feedback if needed.
Same problem has been observed with 10.0-RC4.
kern/166645 may be related.
This is causing FreeBSD (and pfSense) to be unusable as a network
appliance / router on KVM platforms.
Still present on 10.1.
Ubuntu 14.04 KVM hypervisor
A FreeBSD 10.01 gateway between the world to three networks.
The FreeBSD has a VYOS default gateway with NATTING it.
When I remove the txcsum and rxcsum from the interface the packets doesn't get corrupted.
The VYOS router blocks INVALID packets ICMP packets are not malformed while TCP do.
I had the same issue with OpenBSD 5.7 and it got fixed on current(5.8).
Examples of the setup rc.conf and pf rules at:
(In reply to elico from comment #9)
My testing topology:
I've seen the same issue on linux kvm guest(w/ FBSD router virtio guest w/ tso) worked around by:
pre-up /sbin/ethtool --offload eth0 tx off
so I am curious as to how this is identified as a FreeBSD bug. Seems to more like something within the kvm stack.
Ubuntu 14.04.3 LTS
Seems to also be the same issue here:
However it's not a PF issue as ipfw kernel nat also did the same.
(In reply to amvandemore from comment #11)
My assumption is that if it works with Linux, OpenBSD in many versions then it's not an hypervisor issue.
What and how exactly I do not know but maybe the OpenBSD virtio changes can help to understand what was changed.
Notice that it's affecting only routing\gateway mode and not regular traffic so it's something special and it's not related at all to FW but to the GW\routing related code.
I do not "need" this since my systems works fine with e1000 and with Linux hosts but I was wondering if there is any progress with it?
(In reply to elico from comment #12)
I doubt too that this is a KVM issue as there seem to be similar problems (forwarding tcp packets) on Xen.
(In reply to Sydney Meyer from comment #14)
I had a chance to verify the subject with a special setup which includes debian and found out that the issue is only in a specific scenario:
The KVM hypervisor hosting two VM's and sharing the same interfae such as bridge.
The bug is that the hypervisor virtio driver doesn't write a checksum for packets which are directed towards an internal interface.
The Hypervisor should either write the checksome or the VM should not check it.
It's an issue that is partially in the driver and the hypervisor.
I am running now on both CentOS and Ubuntu KVM hypervisor sharing the same issue.
The solution for my case was to use iptables checksum fill option on the gateway machine.
The first step is to allow DHCP traffic pass between VM's and that the DHCP client(ISC) will not drop the packets using:
iptables -A POSTROUTING -t mangle -p udp --dport 68 -j CHECKSUM --checksum-fill
I will try to test with FreeBSD 11 since with OpenBSD 5.X it didn't but with 5.Y(tip) it was working fine.
This has been a long standing and unfortunate issue. My memory is somewhat fuzzy, but generally speaking the host doesn't need to compute a checksum because it is basically just a memory copy into the guest, but FreeBSD doesn't have a flag (at least at the time I was originally working on the VirtIO drivers) to denote "recompute this checksum if forwarding" the packet.
(In reply to Bryan Venteicher from comment #16)
OK So the test environment should be the next:
- 1 KVM Hypervier(CentOS 7)
- 1 VYOS EDGE GW(eth0=192.168.89.200/24, eth1=192.168.7.254/24, GW=192.168.89.1)
- 1 FreeBSD (10.3+11.0) GW for 2 networks(eth0=192.168.7.1/24,eth1=192.168.6.254/24,eth2=192.168.7.254/24, GW=192.168.7.254/23)
- Windows+Linux+FreeBSD clients on networks 192.168.6.0/24+192.168.7.0/24 with GW 192.168.6.254 or 192.168.7.254
The expected result should be a working connection from the end user(Win\Lin\BSD) to the local networks and Internet.