When the router for a FreeBSD guest on KVM is also a FreeBSD guest on the same KVM host, and which is using the virtio network driver from virtio_kmod, ping will work between guests on different subnets, but no userland network daemons will respond. If I switch to the e1000 driver on the router, but change nothing else, everything works correctly. Fix: Unknown. How-To-Repeat: I created three FreeBSD guests on one Linux KVM host. I am using bridged networking on the KVM host, as br0 and br1. One of the guests has two network interfaces and acts as a router between two subnets, as follows: router1: br0, 192.168.1.1; br1, 192.168.2.1 client1: br0, 192.168.1.100; default route 192.168.1.1 client2: br1, 192.168.2.100; default route 192.168.2.1 I configured virtio network interfaces on all three hosts. I enabled forwarding on router1, but no packet filtering. No NAT is in use. Result: * client1 can ping client2, and vice versa. * ssh works from router1 to client1 and vice versa, and from router1 to client2 and vice versa. * ssh from client1 to client2 will fail (and vice versa); the client simply hangs indefinitely while trying to connect. * tcpdump on client2 will show that the SYN is arriving at client2 port 22, but client2 never replies, nor generates any debug or log output that suggests it ever saw the connection attempt. * any other userland network service I try (both tcp and udp) will show the same thing -- packets arrive at client2 from client1, but the daemon seems to never see them. Since ping works, I know the kernel is getting them. * If I switch back to the e1000 driver on router1, but make no other changes, and make no changes at all to client1 and client2, then ssh will work properly from client1 to client2 and the problem is resolved. * If I let router1 continue to use virtio interfaces, but move router1 onto a different KVM host -- so that the traffic from client1 to client2 must leave the KVM host via the bridged interface and then return on a different interface - then ssh will work properly from client1 to client2 and the problem is resolved. KVM guests: FreeBSD 9 virtio-kmod: 0.228301 KVM host: Ubuntu 11.10 qemu-kvm: 0.14.1
Responsible Changed From-To: freebsd-bugs->freebsd-ports-bugs reclassify.
Responsible Changed From-To: freebsd-ports-bugs->kuriyama Over to maintainer (via the GNATS Auto Assign Tool)
Responsible Changed From-To: kuriyama->bryanv Hi Bryan, Is this something you may be acquainted with, as the virtio maintainer? Do you have any recommendations? Please accept my apologies if this isn't something for you... Chris
I wanted to confirm that this bug is present in FreeBSD 8.3-RELEASE_p9 running on SmartOS KVM (a different implementation than the linux KVM).
State Changed From-To: open->feedback Fantastic, thanks for your quick response. Jeffrey, does disabling checksum offloading work for you?
This did not resolve my issue. Thanks, Jeff
Env: Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian) Guest: FreeBSD 9.2-R amd64 Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both interfaces (this is a router) solves the issue, but performance becomes terrible (150 KB/sec uses 100% CPU on host). vtnet interfaces are, Host side, bridged to VLANs. Problem does not appear if the traffic is to/from the router itself. Only forwarded traffic is a problem. Can provide more info/feedback if needed.
Phil Regnauld (regnauld) writes: > Env: > Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux > KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian) > Guest: FreeBSD 9.2-R amd64 > > Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both > interfaces (this is a router) solves the issue, but performance becomes > terrible (150 KB/sec uses 100% CPU on host). > > vtnet interfaces are, Host side, bridged to VLANs. > > Problem does not appear if the traffic is to/from the router itself. Only > forwarded traffic is a problem. > > Can provide more info/feedback if needed. Same problem has been observed with 10.0-RC4. kern/166645 may be related. This is causing FreeBSD (and pfSense) to be unusable as a network appliance / router on KVM platforms. Phil
Still present on 10.1. Environment: Ubuntu 14.04 KVM hypervisor A FreeBSD 10.01 gateway between the world to three networks. The FreeBSD has a VYOS default gateway with NATTING it. When I remove the txcsum and rxcsum from the interface the packets doesn't get corrupted. The VYOS router blocks INVALID packets ICMP packets are not malformed while TCP do. I had the same issue with OpenBSD 5.7 and it got fixed on current(5.8). Examples of the setup rc.conf and pf rules at: http://wiki.squid-cache.org/ConfigExamples/Intercept/PfPolicyRoute#rc.conf_example_for_a_router
(In reply to elico from comment #9) My testing topology: http://ngtech.co.il/squidblocker/topology1.png
I've seen the same issue on linux kvm guest(w/ FBSD router virtio guest w/ tso) worked around by: pre-up /sbin/ethtool --offload eth0 tx off so I am curious as to how this is identified as a FreeBSD bug. Seems to more like something within the kvm stack. Ubuntu 14.04.3 LTS qemu-kvm 2.0.0+dfsg-2ubuntu1.16 Seems to also be the same issue here: https://forum.pfsense.org/index.php?topic=88467.0 However it's not a PF issue as ipfw kernel nat also did the same.
(In reply to amvandemore from comment #11) My assumption is that if it works with Linux, OpenBSD in many versions then it's not an hypervisor issue. What and how exactly I do not know but maybe the OpenBSD virtio changes can help to understand what was changed. Notice that it's affecting only routing\gateway mode and not regular traffic so it's something special and it's not related at all to FW but to the GW\routing related code.
I do not "need" this since my systems works fine with e1000 and with Linux hosts but I was wondering if there is any progress with it?
(In reply to elico from comment #12) I doubt too that this is a KVM issue as there seem to be similar problems (forwarding tcp packets) on Xen. E.g: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188261 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202199 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197344
(In reply to Sydney Meyer from comment #14) I had a chance to verify the subject with a special setup which includes debian and found out that the issue is only in a specific scenario: The KVM hypervisor hosting two VM's and sharing the same interfae such as bridge. The bug is that the hypervisor virtio driver doesn't write a checksum for packets which are directed towards an internal interface. The Hypervisor should either write the checksome or the VM should not check it. It's an issue that is partially in the driver and the hypervisor. I am running now on both CentOS and Ubuntu KVM hypervisor sharing the same issue. The solution for my case was to use iptables checksum fill option on the gateway machine. The first step is to allow DHCP traffic pass between VM's and that the DHCP client(ISC) will not drop the packets using: iptables -A POSTROUTING -t mangle -p udp --dport 68 -j CHECKSUM --checksum-fill I will try to test with FreeBSD 11 since with OpenBSD 5.X it didn't but with 5.Y(tip) it was working fine.
This has been a long standing and unfortunate issue. My memory is somewhat fuzzy, but generally speaking the host doesn't need to compute a checksum because it is basically just a memory copy into the guest, but FreeBSD doesn't have a flag (at least at the time I was originally working on the VirtIO drivers) to denote "recompute this checksum if forwarding" the packet.
(In reply to Bryan Venteicher from comment #16) OK So the test environment should be the next: - 1 KVM Hypervier(CentOS 7) - 1 VYOS EDGE GW(eth0=192.168.89.200/24, eth1=192.168.7.254/24, GW=192.168.89.1) - 1 FreeBSD (10.3+11.0) GW for 2 networks(eth0=192.168.7.1/24,eth1=192.168.6.254/24,eth2=192.168.7.254/24, GW=192.168.7.254/23) - Windows+Linux+FreeBSD clients on networks 192.168.6.0/24+192.168.7.0/24 with GW 192.168.6.254 or 192.168.7.254 The expected result should be a working connection from the end user(Win\Lin\BSD) to the local networks and Internet.
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
(In reply to Eitan Adler from comment #18) With what version of FreeBSD? Latest stable?
I am probably hit similar issue with Gentoo Linux Host KVM and FreeBSD 9~12 Guest with virtio network adapter, my environment is Linux bacztwo 4.14.101-gentoo libvirt-4.9.0 qemu-3.1.0 under default setting with e1000 series the guest sometimes raising interrupt storm over it with virtio it won't receive any packet and unable to communicate through network. the following option for qemu that just disable all checksum for virtio-net works on my FreeBSD guest -device virtio-net-pci,csum=off,guest_csum=off
(In reply to andcycle-bugs.freebsd.org from comment #20) here are the options for "virsh edit" on the KVM host: <interface type='network'> <source network='private'/> <model type='virtio'/> <driver> <host csum='off'/> <guest csum='off'/> </driver> </interface> eg. "virsh edit freebsd-host" and put this into the config. Jakub
MARKED AS SPAM
(In reply to zain david from comment #22) Why Spam?
Hello FreeBSD maintainers, This bug is still present on 11.3-RELEASE and 12.0-RELEASE Maybe it could be a good idea to upgrade version in the ticket description. Maybe somebody will consider it :) It's so sad that FreeBSD run better on vmware than on linux KVM :/ Thanks in advance for the good job on FreeBSD BR, Grégory
(In reply to Greg A. from comment #24) FreeBsd 12.1 workaround # Gateway Host Ifconfig for WAN interface vtnet0 (vtnet uplink 10g) ---- ifconfig_vtnet0="inet WAN_IP netmask WAN_MASK vlanmtu vlanhwtag vlanhwfilter vlanhwcsum vlanhwtso -rxcsum -txcsum -rxcsum6 -txcsum6 tso6 tso4 lro" ---- disable only -rxcsum -txcsum -rxcsum6 -txcsum6 Ifconfig for LAN interface vtnet1 (vtnet uplink 10g) --- ifconfig_vtnet1="inet 192.168.1.1 netmask 255.255.255.0 vlanmtu vlanhwtag vlanhwfilter vlanhwcsum vlanhwtso rxcsum -txcsum rxcsum6 -txcsum6 tso6 tso4 lro" --- disable only -txcsum -txcsum6 pf.conf - simple nat rule for LAN (scrub rules not need) --- nat on vtnet0 from 192.168.1.0/24 to any -> WAN_IP --- ###################################################################### # LAN Client Host (vtnet uplink 10g) Ifconfig for LAN interface vtnet0 --- ifconfig_vtnet0="inet 10.66.1.2 netmask 255.255.255.0 vlanmtu vlanhwtag vlanhwfilter vlanhwcsum vlanhwtso rxcsum -txcsum rxcsum6 -txcsum6 tso6 tso4 lro" --- disable only -txcsum -txcsum6 ###################################################################### All traffic test passed normal. 1) Gateway LAN -> Client LAN - iperf3 result 14Gbit/s 2) Client Lan -> Gateway Lan - iperf result 14Gbit/s 3) Client Lan download Gateway Lan (NAT) from External source - result max download speed 3) External iperf client -> to Gateway WAN (iperf port) -> redirect to LAN Client iperf server - result max External iperf client speed Please check workaround
^Triage: This issue needs a reproduction on currently supported FreeBSD versions and steps to reproduce (minimum test case). Ideally reproduction confirmation against CURRENT, 13.1 and 12.4 (re)confirmation that disabling RX/TX checksumming works around the issue, or not, would also be great.
(In reply to Kubilay Kocak from comment #26) I will try to test later on. Since Ubuntu 14.04 is not supported anymore I will try to reproduce on later versions of Ubuntu 20.04/22.04 and Oracle Linux 8. I will try to verify with 12.3 and 12.4
(In reply to elico from comment #27) OK so just to mention the NAT related documents are at: http://draft.scyphus.co.jp/freebsd/nat.html and ontop of Oracle Enterprise Linux 8 KVM host the issue exists on 12.3 the setup is very simple: * Alpine 3.16 with ip 192.168.111.1/24 gw 192.168.111.254 DNS 8.8.8.8 * FreeBSD 12.3 with two interfaces: vtnet0 192.168.110.1/24 GW 192.168.110.254 vtnet1 192.168.111.254/24 pf rules to nat on $ext_inf * VyOS 1.3.2 with two interfaces: eth0 192.168.122.183/24 GW 192.168.122.1 eth1: 192.168.110.254/24 ping from Alpine to 8.8.8.8 via FreeBSD (NAT) -> VyOS (NAT) = works (ICMP_ wget from Alpine to 8.8.8.8 via FreeBSD (NAT) -> VyOS (NAT) = doesn't work (TCP) When I am running the next on the vtnet0 and vtnet1 interfaces the TCP works: ifconfig vtnet0 -rxcsum ifconfig vtnet1 -rxcsum It was resolved long ago in OpenBSD so now there only should be a fix and a text.
(In reply to elico from comment #28) Hello, I created a FreeBSD 13.1 STABLE virtual router in a Proxmox 7.1 testing environment and was encountering this issue with my FreeBSD VM. I can confirm that running the following helped me to resolve the issue: ifconfig vtnet0 -rxcsum ifconfig vtnet1 -rxcsum Thank you for helping with my virtual router's NAT not working. Previous to resolution, I was able to ping things, which tells me that ICMP was working fine (I could also see this in my pflogging) - however, I could not say the same for TCP
Hello, I created a FreeBSD 12.3 STABLE virtual router in a Proxmox 7.1 environment and was encountering this issue with my FreeBSD VM. I can confirm that running the following helped to work arround the issue, but the performance is terrible. ifconfig vtnet0 -rxcsum ifconfig vtnet1 -rxcsum
Hello, I came across this bug report while troubleshooting an OPNsense throughput issue when using a VirtIO network card on Proxmox 8. Unfortunately the only way to achieve 10Gbps speed is to enable the HW Checksum Offloading option. Unfortunately, after activating it, access to servers on the same Proxmox server stops working and not only to them, but also for example to the TrueNAS Core server (NOT on the same Proxmox server). I see that this issue has not been resolved since 2012. Is there ever a plan to fix this? This behavior is still present on OPNsense 23.7.9 and therefore FreeBSD 13.2. Thanks
Hello Adding parameters to /boot/loader.conf hw.vtnet.X.tso_disable="1" hw.vtnet.tso_disable="1" hw.vtnet.lro_disable="1" hw.vtnet.X.lro_disable="1" hw.vtnet.csum_disable="1" hw.vtnet.X.csum_disable="1" Solved the problem
^Triage: clear unneeded flags. Nothing has yet been committed to be merged.