Bug 165059 - [kvm] virtio-kmod: networking breaks with a router using virtio net driver on KVM host
Summary: [kvm] virtio-kmod: networking breaks with a router using virtio net driver on...
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.0-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: Bryan Venteicher
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-12 21:20 UTC by t42
Modified: 2017-05-21 07:55 UTC (History)
7 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description t42 2012-02-12 21:20:11 UTC
When the router for a FreeBSD guest on KVM is also a FreeBSD guest on
the same KVM host, and which is using the virtio network driver from
virtio_kmod, ping will work between guests on different subnets, but
no userland network daemons will respond. If I switch to the e1000
driver on the router, but change nothing else, everything works correctly.

Fix: 

Unknown.
How-To-Repeat: I created three FreeBSD guests on one Linux KVM host. I am using bridged
networking on the KVM host, as br0 and br1. One of the guests has two
network interfaces and acts as a router between two subnets, as follows:

router1: br0, 192.168.1.1; br1, 192.168.2.1
client1: br0, 192.168.1.100; default route 192.168.1.1
client2: br1, 192.168.2.100; default route 192.168.2.1

I configured virtio network interfaces on all three hosts. I enabled
forwarding on router1, but no packet filtering. No NAT is in use.

Result:

    * client1 can ping client2, and vice versa.
    * ssh works from router1 to client1 and vice versa, and from router1
      to client2 and vice versa.
    * ssh from client1 to client2 will fail (and vice versa); the client
      simply hangs indefinitely while trying to connect. 
    * tcpdump on client2 will show that the SYN is arriving at client2
      port 22, but client2 never replies, nor generates any debug or log
      output that suggests it ever saw the connection attempt.
    * any other userland network service I try (both tcp and udp) will
      show the same thing -- packets arrive at client2 from client1, but
      the daemon seems to never see them. Since ping works, I know the
      kernel is getting them.
    * If I switch back to the e1000 driver on router1, but make no other
      changes, and make no changes at all to client1 and client2, then
      ssh will work properly from client1 to client2 and the problem is resolved.
    * If I let router1 continue to use virtio interfaces, but move router1
      onto a different KVM host -- so that the traffic from client1 to client2
      must leave the KVM host via the bridged interface and then return on a
      different interface - then ssh will work properly from client1 to
      client2 and the problem is resolved.

KVM guests: FreeBSD 9
virtio-kmod: 0.228301
KVM host: Ubuntu 11.10
qemu-kvm: 0.14.1
Comment 1 Mark Linimon freebsd_committer 2012-04-05 02:43:24 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-ports-bugs

reclassify.
Comment 2 Edwin Groothuis freebsd_committer 2012-05-28 03:17:30 UTC
Responsible Changed
From-To: freebsd-ports-bugs->kuriyama

Over to maintainer (via the GNATS Auto Assign Tool)
Comment 3 Chris Rees freebsd_committer 2013-08-07 20:33:18 UTC
Responsible Changed
From-To: kuriyama->bryanv

Hi Bryan, 

Is this something you may be acquainted with, as the virtio maintainer? 
Do you have any recommendations? 

Please accept my apologies if this isn't something for you... 

Chris
Comment 4 jmealo 2013-08-07 20:46:18 UTC
I wanted to confirm that this bug is present in FreeBSD 8.3-RELEASE_p9
running on SmartOS KVM (a different implementation than the linux KVM).
Comment 5 Chris Rees freebsd_committer 2013-08-07 21:22:58 UTC
State Changed
From-To: open->feedback

Fantastic, thanks for your quick response. 

Jeffrey, does disabling checksum offloading work for you?
Comment 6 jmealo 2013-08-13 01:30:53 UTC
This did not resolve my issue.

Thanks,
Jeff
Comment 7 regnauld 2013-10-22 00:22:30 UTC
Env:
	Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux
	KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian)
	Guest: FreeBSD 9.2-R amd64

Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both
interfaces (this is a router) solves the issue, but performance becomes
terrible (150 KB/sec uses 100% CPU on host).

vtnet interfaces are, Host side, bridged to VLANs.

Problem does not appear if the traffic is to/from the router itself. Only
forwarded traffic is a problem.

Can provide more info/feedback if needed.
Comment 8 regnauld 2014-01-07 14:54:53 UTC
Phil Regnauld (regnauld) writes:
> Env:
> 	Host OS: Debian 7.1 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux
> 	KVMM: QEMU emulator version 1.1.2 (qemu-kvm-1.1.2+dfsg-6, Debian)
> 	Guest: FreeBSD 9.2-R amd64
> 
> Disabling checksum offload with ifconfig vtnetX -rxcsum -txcsum on both
> interfaces (this is a router) solves the issue, but performance becomes
> terrible (150 KB/sec uses 100% CPU on host).
> 
> vtnet interfaces are, Host side, bridged to VLANs.
> 
> Problem does not appear if the traffic is to/from the router itself. Only
> forwarded traffic is a problem.
> 
> Can provide more info/feedback if needed.

	Same problem has been observed with 10.0-RC4.

	kern/166645 may be related.

	This is causing FreeBSD (and pfSense) to be unusable as a network
	appliance / router on KVM platforms.

	Phil
Comment 9 elico 2015-08-27 21:06:53 UTC
Still present on 10.1.
Environment:
Ubuntu 14.04 KVM hypervisor
A FreeBSD 10.01 gateway between the world to three networks.
The FreeBSD has a VYOS default gateway with NATTING it.
When I remove the txcsum and rxcsum from the interface the packets doesn't get corrupted.

The VYOS router blocks INVALID packets ICMP packets are not malformed while TCP do.

I had the same issue with OpenBSD 5.7 and it got fixed on current(5.8).

Examples of the setup rc.conf and pf rules at:
http://wiki.squid-cache.org/ConfigExamples/Intercept/PfPolicyRoute#rc.conf_example_for_a_router
Comment 10 elico 2015-08-28 15:13:29 UTC
(In reply to elico from comment #9)
My testing topology:
http://ngtech.co.il/squidblocker/topology1.png
Comment 11 amvandemore 2015-09-13 14:25:29 UTC
I've seen the same issue on linux kvm guest(w/ FBSD router virtio guest w/ tso) worked around by:

pre-up /sbin/ethtool --offload eth0 tx off

so I am curious as to how this is identified as a FreeBSD bug.  Seems to more like something within the kvm stack.

Ubuntu 14.04.3 LTS
qemu-kvm                            2.0.0+dfsg-2ubuntu1.16

Seems to also be the same issue here:

https://forum.pfsense.org/index.php?topic=88467.0

However it's not a PF issue as ipfw kernel nat also did the same.
Comment 12 elico 2015-09-13 14:31:02 UTC
(In reply to amvandemore from comment #11)
My assumption is that if it works with Linux, OpenBSD in many versions then it's not an hypervisor issue.
What and how exactly I do not know but maybe the OpenBSD virtio changes can help to understand what was changed.

Notice that it's affecting only routing\gateway mode and not regular traffic so it's something special and it's not related at all to FW but to the GW\routing related code.
Comment 13 elico 2015-09-21 01:13:30 UTC
I do not "need" this since my systems works fine with e1000 and with Linux hosts but I was wondering if there is any progress with it?
Comment 14 Sydney Meyer 2016-01-24 00:59:12 UTC
(In reply to elico from comment #12)

I doubt too that this is a KVM issue as there seem to be similar problems (forwarding tcp packets) on Xen.

E.g: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188261
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202199
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197344