Bug 237441 - Virtio net consistently truncates last byte of a fetch xfer with > 8956 bytes of payload
Summary: Virtio net consistently truncates last byte of a fetch xfer with > 8956 bytes...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-virtualization mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-21 16:06 UTC by Guest
Modified: 2019-07-13 15:05 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Guest 2019-04-21 16:06:25 UTC
Reading 215737 carefully, I couldn't decide if this was the same problem but ultimately decided it wasn't.

Environment:  OSX High Sierra running QEMU and the 12.0 release qcow2 image published on the FreeBSD site.

Qemu command line:  qemu-system-x86_64 -m 2048 -hda FreeBSD-12.0-RELEASE-amd64.qcow2  -netdev user,id=mynet0,hostfwd=tcp:127.0.0.1:7722-:22 -device virtio-net-pci,netdev=mynet0

Trying to install pkg fails.  If you do the following command:

fetch http://pkg.freebsd.org/FreeBSD:12:amd64/latest/Latest/pkg.txz

you will consistently get the following (Note:  with or without [TR]XCSUM enabled):

fetch: pkg.txz appears to be truncated: 3395051/3395052 bytes

If you download the full package and use dd to grab all but the last byte, the SHA256 sums match so the data's not corrupted, just missing the final byte (a 'Z') character.  Furthermore, if you run tcpdump in the guest against the vtnet0 interface while it's transferring you can see the final 'Z' byte in the final packet so qemu is getting the data to the guest.  If you then ktrace the fetch process, you'll see that its final read *doesn't* have the 'Z' which rules out a bug in fetch/libfetch.

Using fetch to test for sizing, I started downloading packages at the jumbo frame boundary and found that packages <= 8956 bytes work and >= 8960 exhibit the failure.
Comment 1 Guest 2019-04-21 16:06:55 UTC
qemu was installed via brew.
Comment 2 Guest 2019-04-21 17:34:53 UTC
Additional information:  OpenBSD using virtio has almost exactly the same problem--one byte truncation when trying to download packages (down to the tcpdump output showing a complete final payload packet but ktrace showing the ftp utility not receiving the final byte).  Bizarrely, OpenBSD downloaded and installed packages via the network using virtio so it's unclear why this seems to work intermittently.

In any case, with OpenBSD having almost identical behavior, I am unconvinced this is a FreeBSD issue.
Comment 3 Rodney W. Grimes freebsd_committer 2019-05-25 09:38:05 UTC
Are jumbo frames in use some place along the path?
Comment 4 Christoph Kliemann 2019-06-27 23:19:25 UTC
I can reproduce this on macOS 10.13.6 (17G7024) High Sierra with qemu 4.0.0 and a FreeBSD 12.0-RELEASE (p1-p6) guest.

My packer freebsd builder failed because of this issue.
I have tested this for a while with the same template.

In most cases, the builder fails (truncated base.txz or truncated pkgng packages).
Occasionally, the download and installation are successful.

I booted one of these successfully created images with qemu and ran additional tests.

Test #1: fetch http://www.google.de
The last byte is missing.

Test #2: ping google.de
PING google.de (172.217.23.163): 56 data bytes
64 bytes from 172.217.23.163: icmp_seq=0 ttl=255 time=622018725671.832 ms
wrong data byte #8 should be 0x8 but was 0xc0
[...]

Test #3: pkg install curl wget
One successful attempt after many truncated downloads.

Test #4: curl http://www.google.de
No issues

Test #5: wget http://www.google.de
No issues
Comment 5 Christoph Kliemann 2019-06-27 23:36:28 UTC
(In reply to Rodney W. Grimes from comment #3)
I haven't changed mtu on any interface.
Hosts external interface is 1500, hosts gateway uses 1500 and guests vtnet0 is 1500.
Comment 6 Christoph Kliemann 2019-07-13 13:50:36 UTC
I think this is not a FreeBSD issue.

Can't reproduce immediately after a host reboot.
The issue occurs after the first sleep/wake cycle of the host and persists until reboot.
This seems to be a macOS and/or qemu issue.
Comment 7 Christoph Kliemann 2019-07-13 15:05:57 UTC
(In reply to Christoph Kliemann from comment #6)

Please disregard. Managed to reproduce after a reboot. Sorry for the noise.