Bug 231416 - dhcp / dhclient: bad udp checksums if running on a vlan on a Intel I211 / Broadcom interfaces
Summary: dhcp / dhclient: bad udp checksums if running on a vlan on a Intel I211 / Bro...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Stephen Hurd
URL: https://reviews.freebsd.org/D17404
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-17 04:17 UTC by Kurt Jaeger
Modified: 2020-01-15 02:52 UTC (History)
11 users (show)

See Also:
koobs: mfc-stable11+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kurt Jaeger freebsd_committer 2018-09-17 04:17:42 UTC
Workaround: ifconfig igb0 -vlanhwtag

See

https://lists.freebsd.org/pipermail/freebsd-net/2018-September/051592.html

and followups.
Comment 1 Eric van Gyzen freebsd_committer 2018-09-20 18:28:40 UTC
Since it only affects DHCP traffic (bpf injected traffic?), I wonder if it's more related to iflib than to igb.  Adding marius@ and shurd@.
Comment 2 Eric van Gyzen freebsd_committer 2018-09-20 18:34:06 UTC
kp@ reported:

I had a similar issue, where -vlanhwtag also fixed it.
That was on a I210 (igb) card (in a FreeNAS mini XL).
Comment 3 Eric van Gyzen freebsd_committer 2018-09-20 19:54:47 UTC
This also affects some Broadcom NICs:

https://lists.freebsd.org/pipermail/freebsd-net/2018-September/051642.html
Comment 4 Lev A. Serebryakov freebsd_committer 2018-09-29 20:50:20 UTC
I have same problem on r339012
Comment 5 Lev A. Serebryakov freebsd_committer 2018-09-30 12:55:06 UTC
I can add one more datapoint: 82574L Gigabit Network Connection (which is shown as "em") doesn't have this problem on same FreeBSD revision.
Comment 6 Stephen Hurd freebsd_committer 2018-10-03 15:18:40 UTC
The packet captures are consistent with the checksum being calculated with the pseudo header having the correct checksum, not zero.  Forcing the UDP checksum to zero before passing to the card would likely work around the issue, but I want to figure out what changed to cause this issue to crop up first.
Comment 7 Stephen Hurd freebsd_committer 2018-10-03 18:10:20 UTC
Can you test with the patch in the review here:

https://reviews.freebsd.org/D17393

I'm not confident this will fix the problem, but it's the only obvious issue I saw tracing the send path:

bpf.c:bpfwrite()
if_ethersubr.c:ether_output()
if_ethersubr.c:ether_output_frame()
if_vlan.c:vlan_transmit()
iflib.c:iflib_if_transmit()
...
igb_txrx.c:igb_isc_txd_encap()
igb_txrx.c:igb_tx_ctx_setup()

If this fixes the issue, the bce(4) issue is a different issue.
Comment 8 Eric van Gyzen freebsd_committer 2018-10-03 18:23:35 UTC
I'll test it in the next day or two.  I think I can only test it on my home firewall, so I have to schedule downtime with the family.  ;)

I would be grateful if other people tested it, too.
Comment 9 Lev A. Serebryakov freebsd_committer 2018-10-03 18:40:25 UTC
I'll check it in 3 or 4 hours and will report back.
Comment 10 Lev A. Serebryakov freebsd_committer 2018-10-03 20:21:55 UTC
Nope. It make worse.
I've explained problems with more details ant phabricator.

I don't noow, should we duplicate all comments in two places?
Comment 11 Stephen Hurd freebsd_committer 2018-10-03 20:37:07 UTC
(In reply to Lev A. Serebryakov from comment #10)

I think that comments regarding the patch can/should go on Phabricator only and comments about the bug should go here (and likely not on Phabricator).
Comment 12 commit-hook freebsd_committer 2018-10-05 20:17:15 UTC
A commit references this bug:

Author: shurd
Date: Fri Oct  5 20:16:20 UTC 2018
New revision: 339207
URL: https://svnweb.freebsd.org/changeset/base/339207

Log:
  Fix igb corrupting checksums with BPF and VLAN

  When using a vlan with igb and the vlanhwcsum option, any mbufs which
  already had the TCP, UDP, or SCTP checksum calculated and therefore don't
  have the CSUM_[IP|IP6]_[TCP|UDP|SCTP] bits set in the csum_flags field would
  have the L4 checksum corrupted by the hardware.

  This was caused by the driver setting E1000_TXD_POPTS_TXSM any time a
  checksum bit was set OR a vlan tag was present.

  The patched driver only sets E1000_TXD_POPTS_TXSM when an offload is
  requested.

  PR:		231416
  Reported by:	pi
  Approved by:	re (gjb)
  Sponsored by:	Limelight Networks
  Differential Revision:	https://reviews.freebsd.org/D17404

Changes:
  head/sys/dev/e1000/igb_txrx.c
Comment 13 Kubilay Kocak freebsd_committer freebsd_triage 2020-01-15 02:52:35 UTC
^Triage:

 - Assign to committer that resolved
 - Track MFC's 
   - HEAD was 12.x base r339207
   - MFC'd to stable/11 in base r342789 by marius