Summary: | Intel e1000 network link drops under high network load | ||
---|---|---|---|
Product: | Base System | Reporter: | Naveen Nathan <freebsd> |
Component: | kern | Assignee: | freebsd-net (Nobody) <net> |
Status: | Closed Feedback Timeout | ||
Severity: | Affects Only Me | CC: | freebsd, kaho, marius, sbruno |
Priority: | --- | Keywords: | IntelNetworking, needs-qa |
Version: | 11.0-RELEASE | Flags: | koobs:
mfc-stable11?
koobs: mfc-stable10? |
Hardware: | amd64 | ||
OS: | Any | ||
See Also: | https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200221 |
Description
Naveen Nathan
2017-04-14 10:46:17 UTC
Further investigation and it happens during any kind of network load activity, usually when traffic goes beyond 10Mbps. So this happen when using portsnap, pkg install, etc. I have also disabled tso4 and vlanhwtso. I think it made things a little more bearable but the issue still persists. # ifconfig em0 em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC> ether 00:30:48:8b:55:de inet 104.149.6.19 netmask 0xfffffff0 broadcast 104.149.6.31 nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> media: Ethernet autoselect (1000baseT <full-duplex>) status: active [root@armakuni ~]# netstat -I em0 Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll em0 1500 <Link#1> 00:30:48:8b:55:de 88189 476 0 49789 5 0 em0 - 104.x.x.16/ xxx.xxx 86581 - - 50046 - - Apologies, I forgot to mention. I was able to upgrade to 11.0-RELEASE after running the freebsd-update about 30 or so times -- I ended up getting lucky where the network connection didn't drop, and was able to continue with the upgrade. The above comments about disabling tso4/vlanhwtso was in the 11.0 release. Therefore the em0 watchdog timer under network load issue seems to persist even though bug 200221 resolved it for 10.3. (In reply to nn from comment #1) I think the link drop itself is caused by a Tx error, but you have many Rx errors shown by the Ierrs of the netstat output and you should investigate what errors occur at first. Can you see a `sysctl dev.em.0` result? Can you get which knobs related errors are increasing their counters? For example, does rx_overrun or crc_errs has a non-zero value? Feedback timeout (2 months) @nn please re-open this issue if you can provide additional or updated information, isolation or reproduction steps. |