My em0 interface repeatedly hangs up with watchdog timeout when communicating to the windows host at MTU 9K. [sobomax@pioneer ~]$ grep em0 /var/run/dmesg.boot em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0xecc0-0xecdf mem 0xfe6e0000-0xfe6fffff,0xfe6d9000-0xfe6d9fff irq 21 at device 25.0 on pci0 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:22:19:32:87:2f [sobomax@pioneer ~]$ uname -a FreeBSD pioneer.sippysoft.com 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: Sun Oct 4 03:08:04 PDT 2009 root@pioneer.sippysoft.com:/usr/obj/usr/src/sys/PIONEER amd64 [sobomax@pioneer ~]$ ifconfig em0 em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=98<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM> ether 00:22:19:32:87:2f inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255 inet 192.168.2.1 netmask 0xffffff00 broadcast 192.168.2.255 inet6 fec0::1 prefixlen 64 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active [sobomax@pioneer ~]$ dmesg | grep watchd em0: watchdog timeout -- resetting em0: watchdog timeout -- resetting em0: watchdog timeout -- resetting em0: watchdog timeout -- resetting em0: watchdog timeout -- resetting I have managed to make a packet capture right at the time when hang happens. It appears to be that either "MAC Pause" or "TCP Segment of reassembled PDU" is the last packet that goes through before the interface hangs. Here is the screenshot, if somebody wants to take closer look at the actual packets please let me know. http://sobomax.sippysoft.com/~sobomax/ScreenShot527.png Turning off TSO and TXCSUM/RXCSUM has not helped. Bringing MTU down to 1,500 resolved the issue. I have had the same problem happening several times in the past (although I initially attributed it to the bad cable or something like that), so it's definitely not on-off issue. Given popularity of intel/pro chips in today's computers it look like quite serious issue to me. Any help is greatly appreciated.
Can't do much unless you adequately identify hardware, on BOTH sides, believe it or not "windows" is not a sufficient description :) I need to know what the E1000 hardware is, using pciconf -l, and I also need to know what is on the Windows side before having a clue on how to repro or help you. Cheers, Jack On Thu, Nov 5, 2009 at 5:18 PM, Maksym Sobolyev <sobomax@freebsd.org> wrote: > > >Number: 140326 > >Category: kern > >Synopsis: em0: watchdog timeout when communicating to windows using > 9K MTU > >Confidential: no > >Severity: serious > >Priority: high > >Responsible: freebsd-bugs > >State: open > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Fri Nov 06 01:20:01 UTC 2009 > >Closed-Date: > >Last-Modified: > >Originator: Maksym Sobolyev > >Release: 7.2-p4 > >Organization: > Sippy Software, Inc. > >Environment: > FreeBSD pioneer.sippysoft.com 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: > Sun Oct 4 03:08:04 PDT 2009 root@pioneer.sippysoft.com:/usr/obj/usr/src/sys/PIONEER > amd64 > >Description: > My em0 interface repeatedly hangs up with watchdog timeout when > communicating to the windows host at MTU 9K. > > [sobomax@pioneer ~]$ grep em0 /var/run/dmesg.boot > em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0xecc0-0xecdf mem > 0xfe6e0000-0xfe6fffff,0xfe6d9000-0xfe6d9fff irq 21 at device 25.0 on pci0 > em0: Using MSI interrupt > em0: [FILTER] > em0: Ethernet address: 00:22:19:32:87:2f > [sobomax@pioneer ~]$ uname -a > FreeBSD pioneer.sippysoft.com 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: > Sun Oct 4 03:08:04 PDT 2009 root@pioneer.sippysoft.com:/usr/obj/usr/src/sys/PIONEER > amd64 > [sobomax@pioneer ~]$ ifconfig em0 > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 > options=98<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM> > ether 00:22:19:32:87:2f > inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255 > inet 192.168.2.1 netmask 0xffffff00 broadcast 192.168.2.255 > inet6 fec0::1 prefixlen 64 > media: Ethernet autoselect (1000baseTX <full-duplex>) > status: active > [sobomax@pioneer ~]$ dmesg | grep watchd > em0: watchdog timeout -- resetting > em0: watchdog timeout -- resetting > em0: watchdog timeout -- resetting > em0: watchdog timeout -- resetting > em0: watchdog timeout -- resetting > > I have managed to make a packet capture right at the time when hang > happens. It appears to be that either "MAC Pause" or "TCP Segment of > reassembled PDU" is the last packet that goes through before the interface > hangs. > > Here is the screenshot, if somebody wants to take closer look at the actual > packets please let me know. > > http://sobomax.sippysoft.com/~sobomax/ScreenShot527.png<http://sobomax.sippysoft.com/%7Esobomax/ScreenShot527.png> > > Turning off TSO and TXCSUM/RXCSUM has not helped. Bringing MTU down to > 1,500 resolved the issue. > > I have had the same problem happening several times in the past (although I > initially attributed it to the bad cable or something like that), so it's > definitely not on-off issue. > > Given popularity of intel/pro chips in today's computers it look like quite > serious issue to me. Any help is greatly appreciated. > >How-To-Repeat: > > >Fix: > > > >Release-Note: > >Audit-Trail: > >Unformatted: > _______________________________________________ > freebsd-bugs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs > To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" >
Jack Vogel wrote: > Can't do much unless you adequately identify hardware, on BOTH sides, > believe > it or not "windows" is not a sufficient description :) > > I need to know what the E1000 hardware is, using pciconf -l, and I also > need to > know what is on the Windows side before having a clue on how to repro or > help > you. Jack, Thank you for the amazingly fast reply. Sure, FreeBSD side is this: em0@pci0:0:25:0: class=0x020000 card=0x02761028 chip=0x10de8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet On windows side it's Realtek GiGe card. The system itself is Windows 7 Ultimate 64-bit edition: PCI\VEN_10EC&DEV_8168&SUBSYS_02C01028&REV_03 Please let me know if any other information is necessary. Regards, -- Maksym Sobolyev Sippy Software, Inc. Internet Telephony (VoIP) Experts T/F: +1-646-651-1110 Web: http://www.sippysoft.com MSN: sales@sippysoft.com Skype: SippySoft
Responsible Changed From-To: freebsd-bugs->freebsd-net Over to maintainer(s).
Good, that's a start. Now, is there a switch of some sort involved or are you going back to back? Some switches have problems with jumbo frames, there are also some vendors (including our's) interfaces that do not support jumbo frames, so you need to check on that also (I mean the RT). I will check on the Intel adapter tomorrow. Jack On Thu, Nov 5, 2009 at 6:28 PM, Maxim Sobolev <sobomax@freebsd.org> wrote: > Jack Vogel wrote: > >> Can't do much unless you adequately identify hardware, on BOTH sides, >> believe >> it or not "windows" is not a sufficient description :) >> >> I need to know what the E1000 hardware is, using pciconf -l, and I also >> need to >> know what is on the Windows side before having a clue on how to repro or >> help >> you. >> > > Jack, > > Thank you for the amazingly fast reply. > > Sure, FreeBSD side is this: > > em0@pci0:0:25:0: class=0x020000 card=0x02761028 chip=0x10de8086 > rev=0x02 hdr=0x00 > vendor = 'Intel Corporation' > class = network > subclass = ethernet > > On windows side it's Realtek GiGe card. The system itself is Windows 7 > Ultimate 64-bit edition: > > PCI\VEN_10EC&DEV_8168&SUBSYS_02C01028&REV_03 > > Please let me know if any other information is necessary. > > Regards, > -- > Maksym Sobolyev > Sippy Software, Inc. > Internet Telephony (VoIP) Experts > T/F: +1-646-651-1110 > Web: http://www.sippysoft.com > MSN: sales@sippysoft.com > Skype: SippySoft >
Jack Vogel wrote: > Good, that's a start. Now, is there a switch of some sort involved or > are you going > back to back? Some switches have problems with jumbo frames, there are also > some vendors (including our's) interfaces that do not support jumbo > frames, so > you need to check on that also (I mean the RT). > > I will check on the Intel adapter tomorrow. Yes, there is switch involved (Cisco/Linksys EG008W ver.3), but I don't think it's related. The problem has really escalated when I installed Windows 7 on this machine yesterday. Before that the same machine with Realtek was running Vista and this problem had happened to me only once or twice in two weeks with the same MTU on both ends. And from the capture it seems like the very specific condition causes this. Unfortunately this box is a gateway for a network, so that I cannot replace hub and try to reproduce the issue. -Maxim
Jack, Here is some additional info you might find useful: I have replaced Linksys switch with more "professional" rack-mountable 3Com Baseline 2816 switch and reproduced the issue just as easy by copying large file via SMB from FReeBSD to Windows 7. To me it pretty much rules out any problems with the switch. Hope it helps. -Maxim
Responsible Changed From-To: freebsd-net->jfv Over to maintainer.
Reassign to erj@ for triage. To submitter: is this issue still relevant?
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
MARKED AS SPAM
Close for now.