Bug 183390

Summary: [ixgbe] 10gigabit networking problems
Product: Base System Reporter: pataki.antal
Component: kernAssignee: jfv
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: adrian, kevin, sbruno
Priority: Normal Keywords: IntelNetworking
Version: Unspecified   
Hardware: Any   
OS: Any   

Description pataki.antal 2013-10-28 11:10:00 UTC
Hardware: IBM x3500 m4 (2x E5-2620, 16GB RAM)
Intel X520 DA2 10Gbit NIC (PCI-Express x8)
IBM ServeRAID M1115 with 8x600GB 15k rpm SAS disk.

System setup:
The system is installed into a geli'ed zpool.

The Intel 10Gbit NIC is direct-connected to an other IBM x3500 m4 (same
Intel card too) what is running VmWare ESXi 5.5.

The system provides an NFS share to the ESXi system trough the 10 gigabit
connection.

The problem:

Without any load if I ping the other machine trough the 10 gigabit
connection, the ping output is like this:

root@storagex:~ # ping 10.3.3.2
PING 10.3.3.2 (10.3.3.2): 56 data bytes
(...cutoff...)
64 bytes from 10.3.3.2: icmp_seq=89 ttl=64 time=0.106ms
ping: sendto: File too large
64 bytes from 10.3.3.2: icmp_seq=91 ttl=64 time=0.092ms
..etc..etc.

Sometimes the "ping: sendto: File too large" message don't coming for
many hours, sometimes its floods the console!

When this starts to happens, the other end, the ESXi machine shows in
the logs, the StorageApdHandler process starts a times for the NFS share,
because it didn't receives back the NFS heartbeat.

After a few seconds, the ESXi machine starts to show in the lock:

NFSLock: xxx: Stop accessing fd 0xxxxxxx x

After a few seconds again, on the ESXi machine, the StorageApd Handler
enters the NFS share to All Path Down state, and drops the NFS connection.

After this, if I try to ping the FreeBSD machine from the ESXi machine,
the ESXi show "host is down", and on the FreeBSD machine the ping repeats
the "ping: sendto: File too large" message.

To resolve this, only ifconfig ix1 down and after ifconfig ix1 up works.

After resetting the interface like this, sometimes the connection and
the ping works for minutes, sometimes works for hours - and again
starting the situation described above.

I have screenshoots from the "ping: sendto: File too large" message.

We tried the default ixgbe driver, and the newest from the Intel's
website.  With both drives is the same issue.

We analysed that, if the transfer rate over the 10Gbit connection reaches
over 5Gbit/sec, the problem comes more faster, maybe in 20-40 minutes,
sometimes after 5 minutes.

If we leave the machine only to ping the each other, sometimes the problem
didn't come for days, but come.

How-To-Repeat: Install an Intel X520 10gbit NIC into a FreeBSD 9.2 system.

Connect it to an other host via 10gbit ethernet. (We tried with ESXi
5.1 and 5.5.)

Start to ping the other end and leave it for hours.

Engage some high traffic (utilise the connection over 5Gbit/sec),
probably via NFS to an ESXi 5.5 host on the other side.

Wait some hours.
Comment 1 pataki.antal 2013-10-30 21:03:30 UTC
why is this non-critical?
the other side drops the connection because of this, this is very =
critical for example if the bogous system is a storage...=
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2013-10-31 02:43:11 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-net

Over to maintainer(s).
Comment 3 Xin LI freebsd_committer freebsd_triage 2014-03-17 22:41:15 UTC
Responsible Changed
From-To: freebsd-net->jfv

Hi, Jack, 

Some FreeNAS users [1] have encountered similar issue too, can you take 
a look at this one? 

Thanks in advance! 

[1] https://bugs.freenas.org/issues/4560
Comment 4 csforgeron 2014-03-21 03:08:56 UTC
To keep you in the loop;

I'm having a very similar problem in 10.0-RELEASE

We've made some headway - Disabling TSO (ifconfig ix0 -tso) seems to avoid
the symptom, but of course that's just a temporary fix.

Try it, and see if you have stability again.


The discussion is the freebsd-net mailing list, at
http://lists.freebsd.org/pipermail/freebsd-net/2014-March/038061.html

It's a bit long, but follow along as it may help your situation. I hope to
test changes to the TSO code tomorrow.
Comment 5 John Hickey 2014-04-28 06:58:40 UTC
I am seeing this too on 10.0-RELEASE.  Disabling TSO doesn't seem to 
help it either.  The server was undergoing fairly heavy load related to 
ZFS at the time .  Network was fairly quiet since the NFS connections I 
did have ended up hanging.

System specs:

FreeBSD 10.0-RELEASE-p1 #3 r264309: Wed Apr  9 17:01:09 PDT 2014
2x Opteron 6128 (16 total cores)
128GB RAM
Intel X520 NIC
~22TB ZFS filesystem
Comment 6 kevin 2014-08-01 17:49:52 UTC
I'm seeing this two on one of our busier boxes running 10.0-RELEASE. The ix device worked okay at first, but under heavy load we'd see things like:

# ping x.x.x.x
PING x.x.x.x (x.x.x.x): 56 data bytes
64 bytes from x.x.x.x: icmp_seq=0 ttl=61 time=55.950 ms
ping: sendto: File too large
64 bytes from x.x.x.x: icmp_seq=2 ttl=61 time=55.972 ms
ping: sendto: File too large
64 bytes from x.x.x.x: icmp_seq=4 ttl=61 time=55.944 ms
ping: sendto: File too large

TCP traffic seemed unaffected, but things that used UDP like NFS or NTP got it too:

ntpd[46659]: sendto(204.9.54.119) (fd=26): File too large
lldpd_FreeBSD_amd64[1407]: unable to send packet on real device for ix0: File too large

I set -tso and -vlanhwtso, and that didn't immediately help. I then set the interface down/up and that seemed to immediately fix it. Not sure yet if it's a permanent fix or if it'll return after a while.

Any debugging we can do to help with this if it returns?


dev.ix.0.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 2.5.15
dev.ix.0.%driver: ix
dev.ix.0.%location: slot=0 function=0
dev.ix.0.%pnpinfo: vendor=0x8086 device=0x10fb subvendor=0x8086 subdevice=0x0006 class=0x020000
Comment 7 Adrian Chadd freebsd_committer freebsd_triage 2014-08-14 22:06:40 UTC
Hi,

This should've been fixed in -HEAD, -10 and -9 for at least the ixgbe NICs.

Log:
  MFC: r264630
  For NFS mounts using rsize,wsize=65536 over TSO enabled
  network interfaces limited to 32 transmit segments, there
  are two known issues.
  The more serious one is that for an I/O of slightly less than 64K,
  the net device driver prepends an ethernet header, resulting in a
  TSO segment slightly larger than 64K. Since m_defrag() copies this
  into 33 mbuf clusters, the transmit fails with EFBIG.
  A tester indicated observing a similar failure using iSCSI.

  The second less critical problem is that the network
  device driver must copy the mbuf chain via m_defrag()
  (m_collapse() is not sufficient), resulting in measurable overhead.

  This patch reduces the default size of if_hw_tsomax
  slightly, so that the first issue is avoided.
  Fixing the second issue will require a way for the
  network device driver to inform tcp_output() that it
  is limited to 32 transmit segments.

HEAD: 264630
-10: r265414
-9: r265292

additionally there were some issues with the way mbufs were repacked, resulting in EFBIG being returned. I'm not sure where/when that was fixed - search for 'NFS client READ performance on -current' on the freebsd-net mailing list.
Comment 8 Sean Bruno freebsd_committer freebsd_triage 2015-08-03 17:39:51 UTC
This looks to be fixed in 10.2r and head.  If not, please reopen the ticket.