Bug 213232

Summary: [tcp] [panic] tcp_output() Panic String: tcp_output: len > IP_MAXPACKET on -head and stable/11
Product: Base System Reporter: Hiren Panchasara <hiren>
Component: kernAssignee: freebsd-net (Nobody) <net>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: ae, glebius, kbowling, sbruno, tuexen
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   

Description Hiren Panchasara freebsd_committer freebsd_triage 2016-10-05 18:46:04 UTC
(kgdb) bt
#0  __curthread () at ./machine/pcpu.h:221
#1  doadump (textdump=492866688) at /d0/hiren/freebsd/sys/kern/kern_shutdown.c:298
#2  0xffffffff80395066 in db_fncall_generic (nargs=0, addr=<optimized out>, rv=<optimized out>, args=<optimized out
    at /d0/hiren/freebsd/sys/ddb/db_command.c:581
#3  db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
    at /d0/hiren/freebsd/sys/ddb/db_command.c:629
#4  0xffffffff80394bc9 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=<optimized out>
    at /d0/hiren/freebsd/sys/ddb/db_command.c:453
#5  0xffffffff80394924 in db_command_loop () at /d0/hiren/freebsd/sys/ddb/db_command.c:506
#6  0xffffffff803979df in db_trap (type=<optimized out>, code=<optimized out>) at /d0/hiren/freebsd/sys/ddb/db_main
#7  0xffffffff80a90003 in kdb_trap (type=<optimized out>, code=<optimized out>, tf=<optimized out>) at /d0/hiren/fr
#8  0xffffffff80ec5ce4 in trap (frame=0xfffffe201d6090f0) at /d0/hiren/freebsd/sys/amd64/amd64/trap.c:559
#9  <signal handler called>
#10 kdb_enter (why=0xffffffff813f7eb8 "panic", msg=0x80 <error: Cannot access memory at address 0x80>) at /d0/hiren
#11 0xffffffff80a4e41f in vpanic (fmt=<optimized out>, ap=0xfffffe201d609280) at /d0/hiren/freebsd/sys/kern/kern_sh
#12 0xffffffff80a4e276 in kassert_panic (fmt=0xffffffff8142a88d "%s: len > IP_MAXPACKET") at /d0/hiren/freebsd/sys/
#13 0xffffffff80c3660a in tcp_output (tp=<optimized out>) at /d0/hiren/freebsd/sys/netinet/tcp_output.c:987
#14 0xffffffff80c333a2 in tcp_do_segment (m=<optimized out>, th=<optimized out>, so=<optimized out>, tp=<optimized 
    tlen=<optimized out>, iptos=<optimized out>, ti_locked=<error reading variable: Cannot access memory at address
    at /d0/hiren/freebsd/sys/netinet/tcp_input.c:3169
#15 0xffffffff80c2f690 in tcp_input (mp=<optimized out>, offp=<optimized out>, proto=<optimized out>) at /d0/hiren/
#16 0xffffffff80bbc901 in ip_input (m=0x4) at /d0/hiren/freebsd/sys/netinet/ip_input.c:809
#17 0xffffffff80b57630 in netisr_dispatch_src (proto=1, source=0, m=0xfffff807bf186300) at /d0/hiren/freebsd/sys/ne
#18 0xffffffff80b4183a in ether_demux (ifp=<optimized out>, m=0x80) at /d0/hiren/freebsd/sys/net/if_ethersubr.c:848
#19 0xffffffff80b42637 in ether_input_internal (ifp=<optimized out>, m=0x80) at /d0/hiren/freebsd/sys/net/if_ethers
#20 ether_nh_input (m=<optimized out>) at /d0/hiren/freebsd/sys/net/if_ethersubr.c:667
#21 0xffffffff80b57630 in netisr_dispatch_src (proto=5, source=0, m=0xfffff807bf186300) at /d0/hiren/freebsd/sys/ne
#22 0xffffffff80b41ba2 in ether_input (ifp=<optimized out>, m=0x0) at /d0/hiren/freebsd/sys/net/if_ethersubr.c:757
#23 0xffffffff805e2dcb in ixgbe_rx_input (rxr=<optimized out>, ifp=<optimized out>, m=0xfffff807bf186300, ptype=<op
    at /d0/hiren/freebsd/sys/dev/ixgbe/ix_txrx.c:1704
#24 ixgbe_rxeof (que=<optimized out>) at /d0/hiren/freebsd/sys/dev/ixgbe/ix_txrx.c:1985
#25 0xffffffff805da70c in ixgbe_msix_que (arg=0xfffff80114812870) at /d0/hiren/freebsd/sys/dev/ixgbe/if_ix.c:1572
#26 0xffffffff80a147a6 in intr_event_execute_handlers (p=<optimized out>, ie=<optimized out>) at /d0/hiren/freebsd/
#27 0xffffffff80a14e26 in ithread_execute_handlers (ie=<optimized out>, p=<optimized out>) at /d0/hiren/freebsd/sys
#28 ithread_loop (arg=<optimized out>) at /d0/hiren/freebsd/sys/kern/kern_intr.c:1356
#29 0xffffffff80a11eb4 in fork_exit (callout=0xffffffff80a14d80 <ithread_loop>, arg=0xfffff801147babe0, frame=0xfff
    at /d0/hiren/freebsd/sys/kern/kern_fork.c:1038
#30 <signal handler called>
(kgdb) 

This box paniced while sitting idle. Just running -head and no special configs.

ix0@pci0:4:0:0: class=0x020000 card=0x061115d9 chip=0x10fb8086 rev=0x01 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82599ES 10-Gigabit SFI/SFP+ Network Connection'
    class      = network
    subclass   = ethernet

tso is on.
 
http://svn.freebsd.org/changeset/base/211317 added the check where its panicing. 

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=132832 which caused r211317

I've not looked at it into much detail.
Comment 1 Hiren Panchasara freebsd_committer freebsd_triage 2016-10-06 19:02:31 UTC
Just hit this again on the same box.
Comment 2 Andrey V. Elsukov freebsd_committer freebsd_triage 2016-10-07 04:01:43 UTC
(In reply to Hiren Panchasara from comment #1)
> Just hit this again on the same box.

What value have len, hdrlen and ipoptlen in frame 13 or can you show `info locals` from the same frame?
Comment 3 Hiren Panchasara freebsd_committer freebsd_triage 2016-10-07 05:21:48 UTC
(In reply to Andrey V. Elsukov from comment #2)
All optimized out but because it happens (or at least happened) twice, I can put some instrumentation to catch those values. 

I'll update when I have some more info.
Comment 4 Hiren Panchasara freebsd_committer freebsd_triage 2016-10-13 16:31:03 UTC
Just hit this on uptodate stable/11 system.
Comment 5 Hiren Panchasara freebsd_committer freebsd_triage 2016-10-13 19:07:35 UTC
r306769 | jtl | 2016-10-06 09:28:34 -0700 (Thu, 06 Oct 2016) | 8 lines

Remove "long" variables from the TCP stack (not including the modular
congestion control framework).

may help. Updating my box and trying again if I get the same crash.
Comment 6 Hiren Panchasara freebsd_committer freebsd_triage 2016-10-16 06:16:51 UTC
Now with r306769, len is int32_t so it can actually have -ve value, I am seeing panic at


	/*
	 * This KASSERT is here to catch edge cases at a well defined place.
	 * Before, those had triggered (random) panic conditions further down.
	 */
	KASSERT(len >= 0, ("[%s:%d]: len < 0", __func__, __LINE__));

Now, putting a bit of debugs above that I figured that at following path in 'else' case the len becomes -ve

		if (tso) {

                   <blah>

		} else {
			len = tp->t_maxseg - optlen - ipoptlen;
			sendalot = 1;
		}

I found that tp-t_maxseg = 2, optlen = 12, ipoptlen = 0 resulting in len = -10.

Clearly, t_maxseg which is supposed to be representing MSS at '2' looks wrong.

I wonder if the changes in https://svnweb.freebsd.org/base?view=revision&revision=293284 somewhere caused this.

Still investigating.
Comment 7 commit-hook freebsd_committer freebsd_triage 2016-10-18 02:40:48 UTC
A commit references this bug:

Author: hiren
Date: Tue Oct 18 02:40:25 UTC 2016
New revision: 307545
URL: https://svnweb.freebsd.org/changeset/base/307545

Log:
  Make sure tcp_mss() has the same check as tcp_mss_update() to have t_maxseg set
  to at least 64.

  This is still just a coverup to avoid kernel panic and not an actual fix.

  PR:			213232
  Reviewed by:		glebius
  MFC after:		1 week
  Sponsored by:		Limelight Networks
  Differential Revision:	https://reviews.freebsd.org/D8272

Changes:
  head/sys/netinet/tcp_input.c
Comment 8 commit-hook freebsd_committer freebsd_triage 2016-11-01 22:41:04 UTC
A commit references this bug:

Author: hiren
Date: Tue Nov  1 22:40:25 UTC 2016
New revision: 308184
URL: https://svnweb.freebsd.org/changeset/base/308184

Log:
  MFC r307545
  Make sure tcp_mss() has the same check as tcp_mss_update() to have t_maxseg set
  to at least 64.

  This is still just a coverup to avoid kernel panic and not an actual fix.

  PR:			213232
  Sponsored by:		Limelight Networks

Changes:
_U  stable/11/
  stable/11/sys/netinet/tcp_input.c
Comment 9 Michael Tuexen freebsd_committer freebsd_triage 2021-03-02 16:50:35 UTC
I think a lot has changed in the area. Please re-open, if the problem still exists.