Bug 155597 - [panic] Kernel panics with "sbdrop" message
Summary: [panic] Kernel panics with "sbdrop" message
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-net (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-16 12:30 UTC by Vladimir Kutakov
Modified: 2017-01-05 23:04 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vladimir Kutakov 2011-03-16 12:30:12 UTC
The machine works as webserver and has panic "sbdrop" after some hours of work. The network is rather loaded, but not extremly (about 20-30kpps input).

Please let me know if some additional information needed. I want to resolve the problem very much.

Here is the backtrace from the dumped core:
(kgdb) bt
#0  doadump () at pcpu.h:196
#1  0xffffff000706d3a0 in ?? ()
#2  0xffffffff8054d4da in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:421
#3  0xffffffff8054d8f2 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:576
#4  0xffffffff805a9976 in sbdrop_internal (sb=Variable "sb" is not available.
) at /usr/src/sys/kern/uipc_sockbuf.c:843
#5  0xffffffff8069d53c in tcp_do_segment (m=0xffffff01b5618400, th=0xffffff01573e4022, so=0xffffff013eecd2d0, tp=0xffffff016f9f8000, drop_hdrlen=40, tlen=0) at /usr/src/sys/netinet/tcp_input.c:2043
#6  0xffffffff8069ed48 in tcp_input (m=0xffffff01b5618400, off0=Variable "off0" is not available.
) at /usr/src/sys/netinet/tcp_input.c:847
#7  0xffffffff8063a2eb in ip_input (m=0xffffff01b5618400) at /usr/src/sys/netinet/ip_input.c:663
#8  0xffffffff805f2ee1 in ether_demux (ifp=0xffffff00070f5000, m=0xffffff01b5618400) at /usr/src/sys/net/if_ethersubr.c:834
#9  0xffffffff805f315e in ether_input (ifp=0xffffff00070f5000, m=0xffffff01b5618400) at /usr/src/sys/net/if_ethersubr.c:692
#10 0xffffffff8031ddb9 in igb_rxeof (que=Variable "que" is not available.
) at /usr/src/sys/dev/e1000/if_igb.c:4097
#11 0xffffffff8031e1a8 in igb_msix_que (arg=Variable "arg" is not available.
) at /usr/src/sys/dev/e1000/if_igb.c:1309
#12 0xffffffff8052b0e5 in ithread_loop (arg=0xffffff00070fe100) at /usr/src/sys/kern/kern_intr.c:1181
#13 0xffffffff80527b43 in fork_exit (callout=0xffffffff8052af70 <ithread_loop>, arg=0xffffff00070fe100, frame=0xffffff800013fc80) at /usr/src/sys/kern/kern_fork.c:811
#14 0xffffffff80800f3e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:554
#15 0x0000000000000000 in ?? ()
#16 0x0000000000000000 in ?? ()

How-To-Repeat: Direct network traffic to the machine and wait some hours.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2011-03-16 12:33:10 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-net

Over to maintainer(s).
Comment 2 Vladimir Kutakov 2011-03-24 09:35:34 UTC
We are looking into this problem a bit. The panic is reproducible easily by means of big http trafic. We ran apache benchmark on 3 next servers: ab -c40 -n20000000 'http://thehost/somesmallfile'. After just some minutes panic occurs.

However, after decreasing hw.igb.num_queues to 4 (default value is 8), it occurs only after 2 hours.

Finally, the server with hw.igb.num_queues=1 works good already 7 days.

It seems that some problem happens during parallel tcp processing.


-- 
WBR,
Vladimir
mailto:vova@ashmanov.com
Comment 3 Arnaud 2011-08-16 21:43:59 UTC
Hi,

Does this still happen with 9.0-BETA ?

If so, could this be a use-after-free, where an mbuf is freed (during
an m_pullup() or alike), but the old reference is still kept around,
gets added to the sockbuf, then the mbuf is re-allocated and corrupt
the sockbuf ?
Comment 4 Vladimir Kutakov 2012-01-11 13:55:41 UTC
We have tried RELENG_8_2 and the panic doesn't happen anymore. Many =
thanks to the FreeBSD team.

On Aug 17, 2011, at 1:43 AM, Arnaud Lacombe wrote:

> Hi,
>=20
> Does this still happen with 9.0-BETA ?
>=20
> If so, could this be a use-after-free, where an mbuf is freed (during
> an m_pullup() or alike), but the old reference is still kept around,
> gets added to the sockbuf, then the mbuf is re-allocated and corrupt
> the sockbuf ?


--
=D0=92=D0=BB=D0=B0=D0=B4=D0=B8=D0=BC=D0=B8=D1=80 =D0=9A=D1=83=D1=82=D0=B0=D0=
=BA=D0=BE=D0=B2
=D0=A2=D0=B5=D1=85=D0=BD=D0=B8=D1=87=D0=B5=D1=81=D0=BA=D0=B8=D0=B9 =
=D0=B4=D0=B8=D1=80=D0=B5=D0=BA=D1=82=D0=BE=D1=80
=D0=97=D0=90=D0=9E "=D0=9F=D0=BE=D0=B8=D1=81=D0=BA=D0=BE=D0=B2=D1=8B=D0=B5=
 =D1=82=D0=B5=D1=85=D0=BD=D0=BE=D0=BB=D0=BE=D0=B3=D0=B8=D0=B8"
vova@ashmanov.com