Summary: | [route] [panic] Panic when inject routes in kernel | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Eduardo Schoedler <eschoedler> | ||||||
Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> | ||||||
Status: | Open --- | ||||||||
Severity: | Affects Only Me | Keywords: | crash | ||||||
Priority: | Normal | ||||||||
Version: | 8.2-STABLE | ||||||||
Hardware: | Any | ||||||||
OS: | Any | ||||||||
Attachments: |
|
Description
Eduardo Schoedler
2011-03-02 05:20:10 UTC
Responsible Changed From-To: freebsd-bugs->freebsd-net Attempt to classify and reassign. ------=_NextPart_000_01EB_01CBD8F8.CAB2DE00-- Hello, The culprit here is RADIX_MPATH. When the kernel is built with it, it = crashes with the following backtrace (missing on PR): #0 doadump () at pcpu.h:224 224 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:224 #1 0xffffffff803c8bee in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff803c9021 in panic (fmt=3DVariable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:592 #3 0xffffffff8049cc15 in rtfree (rt=3DVariable "rt" is not available. ) at /usr/src/sys/net/route.c:446 #4 0xffffffff804a0856 in route_output (m=3D0xffffff006f14ab00,=20 so=3D0xffffff004dfbd7f8) at /usr/src/sys/net/rtsock.c:863 #5 0xffffffff804321e1 in sosend_generic (so=3D0xffffff004dfbd7f8, = addr=3D0x0,=20 uio=3D0xffffff824413ca90, top=3D0xffffff006f14ab00, control=3D0x0, = flags=3D0,=20 td=3D0xffffff00062a6460) at /usr/src/sys/kern/uipc_socket.c:1260 #6 0xffffffff804126c2 in soo_write (fp=3DVariable "fp" is not = available. ) at /usr/src/sys/kern/sys_socket.c:102 #7 0xffffffff8040b23b in dofilewrite (td=3D0xffffff00062a6460, fd=3D4,=20= fp=3D0xffffff00063fe2d0, auio=3D0xffffff824413ca90, offset=3DVariable = "offset" is not available. ) at file.h:239 #8 0xffffffff8040b550 in kern_writev (td=3D0xffffff00062a6460, fd=3D4,=20= auio=3D0xffffff824413ca90) at /usr/src/sys/kern/sys_generic.c:447 #9 0xffffffff8040b5d5 in write (td=3DVariable "td" is not available. ) at /usr/src/sys/kern/sys_generic.c:363 #10 0xffffffff804077a5 in syscallenter (td=3D0xffffff00062a6460,=20 sa=3D0xffffff824413cba0) at /usr/src/sys/kern/subr_trap.c:315 #11 0xffffffff8064a6ab in syscall (frame=3D0xffffff824413cc40) at /usr/src/sys/amd64/amd64/trap.c:944 #12 0xffffffff80632c52 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:381 #13 0x0000000800bc5b3c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb)=20 Looks like it is leaking the 'rt->rt_refcnt' and as result it = crashes/panic at RTFREE() on the end of route_output(). I don't have access to this live system to dig further (i.e. reduce the = test case). Cheers, Luiz= Hello, I've found another (easy) way to reproduce the problem with two scripts: routes-add.sh and routes-remove.sh. First run routes-add.sh for a while; then execute routes-remove.sh. Cancel with CTRL+C and execute routes-remove.sh again. Scripts: ======== # cat routes-add.sh #!/usr/local/bin/bash for a in {11..16}; do for b in {1..255}; do for c in {1..255}; do echo -n Adding route $a.$b.$c.0/24... route -q delete -net $a.$b.$c.0/24 echo OK. done done done # cat routes-remove.sh #!/usr/local/bin/bash for a in {11..16}; do for b in {1..255}; do for c in {1..255}; do echo -n Removing route $a.$b.$c.0/24... route -q delete -net $a.$b.$c.0/24 echo OK. done done done Backtrace: ========== # cat /var/crash/core.txt.1 <snip> Unread portion of the kernel message buffer: panic: rtfree 2 cpuid = 4 KDB: stack backtrace: #0 0xffffffff80416e43 at kdb_backtrace+0x5e #1 0xffffffff803e68a8 at panic+0x182 #2 0xffffffff804b2274 at rtalloc1_fib+0 #3 0xffffffff804b5b92 at route_output+0x304 #4 0xffffffff8044b776 at sosend_generic+0x366 #5 0xffffffff8042cd5c at soo_write+0x54 #6 0xffffffff80425bee at dofilewrite+0x7a #7 0xffffffff80425ec1 at kern_writev+0x52 #8 0xffffffff80425f3f at write+0x4e #9 0xffffffff80422408 at syscallenter+0x186 #10 0xffffffff8065b4f7 at syscall+0x40 #11 0xffffffff806449f2 at Xfast_syscall+0xe2 Uptime: 37m16s Physical memory: 4084 MB Dumping 497 MB:VOP_STRATEGY: bp is not locked but should be 482 466 450 434 418 402 386 370 354 338 322 306 290 274 258 242 226 210 194 178 162 146 130 114 98 82 66 50 34 18 2 #0 doadump () at pcpu.h:224 224 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:224 #1 0xffffffff803e6425 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff803e6892 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:592 #3 0xffffffff804b2274 in rtfree (rt=Variable "rt" is not available. ) at /usr/src/sys/net/route.c:446 #4 0xffffffff804b5b92 in route_output (m=0xffffff0004790700, so=0xffffff00b07ead48) at /usr/src/sys/net/rtsock.c:863 #5 0xffffffff8044b776 in sosend_generic (so=0xffffff00b07ead48, addr=0x0, uio=0xffffff830ff98a90, top=0xffffff0004790700, control=0x0, flags=0, td=0xffffff0004a13000) at /usr/src/sys/kern/uipc_socket.c:1260 #6 0xffffffff8042cd5c in soo_write (fp=Variable "fp" is not available. ) at /usr/src/sys/kern/sys_socket.c:102 #7 0xffffffff80425bee in dofilewrite (td=0xffffff0004a13000, fd=3, fp=0xffffff0004977af0, auio=0xffffff830ff98a90, offset=Variable "offset" is not available. ) at file.h:239 #8 0xffffffff80425ec1 in kern_writev (td=0xffffff0004a13000, fd=3, auio=0xffffff830ff98a90) at /usr/src/sys/kern/sys_generic.c:447 #9 0xffffffff80425f3f in write (td=Variable "td" is not available. ) at /usr/src/sys/kern/sys_generic.c:363 #10 0xffffffff80422408 in syscallenter (td=0xffffff0004a13000, sa=0xffffff830ff98ba0) at /usr/src/sys/kern/subr_trap.c:315 #11 0xffffffff8065b4f7 in syscall (frame=0xffffff830ff98c40) at /usr/src/sys/amd64/amd64/trap.c:944 #12 0xffffffff806449f2 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:381 #13 0x0000000800735afc in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) <snip> Again, removing RADIX_MPATH from kernel, it's working fine. Regards, -- Eduardo Schoedler On Mar 4, 2011, at 9:10 AM, Eduardo Schoedler wrote:
> Hello,
>
> I've found another (easy) way to reproduce the problem with two scripts:
> routes-add.sh and routes-remove.sh.
> First run routes-add.sh for a while; then execute routes-remove.sh.
> Cancel with CTRL+C and execute routes-remove.sh again.
>
<snip>
Hi Eduardo,
I've found another problem while trying something like you'd proposed, but it can be easily reproduced by just trying to remove a network route that is not in the table (probably what your script does when you press ctrl+c and restart it).
The problem i've found produces the following backtrace:
#0 doadump () at pcpu.h:244
244 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0 doadump () at pcpu.h:244
#1 0xc04d7de9 in db_fncall (dummy1=1, dummy2=0, dummy3=-1056933504,
dummy4=0xe69ee798 "") at /usr/src/sys/ddb/db_command.c:548
#2 0xc04d81e1 in db_command (last_cmdp=0xc0e303dc, cmd_table=0x0, dopager=1)
at /usr/src/sys/ddb/db_command.c:445
#3 0xc04d833a in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
#4 0xc04da25d in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:229
#5 0xc0902672 in kdb_trap (type=3, code=0, tf=0xe69ee948)
at /usr/src/sys/kern/subr_kdb.c:533
#6 0xc0c137bb in trap (frame=0xe69ee948) at /usr/src/sys/i386/i386/trap.c:717
#7 0xc0bfc7ec in calltrap () at /usr/src/sys/i386/i386/exception.s:168
#8 0xc09024fa in kdb_enter (why=0xc0ce86fa "panic", msg=0xc0ce86fa "panic")
at cpufunc.h:71
#9 0xc08cea24 in panic (fmt=0xc0cfedcb "radix node disappeared")
at /usr/src/sys/kern/kern_shutdown.c:574
#10 0xc0996900 in rtrequest1_fib (req=2, info=0xe69eea50, ret_nrt=0xe69eea84,
fibnum=Variable "fibnum" is not available.
) at /usr/src/sys/net/route.c:968
#11 0xc099abbd in route_output (m=0xc43a6b00, so=0xc48b0000)
at /usr/src/sys/net/rtsock.c:630
#12 0xc09959da in raw_usend (so=0xc48b0000, flags=Variable "flags" is not available.
)
at /usr/src/sys/net/raw_usrreq.c:228
#13 0xc0999275 in rts_send (so=0xc48b0000, flags=0, m=0xc43a6b00, nam=0x0,
control=0x0, td=0xc49d18a0) at /usr/src/sys/net/rtsock.c:354
#14 0xc093ceed in sosend_generic (so=0xc48b0000, addr=0x0, uio=0xe69eec28,
top=0xc43a6b00, control=0x0, flags=0, td=0xc49d18a0)
at /usr/src/sys/kern/uipc_socket.c:1301
#15 0xc0938ddf in sosend (so=0xc48b0000, addr=0x0, uio=0xe69eec28, top=0x0,
control=0x0, flags=0, td=0xc49d18a0)
at /usr/src/sys/kern/uipc_socket.c:1345
#16 0xc0920ae3 in soo_write (fp=0xc4690d58, uio=0xe69eec28,
active_cred=0xc47e8e00, flags=0, td=0xc49d18a0)
at /usr/src/sys/kern/sys_socket.c:100
#17 0xc0919a65 in dofilewrite (td=0xc49d18a0, fd=3, fp=0xc4690d58,
auio=0xe69eec28, offset=-1, flags=0) at file.h:238
#18 0xc091b208 in kern_writev (td=0xc49d18a0, fd=3, auio=0xe69eec28)
at /usr/src/sys/kern/sys_generic.c:447
#19 0xc091b31f in write (td=0xc49d18a0, uap=0xe69eecec)
at /usr/src/sys/kern/sys_generic.c:363
#20 0xc090fda3 in syscallenter (td=0xc49d18a0, sa=0xe69eece4)
at /usr/src/sys/kern/subr_trap.c:344
#21 0xc0c13064 in syscall (frame=0xe69eed28)
at /usr/src/sys/i386/i386/trap.c:1080
#22 0xc0bfc851 in Xint0x80_syscall ()
at /usr/src/sys/i386/i386/exception.s:266
#23 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)
Are you sure that your scripts produce the backtrace you'd posted ? I cannot reproduce that here...
Well, about the problem i've found ("radix node disappeared") when removing a nonexistent route (route delete x.y.w.z/24 - where x.y.w.z/24 is _not_ in the route table), it was related to the code that check for a gateway when there are multiple gateways for a route, which clearly was not the case.
After some thought i've crafted the following patch which fix the "radix node disappeared" problem (for me obviously...), can you try your scripts with this patch ? Not sure yet if this is related to the first problem you'd reported.
Thanks,
Luiz
For bugs matching the following criteria: Status: In Progress Changed: (is less than) 2014-06-01 Reset to default assignee and clear in-progress tags. Mail being skipped Keyword: crash – in lieu of summary line prefix: [panic] * bulk change for the keyword * summary lines may be edited manually (not in bulk). Keyword descriptions and search interface: <https://bugs.freebsd.org/bugzilla/describekeywords.cgi> |