Summary: | [linprocfs] panic: sleeping thread (Sleeping thread ... owns a non-sleepable lock) | ||
---|---|---|---|
Product: | Base System | Reporter: | Eirik Oeverby <ltning-freebsd> |
Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> |
Status: | Closed FIXED | ||
Severity: | Affects Only Me | ||
Priority: | Normal | ||
Version: | 6.1-STABLE | ||
Hardware: | Any | ||
OS: | Any |
Description
Eirik Oeverby
2006-06-18 10:20:19 UTC
You need to get a stack trace of the thread mentioned in the message that actually misbehaved. It actually should have been printed on the console when it panic'd since you have DDB in the kernel. -- John Baldwin Hm, I thought I had that in my report? I have to find a way to automate this. I've just moved the installation to a newly partitioned array, to make sure I have room for crash dumps, and I have the following in rc.conf: dumpdev="AUTO" dumpdir="/usr/crash" from my reading, that should be enough. In addition I added KDB_UNATTENDED to the kernel config, as I cannot risk that the box is down for hours before I have a chance to get to the debugger console every time. The question is: Will it actually do an automatic dump before rebooting, or will a dump *always* require manual intervention? And will a dump contain all necessary information? Thanks, /Eirik On Jun 29, 2006, at 1:26 PM, John Baldwin wrote: > You need to get a stack trace of the thread mentioned in the message > that actually misbehaved. It actually should have been printed on > the console when it panic'd since you have DDB in the kernel. > > -- > John Baldwin > > On Friday 30 June 2006 10:56, Eirik =D8verby wrote: > Hm, I thought I had that in my report? from the messages: Sleeping thread (tid 100082, pid 84236) owns a non-sleepable lock panic: sleeping thread cpuid = 0 KDB: enter: panic [thread pid 84235 tid 100474 ] This means pid 84236 misbehaved, and pid 84235 found that it had misbehaved: The sole stack trace is of pid 84235: db> bt Tracing pid 84235 tid 100474 td 0xffffff006f759000 ^^^^^ :) > I have to find a way to automate this. I've just moved the > installation to a newly partitioned array, to make sure I have room > for crash dumps, and I have the following in rc.conf: > > dumpdev="AUTO" > dumpdir="/usr/crash" > > from my reading, that should be enough. > > In addition I added KDB_UNATTENDED to the kernel config, as I cannot > risk that the box is down for hours before I have a chance to get to > the debugger console every time. The question is: Will it actually do > an automatic dump before rebooting, or will a dump *always* require > manual intervention? And will a dump contain all necessary information? You can get a stack trace from the dump using kgdb. You'll have to use 'info threads' in gdb to find the thread with the corresponding pid, then use the gdb 'thread' command to switch to that thread (using the gdb thread number, not the tid or pid) and then do a 'bt' or 'where' to get the trace. -- John Baldwin --Apple-Mail-4--497926925 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Hi again, I now have WITNESS and INVARIANTS in the kernel, and today it hung again. It looks somewhat different than before, but I am fairly certain it's the same error. I've tried to include everything asked for, except that I was silly enough to do a "call boot()" before dumping, so it only started dumping after the boot() call failed. Nevertheless; I do have a dump now, if it's of any use. Below you'll find the panic message, a bt, a ps, and then the output of a "c", which is exactly the same as the first message except it's not chopped off due to terminal size, and finally the panic resulting from the boot() call. /Eirik Expensive timeout(9) function: 0xffffffff802944a0(0xffffffff86ca7000) 0.016565232 s Expensive timeout(9) function: 0xffffffff804f3b00(0) 0.021799841 s Expensive timeout(9) function: 0xffffffff804455e0(0xffffff007ad7b000) 0.047916735 s malloc(M_WAITOK) of "1024", forcing M_NOWAIT with the following non- sleepable locks held: exclusive sleep mutex vm object (standard object) r = 0 (0xffffff0018f3fe00) locked @ /usr/src/sys/compat/linprocfs/lin9 KDB: enter: witness_warn [thread pid 77487 tid 100323 ] Stopped at kdb_enter+0x2f: nop db> db> bt Tracing pid 77487 tid 100323 td 0xffffff00531794c0 kdb_enter() at kdb_enter+0x2f witness_warn() at witness_warn+0x2e0 uma_zalloc_arg() at uma_zalloc_arg+0x1ee malloc() at malloc+0xab vn_fullpath() at vn_fullpath+0x56 linprocfs_doprocmaps() at linprocfs_doprocmaps+0x31e pfs_read() at pfs_read+0x2a7 VOP_READ_APV() at VOP_READ_APV+0x74 vn_read() at vn_read+0x232 dofileread() at dofileread+0x94 kern_readv() at kern_readv+0x60 read() at read+0x4a ia32_syscall() at ia32_syscall+0x167 Xint0x80_syscall() at Xint0x80_syscall+0x5d db> ps pid proc uid ppid pgrp flag stat wmesg wchan cmd 77510 ffffff0027630340 2 77485 77462 0004000 [RUNQ] mv 77509 ffffff0036d2a000 2 77466 77441 0006000 [RUNQ] dd 77502 ffffff00342d0340 0 77457 77457 0004000 [RUNQ] perl 77492 ffffff0072508680 90 77454 77454 0004000 [SLPQ sbwait 0xffffff000d7513a0][SLP] fetchmail 77489 ffffff004d91c9c0 91 77470 77470 0004000 [RUNQ] python2.4 77488 ffffff006a4e0680 91 77469 77469 0004000 [RUNQ] python2.4 77487 ffffff0034673680 0 77451 77451 0004000 [CPU 0] bdc 77485 ffffff0050a969c0 2 77462 77462 0004000 [SLPQ wait 0xffffff0050a969c0][SLP] sh 77480 ffffff004d2b7000 0 77452 77452 0004000 [CPU 1] perl5.8.8 77479 ffffff006a4e0000 0 77453 77453 0004000 [SLPQ vnread 0xffffffff9fe13b08][SLP] antivir 77470 ffffff00506f0680 91 77465 77470 0004000 [SLPQ wait 0xffffff00506f0680][SLP] sh 77469 ffffff0072508340 91 77459 77469 0004000 [SLPQ wait 0xffffff0072508340][SLP] sh 77466 ffffff0061b3c680 2 77441 77441 0004000 [SLPQ wait 0xffffff0061b3c680][SLP] sh 77465 ffffff00215f3000 0 1026 1026 0000000 [SLPQ piperd 0xffffff00209e4000][SLP] cron 77462 ffffff0014911680 2 77456 77462 0004000 [SLPQ wait 0xffffff0014911680][SLP] sh 77459 ffffff0036d2a680 0 1026 1026 0000000 [SLPQ piperd 0xffffff003c30c900][SLP] cron 77457 ffffff0036028000 0 77449 77457 0004000 [SLPQ wait 0xffffff0036028000][SLP] sh 77456 ffffff0037a4a9c0 0 1026 1026 0000000 [SLPQ piperd 0xffffff0036b82c00][SLP] cron 77454 ffffff004a2c2680 90 77448 77454 0004000 [SLPQ wait 0xffffff004a2c2680][SLP] sh 77453 ffffff0038ef2000 0 77446 77453 0004000 [SLPQ wait 0xffffff0038ef2000][SLP] sh 77452 ffffff0037a4a000 0 77447 77452 0004000 [SLPQ wait 0xffffff0037a4a000][SLP] sh 77451 ffffff004d91c680 0 77445 77451 0004000 [SLPQ wait 0xffffff004d91c680][SLP] sh 77449 ffffff00274119c0 0 1026 1026 0000000 [SLPQ piperd 0xffffff0060f41900][SLP] cron 77448 ffffff002f74f000 0 1026 1026 0000000 [SLPQ piperd 0xffffff0011eb3600][SLP] cron 77447 ffffff0049c40000 0 1026 1026 0000000 [SLPQ piperd 0xffffff0001a8f300][SLP] cron 77446 ffffff004b9c29c0 0 1026 1026 0000000 [SLPQ piperd 0xffffff0008eedc00][SLP] cron 77445 ffffff0036028340 0 1026 1026 0000000 [SLPQ piperd 0xffffff002c885300][SLP] cron 77441 ffffff0038ef2340 2 77438 77441 0004000 [SLPQ wait 0xffffff0038ef2340][SLP] sh 77438 ffffff005c5f0680 0 2842 2842 0000000 [SLPQ piperd 0xffffff0008547000][SLP] cron 76840 ffffff0059fc79c0 80 2553 2553 0000100 [RUNQ] httpd 76832 ffffff0035d0b9c0 80 2553 2553 0000100 [RUNQ] httpd 76831 ffffff00346739c0 80 2553 2553 0000100 [SLPQ select 0xffffffff809c6310][SLP] httpd 76830 ffffff0038ef29c0 80 2553 2553 0000100 [SLPQ select 0xffffffff809c6310][SLP] httpd 76829 ffffff005d0ce680 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76828 ffffff006053e680 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76827 ffffff00739e9340 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76823 ffffff0032f879c0 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76821 ffffff0061fd8000 80 2553 2553 0000100 [RUNQ] httpd 76810 ffffff00342d09c0 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76798 ffffff005342b340 80 2553 2553 0000100 [SLPQ accept 0xffffff004c6b6c66][SLP] httpd 76782 ffffff0032f87000 1000 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 76304 ffffff005115b340 125 2598 2598 0004100 [SLPQ select 0xffffffff809c6310][SLP] pickup 76163 ffffff0032f87680 0 812 812 0000000 [SLPQ select 0xffffffff809c6310][SLP] perl5.8.8 73976 ffffff005ff7d680 1000 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 72957 ffffff00342d0680 0 812 812 0000000 [SLPQ select 0xffffffff809c6310][SLP] perl5.8.8 72944 ffffff005d0ce000 1001 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 72008 ffffff00360289c0 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 72007 ffffff0074950680 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 71893 ffffff006c2ce000 6676 71892 71893 0004002 [SLPQ ttyin 0xffffff0060795810][SLP] bash 71892 ffffff002456d680 6676 71890 71890 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 71890 ffffff004b361000 0 1021 71890 0004100 [SLPQ sbwait 0xffffff002c38c870][SLP] sshd 71889 ffffff004a2c2340 90 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 66142 ffffff0074950000 1000 66141 66142 0004102 [SLPQ pause 0xffffff0074950068][SLP] screen 66141 ffffff0035ae9680 1000 66140 66141 0004002 [SLPQ wait 0xffffff0035ae9680][SLP] bash 66140 ffffff004db7a340 1000 66135 66135 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 66135 ffffff002c60b680 0 1021 66135 0004100 [SLPQ sbwait 0xffffff0049d3d3a0][SLP] sshd 58735 ffffff0051e7d9c0 10002 1 58735 0000000 [SLPQ select 0xffffffff809c6310][SLP] ezb 56610 ffffff0051cfc9c0 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35196 ffffff005fe7f000 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35192 ffffff006a4e0340 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35191 ffffff0061b3c000 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35190 ffffff00506f09c0 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35189 ffffff004e009000 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35188 ffffff002f74f340 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 35185 ffffff002c60b340 80 742 742 0000100 [SLPQ accept 0xffffff005d10c9fe][SLP] httpd 14960 ffffff0035ae99c0 6694 14958 14960 0004002 [SLPQ select 0xffffffff809c6310][SLP] irssi 14958 ffffff0034673000 6694 1 14958 0000100 [SLPQ select 0xffffffff809c6310][SLP] screen 12267 ffffff0074950340 2003 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 11765 ffffff00739e9000 2003 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 11500 ffffff006143e340 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 10483 ffffff0047119340 80 2353 2353 0000100 [SLPQ accept 0xffffff004c4d52c6][SLP] httpd 10481 ffffff003be99680 80 2353 2353 0000100 [SLPQ accept 0xffffff004c4d52c6][SLP] httpd 3244 ffffff0035b71000 1017 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 3231 ffffff0047f369c0 90 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 3228 ffffff005ba74000 1006 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 3227 ffffff0061b3c340 90 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 3100 ffffff006143e000 80 2353 2353 0000100 [SLPQ accept 0xffffff004c4d52c6][SLP] httpd 3029 ffffff003bec4340 1026 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 3007 ffffff004b9c2680 1017 979 979 0004000 [SLPQ select 0xffffffff809c6310][SLP] imapd 2962 ffffff0051cfc000 0 1497 2962 0004100 [SLPQ select 0xffffffff809c6310][SLP] fam 2945 ffffff004db7a9c0 0 2944 2945 0004002 [SLPQ ttyin 0xffffff0066152410][SLP] bash 2944 ffffff0032f87340 0 2941 2944 0004102 [SLPQ wait 0xffffff0032f87340][SLP] su 2941 ffffff0050f1d000 1000 2939 2941 0004002 [SLPQ wait 0xffffff0050f1d000][SLP] bash 2939 ffffff004b864340 1000 1 2939 0000100 [SLPQ select 0xffffffff809c6310][SLP] screen 2911 ffffff004a2c29c0 0 1 2911 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2901 ffffff0050a96000 0 1 2901 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2885 ffffff003ceda680 25 1 2885 0000100 [SLPQ pause 0xffffff003ceda6e8][SLP] sendmail 2879 ffffff0051264680 0 1 2879 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2868 ffffff0052df3340 0 1 2868 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2842 ffffff003be99000 0 1 2842 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2838 ffffff004d91c340 25 1 2838 0000100 [SLPQ pause 0xffffff004d91c3a8][SLP] sendmail 2834 ffffff0048118340 0 1 2834 0000100 [SLPQ select 0xffffffff809c6310][SLP] sendmail 2824 ffffff0049c40680 0 1 2824 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2818 ffffff003c4f8000 0 1 2818 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2805 ffffff005366a340 0 1 2805 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2801 ffffff0050f31340 25 1 2801 0000100 [SLPQ pause 0xffffff0050f313a8][SLP] sendmail 2797 ffffff003c4f8680 0 1 2797 0000100 [SLPQ select 0xffffffff809c6310][SLP] sendmail 2790 ffffff003be999c0 0 1 2790 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2784 ffffff003ca17340 0 1 2784 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2772 ffffff0050f319c0 0 1 2772 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2768 ffffff003ca179c0 88 2708 2707 000c080 (threaded) mysqld thread 0xffffff004809c720 ksegrp 0xffffff004de38240 [SLPQ kserel 0xffffff004de38298][SLP] thread 0xffffff000ac4d000 ksegrp 0xffffff004de38240 [SLPQ kserel 0xffffff004de38298][SLP] thread 0xffffff00380434c0 ksegrp 0xffffff004de38240 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff00397e0720 ksegrp 0xffffff004ccb9480 [SLPQ sigwait 0xffffffffb52c5a38][SLP] thread 0xffffff00397e0980 ksegrp 0xffffff003b203360 [SLPQ ksesigwait 0xffffff003ca17ba8][SLP] 2715 ffffff003ceda340 25 1 2715 0000100 [SLPQ pause 0xffffff003ceda3a8][SLP] sendmail 2711 ffffff003ceda9c0 0 1 2711 0000100 [SLPQ pause 0xffffff003cedaa28][SLP] sendmail 2708 ffffff003bec4680 88 1 2707 0004000 [SLPQ wait 0xffffff003bec4680][SLP] sh 2689 ffffff004cf9b9c0 125 2598 2598 0004100 [SLPQ select 0xffffffff809c6310][SLP] qmgr 2674 ffffff0050817680 0 1 2674 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2664 ffffff005ff7d340 1001 1 2664 0000000 [SLPQ accept 0xffffff0048155c66][SLP] svnserve 2662 ffffff00512649c0 88 2614 2603 000c080 (threaded) mysqld thread 0xffffff0046cbd4c0 ksegrp 0xffffff0050c43870 [SLPQ kserel 0xffffff0050c438c8][SLP] thread 0xffffff0038043be0 ksegrp 0xffffff0050c43870 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff004a031720 ksegrp 0xffffff0050c43870 [SLPQ kserel 0xffffff0050c438c8][SLP] thread 0xffffff0061935720 ksegrp 0xffffff003b203900 [SLPQ kserel 0xffffff003b203958][SLP] thread 0xffffff00474c5260 ksegrp 0xffffff004ccb9090 [SLPQ kserel 0xffffff004ccb90e8][SLP] thread 0xffffff003c17bbe0 ksegrp 0xffffff004ccb9120 [SLPQ sbwait 0xffffff003d154d40][SLP] thread 0xffffff003c121260 ksegrp 0xffffff004ccb91b0 [SLPQ sigwait 0xffffffffb52d9a38][SLP] thread 0xffffff004cf7e260 ksegrp 0xffffff004ccb9240 [SLPQ ksesigwait 0xffffff0051264ba8][SLP] 2631 ffffff0047119000 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 2630 ffffff0050f31000 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 2629 ffffff004d2b7680 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 2628 ffffff0047f36000 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 2627 ffffff00507159c0 80 2556 2556 0000100 [SLPQ accept 0xffffff00451b59fe][SLP] httpd 2614 ffffff0051264340 88 1 2603 0004000 [SLPQ wait 0xffffff0051264340][SLP] sh 2607 ffffff004a2c2000 80 1 2606 0000100 [SLPQ kqread 0xffffff004c3d5600][SLP] lighttpd 2598 ffffff005fe7f9c0 0 1 2598 0004100 [SLPQ select 0xffffffff809c6310][SLP] master 2556 ffffff0050817340 0 1 2556 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] httpd 2553 ffffff0050a96340 0 1 2553 0000000 [SLPQ select 0xffffffff809c6310][SLP] httpd 2552 ffffff004d2b79c0 80 2353 2353 0000100 [SLPQ accept 0xffffff004c4d52c6][SLP] httpd 2548 ffffff006253c340 80 2353 2353 0000100 [SLPQ accept 0xffffff004c4d52c6][SLP] httpd 2513 ffffff0051cfc680 88 2417 2330 000c080 (threaded) mysqld thread 0xffffff001455d260 ksegrp 0xffffff005bfb5480 [SLPQ kserel 0xffffff005bfb54d8][SLP] thread 0xffffff00563734c0 ksegrp 0xffffff002d4d4480 [SLPQ kserel 0xffffff002d4d44d8][SLP] thread 0xffffff00160c0720 ksegrp 0xffffff005bfb5480 [SLPQ kserel 0xffffff005bfb54d8][SLP] thread 0xffffff0013e554c0 ksegrp 0xffffff005bfb5480 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff004cf7e4c0 ksegrp 0xffffff004de38480 [SLPQ sigwait 0xffffffffb51b2a38][SLP] thread 0xffffff00474c5000 ksegrp 0xffffff004ccb92d0 [SLPQ ksesigwait 0xffffff0051cfc868][SLP] 2507 ffffff0048118000 0 1 2507 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2492 ffffff004b9c2340 0 1 2492 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2460 ffffff004b8649c0 194 1 2460 0000000 [SLPQ kqread 0xffffff0050da5a00][SLP] ircd 2457 ffffff0052df3680 0 1 2457 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2452 ffffff004db7a680 25 1 2452 0000100 [SLPQ pause 0xffffff004db7a6e8][SLP] sendmail 2437 ffffff0047e57000 0 1 2437 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2431 ffffff0047f36340 0 1 2431 0000100 [SLPQ select 0xffffffff809c6310][SLP] sendmail 2417 ffffff004b361680 88 1 2330 0004000 [SLPQ wait 0xffffff004b361680][SLP] sh 2396 ffffff0049c409c0 25 1 2396 0000100 [SLPQ pause 0xffffff0049c40a28][SLP] sendmail 2365 ffffff004cf9b000 0 1 2365 0000100 [SLPQ select 0xffffffff809c6310][SLP] sendmail 2353 ffffff0062199340 0 1 2353 0000000 [SLPQ select 0xffffffff809c6310][SLP] httpd 2338 ffffff005d0ce9c0 0 1 2338 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2301 ffffff004b9c2000 0 1 2301 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 2296 ffffff004b864000 0 1 2296 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 2275 ffffff0050f31680 0 1 2275 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 2241 ffffff004cf9b680 25 1 2241 0000100 [SLPQ pause 0xffffff004cf9b6e8][SLP] sendmail 2233 ffffff005ccef340 0 1 2233 0000100 [SLPQ select 0xffffffff809c6310][SLP] sendmail 2208 ffffff0050f1d340 88 2113 1989 000c080 (threaded) mysqld thread 0xffffff00214ec260 ksegrp 0xffffff00625be480 [SLPQ kserel 0xffffff00625be4d8][SLP] thread 0xffffff001afcabe0 ksegrp 0xffffff00625be480 [SLPQ kserel 0xffffff00625be4d8][SLP] thread 0xffffff000f979000 ksegrp 0xffffff00625be480 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff0057bd9720 ksegrp 0xffffff004ccb93f0 [SLPQ sigwait 0xffffffffb4fe6a38][SLP] thread 0xffffff005c10b720 ksegrp 0xffffff004de38510 [SLPQ ksesigwait 0xffffff0050f1d528][SLP] 2113 ffffff004d2b7340 88 1 1989 0004000 [SLPQ wait 0xffffff004d2b7340][SLP] sh 2093 ffffff005366a680 0 1 2093 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 1991 ffffff005115b680 0 1 1991 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1921 ffffff004db7a000 0 1 1921 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1874 ffffff005ba749c0 0 1 1874 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1821 ffffff006053e9c0 0 1 1821 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1677 ffffff0050715000 0 1 1677 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1669 ffffff005d0ce340 0 1 1669 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1583 ffffff005115b9c0 0 1 1583 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 1527 ffffff0052df39c0 0 1 1527 0004002 [SLPQ ttyin 0xffffff0079a09810][SLP] getty 1526 ffffff0050817000 0 1 1526 0004002 [SLPQ ttyin 0xffffff0079368410][SLP] getty 1525 ffffff0061834680 0 1 1525 0004002 [SLPQ ttyin 0xffffff0079368c10][SLP] getty 1524 ffffff0050f1d9c0 0 1 1524 0004002 [SLPQ ttyin 0xffffff0079e74c10][SLP] getty 1523 ffffff0050f1d680 0 1 1523 0004002 [SLPQ ttyin 0xffffff0078447010][SLP] getty 1522 ffffff006257d9c0 0 1 1522 0004002 [SLPQ ttyin 0xffffff0079e6a410][SLP] getty 1521 ffffff0050715340 0 1 1521 0004002 [SLPQ ttyin 0xffffff0079e74010][SLP] getty 1520 ffffff0059fc7680 0 1 1520 0004002 [SLPQ ttyin 0xffffff0079a49010][SLP] getty 1519 ffffff005b1b3340 0 1 1519 0004002 [SLPQ ttyin 0xffffff0079a49810][SLP] getty 1497 ffffff0051cfc340 0 1 1497 0000000 [SLPQ select 0xffffffff809c6310][SLP] inetd 1464 ffffff0051264000 0 1 46 0000002 [SLPQ select 0xffffffff809c6310][SLP] rxstack 1064 ffffff007ac01680 0 1063 1063 0000000 [SLPQ select 0xffffffff809c6310][SLP] fsavd 1063 ffffff005b8ca9c0 0 1 1063 0000000 [SLPQ select 0xffffffff809c6310][SLP] fsavd 1026 ffffff0059fc7000 0 1 1026 0000000 [SLPQ nanslp 0xffffffff8093c780][SLP] cron 1021 ffffff0052ea9680 0 1 1021 0000100 [SLPQ select 0xffffffff809c6310][SLP] sshd 1008 ffffff005c5f09c0 70 1007 1004 0000000 [SLPQ select 0xffffffff809c6310][SLP] postgres 1007 ffffff0052ea9000 70 1004 1004 0000000 [SLPQ select 0xffffffff809c6310][SLP] postgres 1006 ffffff0052ea99c0 70 1004 1004 0000000 [SLPQ select 0xffffffff809c6310][SLP] postgres 1004 ffffff005fe7f340 70 1 1004 0000000 [SLPQ select 0xffffffff809c6310][SLP] postgres 989 ffffff005342b9c0 0 988 989 0004002 [SLPQ select 0xffffffff809c6310][SLP] couriertcpd 988 ffffff005342b680 0 1 988 0000003 [SLPQ piperd 0xffffff00613a9600][SLP] courierlogger 979 ffffff005ff7d9c0 0 978 979 0004002 [SLPQ select 0xffffffff809c6310][SLP] couriertcpd 978 ffffff005ccef9c0 0 1 978 0000003 [SLPQ piperd 0xffffff005a1e8900][SLP] courierlogger 967 ffffff005ba74680 0 966 967 0004002 [SLPQ select 0xffffffff809c6310][SLP] couriertcpd 966 ffffff005b1b39c0 0 1 966 0000003 [SLPQ piperd 0xffffff005a1e8600][SLP] courierlogger 936 ffffff005366a9c0 389 1 936 0008181 (threaded) slapd thread 0xffffff003b892260 ksegrp 0xffffff0061c57d80 [SLPQ kserel 0xffffff0061c57dd8][SLP] thread 0xffffff00372904c0 ksegrp 0xffffff0061c57d80 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff0054338720 ksegrp 0xffffff0061c57d80 [SLPQ kserel 0xffffff0061c57dd8][SLP] thread 0xffffff0056a8d720 ksegrp 0xffffff00625be3f0 [SLPQ ksesigwait 0xffffff005366aba8][SLP] 861 ffffff00600fa000 0 1 860 0000000 [SLPQ select 0xffffffff809c6310][SLP] snmpd 817 ffffff0061fd8680 26 1 817 0000100 [SLPQ select 0xffffffff809c6310][SLP] exim-4.62-0 812 ffffff005ba74340 0 1 812 0000000 [SLPQ select 0xffffffff809c6310][SLP] perl5.8.8 809 ffffff005c5f0340 0 793 789 0000002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 808 ffffff005b1b3000 0 793 789 0000002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 807 ffffff006053e340 0 793 789 0000002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 806 ffffff006257d000 0 793 789 0000002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 805 ffffff006257d340 0 793 789 0000002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 804 ffffff005ccef000 88 774 773 000c082 (threaded) mysqld thread 0xffffff002e921980 ksegrp 0xffffff007ba04bd0 [SLPQ kserel 0xffffff007ba04c28][SLP] thread 0xffffff006419e260 ksegrp 0xffffff007ba04bd0 [SLPQ kserel 0xffffff007ba04c28][SLP] thread 0xffffff000c50c720 ksegrp 0xffffff007ba04bd0 [SLPQ select 0xffffffff809c6310][SLP] thread 0xffffff0057bd9000 ksegrp 0xffffff005bfb56c0 [SLPQ sigwait 0xffffffffb4fd7a38][SLP] thread 0xffffff005c10bbe0 ksegrp 0xffffff005bfb5750 [SLPQ ksesigwait 0xffffff005ccef1e8][SLP] 793 ffffff0062199680 0 789 789 0004002 [SLPQ select 0xffffffff809c6310][SLP] authdaemond 789 ffffff005fe7f680 0 1 789 0000003 [SLPQ piperd 0xffffff0060f41600][SLP] courierlogger 774 ffffff005c5f0000 88 1 773 0004002 [SLPQ wait 0xffffff005c5f0000][SLP] sh 765 ffffff005ccef680 106 1 765 0000100 [SLPQ pause 0xffffff005ccef6e8][SLP] freshclam 760 ffffff006257d680 106 1 760 0008180 (threaded) clamd thread 0xffffff004a031000 ksegrp 0xffffff007ba18090 [SLPQ kserel 0xffffff007ba180e8][SLP] thread 0xffffff005153a000 ksegrp 0xffffff007ba18090 [SLPQ kserel 0xffffff007ba180e8][SLP] thread 0xffffff00292da4c0 ksegrp 0xffffff007ba18090 [SLPQ accept 0xffffff005c77d796][SLP] thread 0xffffff0057bd94c0 ksegrp 0xffffff005bfb55a0 [SLPQ ksesigwait 0xffffff006257d868][SLP] 742 ffffff005b1b3680 0 1 742 0000000 [SLPQ select 0xffffffff809c6310][SLP] httpd 720 ffffff005b8ca680 0 1 720 0000000 [SLPQ select 0xffffffff809c6310][SLP] ntpd 643 ffffff00621999c0 0 638 638 0000000 [SLPQ - 0xffffff005d435400][SLP] nfsd 642 ffffff0061834340 0 638 638 0000000 [SLPQ - 0xffffff005fc5e200][SLP] nfsd 640 ffffff005ff7d000 0 638 638 0000000 [SLPQ - 0xffffff005fc5e400][SLP] nfsd 639 ffffff0061834000 0 638 638 0000000 [SLPQ - 0xffffff005f83f600][SLP] nfsd 638 ffffff0062199000 0 1 638 0000000 [SLPQ select 0xffffffff809c6310][SLP] nfsd 636 ffffff0061fd8340 0 1 636 0000000 [SLPQ select 0xffffffff809c6310][SLP] mountd 633 ffffff00600fa9c0 0 1 633 0000000 [SLPQ select 0xffffffff809c6310][SLP] amd 567 ffffff006253c9c0 0 1 567 0000000 [SLPQ select 0xffffffff809c6310][SLP] rpcbind 532 ffffff007ac019c0 53 1 532 0000100 [SLPQ select 0xffffffff809c6310][SLP] named 465 ffffff00600fa340 0 1 465 0000000 [SLPQ select 0xffffffff809c6310][SLP] syslogd 411 ffffff007ac01340 0 1 411 0000000 [SLPQ select 0xffffffff809c6310][SLP] devd 230 ffffff006253c000 0 0 0 0000204 [SLPQ mdwait 0xffffff00619fc000][SLP] md0 212 ffffff0061fd89c0 0 1 212 0000000 [SLPQ pause 0xffffff0061fd8a28][SLP] adjkerntz 45 ffffff007acf99c0 0 0 0 0000204 [SLPQ m:w1 0xffffff000109ec00][SLP] g_mirror boot0s1 44 ffffff007aafd000 0 0 0 0000204 [SLPQ - 0xffffffffb2adebe4][SLP] schedcpu 43 ffffff007aafd340 0 0 0 0000204 [SLPQ sdflush 0xffffffff809e2e00][SLP] softdepflush 42 ffffff007aafd680 0 0 0 0000204 [SLPQ vlruwt 0xffffff007aafd680][SLP] vnlru 41 ffffff007aafd9c0 0 0 0 0000204 [SLPQ syncer 0xffffffff8093c360][SLP] syncer 40 ffffff007ac00000 0 0 0 0000204 [SLPQ psleep 0xffffffff809c6bd8][SLP] bufdaemon 39 ffffff007ac00340 0 0 0 000020c [SLPQ pgzero 0xffffffff809e4780][SLP] pagezero 38 ffffff007ac00680 0 0 0 0000204 [SLPQ psleep 0xffffffff809e3e4c][SLP] vmdaemon 37 ffffff007ac009c0 0 0 0 0000204 [SLPQ psleep 0xffffffff809e3dfc][SLP] pagedaemon 36 ffffff007ac01000 0 0 0 0000204 [IWAIT] irq7: ppc0 35 ffffff007b9e1680 0 0 0 0000204 [SLPQ - 0xffffff007ab0e848][SLP] fdc0 34 ffffff007b9e19c0 0 0 0 0000204 [IWAIT] swi0: sio 33 ffffff007ab56000 0 0 0 0000204 [IWAIT] irq1: atkbd0 32 ffffff007ab56340 0 0 0 0000204 [SLPQ idle 0xffffffff86ca3000][SLP] aic_recovery1 31 ffffff007ab56680 0 0 0 0000204 [IWAIT] irq25: bge1 ahd1 30 ffffff007ab569c0 0 0 0 0000204 [SLPQ idle 0xffffffff86c9f000][SLP] aic_recovery0 29 ffffff007acf9000 0 0 0 0000204 [LOCK Giant ffffff000123fb00] irq24: bge0 ahd0 28 ffffff007acf9340 0 0 0 0000204 [IWAIT] irq15: ata1 27 ffffff007acf9680 0 0 0 0000204 [IWAIT] irq14: ata0 26 ffffff007ba0a680 0 0 0 0000204 [IWAIT] irq18: fxp0 25 ffffff007ba0a9c0 0 0 0 0000204 [SLPQ usbevt 0xffffffff86c9b420][SLP] usb1 24 ffffff007ba5d000 0 0 0 0000204 [SLPQ usbtsk 0xffffffff80937510][SLP] usbtask 23 ffffff007ba5d340 0 0 0 0000204 [SLPQ usbevt 0xffffffff86c99420][SLP] usb0 22 ffffff007ba5d680 0 0 0 0000204 [IWAIT] irq19: ohci0 ohci+ 21 ffffff007ba5d9c0 0 0 0 0000204 [IWAIT] irq9: acpi0 9 ffffff007b9e1000 0 0 0 0000204 [SLPQ - 0xffffff0000e85500][SLP] kqueue taskq 8 ffffff007b9e1340 0 0 0 0000204 [SLPQ - 0xffffff0000d71400][SLP] acpi_task2 7 ffffff007ba039c0 0 0 0 0000204 [SLPQ - 0xffffff0000d71400][SLP] acpi_task1 6 ffffff007ba28000 0 0 0 0000204 [SLPQ - 0xffffff0000d71400][SLP] acpi_task0 20 ffffff007ba28340 0 0 0 0000204 [IWAIT] swi6: task queue 19 ffffff007ba28680 0 0 0 0000204 [IWAIT] swi6: + 5 ffffff007ba289c0 0 0 0 0000204 [SLPQ - 0xffffff0000d71900][SLP] thread taskq 18 ffffff007ba0a000 0 0 0 0000204 [IWAIT] swi5: + 17 ffffff007ba0a340 0 0 0 0000204 [IWAIT] swi2: cambio 16 ffffff007ba5a340 0 0 0 0000204 [SLPQ - 0xffffffff809350c0][SLP] yarrow 4 ffffff007ba5a680 0 0 0 0000204 [SLPQ - 0xffffffff80937e08][SLP] g_down 3 ffffff007ba5a9c0 0 0 0 0000204 [SLPQ - 0xffffffff80937e00][SLP] g_up 2 ffffff007ba03000 0 0 0 0000204 [SLPQ - 0xffffffff80937df0][SLP] g_event 15 ffffff007ba03340 0 0 0 0000204 [IWAIT] swi3: vm 14 ffffff007ba03680 0 0 0 000020c [LOCK Giant ffffff000123fb00] swi4: clock sio 13 ffffff007ba33000 0 0 0 0000204 [IWAIT] swi1: net 12 ffffff007ba33340 0 0 0 000020c [Can run] idle: cpu0 11 ffffff007ba33680 0 0 0 000020c [Can run] idle: cpu1 1 ffffff007ba339c0 0 0 1 0004200 [SLPQ wait 0xffffff007ba339c0][SLP] init 10 ffffff007ba5a000 0 0 0 0000204 [SLPQ ktrace 0xffffffff80938f00][SLP] ktrace 0 ffffffff80937f60 0 0 0 0000200 [IWAIT] swapper 11503 ffffff0061b3c9c0 80 2607 2606 0006000 zomb[INACTIVE] perl 2663 ffffff004b864680 80 2607 2606 0006000 zomb[INACTIVE] perl 2632 ffffff0049c40340 80 2607 2606 0006000 zomb[INACTIVE] perl db> c malloc(M_WAITOK) of "1024", forcing M_NOWAIT with the following non- sleepable locks held: exclusive sleep mutex vm object (standard object) r = 0 (0xffffff0018f3fe00) locked @ /usr/src/sys/compat/linprocfs/ linprocfs.c:879 KDB: enter: witness_warn [thread pid 77487 tid 100323 ] Stopped at kdb_enter+0x2f: nop db> call boot() panic: blockable sleep lock (sleep mutex) eventhandler @ /usr/src/sys/ kern/subr_eventhandler.c:212 cpuid = 0 KDB: stack backtrace: panic() at panic+0x253 witness_checkorder() at witness_checkorder+0x5c3 _mtx_lock_flags() at _mtx_lock_flags+0x4a eventhandler_find_list() at eventhandler_find_list+0x32 boot() at boot+0x96 db_fncall() at db_fncall+0xb1 db_command_loop() at db_command_loop+0x3b5 db_trap() at db_trap+0x63 kdb_trap() at kdb_trap+0xa9 trap() at trap+0x1b4 calltrap() at calltrap+0x5 --- trap 0x3, rip = 0xffffffff8043d68f, rsp = 0xffffffffb5374390, rbp = 0xffffffffb53743a0 --- kdb_enter() at kdb_enter+0x2f witness_warn() at witness_warn+0x2e0 uma_zalloc_arg() at uma_zalloc_arg+0x1ee malloc() at malloc+0xab vn_fullpath() at vn_fullpath+0x56 linprocfs_doprocmaps() at linprocfs_doprocmaps+0x31e pfs_read() at pfs_read+0x2a7 VOP_READ_APV() at VOP_READ_APV+0x74 vn_read() at vn_read+0x232 dofileread() at dofileread+0x94 kern_readv() at kern_readv+0x60 read() at read+0x4a ia32_syscall() at ia32_syscall+0x167 Xint0x80_syscall() at Xint0x80_syscall+0x5d Uptime: 17h2m19s Dumping 2047 MB (2 chunks) chunk 0: 1MB (156 pages) ... ok Well, the problem is in linprocfs. It is trying to do some very expensive things while holding a mutex. Here's the code excerpt: if (lobj) { vp = lobj->handle; VM_OBJECT_LOCK(lobj); off = IDX_TO_OFF(lobj->size); if (lobj->type == OBJT_VNODE && lobj->handle) { vn_fullpath(td, vp, &name, &freename); VOP_GETATTR(vp, &vat, td->td_ucred, td); ino = vat.va_fileid; } flags = obj->flags; ref_count = obj->ref_count; shadow_count = obj->shadow_count; VM_OBJECT_UNLOCK(lobj); The VM_OBJECT_LOCK() is a mutex, and it can't really hold a mutex while calling things like vn_fullpath() and VOP_GETATTR() as those can block, etc; It needs to probably be reordered to grab copies of the object fields under the object lock, take a ref on the vnode (via vref) then do the vn_fullpath() and VOP_GETATTR() after dropping the vm object lock and finally do a vrele() to drop the vnode reference. I'm cc'ing des@ as he's the linprocfs maintainer and should be able to help with this further. -- John Baldwin Brilliant! Thanks! I have already disabled linprocfs, so I should expect the system to be stable now, then. PS: This was introduced between some of the RCs for 6.1, as far as I can remember. /Eirik State Changed From-To: open->feedback So this is really a linprocfs bug and you're not using linprocfs any more? Yes, it seems to have been a linprocfs bug. I no longer require linprocfs on any of the relevant systems, and after disabling it I haven't seen the problem again. Thanks, /Eirik State Changed From-To: feedback->open Feedback received. The problem seems to be in linprocfs. State Changed From-To: open->closed Fixed by r161094. |