Bug 245414 - panic: softdep_setup_blkfree: not free
Summary: panic: softdep_setup_blkfree: not free
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords: panic, patch
Depends on:
Blocks:
 
Reported: 2020-04-07 01:32 UTC by sigsys
Modified: 2020-04-13 19:48 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description sigsys 2020-04-07 01:32:44 UTC
I'm getting reproducible panics on up-to-date -CURRENT (r359682) whenever a process coredumps on a UFS partition.  I tried to disable TRIM and it doesn't prevent the panic.

On a VM with a single UFS partition, the panic occurs a second or so after a coredump.  Sending CTRL+\ to any process triggers it every time so far.  On another non-VM system I've tested on it takes a few times before it panics.



<6>[46] pid 4472 (cat), jid 0, uid 1001: exited on signal 3 (core dumped)
[202] WARNING: /mnt: TRIM flag on fs but disk does not support TRIM
<6>[237] pid 23049 (cat), jid 0, uid 1001: exited on signal 3 (core dumped)
<6>[326] pid 26313 (cat), jid 0, uid 1001: exited on signal 3 (core dumped)
<6>[327] pid 26399 (cat), jid 0, uid 1001: exited on signal 3 (core dumped)
<6>[328] pid 26504 (cat), jid 0, uid 1001: exited on signal 3 (core dumped)
[329] panic: softdep_setup_blkfree: not free
[329] cpuid = 2
[329] time = 1586221947
[329] KDB: stack backtrace:
[329] db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe00ff8a6830
[329] vpanic() at vpanic+0x186/frame 0xfffffe00ff8a6880
[329] panic() at panic+0x43/frame 0xfffffe00ff8a68e0
[329] softdep_setup_blkfree() at softdep_setup_blkfree+0x292/frame 0xfffffe00ff8a6940
[329] ffs_blkfree_cg() at ffs_blkfree_cg+0x421/frame 0xfffffe00ff8a69f0
[329] ffs_blkfree() at ffs_blkfree+0xa4/frame 0xfffffe00ff8a6a50
[329] handle_workitem_freefrag() at handle_workitem_freefrag+0xbf/frame 0xfffffe00ff8a6aa0
[329] process_worklist_item() at process_worklist_item+0x272/frame 0xfffffe00ff8a6b40
[329] softdep_process_worklist() at softdep_process_worklist+0xc6/frame 0xfffffe00ff8a6b80
[329] softdep_flush() at softdep_flush+0xef/frame 0xfffffe00ff8a6bb0
[329] fork_exit() at fork_exit+0x7f/frame 0xfffffe00ff8a6bf0
[329] fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00ff8a6bf0
[329] --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
[329] KDB: enter: panic



Unread portion of the kernel message buffer:
panic: softdep_setup_blkfree: not free
cpuid = 5
time = 1586219503
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe003c7d3910
vpanic() at vpanic+0x182/frame 0xfffffe003c7d3960
panic() at panic+0x43/frame 0xfffffe003c7d39c0
softdep_setup_blkfree() at softdep_setup_blkfree+0x282/frame 0xfffffe003c7d3a20
ffs_blkfree_cg() at ffs_blkfree_cg+0x440/frame 0xfffffe003c7d3ac0
ffs_blkfree_trim_task() at ffs_blkfree_trim_task+0x54/frame 0xfffffe003c7d3b00
taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe003c7d3b80
taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 0xfffffe003c7d3bb0
fork_exit() at fork_exit+0x80/frame 0xfffffe003c7d3bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003c7d3bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:394
#2  0xffffffff8049b43a in db_dump (dummy=<optimized out>,
    dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>)
    at /usr/src/sys/ddb/db_command.c:575
#3  0xffffffff8049b1fc in db_command (last_cmdp=<optimized out>,
    cmd_table=<optimized out>, dopager=1) at /usr/src/sys/ddb/db_command.c:482
#4  0xffffffff8049af6d in db_command_loop ()
    at /usr/src/sys/ddb/db_command.c:535
#5  0xffffffff8049e168 in db_trap (type=<optimized out>, code=<optimized out>)
    at /usr/src/sys/ddb/db_main.c:253
#6  0xffffffff80c07dd4 in kdb_trap (type=3, code=0, tf=<optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:699
#7  0xffffffff8105cc08 in trap (frame=0xfffffe003c7d3840)
    at /usr/src/sys/amd64/amd64/trap.c:578
#8  <signal handler called>
#9  kdb_enter (why=0xffffffff811eb48a "panic", msg=<optimized out>)
    at /usr/src/sys/kern/subr_kdb.c:486
#10 0xffffffff80bbd8ae in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:902
#11 0xffffffff80bbd643 in panic (
    fmt=0xffffffff81c7c198 <cnputs_mtx> "\340\367\032\201\377\377\377\377")
    at /usr/src/sys/kern/kern_shutdown.c:839
#12 0xffffffff80ec9882 in softdep_setup_blkfree (mp=0xfffffe0034596100,
    bp=<optimized out>, blkno=7563456, frags=1, wkhd=0xfffff80004916b28)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:10917
#13 0xffffffff80eaa950 in ffs_blkfree_cg (ump=<optimized out>,
    fs=0xfffffe0034599000, devvp=0x12, bno=<optimized out>,
    size=<optimized out>, inum=<optimized out>, dephd=0xfffff80004916b28)
    at /usr/src/sys/ufs/ffs/ffs_alloc.c:2335
#14 0xffffffff80eabbd4 in ffs_blkfree_trim_task (ctx=0xfffff800048dcb00,
    pending=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_alloc.c:2420
#15 0xffffffff80c1bffa in taskqueue_run_locked (queue=0xfffff800032e1500)
    at /usr/src/sys/kern/subr_taskqueue.c:476
#16 0xffffffff80c1d084 in taskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:793
#17 0xffffffff80b7b440 in fork_exit (
    callout=0xffffffff80c1cff0 <taskqueue_thread_loop>,
    arg=0xfffff800032f5560, frame=0xfffffe003c7d3c00)
    at /usr/src/sys/kern/kern_fork.c:1051
#18 <signal handler called>
Comment 1 sigsys 2020-04-09 20:32:06 UTC
This does seem to have been caused by r359612.  It apparently fixed a bug in sys/ufs/ffs/ffs_inode.c (the MNT_SUJ flag being ignored), but it must also have exposed other bugs.

By forcing the old behavior, there are no more panics during my tests.  There's still something wrong going on there but way too complicated for me to figure out.

Index: sys/ufs/ffs/ffs_inode.c
===================================================================
--- sys/ufs/ffs/ffs_inode.c	(revision 359751)
+++ sys/ufs/ffs/ffs_inode.c	(working copy)
@@ -243,7 +243,11 @@
 	allerror = 0;
 	needextclean = 0;
 	softdeptrunc = 0;
+#if 0
 	journaltrunc = DOINGSUJ(vp);
+#else
+	journaltrunc = 0;
+#endif
 	if (journaltrunc == 0 && DOINGSOFTDEP(vp) && length == 0)
 		softdeptrunc = !softdep_slowdown(vp);
 	extblocks = 0;
Comment 2 Mark Johnston freebsd_committer 2020-04-13 19:48:19 UTC
Kirk committed the workaround in r359760, so the panics you were seeing ought to be hidden again for now.