Created attachment 164272 [details] kernel.log I run 10.2-STABLE/amd64 revision 291061 as my workstation. It has three disks, two of them are mirrored and one single. There are several small and big UFS file systems and big ones are gjournalled, total four geom_journals. It generally runs very stable but recently I switched to INVARIANTS/WITNESS-enabled kernel and today it paniced due to KASSERT triggered in the geom_journal.c code and I have kernel dump. Kernel log and kgdb backtrace are attached.
Created attachment 164273 [details] kgdb.log
I've upgraded this system to 11.0-STABLE/amd64 r313803 and eventually same panic occured again: Unread portion of the kernel message buffer: panic: poffset=59478637056 plength=6144 coffset=59478573056 cpuid = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe023885e6f0 vpanic() at vpanic+0x186/frame 0xfffffe023885e770 kassert_panic() at kassert_panic+0x126/frame 0xfffffe023885e7e0 g_journal_optimize() at g_journal_optimize+0x33/frame 0xfffffe023885e820 g_journal_flush_send() at g_journal_flush_send+0xfd/frame 0xfffffe023885e860 g_journal_worker() at g_journal_worker+0x888/frame 0xfffffe023885eb70 fork_exit() at fork_exit+0x84/frame 0xfffffe023885ebb0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe023885ebb0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Uptime: 21h55m19s Dumping 887 out of 8161 MB:..2%..11%..22%..31% (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) (CTRL-C to abort) ..42%..51%..62%..71%..82%..91% (kgdb) bt #0 doadump (textdump=1) at pcpu.h:222 #1 0xffffffff80590ac5 in kern_reboot (howto=<value optimized out>) at /data2/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff805910a0 in vpanic (fmt=<value optimized out>, ap=<value optimized out>) at /data2/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80590ed6 in kassert_panic (fmt=<value optimized out>) at /data2/src/sys/kern/kern_shutdown.c:649 #4 0xffffffff8051a003 in g_journal_optimize (head=<value optimized out>) at /data2/src/sys/geom/journal/g_journal.c:1034 #5 0xffffffff8051902d in g_journal_flush_send (sc=<value optimized out>) at /data2/src/sys/geom/journal/g_journal.c:1395 #6 0xffffffff80516578 in g_journal_worker (arg=<value optimized out>) at /data2/src/sys/geom/journal/g_journal.c:2186 #7 0xffffffff8055cc94 in fork_exit ( callout=0xffffffff80515cf0 <g_journal_worker>, arg=0xfffff8000699b800, frame=0xfffffe023885ebc0) at /data2/src/sys/kern/kern_fork.c:1040 #8 0xffffffff80814abe in fork_trampoline () at /data2/src/sys/amd64/amd64/exception.S:611 #9 0x0000000000000000 in ?? () (kgdb) frame 4 #0 0x0000000000000000 in ?? () (kgdb) frame 4 #4 0xffffffff8051a003 in g_journal_optimize (head=<value optimized out>) at /data2/src/sys/geom/journal/g_journal.c:1034 1034 KASSERT(pbp->bio_offset + pbp->bio_length < cbp->bio_offset, (kgdb) l 1029 continue; 1030 } 1031 /* Is this a neighbour bio? */ 1032 if (pbp->bio_offset + pbp->bio_length != cbp->bio_offset) { 1033 /* Be sure that bios queue is sorted. */ 1034 KASSERT(pbp->bio_offset + pbp->bio_length < cbp->bio_offset, 1035 ("poffset=%jd plength=%jd coffset=%jd", 1036 (intmax_t)pbp->bio_offset, 1037 (intmax_t)pbp->bio_length, 1038 (intmax_t)cbp->bio_offset));
Please can you check if your problem still exists, after you have updated your system to use the commit r322179 ? Andreas Longwitz
(In reply to longwitz from comment #3) I run stable/11 here, not head. Should I manually apply r322179 to stable/11 and test?
Yes, the commit r 322179 for HEAD works also for stable/10 and should do it for stable/11 too.
(In reply to longwitz from comment #5) r322179 applied just fine to stable/11 r322529. I will rebuild and run it and report back eventually.
(In reply to longwitz from comment #3) A months has passed since I run this patch with my 11.1-STABLE workstation having multiple gjournals with INVARIANTS-enabled kernel and problem seems to be gone.
batch change: For bugs that match the following - Status Is In progress AND - Untouched since 2018-01-01. AND - Affects Base System OR Documentation DO: Reset to open status. Note: I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Fixed and MFC'd 9 months ago.