Bug 205343 - [panic] [geom] [gjournal] g_journal KASSERT triggers for stable/10
Summary: [panic] [geom] [gjournal] g_journal KASSERT triggers for stable/10
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.2-STABLE
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-geom (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-15 16:02 UTC by Eugene Grosbein
Modified: 2018-05-28 20:49 UTC (History)
3 users (show)

See Also:


Attachments
kernel.log (1.75 KB, text/plain)
2015-12-15 16:02 UTC, Eugene Grosbein
no flags Details
kgdb.log (5.33 KB, text/plain)
2015-12-15 16:03 UTC, Eugene Grosbein
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein 2015-12-15 16:02:23 UTC
Created attachment 164272 [details]
kernel.log

I run 10.2-STABLE/amd64 revision 291061 as my workstation.
It has three disks, two of them are mirrored and one single. There are several small and big UFS file systems and big ones are gjournalled, total four geom_journals.

It generally runs very stable but recently I switched to INVARIANTS/WITNESS-enabled kernel and today it paniced due to KASSERT triggered in the geom_journal.c code and I have kernel dump.

Kernel log and kgdb backtrace are attached.
Comment 1 Eugene Grosbein 2015-12-15 16:03:20 UTC
Created attachment 164273 [details]
kgdb.log
Comment 2 Eugene Grosbein 2017-02-28 10:59:48 UTC
I've upgraded this system to 11.0-STABLE/amd64 r313803 and eventually same panic occured again:

Unread portion of the kernel message buffer:
panic: poffset=59478637056 plength=6144 coffset=59478573056
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe023885e6f0
vpanic() at vpanic+0x186/frame 0xfffffe023885e770
kassert_panic() at kassert_panic+0x126/frame 0xfffffe023885e7e0
g_journal_optimize() at g_journal_optimize+0x33/frame 0xfffffe023885e820
g_journal_flush_send() at g_journal_flush_send+0xfd/frame 0xfffffe023885e860
g_journal_worker() at g_journal_worker+0x888/frame 0xfffffe023885eb70
fork_exit() at fork_exit+0x84/frame 0xfffffe023885ebb0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe023885ebb0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 21h55m19s
Dumping 887 out of 8161 MB:..2%..11%..22%..31% (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort)  (CTRL-C to abort) ..42%..51%..62%..71%..82%..91%

(kgdb) bt
#0  doadump (textdump=1) at pcpu.h:222
#1  0xffffffff80590ac5 in kern_reboot (howto=<value optimized out>)
    at /data2/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff805910a0 in vpanic (fmt=<value optimized out>, 
    ap=<value optimized out>) at /data2/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80590ed6 in kassert_panic (fmt=<value optimized out>)
    at /data2/src/sys/kern/kern_shutdown.c:649
#4  0xffffffff8051a003 in g_journal_optimize (head=<value optimized out>)
    at /data2/src/sys/geom/journal/g_journal.c:1034
#5  0xffffffff8051902d in g_journal_flush_send (sc=<value optimized out>)
    at /data2/src/sys/geom/journal/g_journal.c:1395
#6  0xffffffff80516578 in g_journal_worker (arg=<value optimized out>)
    at /data2/src/sys/geom/journal/g_journal.c:2186
#7  0xffffffff8055cc94 in fork_exit (
    callout=0xffffffff80515cf0 <g_journal_worker>, arg=0xfffff8000699b800, 
    frame=0xfffffe023885ebc0) at /data2/src/sys/kern/kern_fork.c:1040
#8  0xffffffff80814abe in fork_trampoline ()
    at /data2/src/sys/amd64/amd64/exception.S:611
#9  0x0000000000000000 in ?? ()
(kgdb) frame 4
#0  0x0000000000000000 in ?? ()
(kgdb) frame 4
#4  0xffffffff8051a003 in g_journal_optimize (head=<value optimized out>)
    at /data2/src/sys/geom/journal/g_journal.c:1034
1034                            KASSERT(pbp->bio_offset + pbp->bio_length < cbp->bio_offset,
(kgdb) l
1029                            continue;
1030                    }
1031                    /* Is this a neighbour bio? */
1032                    if (pbp->bio_offset + pbp->bio_length != cbp->bio_offset) {
1033                            /* Be sure that bios queue is sorted. */
1034                            KASSERT(pbp->bio_offset + pbp->bio_length < cbp->bio_offset,
1035                                ("poffset=%jd plength=%jd coffset=%jd",
1036                                (intmax_t)pbp->bio_offset,
1037                                (intmax_t)pbp->bio_length,
1038                                (intmax_t)cbp->bio_offset));
Comment 3 longwitz 2017-08-14 10:31:10 UTC
Please can you check if your problem still exists, after you have updated your system to use the commit r322179 ?

Andreas Longwitz
Comment 4 Eugene Grosbein freebsd_committer freebsd_triage 2017-08-14 10:43:23 UTC
(In reply to longwitz from comment #3)

I run stable/11 here, not head. Should I manually apply r322179 to stable/11 and test?
Comment 5 longwitz 2017-08-14 13:35:52 UTC
Yes, the commit r 322179 for HEAD works also for stable/10 and should do it for stable/11 too.
Comment 6 Eugene Grosbein freebsd_committer freebsd_triage 2017-08-15 13:28:18 UTC
(In reply to longwitz from comment #5)

r322179 applied just fine to stable/11 r322529. I will rebuild and run it and report back eventually.
Comment 7 Eugene Grosbein freebsd_committer freebsd_triage 2017-09-14 17:00:13 UTC
(In reply to longwitz from comment #3)

A months has passed since I run this patch with my 11.1-STABLE workstation having multiple gjournals with INVARIANTS-enabled kernel and problem seems to be gone.
Comment 8 Eitan Adler freebsd_committer freebsd_triage 2018-05-28 19:44:19 UTC
batch change:

For bugs that match the following
-  Status Is In progress 
AND
- Untouched since 2018-01-01.
AND
- Affects Base System OR Documentation

DO:

Reset to open status.


Note:
I did a quick pass but if you are getting this email it might be worthwhile to double check to see if this bug ought to be closed.
Comment 9 Eugene Grosbein freebsd_committer freebsd_triage 2018-05-28 20:49:14 UTC
Fixed and MFC'd 9 months ago.