Bug 194681 - [geom] a race in gmirror cause kernel panic during shutdown
Summary: [geom] a race in gmirror cause kernel panic during shutdown
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 9.2-RELEASE
Hardware: i386 Any
: --- Affects Only Me
Assignee: freebsd-geom (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-29 11:32 UTC by alexandre.martins
Modified: 2023-05-23 20:09 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description alexandre.martins 2014-10-29 11:32:58 UTC
There is a race in the shutdown process of gmirror.
Time to time, a synchronization completion event rise after the gmirror device was destroyed


[1510] Waiting (max 60 seconds) for system process `vnlru' to stop...done
[1510] Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
[1510] 
[1510] Waiting (max 60 seconds) for system process `syncer' to stop...Syncing disks, vnodes remaining...0 0 0 done
[1512] All buffers synced.
[1513] Uptime: 25m13s
[1513] GEOM_MIRROR: Device gm0: rebuilding provider ad0 stopped.
[1513] GEOM_MIRROR: Device gm0: provider mirror/gm0 destroyed.
[1513] kernel trap 12 with interrupts disabled
[1513] 
[1513] 
[1513] Fatal trap 12: page fault while in kernel mode
[1513] cpuid = 2; apic id = 12
[1513] fault virtual address    = 0xf000f858
[1513] fault code               = supervisor read, page not present
[1513] instruction pointer      = 0x20:0x407574af
[1513] stack pointer            = 0x28:0x86e2da98
[1513] frame pointer            = 0x28:0x86e2daa0
[1513] code segment             = base rx0, limit 0xfffff, type 0x1b
[1513]                  = DPL 0, pres 1, def32 1, gran 1
[1513] processor eflags = resume, IOPL = 0
[1513] current process          = 13 (g_up)
[1513] Physical memory: 3050 MB
[1513] Dumping 1256 MB: 1241 1225 1209 1193 1177 1161 1145 1129 1113 1097 1081 1065 1049 1033 1017 1001 985 969 953 937 921 905 889 873 857 841 825 809 793 777 761 745 729 713 697 681 665 649 633 617 601 585 569 553 537 521 505 489 473 457 441 425 409 393 377 361 345 329 313 297 281 265 249 233 217 201 185 169 153 137 121 105 89 73 57 41 25 9


gdb$ bt
#0  doadump (textdump=0x86e2d638) at pcpu.h:249
#1  0x404b7e19 in db_fncall (dummy1=0x407574af, dummy2=0x0, dummy3=0xfffffffe, dummy4=0x86e2d6cc "���\206") at ../../../ddb/db_command.c:573
#2  0x404b824f in db_command (last_cmdp=0x40a1865c, cmd_table=0x0, dopager=0x0) at ../../../ddb/db_command.c:449
#3  0x404b8304 in db_command_script (command=0x40a19585 "call doadump()") at ../../../ddb/db_command.c:520
#4  0x404bc740 in db_script_exec (scriptname=0x4091f3aa "kdb.enter.default", warnifnotfound=<value optimized out>) at ../../../ddb/db_script.c:302
#5  0x404bc83b in db_script_kdbenter (eventname=0x409a480e "unknown") at ../../../ddb/db_script.c:325
#6  0x404ba358 in db_trap (type=0xc, code=0x0) at ../../../ddb/db_main.c:230
#7  0x406d8af6 in kdb_trap (type=0xc, code=0x0, tf=0x86e2da58) at ../../../kern/subr_kdb.c:649
#8  0x408d75af in trap_fatal (frame=0x86e2da58, eva=0xf000f858) at ../../../i386/i386/trap.c:1035
#9  0x408d768d in trap_pfault (frame=0x86e2da58, usermode=0x0, eva=0xf000f858) at ../../../i386/i386/trap.c:859
#10 0x408d872b in trap (frame=0x86e2da58) at ../../../i386/i386/trap.c:555
#11 0x408c1fbc in calltrap () at ../../../i386/i386/exception.s:170
#12 0x407574af in strlen (str=0xf000f859 <Address 0xf000f859 out of bounds>) at ../../../libkern/strlen.c:98
#13 0x406dcd2e in kvprintf (fmt=0x4095dd45 " @ %s:%d", func=0x406dbff0 <snprintf_func>, arg=0x86e2dbdc, radix=0xa, ap=0x86e2dc20 "L8\225@�\003") at ../../../kern/subr_prf.c:823
#14 0x406dd5bb in vsnprintf (str=0x40c85100 "mtx_lock() of spin mutex ", size=0x100, format=0x4095dd2a "mtx_lock() of spin mutex %s @ %s:%d", ap=0x86e2dc1c "Y�") at ../../../kern/subr_prf.c:556
#15 0x406a075e in panic (fmt=0x4095dd2a "mtx_lock() of spin mutex %s @ %s:%d") at ../../../kern/kern_shutdown.c:713
#16 0x4068e638 in _mtx_lock_flags (m=0x54, opts=0x0, file=0x4095384c "../../../geom/mirror/g_mirror.c", line=0x3dc) at ../../../kern/kern_mutex.c:204
#17 0x406315eb in g_mirror_sync_done (bp=0x8d865854) at ../../../geom/mirror/g_mirror.c:988
#18 0x40723adc in biodone (bp=0x8d865854) at ../../../kern/vfs_bio.c:3650
#19 0x40626f08 in g_io_schedule_up (tp=0x8ce9b2f0) at ../../../geom/geom_io.c:805
#20 0x40627acd in g_up_procbody (arg=0x0) at ../../../geom/geom_kern.c:97
#21 0x40670548 in fork_exit (callout=0x40627a30 <g_up_procbody>, arg=0x0, frame=0x86e2dd08) at ../../../kern/kern_fork.c:992
#22 0x408c2064 in fork_trampoline () at ../../../i386/i386/exception.s:279

gdb$ f 17
#17 0x406315eb in g_mirror_sync_done (bp=0x8d865854) at ../../../geom/mirror/g_mirror.c:988
988     ../../../geom/mirror/g_mirror.c: No such file or directory.
        in ../../../geom/mirror/g_mirror.c

gdb$ p bp->bio_from->geom->softc
$3 = (void *) 0x0


 980 static void
 981 g_mirror_sync_done(struct bio *bp)
 982 {
 983         struct g_mirror_softc *sc;
 984 
 985         G_MIRROR_LOGREQ(3, bp, "Synchronization request delivered.");
 986         sc = bp->bio_from->geom->softc;
 987         bp->bio_cflags = G_MIRROR_BIO_FLAG_SYNC;
 988 *crash* mtx_lock(&sc->sc_queue_mtx);
 989         bioq_disksort(&sc->sc_queue, bp);
 990         mtx_unlock(&sc->sc_queue_mtx);
 991         wakeup(sc);
 992 }



This problem look like : https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=162997
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2023-05-23 20:09:34 UTC
Sorry that this report didn't get any attention.  A large number of races relating to gmirror and synchronization in particular have been fixed since this was filed.  Please re-open if the panic is still reproducible in supported versions of FreeBSD.