r227015(sys/geom/geom_vfs.c) causes kernel panic at destroying geom providers. A system with only one gmirror provider does not affect this problem. The machines with mirroring slice level and having multiple gmirror names panic during shutdown. It occurs both i386 and amd64 system. i386 system shows: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x10 fault code = supervisor write, page not present instruction pointer = 0x20:0xc052e601 stack pointer = 0x28:0xc1b0ec80 frame pointer = 0x28:0xc1b0eca0 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 13 (g_event) (kgdb) bt #0 doadump (textdump=0) at pcpu.h:244 #1 0xc04a2bb3 in db_dump (dummy=-1068308991, dummy2=0, dummy3=-1, dummy4=0xc1b0ea0c "") at /usr/src/sys/ddb/db_command.c:537 #2 0xc04a25f1 in db_command (last_cmdp=0xc08b003c, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:448 #3 0xc04a2755 in db_command_loop () at /usr/src/sys/ddb/db_command.c:501 #4 0xc04a47dc in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 #5 0xc05c1bfd in kdb_trap (type=12, code=0, tf=0xc1b0ec40) at /usr/src/sys/kern/subr_kdb.c:625 #6 0xc07e6f9f in trap_fatal (frame=0xc1b0ec40, eva=16) at /usr/src/sys/i386/i386/trap.c:966 #7 0xc07e7099 in trap_pfault (frame=0xc1b0ec40, usermode=0, eva=16) at /usr/src/sys/i386/i386/trap.c:839 #8 0xc07e7ec7 in trap (frame=0xc1b0ec40) at /usr/src/sys/i386/i386/trap.c:558 #9 0xc07d3a6c in calltrap () at /usr/src/sys/i386/i386/exception.s:168 #10 0xc052e601 in g_vfs_orphan (cp=0xc1e79440) at atomic.h:246 #11 0xc052905d in g_run_events () at /usr/src/sys/geom/geom_event.c:211 #12 0xc052a648 in g_event_procbody (arg=0x0) at /usr/src/sys/geom/geom_kern.c:122 #13 0xc05631a2 in fork_exit (callout=0xc052a5e0 <g_event_procbody>, arg=0x0, frame=0xc1b0ed28) at /usr/src/sys/kern/kern_fork.c:995 #14 0xc07d3ae4 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 Fix: I don't have a solution but kernel with `svn sys/geom/geom_vfs.c -r 227014` does not panic. How-To-Repeat: setup more than 2 gmirror names and shutdown
Responsible Changed From-To: freebsd-bugs->freebsd-geom Over to maintainer(s).
I was tracking down a similar problem. My sparc64 machine with multiple gmirrors stopped rebooting, it hangs after the first mirror is destroyed, and never recovers. I have to reset it via LOM. I can confirm the hang is caused by r227015. Syncing disks, vnodes remaining...1 1 0 0 done GEOM_MIRROR: Device var: provider mirror/var destroyed. GEOM_MIRROR: Device var destroyed. *hang* Florian
Author: mav Date: Fri Dec 2 17:09:48 2011 New Revision: 228204 URL: http://svn.freebsd.org/changeset/base/228204 Log: Close race between geom destruction on g_vfs_close() when softc destroyed and g_vfs_orphan() call that tries to access softc, intruced at r227015. PR: kern/162997 Modified: head/sys/geom/geom_vfs.c Modified: head/sys/geom/geom_vfs.c ============================================================================== --- head/sys/geom/geom_vfs.c Fri Dec 2 15:47:05 2011 (r228203) +++ head/sys/geom/geom_vfs.c Fri Dec 2 17:09:48 2011 (r228204) @@ -169,8 +169,10 @@ g_vfs_orphan(struct g_consumer *cp) g_topology_assert(); gp = cp->geom; - sc = gp->softc; g_trace(G_T_TOPOLOGY, "g_vfs_orphan(%p(%s))", cp, gp->name); + sc = gp->softc; + if (sc == NULL) + return; mtx_lock(&sc->sc_mtx); sc->sc_orphaned = 1; destroy = (sc->sc_active == 0); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
Hello, I don't like to crash real machine many times and build test environment on qemu. md0p2a is labeled gm0 and md0p2h is gm1, and mount as UFS2. After sysctl kern.geom.debugflags=7, machine was rebooted. The copy from console before panic is here. open delta:[r-1w-1e-3] old:[r2w2e6] provider:[r2w2e6] 0xc14eac00(md0) g_post_event_x(0xc052c830, 0xc166c300, 2, 0) ref 0xc166c300 g_post_event_x(0xc0a03e40, 0xc1446b00, 2, 0) g_wither_geom(0xc17ffa80(gm1.sync)) GEOM_MIRROR: Device gm1 destroyed. g_wither_geom(0xc17ffb00(gm1)) g_orphan_register(mirror/gm1) g_vfs_orphan(0xc1800400(ffs.mirror/gm1)) kernel trap 12 with interrupts disabled The situation looks like this: gm1 was destroyed in g_vfs_close() and then g_vfs_orphan() was called to manipulate gm1. The function g_vfs_close() was freed softc and g_vfs_orphan() would like to use softc already freed and it causes panic. I think that malloc() in g_vfs_open() and free() in g_vfs_close() for mtx_lock is not valid method. malloc() should not be used, or fee() should be used in other function. Or correct other code which never call destroyed provider. -- kaho Toshikazu
I missed r228204 and it makes machine reboot without panic. Thanks. -- Kaho Toshikazu
State Changed From-To: open->closed Problem fixed by r228204.