Host: 14.1-RELEASE VM: FreeBSD 15.0-CURRENT (GENERIC) #0 main-n270847-42c545c5f8a2: Thu Jun 20 03:41:06 UTC 2024 (First snapshot build to include the 9pfs client) Steps to reproduce: 1. Download the UFS or ZFS VM-IMAGE from 2024-06-20 2. Create a directory to share such as /root/9p 3. Boot a VM with that directory: -s 3,virtio-9p,sharename=/root/9p/ 4. VM /boot/loader.conf virtio_p9fs_load=YES #vfs.root.mountfrom="p9fs:sharename" VM /etc/fstab sharename /mnt p9fs rw 0 0 5. Enjoy the VM dd(1) to the 9pfs share works reliably 6. VM: shutdown -r now or reboot Result: root@freebsd:~ # mount /dev/gpt/rootfs on / (ufs, local, soft-updates) devfs on /dev (devfs) /dev/gpt/efiboot0 on /boot/efi (msdosfs, local) sharename on /mnt (p9fs, local) root@freebsd:~ # reboot Jun 21 05:47:45 freebsd reboot[697]: rebooted by root Jun 21 05:47:45 freebsd syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 3 0 0 done All buffers synced. Kernel page fault with the following non-sleepable locks held: exclusive rw vfs hash (vfs hash) r = 0 (0xffffffff81803400) locked @ /usr/src/sys/kern/vfs_hash.c:146 stack backtrace: #0 0xffffffff80bb9c9c at witness_debugger+0x6c #1 0xffffffff80bbae79 at witness_warn+0x3e9 #2 0xffffffff8105fdb0 at trap_pfault+0x80 #3 0xffffffff81031448 at calltrap+0x8 #4 0xffffffff82173e5b at p9fs_cleanup+0x12b #5 0xffffffff82175fd7 at p9fs_reclaim+0x37 #6 0xffffffff81129192 at VOP_RECLAIM_APV+0x32 #7 0xffffffff80c3ecb9 at vgonel+0x3a9 #8 0xffffffff80c3e28a at vflush+0x34a #9 0xffffffff821739c3 at p9fs_unmount+0x73 #10 0xffffffff80c32ae5 at dounmount+0x7b5 #11 0xffffffff80c3f66a at vfs_unmountall+0x6a #12 0xffffffff80c0ff93 at bufshutdown+0x323 #13 0xffffffff80b446a3 at kern_reboot+0x703 #14 0xffffffff80b43f46 at sys_reboot+0x396 #15 0xffffffff810606d8 at amd64_syscall+0x158 #16 0xffffffff81031d5b at fast_syscall_common+0xf8 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x47 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80c2ba5e stack pointer = 0x28:0xfffffe00689c99f0 frame pointer = 0x28:0xfffffe00689c9a00 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 697 (reboot) rdi: ffffffff81803400 rsi: 0000000000000008 rdx: 0000000000000000 rcx: fffff80001fd7040 r8: 0000000000000003 r9: ffffffffffffffff rax: ffffffffffffffff rbx: fffff80001fd7000 rbp: fffffe00689c9a00 r10: 0000000000000000 r11: 0000000000000004 r12: fffff80001d9ec00 r13: 0000000000000000 r14: fffff80001fd7000 r15: fffff80001d9ed80 trap number = 12 panic: page fault cpuid = 0 time = 1718948875 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00689c96c0 vpanic() at vpanic+0x13f/frame 0xfffffe00689c97f0 panic() at panic+0x43/frame 0xfffffe00689c9850 trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00689c98b0 trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00689c9920 calltrap() at calltrap+0x8/frame 0xfffffe00689c9920 --- trap 0xc, rip = 0xffffffff80c2ba5e, rsp = 0xfffffe00689c99f0, rbp = 0xfffffe00689c9a00 --- vfs_hash_remove() at vfs_hash_remove+0x2e/frame 0xfffffe00689c9a00 p9fs_cleanup() at p9fs_cleanup+0x12b/frame 0xfffffe00689c9a40 p9fs_reclaim() at p9fs_reclaim+0x37/frame 0xfffffe00689c9a60 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x32/frame 0xfffffe00689c9a80 vgonel() at vgonel+0x3a9/frame 0xfffffe00689c9af0 vflush() at vflush+0x34a/frame 0xfffffe00689c9c30 p9fs_unmount() at p9fs_unmount+0x73/frame 0xfffffe00689c9c80 dounmount() at dounmount+0x7b5/frame 0xfffffe00689c9cf0 vfs_unmountall() at vfs_unmountall+0x6a/frame 0xfffffe00689c9d20 bufshutdown() at bufshutdown+0x323/frame 0xfffffe00689c9d70 kern_reboot() at kern_reboot+0x703/frame 0xfffffe00689c9db0 sys_reboot() at sys_reboot+0x396/frame 0xfffffe00689c9e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00689c9f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00689c9f30 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x337c8919491a, rsp = 0x337c85f24f28, rbp = 0x337c85f25cf0 --- KDB: enter: panic [ thread pid 697 tid 100078 ] Stopped at kdb_enter+0x33: movq $0,0x105f692(%rip) db> bt Tracing pid 697 tid 100078 td 0xfffff80001fb7000 kdb_enter() at kdb_enter+0x33/frame 0xfffffe00689c97f0 panic() at panic+0x43/frame 0xfffffe00689c9850 trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00689c98b0 trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00689c9920 calltrap() at calltrap+0x8/frame 0xfffffe00689c9920 --- trap 0xc, rip = 0xffffffff80c2ba5e, rsp = 0xfffffe00689c99f0, rbp = 0xfffffe00689c9a00 --- vfs_hash_remove() at vfs_hash_remove+0x2e/frame 0xfffffe00689c9a00 p9fs_cleanup() at p9fs_cleanup+0x12b/frame 0xfffffe00689c9a40 p9fs_reclaim() at p9fs_reclaim+0x37/frame 0xfffffe00689c9a60 VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x32/frame 0xfffffe00689c9a80 vgonel() at vgonel+0x3a9/frame 0xfffffe00689c9af0 vflush() at vflush+0x34a/frame 0xfffffe00689c9c30 p9fs_unmount() at p9fs_unmount+0x73/frame 0xfffffe00689c9c80 dounmount() at dounmount+0x7b5/frame 0xfffffe00689c9cf0 vfs_unmountall() at vfs_unmountall+0x6a/frame 0xfffffe00689c9d20 bufshutdown() at bufshutdown+0x323/frame 0xfffffe00689c9d70 kern_reboot() at kern_reboot+0x703/frame 0xfffffe00689c9db0 sys_reboot() at sys_reboot+0x396/frame 0xfffffe00689c9e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00689c9f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00689c9f30 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x337c8919491a, rsp = 0x337c85f24f28, rbp = 0x337c85f25cf0 ---
Created attachment 251902 [details] p9fs_vnops
Apparently there is some code duplication in p9fs_cleanup()? Maybe due to a bad merge or something? The attached patch fixed the panic for me.
According to, what I believe is, the original implementation, that code is really duplicated: https://github.com/Juniper/virtfs/blob/jnpr/virtfs/sys/dev/virtio/9pfs/virtfs_vnops.c#L97 I'll create a PR on Github today if nobody pushes a fix first. Michael, did you have the chance to test the fix?
(In reply to Danilo Egea Gondolfo from comment #3) File to patch: /usr/src/sys/fs/p9fs/p9fs_vnops.c Patching file /usr/src/sys/fs/p9fs/p9fs_vnops.c using Plan A... Hunk #1 succeeded at 125. done (buildkernel/installkernel) (boot) root@freebsd:~ # uname -a FreeBSD freebsd 15.0-CURRENT FreeBSD 15.0-CURRENT #0: Sat Jul 13 01:43:28 UTC 2024 root @t14:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 root@freebsd:~ # reboot Jul 13 01:46:23 freebsd syslogd: exiting on signal 15 Waiting (max 60 seconds) for system process `vnlru' to stop... done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining... 0 0 done All buffers synced. Uptime: 54s root@t14:~/imagine-work # sh: turning off NDELAY mode Yes! No page fault on shutdown! Hopefully that's it. Good work!
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=a6ca6dfd60b66eec563bd473d96b31f0be1de80a commit a6ca6dfd60b66eec563bd473d96b31f0be1de80a Author: Danilo Egea Gondolfo <danilo@FreeBSD.org> AuthorDate: 2024-07-09 19:07:18 +0000 Commit: Warner Losh <imp@FreeBSD.org> CommitDate: 2024-07-13 03:40:09 +0000 p9fs: remove duplicated code This code is using the vnode after it has been released and causing a panic when a p9fs shared volume is unmounted. In fact, it seems like it's just duplicated code left behind from a bad merge. PR: 279887 Reported by: Michael Dexter Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1323 sys/fs/p9fs/p9fs_vnops.c | 10 ---------- 1 file changed, 10 deletions(-)
The proposed patch appears to work and has been committed. Thank you Danilo and Warner! Closing as FIXED until further notice.