Bug 267468 - bhyve: resuming after ^Z panics kernel: vcpu already halted
Summary: bhyve: resuming after ^Z panics kernel: vcpu already halted
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bhyve (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Konstantin Belousov
URL: https://reviews.freebsd.org/D37227
Keywords: bhyve, crash
Depends on:
Blocks:
 
Reported: 2022-10-31 14:26 UTC by Bjoern A. Zeeb
Modified: 2023-10-04 19:13 UTC (History)
5 users (show)

See Also:
koobs: mfc-stable13?
koobs: mfc-stable12?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bjoern A. Zeeb freebsd_committer freebsd_triage 2022-10-31 14:26:56 UTC
I had a bhyve instance suspended from shell job control (^Z) and on `fg` got a kernel panic.

Unread portion of the kernel message buffer:
panic: vcpu already halted
cpuid = 11
time = 1667225968
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe011507f8f0
vpanic() at vpanic+0x151/frame 0xfffffe011507f940
panic() at panic+0x43/frame 0xfffffe011507f9a0
vm_run() at vm_run+0xc9a/frame 0xfffffe011507faa0
vmmdev_ioctl() at vmmdev_ioctl+0x507/frame 0xfffffe011507fb40
devfs_ioctl() at devfs_ioctl+0xcd/frame 0xfffffe011507fb90
vn_ioctl() at vn_ioctl+0x131/frame 0xfffffe011507fca0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe011507fcc0
kern_ioctl() at kern_ioctl+0x202/frame 0xfffffe011507fd30
sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe011507fe00
amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe011507ff30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe011507ff30
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0xd976740e25a, rsp = 0xd97f5ce5e58, rbp = 0xd97f5ce5f10 ---



#13 0xffffffff80beb283 in panic (
    fmt=0xffffffff81e8de70 <cnputs_mtx> "\006\264)\201\377\377\377\377")
    at /usr/home/bz/Sources/worktrees/test/sys/kern/kern_shutdown.c:903
#14 0xffffffff82557c9a in vm_handle_hlt (vm=0xfffffe00e04ee000,
    vcpuid=<optimized out>, intr_disabled=<optimized out>,
    retu=<optimized out>)
    at /usr/home/bz/Sources/worktrees/test/sys/amd64/vmm/vmm.c:1357
#15 vm_run (vm=0xfffffe00e04ee000, vmrun=vmrun@entry=0xfffff8000ab25a00)
    at /usr/home/bz/Sources/worktrees/test/sys/amd64/vmm/vmm.c:1798
#16 0xffffffff8255a917 in vmmdev_ioctl (cdev=<optimized out>,
    cmd=<optimized out>, data=0xfffff8000ab25a00 "\001",
    fflag=<optimized out>, td=<optimized out>)
    at /usr/home/bz/Sources/worktrees/test/sys/amd64/vmm/vmm_dev.c:504
#17 0xffffffff80a7b25d in devfs_ioctl (ap=0xfffffe011507fba8)
    at /usr/home/bz/Sources/worktrees/test/sys/fs/devfs/devfs_vnops.c:933
#18 0xffffffff80cf56e1 in vn_ioctl (fp=0xfffff8000a4d6730,
    com=<optimized out>, data=0xfffff8000ab25a00,
    active_cred=0xfffff802d0ffb500, td=0x0)
    at /usr/home/bz/Sources/worktrees/test/sys/kern/vfs_vnops.c:1699
#19 0xffffffff80a7b90e in devfs_ioctl_f (fp=0xffffffff81e8de70 <cnputs_mtx>,
    com=0, data=0xffffffff81250ecd, cred=0x1, td=0x0)
    at /usr/home/bz/Sources/worktrees/test/sys/fs/devfs/devfs_vnops.c:864
#20 0xffffffff80c63502 in fo_ioctl (fp=0xfffff8000a4d6730, com=3230692865,
    data=0x1c200001, active_cred=0x1, td=<optimized out>)
    at /usr/home/bz/Sources/worktrees/test/sys/sys/file.h:365
#21 kern_ioctl (td=td@entry=0xfffffe0114b951e0, fd=<optimized out>,
    com=com@entry=3230692865,
    data=0x1c200001 <error: Cannot access memory at address 0x1c200001>,
    data@entry=0xfffff8000ab25a00 "\001")
    at /usr/home/bz/Sources/worktrees/test/sys/kern/sys_generic.c:803
#22 0xffffffff80c6324a in sys_ioctl (td=0xfffffe0114b951e0,
    uap=0xfffffe0114b955d8)
    at /usr/home/bz/Sources/worktrees/test/sys/kern/sys_generic.c:711
#23 0xffffffff810d13be in syscallenter (td=<optimized out>)
    at /usr/home/bz/Sources/worktrees/test/sys/amd64/amd64/../../kern/subr_syscall.c:189
#24 amd64_syscall (td=0xfffffe0114b951e0, traced=0)
    at /usr/home/bz/Sources/worktrees/test/sys/amd64/amd64/trap.c:1200
#25 <signal handler called>
#26 0x00000d976740e25a in ?? ()
Backtrace stopped: Cannot access memory at address 0xd97f5ce5e58


Sources were from (around) Oct 17.  Cannot say which head revision due to a local tree which has been integrated since.
Comment 1 Konstantin Belousov freebsd_committer freebsd_triage 2022-10-31 23:33:21 UTC
Please try https://reviews.freebsd.org/D37227
Comment 2 commit-hook freebsd_committer freebsd_triage 2022-11-01 21:01:01 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=4d447b30f7be761b0c2877513e79f484511a00a5

commit 4d447b30f7be761b0c2877513e79f484511a00a5
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-10-31 23:30:55 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-11-01 18:44:42 +0000

    vmm: do not leak halted_cpus bit after suspension

    Reported by:    bz
    PR:     267468
    Reviewed by:    markj
    Sponsored by:   The FreeBSD Foundation
    MFC after:      1 week
    Differential revision:  https://reviews.freebsd.org/D37227

 sys/amd64/vmm/vmm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
Comment 3 Kubilay Kocak freebsd_committer freebsd_triage 2022-11-02 02:01:24 UTC
^Triage: Assign to committer resolving. Does this affect stable{13,12} ?
Comment 4 commit-hook freebsd_committer freebsd_triage 2022-11-08 22:09:34 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=5aab578e3efe209f280d163e4370ce018fc8d619

commit 5aab578e3efe209f280d163e4370ce018fc8d619
Author:     Konstantin Belousov <kib@FreeBSD.org>
AuthorDate: 2022-10-31 23:30:55 +0000
Commit:     Konstantin Belousov <kib@FreeBSD.org>
CommitDate: 2022-11-08 22:08:25 +0000

    vmm: do not leak halted_cpus bit after suspension

    PR:     267468

    (cherry picked from commit 4d447b30f7be761b0c2877513e79f484511a00a5)

 sys/amd64/vmm/vmm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
Comment 5 Bjoern A. Zeeb freebsd_committer freebsd_triage 2022-11-15 12:29:53 UTC
I've not been able to reproduce this problem since but admitting I am not exercising it for long too often.  I'd suggest if all MFCs are handled to close this;  should I hit it again I'll follow-up and re-open.