Bug 209392

Summary: Panic on FreeBSD guest on bhyve
Product: Base System Reporter: Piotr Kubaj <pkubaj>
Component: miscAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: Closed Overcome By Events    
Severity: Affects Only Me CC: gonzo, grehan
Priority: ---    
Version: 10.3-RELEASE   
Hardware: amd64   
OS: Any   
Description Flags
info file
core.txt file none

Description Piotr Kubaj freebsd_committer 2016-05-09 10:18:58 UTC
I've got a bhyve host (10.3-RELEASE-p2) running bhyve guest of FreeBSD 10.3-RELEASE-p2.

The guest is supposed to be a test box for ports. When I run 'portsnap fetch extract', only a part of ports extracts. Then, it stops and after some time the guest panics (host is unaffected).

Backtrace of vmcore is:
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff80950cc2 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:486
#2  0xffffffff809510a5 in vpanic (fmt=<value optimized out>, 
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:889
#3  0xffffffff80950f33 in panic (fmt=0x0)
    at /usr/src/sys/kern/kern_shutdown.c:818
#4  0xffffffff80937097 in _mtx_lock_spin_cookie (c=<value optimized out>, 
    tid=<value optimized out>, opts=<value optimized out>, 
    file=<value optimized out>, line=<value optimized out>)
    at /usr/src/sys/kern/kern_mutex.c:568
#5  0xffffffff80d44cd1 in smp_tlb_shootdown (vector=246, 
    pmap=0xffffffff816cd5e0, addr1=18446741876820213760, 
    addr2=18446741876820217856) at /usr/src/sys/amd64/amd64/mp_machdep.c:1164
#6  0xffffffff80d46b7c in pmap_invalidate_range (pmap=<value optimized out>, 
    sva=<value optimized out>, eva=<value optimized out>)
    at /usr/src/sys/amd64/amd64/pmap.c:1515
#7  0xffffffff809df4c0 in allocbuf (bp=0xfffffe007b9f9848, size=0)
    at /usr/src/sys/kern/vfs_bio.c:4334
#8  0xffffffff809e0360 in getnewbuf (maxsize=<value optimized out>, 
    gbflags=<value optimized out>) at /usr/src/sys/kern/vfs_bio.c:2198
#9  0xffffffff809dd6b1 in getblk (vp=0xfffff800029eace8, blkno=3847040, 
    size=32768, slpflag=0, slptimeo=0, flags=<value optimized out>)
---Type <return> to continue, or q <return> to quit---
    at /usr/src/sys/kern/vfs_bio.c:3275
#10 0xffffffff809de19d in breadn_flags (vp=0xfffff800029eace8, blkno=0, 
    size=0, rablkno=0x0, rabsize=0x0, cnt=0, cred=0xfffffe0093505868, flags=0, 
    bpp=0xfffffe0093505868) at /usr/src/sys/kern/vfs_bio.c:1130
#11 0xffffffff80b860f1 in ffs_update (vp=0xfffff80002f47938, waitfor=0)
    at /usr/src/sys/ufs/ffs/ffs_inode.c:111
#12 0xffffffff80baef67 in ffs_sync (mp=0xfffff800029d1330, waitfor=Cannot access memory at address 0x1
    at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1461
#13 0xffffffff809fb426 in sync_fsync (ap=<value optimized out>)
    at /usr/src/sys/kern/vfs_subr.c:3857
#14 0xffffffff80e81af7 in VOP_FSYNC_APV (vop=<value optimized out>, 
    a=<value optimized out>) at vnode_if.c:1330
#15 0xffffffff809fbe1b in sched_sync () at vnode_if.h:549
#16 0xffffffff8091a4ea in fork_exit (callout=0xffffffff809fba70 <sched_sync>, 
    arg=0x0, frame=0xfffffe0093505ac0) at /usr/src/sys/kern/kern_fork.c:1027
#17 0xffffffff80d3be0e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:611
#18 0x0000000000000000 in ?? ()

My CPU's are two AMD Opteron(tm) Processor 6262 HE.

Other OS's work just fine on bhyve. I use CentOS to compile some stuff and there are no issues. The same can be said about OpenBSD, so only FreeBSD has problems.
Comment 1 Piotr Kubaj freebsd_committer 2016-05-09 10:21:52 UTC
Created attachment 170140 [details]
info file
Comment 2 Piotr Kubaj freebsd_committer 2016-05-09 10:23:02 UTC
Created attachment 170141 [details]
core.txt file
Comment 3 Peter Grehan freebsd_committer 2016-05-22 00:30:57 UTC
Thanks for the report.

The panic was "spin lock held too long" which usually indicates that one of the vCPUs wasn't able to run for some amount of time.

Other than the 'portsnap extract' in the guest, was there much happening on the host system at the same time ? i.e. lots of other processes running, high memory utilization etc ?

(A simple interim workaround is to use a single-vCPU guest).
Comment 4 Piotr Kubaj freebsd_committer 2016-07-10 10:32:08 UTC
(In reply to Peter Grehan from comment #3)
Sorry, for the (very) late answer, I guess I just accidentally removed the email from BZ.

Nope, there was no other process in the background. It was a VM set up specifically as a testing environment for my ports, so it didn't have anything besides the base system. Also, I've even compiled Android on Debian VM (with 16 vCPU), so at least Linux doesn't have this problem.

ALSO, I have since upgraded my server to head (now running 11.0-ALPHA5) and the problem seems to be gone.