Bug 215740

Summary: [bhyve] utilizing passthru breaks raw device usage with virtio-blk | ahci-hd
Product: Base System Reporter: Harald Schmalzbauer <bugzilla.freebsd>
Component: miscAssignee: freebsd-virtualization (Nobody) <virtualization>
Status: New ---    
Severity: Affects Some People CC: grehan
Priority: ---    
Version: 11.0-STABLE   
Hardware: amd64   
OS: Any   
Description Flags
Verbose boot log part 1, listing ACPI+CPU messages
Verbose boot log part 2, listing device probe messages
Verbose boot log part 3, listing rest (msi assignment + consumer attaching messages)
Vebose boot of ppt corruvting /dev/ada via bhyve-ahci none

Description Harald Schmalzbauer 2017-01-03 16:47:31 UTC
Using a passthru device with bhyve(8) for hosting guests with a physical device as storage backend (regardless if accessed through virtio-blk or ahci-hd)
corrupts guest-disk access, while file-backed ahci-hd (or virtio-blk) doesn't show that problem with passthru.

Steps to reproduce:

Use any harddrive containing any installed OS.
On the host: 'hd /dev/ada6 | less'
See MBR/PMBR code.

Use the same device (ada6 in that example) and conncet it to a FreeBSD-Live-DVD guest with a passthru device involved
(e. g.
bhyveload -d ./releases/ISO-IMAGES/11.0/FreeBSD-11.0-RELEASE-amd64-disc1.iso -S -m 2G ppttest && bhyve -u -A -H -P -s 0,hostbridge -s 3,ahci,cd:./releases/ISO-IMAGES/11.0/FreeBSD-11.0-RELEASE-amd64-disc1.iso,hd:/dev/ada6 -s 5,passthru,0/25/0 -s 31,lpc -l com1,stdio -S -m 2G -c 4 ppttest

Inside the guest, 'hd /dev/ada0 | less' doesn't work anymore (endless I/O)
Using 'dd if=/dev/ada6 count=1 | hd' shows only 0x0 instead of the output you saw on the host!

Simply repeating this without the passthru device in place solves the problem, you see exactly the same bytes inside the guest as on the host.
Comment 1 Peter Grehan freebsd_committer 2017-01-04 20:37:20 UTC
Would you be able to post a verbose dmesg (boot -v) ?
Comment 2 Harald Schmalzbauer 2017-01-05 09:25:49 UTC
Created attachment 178537 [details]
Verbose boot log part 1, listing ACPI+CPU messages
Comment 3 Harald Schmalzbauer 2017-01-05 09:26:38 UTC
Created attachment 178538 [details]
Verbose boot log part 2, listing device probe messages
Comment 4 Harald Schmalzbauer 2017-01-05 09:28:00 UTC
Created attachment 178539 [details]
Verbose boot log part 3, listing rest (msi assignment + consumer attaching messages)
Comment 5 Harald Schmalzbauer 2017-01-05 09:28:53 UTC
(In reply to Peter Grehan from comment #1)

Thanks for your attention!
Please find them attached, I hope my 3-part separation doesn't confuse anybody...

Comment 6 Harald Schmalzbauer 2017-05-24 18:46:25 UTC
Created attachment 182869 [details]
Vebose boot of ppt corruvting /dev/ada via bhyve-ahci

I tried to investigate further.
I can confirm that the same procedure also breaks UEFI booting:
X64 Exception Type - 000000000000000D     CPU Apic ID - 00000000 !!!!
RIP  - 000000007FB00FF5, CS  - 0000000000000028, RFLAGS - 0000000000010002
ExceptionData - 0000000000000000
RAX  - 0000000000000000, RCX - 0000000000000008, RDX - 0000000000000408
RBX  - 0000000000000001, RSP - 000000007FBEF468, RBP - 000000007FBEF7C8
RSI  - 000000007E549B2E, RDI - 000000007FBEF468
R8   - 000000007FBEF97C, R9  - 000000007FC16A9F, R10 - 00000000000003F8
R11  - 0000000000000040, R12 - 0000000000000000, R13 - 0000000000000000
R14  - 0000000000000000, R15 - 0000000000000000
DS   - 0000000000000008, ES  - 0000000000000008, FS  - 0000000000000008
GS   - 0000000000000008, SS  - 0000000000000008
CR0  - 0000000080000033, CR2 - 0000000000000000, CR3 - 000000007FB8E000
CR4  - 0000000000000668, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 000000007FB78E98 000000000000003F, LDTR - 0000000000000000
IDTR - 000000007F711018 0000000000000FFF,   TR - 0000000000000000

This happens as soon as I add a passthru device.
Attached is a verbose boot of an install-iso, with bhyve-ahci (responsive, dd to /dev/null leads to _real_ disk activity, unfortunately NULLs only, not the disk's data).
One thin I noticed is that I always get the message "pcib0: no PRT entry for 0.5.INTA" for any passthru device, regardless which slot I use.

Any help highly appreciated! How do others use passthru?

Comment 7 Harald Schmalzbauer 2017-06-11 10:56:48 UTC
Is there anybody who has checked whether the steps to reproduce show the reported results? Meaning, is there anybody who can confirm correct behaviour in that case?

I observed many more, at first sight completely unrelated strange errors, but all show up as soon as one condition is true: shutting down a bhyve-guest which had ppt in use.

Latest example:
panic: Memory modified after free 0xfffff8002486a030(48) val=0 @ 0xfffff8002486a030

cpuid = 5
KDB: stack backtrace:
#0 0xffffffff805bf327 at kdb_backtrace+0x67
#1 0xffffffff8057f266 at vpanic+0x186
#2 0xffffffff8057f2e3 at panic+0x43
#3 0xffffffff8082eaeb at trash_ctor+0x4b
#4 0xffffffff8082aaec at uma_zalloc_arg+0x52c
#5 0xffffffff813b54a6 at zio_add_child+0x26
#6 0xffffffff813b5a05 at zio_create+0x385
#7 0xffffffff813b6de2 at zio_vdev_child_io+0x232
#8 0xffffffff81396be0 at vdev_mirror_io_start+0x370
#9 0xffffffff813bc629 at zio_vdev_io_start+0x4a9
#10 0xffffffff813b76bc at zio_execute+0x36c
#11 0xffffffff813b6868 at zio_nowait+0xb8
#12 0xffffffff81396bec at vdev_mirror_io_start+0x37c
#13 0xffffffff813bc383 at zio_vdev_io_start+0x203
#14 0xffffffff813b76bc at zio_execute+0x36c
#15 0xffffffff805d10dd at taskqueue_run_locked+0x13d
#16 0xffffffff805d1e78 at taskqueue_thread_loop+0x88
#17 0xffffffff80543844 at fork_exit+0x84

#0  doadump (textdump=<value optimized out>) at pcpu.h:222
#1  0xffffffff8057ece0 in kern_reboot (howto=260) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff8057f2a0 in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff8057f2e3 in panic (fmt=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff8082eaeb in trash_ctor (mem=<value optimized out>, size=<value optimized out>, arg=<value optimized out>, flags=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/vm/uma_dbg.c:80
#5  0xffffffff8082aaec in uma_zalloc_arg (zone=0xfffff8001febc680, udata=0xfffff8001ad5f340, flags=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/vm/uma_core.c:2152
#6  0xffffffff813b54a6 in zio_add_child (pio=0xfffff8026f350b88, cio=0xfffff8002478b7b0)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:460
#7  0xffffffff813b5a05 in zio_create (pio=0xfffff8026f350b88, spa=<value optimized out>, txg=433989, bp=<value optimized out>, data=0xfffffe0058afa000, 
    size=1024, type=<value optimized out>, priority=ZIO_PRIORITY_ASYNC_WRITE, flags=<value optimized out>, vd=<value optimized out>, 
    offset=<value optimized out>, zb=<value optimized out>, pipeline=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:690
#8  0xffffffff813b6de2 in zio_vdev_child_io (pio=0xfffff8026f350b88, bp=<value optimized out>, vd=<value optimized out>, offset=325398016, 
    data=<value optimized out>, size=1024, type=<value optimized out>, flags=1048704, done=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1141
#9  0xffffffff81396be0 in vdev_mirror_io_start (zio=0xfffff8026f350b88)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:488
#10 0xffffffff813bc629 in zio_vdev_io_start (zio=0xfffff8026f350b88)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3143
#11 0xffffffff813b76bc in zio_execute (zio=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1681
#12 0xffffffff813b6868 in zio_nowait (zio=0xfffff8026f350b88)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1739
#13 0xffffffff81396bec in vdev_mirror_io_start (zio=0xfffff8026f7a7b88)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:488
#14 0xffffffff813bc383 in zio_vdev_io_start (zio=0xfffff8026f7a7b88)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3021
#15 0xffffffff813b76bc in zio_execute (zio=<value optimized out>)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1681
#16 0xffffffff805d10dd in taskqueue_run_locked (queue=0xfffff8001ab5a700) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:454
#17 0xffffffff805d1e78 in taskqueue_thread_loop (arg=<value optimized out>) at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:741
#18 0xffffffff80543844 in fork_exit (callout=0xffffffff805d1df0 <taskqueue_thread_loop>, arg=0xfffff8001aa90720, frame=0xfffffe043f609ac0)
    at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_fork.c:1042
#19 0xffffffff808598ae in fork_trampoline () at /usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:611
#20 0x0000000000000000 in ?? ()

I consider this as a severe problem, which shouldn't exist in 11.1-RELEASE.
If nobody can prove my findings wrong, using passthru should be disabled in RELENG_11_1 until it can be ruled out as source of these strange problems (some form of memory corruption).