Bug 256317

Summary: [bhyve] Assertion failure: (cq->qbase != NULL) in function pci_nvme_cq_update, file /usr/src/usr.sbin/bhyve/pci_nvme.c
Product: Base System Reporter: Cheolwoo Myung <cwmyung>
Component: bhyveAssignee: Chuck Tuffli <chuck>
Status: Closed FIXED    
Severity: Affects Only Me CC: chuck, jhb
Priority: ---    
Version: 13.0-RELEASE   
Hardware: Any   
OS: Any   
Bug Depends on:    
Bug Blocks: 256319, 256320, 256322    
Attachments:
Description Flags
add checks to avoid assertion none

Description Cheolwoo Myung 2021-06-01 04:46:43 UTC
To reproduce the bug, please follow the command (bhyve version 13.0):


```

$ bhyvectl --vm=reproVM --destroy

$ bhyve -s 2:0,ahci-hd,hyfuzz.img -s 3,nvme,./nvme.img -s 0:0,hostbridge -s 1:0,lpc -c 1 -m 512M -l bootrom,./BHYVE_UEFI.fd -HPA reproVM

```

File: https://drive.google.com/file/d/1irW37yqP1cR44v4BkWX5jMp_XhsRS6kn/view?usp=sharing
Comment 1 Chuck Tuffli freebsd_committer freebsd_triage 2021-06-01 19:41:06 UTC
Would you mind adding whatever content is in the link directly to this bug?
Comment 2 Cheolwoo Myung 2021-06-02 06:30:00 UTC
(In reply to Chuck Tuffli from comment #1)
The test case for the bug is too large to upload directly. Instead, can I write down the crash log and the input that triggers the bug?

# Crash Log
=====================================================================
Assertion failed: (cq->qbase != NULL), function pci_nvme_cq_update, file /usr/src/usr.sbin/bhyve/pci_nvme.c, line 858.

#0  0x0000000800a202ea in thr_kill () from /lib/libc.so.7
#1  0x0000000800995064 in raise () from /lib/libc.so.7
#2  0x0000000800a49f29 in abort () from /lib/libc.so.7
#3  0x0000000800977f81 in __assert () from /lib/libc.so.7
#4  0x00000000003ca3de in pci_nvme_cq_update (sc=<optimized out>, cq=0x616000001b80, cdw0=<optimized out>, cid=<optimized out>, sqid=0, status=2) at /usr/src/usr.sbin/bhyve/pci_nvme.c:858
#5  0x00000000003c1294 in pci_nvme_handle_admin_cmd (sc=<optimized out>, value=<optimized out>) at /usr/src/usr.sbin/bhyve/pci_nvme.c:1641
#6  0x00000000003bad51 in pci_nvme_handle_doorbell (ctx=<optimized out>, sc=0x0, idx=0, is_sq=<optimized out>, value=<optimized out>) at /usr/src/usr.sbin/bhyve/pci_nvme.c:2330
#7  0x00000000003b636d in pci_nvme_write_bar_0 (ctx=0x606000000140, sc=<optimized out>, offset=<optimized out>, size=4, value=<optimized out>) at /usr/src/usr.sbin/bhyve/pci_nvme.c:2423
#8  pci_nvme_write (ctx=0x606000000140, vcpu=<optimized out>, pi=<optimized out>, baridx=<optimized out>, offset=<optimized out>, size=4, value=3932165) at /usr/src/usr.sbin/bhyve/pci_nvme.c:2541
#9  0x0000000000397356 in pci_emul_mem_handler (ctx=0x606000000140, vcpu=0, dir=<optimized out>, addr=<optimized out>, size=0, val=<optimized out>, arg1=0x616000001580, arg2=0) at /usr/src/usr.sbin/bhyve/pci_emul.c:420
#10 0x0000000000345968 in mem_write (ctx=<optimized out>, vcpu=<optimized out>, gpa=<optimized out>, wval=<optimized out>, size=<optimized out>, arg=0x6080000052b8) at /usr/src/usr.sbin/bhyve/mem.c:162
#11 0x000000000043ff1e in emulate_mov (vm=0x606000000140, vcpuid=<optimized out>, gpa=<optimized out>, vie=<optimized out>, memread=<optimized out>, memwrite=<optimized out>, arg=<optimized out>)
    at /usr/src/sys/amd64/vmm/vmm_instruction_emul.c:615
#12 vmm_emulate_instruction (vm=<optimized out>, vcpuid=<optimized out>, gpa=<optimized out>, vie=<optimized out>, paging=<optimized out>, memread=<optimized out>, memwrite=0x345810 <mem_write>, memarg=0x6080000052b8)
    at /usr/src/sys/amd64/vmm/vmm_instruction_emul.c:1789
#13 0x000000000034418a in emulate_mem_cb (ctx=0x72656, vcpu=6, paddr=0, mr=0x0, arg=<optimized out>) at /usr/src/usr.sbin/bhyve/mem.c:238
#14 0x0000000000343ca9 in access_memory (ctx=<optimized out>, vcpu=0, paddr=<optimized out>, cb=<optimized out>, arg=<optimized out>) at /usr/src/usr.sbin/bhyve/mem.c:215
#15 0x000000000034357d in emulate_mem (ctx=<optimized out>, vcpu=<optimized out>, paddr=<optimized out>, vie=<optimized out>, paging=<optimized out>) at /usr/src/usr.sbin/bhyve/mem.c:251
#16 0x000000000030c240 in vmexit_inst_emul (ctx=<optimized out>, vmexit=0xd1df20 <vmexit>, pvcpu=0x7fffdcbe3de0) at /usr/src/usr.sbin/bhyve/bhyverun.c:784
#17 0x000000000030a566 in vm_loop (ctx=<optimized out>, vcpu=<optimized out>, startrip=<optimized out>) at /usr/src/usr.sbin/bhyve/bhyverun.c:924
#18 0x0000000000305899 in fbsdrun_start_thread (param=<optimized out>) at /usr/src/usr.sbin/bhyve/bhyverun.c:473
#19 0x000000080086e82b in ?? () from /lib/libthr.so.3
#20 0x0000000000000000 in ?? ()


# Crash Input
=====================================================================

```c
// bar_addr[0] : BAR0 of NVMe

writel(0x0, bar_addr[0]+0x14);
writel(0x180000f0, bar_addr[0]+0x34);
writel(0x460071, bar_addr[0]+0x14);
writel(0x3c0005, bar_addr[0]+0x1000);
writel(0x5, bar_addr[0]+0x1004);

```
Comment 3 Chuck Tuffli freebsd_committer freebsd_triage 2021-06-03 03:20:52 UTC
Thank you for providing the test case. It has shown me 4 places where error checking is missing. Is it possible to get the source code for this test case or the entire suite?

Also, once I have fixes, how would you like to test them? E.g. are you comfortable building from sources etc.?
Comment 4 Cheolwoo Myung 2021-06-07 16:48:52 UTC
(In reply to Chuck Tuffli from comment #3)
It's hard to share right now, but I'll share the source in the near future.

And if you provide me with a patch, I'll test them with my fuzzing tool and let you know.
Comment 5 Chuck Tuffli freebsd_committer freebsd_triage 2021-06-13 23:50:32 UTC
I now have a stand-alone reproduction:

login: root
Jun 13 22:08:28 freebsd login[674]: ROOT LOGIN (root) ON ttyu0
root@freebsd:~ # uname -a
FreeBSD freebsd 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n247271-597cc550e7b: Thu Jun 10 06:50
:55 UTC 2021     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64root@freebsd:~ # devctl detach pci0:0:4:1
nvd0: detached
nvme0: detached
root@freebsd:~ # ./bz256317
VID=0xfb5d DID=0xa0a
BARs
        [0] 0xc0004000 : 16384
        [1] 0 : 0
        [2] 0 : 0
        [3] 0 : 0
        [4] 0xc0008000 : 8192
        [5] 0 : 0
Mapped 0xc0004000 to 0x1000
VER=1.3.0
writel(0, 0x14)
writel(0x180000f0, 0x34)
writel(0x460071, 0x14)
RDY 0->1 in 0 seconds
writel(0xAssertion failed: (cq->qbase != NULL), function pci_nvme_cq_update, file /build/903526542ac/src/usr.sbin/bhyve/pci_nvme.c, line 860.
Abort
Comment 6 Chuck Tuffli freebsd_committer freebsd_triage 2021-06-13 23:55:58 UTC
Created attachment 225787 [details]
add checks to avoid assertion

Add checks for
 - bad AQ sizes
 - bad AQ vm_map_gpa
 - doorbell writes prior to RDY
 - doorbell writes to uninitialized queue
 - CSTS.RDY if CFS set
Comment 7 Chuck Tuffli freebsd_committer freebsd_triage 2021-06-13 23:56:38 UTC
Please see if the attached patch fixes the issue you see.
Comment 8 Cheolwoo Myung 2021-06-14 06:27:48 UTC
(In reply to Chuck Tuffli from comment #7)
I checked your attached patch using reproducer, and it avoids assert failure.

Thanks.
Comment 9 John Baldwin freebsd_committer freebsd_triage 2022-08-12 00:32:52 UTC
Chuck, are you planning to merge this fix to main?
Comment 10 Chuck Tuffli freebsd_committer freebsd_triage 2022-08-12 16:15:23 UTC
(In reply to John Baldwin from comment #9)
Yes, I was blocked on some unrelated bhyve issues that now are fixed. This should get committed in the next several days.
Comment 11 commit-hook freebsd_committer freebsd_triage 2022-08-14 15:04:58 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d7d1becad4b692b97dd1f32706947aae5118294b

commit d7d1becad4b692b97dd1f32706947aae5118294b
Author:     Chuck Tuffli <chuck@FreeBSD.org>
AuthorDate: 2022-08-14 14:47:34 +0000
Commit:     Chuck Tuffli <chuck@FreeBSD.org>
CommitDate: 2022-08-14 14:47:34 +0000

    bhyve nvme: Fix Controller init error cases

    Fuzzing of bhyve uncovered an assertion failure in the NVMe emulation.
    Investigation uncovered several corner cases the code did not handle.
    This change handles several Controller initialization errors, including
     - bad AQ sizes
     - bad AQ vm_map_gpa
     - doorbell writes prior to RDY
     - doorbell writes to uninitialized queue
     - CSTS.RDY if CFS set

    PR:             256317,256319,256320,256322
    Reported by:    Cheolwoo Myung <cwmyung@snu.ac.kr>
    Reviewed by:    jhb
    Differential Revision:  https://reviews.freebsd.org/D35453

 usr.sbin/bhyve/pci_nvme.c | 55 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 50 insertions(+), 5 deletions(-)
Comment 12 commit-hook freebsd_committer freebsd_triage 2022-11-19 18:46:27 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=f3a69bc7223ad3fc04e417a88e6bb878aa3bfaf2

commit f3a69bc7223ad3fc04e417a88e6bb878aa3bfaf2
Author:     Chuck Tuffli <chuck@FreeBSD.org>
AuthorDate: 2022-08-14 14:45:21 +0000
Commit:     Chuck Tuffli <chuck@FreeBSD.org>
CommitDate: 2022-11-20 02:21:32 +0000

    bhyve nvme: Check return value of mapped memory

    Fuzzing of bhyve using hyfuzz discovered a way to cause a segmentation
    fault in the NVMe emulation. If a guest specifies a physical address in
    either the PRP1 or PRP2 field of a command that cannot be mapped from
    guest to host, the function paddr_guest2host() returns a NULL pointer.
    The NVMe emulation did not check for this error case, which allowed for
    the segmentation fault to occur.

    Fix is to check for a return value of NULL and indicate an error back to
    the guest (Data Transfer error). While in the area, slightly refactor
    the write/read blockif function to use a common error exit path.

    PR:             256317,256319,256320,256321,256322

    (cherry picked from commit 3d3678627c3112c94d174a8c51d8c058d02befb3)

 usr.sbin/bhyve/pci_nvme.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)
Comment 13 commit-hook freebsd_committer freebsd_triage 2023-01-26 19:48:34 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6c708e91f447d6e080125bfaff85e5507c9445fb

commit 6c708e91f447d6e080125bfaff85e5507c9445fb
Author:     Chuck Tuffli <chuck@FreeBSD.org>
AuthorDate: 2022-08-14 14:47:34 +0000
Commit:     John Baldwin <jhb@FreeBSD.org>
CommitDate: 2023-01-26 19:29:51 +0000

    bhyve nvme: Fix Controller init error cases

    Fuzzing of bhyve uncovered an assertion failure in the NVMe emulation.
    Investigation uncovered several corner cases the code did not handle.
    This change handles several Controller initialization errors, including
     - bad AQ sizes
     - bad AQ vm_map_gpa
     - doorbell writes prior to RDY
     - doorbell writes to uninitialized queue
     - CSTS.RDY if CFS set

    PR:             256317,256319,256320,256322
    Reported by:    Cheolwoo Myung <cwmyung@snu.ac.kr>
    Reviewed by:    jhb
    Differential Revision:  https://reviews.freebsd.org/D35453

    (cherry picked from commit d7d1becad4b692b97dd1f32706947aae5118294b)

 usr.sbin/bhyve/pci_nvme.c | 55 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 50 insertions(+), 5 deletions(-)