Bug 285681 - [Hyper-V] i386 panic during storvsc_xferbuf_prepare()
Summary: [Hyper-V] i386 panic during storvsc_xferbuf_prepare()
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-virtualization (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2025-03-26 21:51 UTC by Dimitry Andric
Modified: 2025-03-31 11:20 UTC (History)
5 users (show)

See Also:


Attachments
possible patch (590 bytes, patch)
2025-03-27 08:53 UTC, Mark Johnston
no flags Details | Diff
possible patch (1.39 KB, patch)
2025-03-29 08:38 UTC, Mark Johnston
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitry Andric freebsd_committer freebsd_triage 2025-03-26 21:51:40 UTC
Follow-up from bug 285415, where a panic occurred under Hyper-V, in storvsc_xferbuf_prepare(). Applying the patch from bug 285415 comment 17, this shows:

cd0 at ata1 bus 0 scbus1 target 0 lun 0
cd0: <Msft Virtual CD/ROM 1.0> Removable CD-ROM SPC-3 SCSI device
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: 349MB (178896 2048 byte sectors)
da0 at blkvsc0 bus 0 scbus2 target 0 lun 0
da0: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
da0: 300.000MB/s transfers
da0: Command Queueing enabled
da0: 102400MB (209715200 512 byte sectors)
da1 at blkvsc1 bus 0 scbus3 target 1 lun 0
da1: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-3 SCSI device
da1: 300.000MB/s transfers
da1: Command Queueing enabled
da1: 8192MB (16777216 512 byte sectors)
segs[0]: ofs 0xf7f45000, len 4096
segs[1]: ofs 0xf7f46000, len 4096
segs[2]: ofs 0xf7f47000, len 4096
segs[3]: ofs 0xf7f48000, len 4096
panic: invalid 1st page, ofs 0xdeadc0de, len 3735929054
cpuid = 3
time = 1743025553
KDB: stack backtrace:
db_trace_self_wrapper(f7,1521eb40,4,4,2a9a8c00,...) at db_trace_self_wrapper+0x28/frame 0x150db174
vpanic(141534a,150db1b0,150db1b0,150db1d0,132bb8c,...) at vpanic+0xf4/frame 0x150db190
panic(141534a,deadc0de,0,deadc0de,356ea000,...) at panic+0x14/frame 0x150db1a4
storvsc_xferbuf_prepare(2b136200,2a9a8c00,4,0) at storvsc_xferbuf_prepare+0x14c/frame 0x150db1d0
bus_dmamap_load_mem(26c30f40,2b132e80,150db224,132ba40,2b136200,1) at bus_dmamap_load_mem+0x2f2/frame 0x150db204
bus_dmamap_load_ccb(26c30f40,2b132e80,356ea000,132ba40,2b136200,1) at bus_dmamap_load_ccb+0x4a/frame 0x150db244
storvsc_action(2b0d5140,356ea000) at storvsc_action+0x3a7/frame 0x150db290
xpt_run_devq(2b0d1040,35676000,128b1890,2b0d1050,356ea000,...) at xpt_run_devq+0x287/frame 0x150db2cc
xpt_action_default(356ea000) at xpt_action_default+0x3c6/frame 0x150db2f0
scsi_action(356ea000) at scsi_action+0x19/frame 0x150db308
dastart(2b1cd900,356ea000) at dastart+0x30d/frame 0x150db344
xpt_run_allocq(2b1cd900,480) at xpt_run_allocq+0x8b/frame 0x150db36c
cam_iosched_schedule(2fd33c80,2b1cd900) at cam_iosched_schedule+0x21/frame 0x150db380
dastrategy(356e3d9c) at dastrategy+0x64/frame 0x150db39c
g_disk_start(356e4a78,150db420,2fd33980,4000,0,...) at g_disk_start+0x469/frame 0x150db3fc
g_io_request(356e4a78,356bc000) at g_io_request+0x26b/frame 0x150db424
g_read_data(356bc000,400,0,4000,0,150db464) at g_read_data+0x99/frame 0x150db444
gpt_read_tbl(1,356da200,0,0,ffffff,...) at gpt_read_tbl+0x10e/frame 0x150db490
g_part_gpt_read(35704c00,356bc000) at g_part_gpt_read+0x96/frame 0x150db4c8
G_PART_READ(3568af00,356f5150,3568af00,2fd33980,18186d8,...) at G_PART_READ+0x39/frame 0x150db4e0
g_part_taste(18186d8,2fd33980,0) at g_part_taste+0x14f/frame 0x150db500
g_new_provider_event(2fd33980,0) at g_new_provider_event+0x96/frame 0x150db51c
g_run_events(0,150db568) at g_run_events+0x10c/frame 0x150db538
fork_exit(ebde80,0,150db568,0,0,...) at fork_exit+0x6b/frame 0x150db554
fork_trampoline() at 0xffc0348e/frame 0x150db554

So "ofs 0xdeadc0de" is pretty bad, and len 3735929054 is also 0xdeadc0de.

This is different from the original panic, which said:

panic: invalid 1st page, ofs 0x3985000, len 2048

But that might also have been some sort of garbage value?
Comment 1 Dimitry Andric freebsd_committer freebsd_triage 2025-03-26 21:55:19 UTC
Oh duh, I see:

--- a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
+++ b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
@@ -1831,6 +1831,10 @@ storvsc_xferbuf_prepare(void *arg, bus_dma_segment_t *segs, int nsegs, int error
 #if !defined(__aarch64__)
                if (nsegs > 1) {
                        if (i == 0) {
+                               for (i = 0; i < nsegs; i++)
+                                       printf("segs[%d]: ofs 0x%jx, len %zu\n",
+                                           i, (uintmax_t)segs[i].ds_addr,
+                                           segs[i].ds_len);
                                KASSERT((segs[i].ds_addr & PAGE_MASK) +
                                    segs[i].ds_len == PAGE_SIZE,
                                    ("invalid 1st page, ofs 0x%jx, len %zu",

The inner for loop increases 'i' to nsegs, while it was supposed to be zero. The inner loop should use another variable, obviously.
Comment 2 Dimitry Andric freebsd_committer freebsd_triage 2025-03-26 22:00:54 UTC
With this patch instead:

--- a/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
+++ b/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
@@ -1820,7 +1820,7 @@ storvsc_xferbuf_prepare(void *arg, bus_dma_segment_t *segs, int nsegs, int error
        union ccb *ccb = reqp->ccb;
        struct ccb_scsiio *csio = &ccb->csio;
        struct storvsc_gpa_range *prplist;
-       int i;
+       int i, j;

        prplist = &reqp->prp_list;
        prplist->gpa_range.gpa_len = csio->dxfer_len;
@@ -1831,6 +1831,10 @@ storvsc_xferbuf_prepare(void *arg, bus_dma_segment_t *segs, int nsegs, int error
 #if !defined(__aarch64__)
                if (nsegs > 1) {
                        if (i == 0) {
+                               for (j = 0; j < nsegs; j++)
+                                       printf("segs[%d]: ofs 0x%jx, len %zu\n",
+                                           j, (uintmax_t)segs[j].ds_addr,
+                                           segs[j].ds_len);
                                KASSERT((segs[i].ds_addr & PAGE_MASK) +
                                    segs[i].ds_len == PAGE_SIZE,
                                    ("invalid 1st page, ofs 0x%jx, len %zu",

The output is:

... lots of segs[], all 4096 bytes long
segs[0]: ofs 0x39cc000, len 4096
segs[1]: ofs 0x39cd000, len 4096
segs[0]: ofs 0x39cf000, len 2048
segs[1]: ofs 0x39d0000, len 2048
panic: invalid 1st page, ofs 0x39cf000, len 2048
cpuid = 1
time = 1743026282
KDB: stack backtrace:
db_trace_self_wrapper(fd,152da780,0,2,24baf780,...) at db_trace_self_wrapper+0x28/frame 0x36cb3020
vpanic(141534a,36cb305c,36cb305c,36cb3080,132bb2d,...) at vpanic+0xf4/frame 0x36cb303c
panic(141534a,39cf000,0,800,0,...) at panic+0x14/frame 0x36cb3050
storvsc_xferbuf_prepare(26c34000,24baf780,2,0) at storvsc_xferbuf_prepare+0xed/frame 0x36cb3080
bus_dmamap_load_mem(24ba6100,26c39100,36cb30d4,132ba40,26c34000,1) at bus_dmamap_load_mem+0x2f2/frame 0x36cb30b4
bus_dmamap_load_ccb(24ba6100,26c39100,37e94bec,132ba40,26c34000,1) at bus_dmamap_load_ccb+0x4a/frame 0x36cb30f4
storvsc_action(2b104180,37e94bec) at storvsc_action+0x3a7/frame 0x36cb3140
xpt_run_devq(2b100080,36e59000,1cef7030,2b100090,37e94bec,...) at xpt_run_devq+0x287/frame 0x36cb317c
xpt_action_default(37e94bec) at xpt_action_default+0x3c6/frame 0x36cb31a0
scsi_action(37e94bec) at scsi_action+0x19/frame 0x36cb31b8
dastart(36e29100,37e94bec) at dastart+0x30d/frame 0x36cb31f4
xpt_run_allocq(36e29100,480) at xpt_run_allocq+0x8b/frame 0x36cb321c
cam_iosched_schedule(1cad0b80,36e29100) at cam_iosched_schedule+0x21/frame 0x36cb3230
dastrategy(36f00a78) at dastrategy+0x64/frame 0x36cb324c
g_disk_start(36ebe860,36ef496c,2d1abb00,1000,0,...) at g_disk_start+0x469/frame 0x36cb32ac
g_io_request(36ebe860,36e8df40,200,0,36ed0e00,...) at g_io_request+0x26b/frame 0x36cb32d4
g_part_start(36ef496c,396c9b14,2d1ab680,1000,0,...) at g_part_start+0x114/frame 0x36cb334c
g_io_request(36ef496c,2499aec0,36cb3398,20b3513,4e,...) at g_io_request+0x26b/frame 0x36cb3374
vdev_geom_io_start(37076b40,7,36c8ca7b,0,41,...) at vdev_geom_io_start+0x26f/frame 0x36cb33a0
zio_vdev_io_start(37076b40,7,36c8c503,1df16d7,1e14d3b,...) at zio_vdev_io_start+0x559/frame 0x36cb33e0
zio_nowait(37076b40,36cb3440,1ffc930,2,2032ca0,...) at zio_nowait+0x143/frame 0x36cb3420
vdev_mirror_io_start(370a0b40,36f87000,370a0b40,2157040,36cb347c,...) at vdev_mirror_io_start+0x13b/frame 0x36cb344c
zio_vdev_io_start(370a0b40,1df16d7,1e14d3b,1e185c7,1e185a1,...) at zio_vdev_io_start+0x559/frame 0x36cb348c
zio_execute(370a0b40,1,36cb3524,fcf452,370a0ea0,...) at zio_execute+0x93/frame 0x36cb34c4
taskq_run_ent(370a0ea0,1) at taskq_run_ent+0x1f/frame 0x36cb34d4
taskqueue_run_locked(152da780,1a55074,36cb3568,36cb3554,f2e09b,...) at taskqueue_run_locked+0x192/frame 0x36cb3524
taskqueue_thread_loop(36f94f90,36cb3568) at taskqueue_thread_loop+0xae/frame 0x36cb3538
fork_exit(fd0090,36f94f90,36cb3568,0,0,...) at fork_exit+0x6b/frame 0x36cb3554
fork_trampoline() at 0xffc0348e/frame 0x36cb3554

So the question seems to become: what initiated these two requests for 2048 bytes?
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2025-03-27 08:53:47 UTC
Created attachment 259079 [details]
possible patch

How much RAM does the instance have?

Does the attached patch help?
Comment 4 Dimitry Andric freebsd_committer freebsd_triage 2025-03-27 18:05:52 UTC
(In reply to Mark Johnston from comment #3)
I started initially with 3G, to keep below a 4G barrier. But since i386 seems to have PAE I increased it to 8G. But I can configure it in any way you would need.

Interestingly the patch seems to fix the issue! It boots to multi-user just fine now, and survives a "zpool scrub zroot".
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2025-03-28 08:50:14 UTC
(In reply to Dimitry Andric from comment #4)
I'm wondering if the panic is reproducible with less than 4GB of RAM.  In particular, I guess we're bouncing the I/O pages, and since the driver doesn't keep track of the 1st page offset, this flag fixes the problem.  Could you please try, say, 3GB of RAM, and without the patch?  In particular I note that the page address passed by storvsc_xferbuf_prepare() is a uint64_t, so maybe we can avoid bouncing entirely on i386.

If that works, it still doesn't explain why we need to disable the assertion on arm64.  Can we try the same test from comment 2 on arm64?
Comment 6 Dimitry Andric freebsd_committer freebsd_triage 2025-03-28 20:01:19 UTC
(In reply to Mark Johnston from comment #5)
> Could you please try, say, 3GB of RAM, and without the patch?

With 3G of RAM it boots fine, without the patch!

If I increase to 4G, it panics with "storvsc recvbuf is not large enough". If I increase to 5G or larger, it panics with "invalid 1st page, ofs 0x378f000, len 2048", so I suppose it then switches to PAE mode?
Comment 7 Mark Johnston freebsd_committer freebsd_triage 2025-03-29 08:38:53 UTC
Created attachment 259144 [details]
possible patch

(In reply to Dimitry Andric from comment #6)
> If I increase to 4G, it panics with "storvsc recvbuf is not large enough".

Weird, I have no idea about that.  I guess we should look at the packet length in vmbus_chan_recv().  A stack trace would also be useful.  But, does it happen at all with the patch in comment 3?

I think that patch is the correct solution.  The attached patch should also eliminate unnecessary bouncing on i386 systems with physical RAM above 4GB.  I think what's happening is that we are getting a 4KB I/O across two pages above 4GB, and busdma is bouncing it into the first 2KB of each of two pages below 4GB.  The storvsc driver isn't able to cope with that.

Again, it would be really nice to understand why the assertion fails on arm64, and whether the attached patch fixes that as well.  A storage driver like this really shouldn't be bouncing on 64-bit systems.
Comment 8 Dimitry Andric freebsd_committer freebsd_triage 2025-03-29 12:43:07 UTC
(In reply to Mark Johnston from comment #7)
> Weird, I have no idea about that.  I guess we should look at the packet length in vmbus_chan_recv().  A stack trace would also be useful.

Annoyingly, I cannot reproduce it anymore. Must have been a glitch.


> But, does it happen at all with the patch in comment 3?

No, that fixes it for all memory sizes.


> The attached patch should also eliminate unnecessary bouncing on i386 systems with physical RAM above 4GB.

I applied this latest one instead, and it also boots fine with different memory sizes.
Comment 9 Mark Millard 2025-03-29 21:12:14 UTC
(In reply to Mark Johnston from comment #7)

I've never been able to get FreeBSD to complete much
of the boot sequence under Hyper-V on the Windows
DevKit 2023 (aarch64) that has Windows 11 Pro.

The console output stops after the masks line of the
EFI framebuffer information or somewhat later. The
farthest I've seen is Event Timer line from the
kernel output. It has stopped between those points
otherwise. No failure notices when ti stops: just
no more output.

It does not get far enough for Hyper-V to be able to
do a shutdown. Hyper-V indicates the VM was still not
ready, even with a long wait first.

arm64 Windows 11 only supports v2 VMs. I converted a
downloaded FreeBSD .vhd to .vhdx in Hyper-V to have
something official to try. (V2 only supports .vhdx .)
(My personal builds have historically behaved
similarly to the above.)

I've never figured out how to get a serial console
and named pipe configuration to work. So I'm
dependent on the monitor being operational.

Anyway, my retry at this got no farther and I was
unable to get anywhere near figuring out how to test.
I've no clue how specific the Hyper-V problems may be
to the Windows DevKit 2023 type of context. So I'm not
sure if anyone else would be able to test. I do not
know if any of it is tied to Warner's UEFI console
related adjustments that he made.
Comment 10 Mark Johnston freebsd_committer freebsd_triage 2025-03-30 12:29:49 UTC
(In reply to Dimitry Andric from comment #8)
Thanks for testing.  Are you able to verify that the patch doesn't regress anything on amd64 and/or arm64?
Comment 11 Wei Hu 2025-03-30 13:25:27 UTC
(In reply to Mark Millard from comment #9)
You are likely hitting the problem introduced by following commit:

https://cgit.freebsd.org/src/commit/?h=1b9096cd1d2fce1edb7077aebd3512cc61c54371

It causes arm64 boot hangs on Hyper-V. I heard Andrew Turner has fixed this recently. But I just tried boot from latest main on arm64 VM in Azure and it is hitting a different panic with stack like:

panic: vm_fault failed: 0xffff000000627fa8 error 1
cpuid = 17
time = 1743320704
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
vpanic() at vpanic+0x1a0
panic() at panic+0x48
data_abort() at data_abort+0x28c
handle_el1h_sync() at handle_el1h_sync+0x18
--- exception, esr 0x96000004
rib_notify() at rib_notify+0x38
add_route() at add_route+0xc8
add_route_flags() at add_route_flags+0x1bc
rib_add_route() at rib_add_route+0x33c
ifa_maintain_loopback_route() at ifa_maintain_loopback_route+0xfc
in6_update_ifa() at in6_update_ifa+0xa44
in6_ifattach() at in6_ifattach+0x1c0
in6_if_up() at in6_if_up+0x98
if_up() at if_up+0xd4
ifhwioctl() at ifhwioctl+0xc18
ifioctl() at ifioctl+0x8bc
kern_ioctl() at kern_ioctl+0x2e4
sys_ioctl() at sys_ioctl+0x140
do_el0_sync() at do_el0_sync+0x608
handle_el0_sync() at handle_el0_sync+0x4c
--- exception, esr 0x56000000
KDB: enter: panic
[ thread pid 202 tid 100285 ]
Stopped at      kdb_enter+0x48: str     xzr, [x19, #2048]

Anyway, 14.0 and 14.1 arm64 image should boot up fine on Hyper-V if you like to try Mark's patch.
Comment 12 Mark Millard 2025-03-30 16:04:29 UTC
(In reply to Wei Hu from comment #11)

What I was trying to boot was from the main [so: 15]
2025-Mar-27 VM image that I downloaded yesterday from:

https://download.freebsd.org/ftp/snapshots/VM-IMAGES/15.0-CURRENT/aarch64/Latest/FreeBSD-15.0-CURRENT-arm64-aarch64-zfs.vhd.xz

My pictures include showing: main-n276101-fd52a9e11c52
with a 2025-Mar-27 date. So recent, for sure.

But the observed behavior is not new in my context. I
normally run main, either official builds or a
personal variant. If I use an alternate it is normally
an officially built stable/* (stable/14 at this point).

I converted the .vhd to .vhdx for use with Hyper-V v2.

FYI: aarch64 no longer has console=comconcole as a
possibility in the loader: just efi and eficom . So
instructions on https://wiki.freebsd.org/HyperV about
named pipe use for serial console use for v2 are out
of date and cannot be followed.

There is no 14.0 any more listed in:

https://download.freebsd.org/ftp/releases/VM-IMAGES/

There is no 14.0 or 14.1 any more listed in:

https://download.freebsd.org/ftp/snapshots/VM-IMAGES/

So I'm guessing you are recommending testing:

https://download.freebsd.org/ftp/releases/VM-IMAGES/14.1-RELEASE/aarch64/Latest/FreeBSD-14.1-RELEASE-arm64-aarch64-ufs.vhd.xz

that the web page is showing 2024-Jun-04 22:30 in the
Date column for. But that is far from being recent.

But may be you mean:

FreeBSD-14.2-STABLE-arm64-aarch64-ufs.vhd.xz

that has the Date column showing: 2025-Mar-27 10:59
that would be recent?

Anyway, I'm unclear on just which available VM download
you are suggesting that I try just to get to the point of
some example operational Hyper-V session.
Comment 13 Mark Millard 2025-03-30 16:30:39 UTC
(In reply to Mark Millard from comment #12)
(In reply to Wei Hu from comment #11)

14.1-RELEASE from:

https://download.freebsd.org/ftp/releases/VM-IMAGES/14.1-RELEASE/aarch64/Latest/FreeBSD-14.1-RELEASE-arm64-aarch64-ufs.vhd.xz

(converted to .vhdx) did work for having an operational
Hyper-V session on the Windows DevKit 2023.

Thanks.
Comment 14 Wei Hu 2025-03-30 16:36:58 UTC
(In reply to Mark Millard from comment #12)
I see. Maybe adding followings at boot will just work for you:

console="comconsole efi vidconsole"
comconsole_speed="115200"
boot_multicons="YES"
boot_serial="YES"

These are directly copied from the FreeBSD images in Azure Marketplace. I don't have on-prem env. Everything I tested is from VMs in Azure. Now FreeBSD works on Gen2 Hyper-V for both amd64 and arm64.

There seems some issues on larger arm64 VMs (with more than 4 vcpus), due to too much parallelization breaking the dependency at boot time. If you keep you VM size under 4 vcpus, it is less likely to hit the panic mentioned in Comment 11.
Comment 15 Mark Millard 2025-03-30 19:00:32 UTC
(In reply to Mark Millard from comment #13)
(In reply to Wei Hu from comment #11)

Also working for being operational in my context
is 14.2-RELEASE from:

https://download.freebsd.org/ftp/releases/VM-IMAGES/14.2-RELEASE/aarch64/Latest/FreeBSD-14.2-RELEASE-arm64-aarch64-ufs.vhd.xz

The web page shows date: 2024-Nov-29 11:11

That is about a year after the 2023-11-21 10:02:58 +0000
commit that you referenced:

https://cgit.freebsd.org/src/commit/?h=1b9096cd1d2fce1edb7077aebd3512cc61c54371

May be the change was not MFC'd?
Comment 16 Dimitry Andric freebsd_committer freebsd_triage 2025-03-30 19:19:07 UTC
(In reply to Mark Johnston from comment #10)
As far as I can see, this code only affects i386 and aarch64? I haven't got access to an aarch64 machine that can do virtualization, unfortunately. Running Windows 11 for arm64 in VMware on my Mac M1 works, but it doesn't support the nested virtualization feature required for Hyper-V.
Comment 17 Mark Millard 2025-03-30 19:21:54 UTC
(In reply to Mark Millard from comment #15)
(In reply to Wei Hu from comment #11)

14.2-STABLE: Also working for being operational
in my context is 14.2-STABLE from:

https://download.freebsd.org/ftp/snapshots/VM-IMAGES/14.2-STABLE/aarch64/Latest/FreeBSD-14.2-STABLE-arm64-aarch64-ufs.vhd.xz

The web page shows date: 2025-Mar-27 10:59

That is a few days ago.
Comment 18 Mark Millard 2025-03-30 19:36:04 UTC
(In reply to Wei Hu from comment #14)

Going back to trying the VM for main's [15's] .vhdx . . .

FYI: at the loader prompt:

OK set console="comconsole efi vidconsole"

reports:

console comconsole is unavailable
console vidconsole is unavailable

Also doing the other lines:

OK set comconsole_speed="115200"
OK set boot_multicons="YES"
OK set boot_serial="YES"

and then:

OK boot

still has the same problem that I've reported.

At least 14.* does not have the problem.
Comment 19 Mark Millard 2025-03-30 20:00:44 UTC
(In reply to Mark Millard from comment #18)

FYI comconsole for aarch64 was removed for 15+
by the following but the "shim" the commit
references is only in place when:

defined(__aarch64__) && __FreeBSD_version < 1500000

The commit is from 2023-May-11:

From: Warner Losh <imp_at_FreeBSD.org>
Date: Thu, 11 May 2023 20:06:47 UTC
The branch main has been updated by imp:

URL: https://cgit.FreeBSD.org/src/commit/?id=f93416d677432f3a713c71b79fb68e89162baca9

commit f93416d677432f3a713c71b79fb68e89162baca9
Author:     Warner Losh <imp@FreeBSD.org>
AuthorDate: 2023-05-11 20:03:30 +0000
Commit:     Warner Losh <imp@FreeBSD.org>
CommitDate: 2023-05-11 20:06:03 +0000

    stand: add comconsole backwards compatibility shim for aarch64
    
    Add a compat shim for the "comconsole" name so that people with a
    "console=comconsole" in their loader.conf on aarch64 will continue to
    work (though with a warning).
    
    This is only aarch64: it will never be there for amd64 (where comconsole
    always means talk to the hardware directly). To do that is too hard.
    
    Sponsored by:           Netflix
    Differential Revision:  https://reviews.freebsd.org/D39983
---
 stand/efi/loader/conf.c        |  7 +++++++
 stand/efi/loader/efiserialio.c | 25 +++++++++++++++++++++++++
 2 files changed, 32 insertions(+)
. . .
Comment 20 Mark Millard 2025-03-30 23:07:09 UTC
(In reply to Mark Millard from comment #19)

Context: the problem with main booting that
is preventing getting far enough to help test.

FYI: substitution of the bootaa64.efi from the
stable/14 based VM into the main based VM is not
enough to avoid the problem in main, even with
the likes of using:

OK set console="comconsole"
OK set comconsole_speed="115200"
OK set boot_multicons="YES"
OK set boot_serial="YES"
OK boot

Somehow main's FreeBSD kernel code is involved.
May be there is common code between the loader
and the kernel so that the substitution is
incomplete?
Comment 21 Mark Johnston freebsd_committer freebsd_triage 2025-03-31 00:55:21 UTC
(In reply to Dimitry Andric from comment #16)
The patch affects all platforms which support the storvsc driver.  Some testing on at least amd64 would be appreciated.

(In reply to Wei Hu from comment #11)
(In reply to Mark Millard from comment #9)
These problems are surely different bugs that deserve their own PRs?  For the routing panic, please include the revision you are testing.

The boot hang will not be easy to debug without being able to attach a debugger to the hypervisor.
Comment 22 Mark Millard 2025-03-31 02:07:47 UTC
(In reply to Mark Johnston from comment #21)

I recognized up front that what I was reporting
was likely not a related problem. I reported
here because it makes having anyone answer your
requests for testing problematical if it is
to involve main and Windows 11 Pro Hyper-V
(and not involve Azure?). (Presumes others
can replicate my problems with the official
VM files.)

Also, I had hoped that someone here would know
what the issue was and a way to avoid it. That
did not work out.

As stands, the notes might help someone from
having to repeat some of my explorations if
they look to try to help for aarch64.

I did send out a separate note to the lists
after finding that 14.* booted okay and only
main [so: 15] does not. I do not plan on more
here.
Comment 23 Mark Johnston freebsd_committer freebsd_triage 2025-03-31 02:13:27 UTC
(In reply to Mark Millard from comment #22)
Testing arm64 in Azure would be useful, I believe the storvsc driver would be in use there as well.
Comment 24 Mark Millard 2025-03-31 03:03:04 UTC
(In reply to Mark Johnston from comment #23)

I have Windows 10/11 Pro usage background,
with their Hyper-V's involved --and an
aarch64 Windows 11 Pro context available
with its Hyper-V available.

But I do not have an Azure or its Hyper-V
usage background. I've no established
Azure context.
Comment 25 Wei Hu 2025-03-31 09:53:36 UTC
(In reply to Mark Johnston from comment #23)
I have tested the patch on amd64 and arm64 on Azure VMs. It works just fine, though I was able to reproduce the assertion failure on arm64 on the most recent current build.

Thanks for fixing this!
Comment 26 Mark Johnston freebsd_committer freebsd_triage 2025-03-31 09:55:04 UTC
(In reply to Wei Hu from comment #25)
Sorry, which assertion failure do you mean?  The "invalid 1st page" one?  And it still occurs with the patch applied?
Comment 27 Wei Hu 2025-03-31 10:27:02 UTC
(In reply to Mark Johnston from comment #26)
(In reply to Wei Hu from comment #25)
Sorry I had a typo in my Comment #25...  

"though I was able to reproduce the assertion failure on arm64" should be "though I was NOT able to reproduce the assertion failure on arm64". 

If I just remove "#if !defined(__aarch64__)" in storvsc_xferbuf_prepare() without fully applying your patch, it doesn't hit the assertion on the latest current. I guess maybe the following commit has helped to avoid bounce buffer?

https://cgit.freebsd.org/src/commit/?id=e7a9817b8d328dda04069b65944ce2ed6f54c6f0

Anyway, the patch looks good and it doesn't cause any regression on amd64 or arm64 according to my test.
Comment 28 commit-hook freebsd_committer freebsd_triage 2025-03-31 11:16:23 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=54a3920dc9b3b5a47cdaaa3132b4fcf1c448a737

commit 54a3920dc9b3b5a47cdaaa3132b4fcf1c448a737
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-03-31 10:45:55 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-03-31 11:15:45 +0000

    hyperv/storvsc: Avoid conditional asserts in storvsc_xferbuf_prepare()

    whu@ cannot reproduce the assertion failure which led to these ifdefs
    being added in the first place, and since they appear wrong, i.e., the
    assertions ought to apply to all platforms, let's remove them.

    This reverts commits 0af5a0cd2788efce9f444f4f781357d317bb0bb1 and
    6f7b1310b6fe36f9bb653d3e97bc257adced3a2b.

    PR:             285681
    Tested by:      whu
    MFC after:      2 weeks

 sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c | 2 --
 1 file changed, 2 deletions(-)
Comment 29 commit-hook freebsd_committer freebsd_triage 2025-03-31 11:16:24 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=a319ba694538a38429115aaaf1d4b3946ea3a8b5

commit a319ba694538a38429115aaaf1d4b3946ea3a8b5
Author:     Mark Johnston <markj@FreeBSD.org>
AuthorDate: 2025-03-31 10:45:14 +0000
Commit:     Mark Johnston <markj@FreeBSD.org>
CommitDate: 2025-03-31 11:14:42 +0000

    hyperv/storvsc: Fix busdma constraints

    - The BUS_DMA_KEEP_PG_OFFSET flag is needed, since
      storvsc_xferbuf_prepare() assumes that only the first segment may have
      a non-zero offset, and that all following segments are page-sized and
      -aligned.
    - storvsc_xferbuf_prepare() handles 64-bit bus addresses, so avoid
      unneeded bouncing on i386.

    PR:             285681
    Reported by:    dim
    Tested by:      dim, whu
    MFC after:      2 weeks

 sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
Comment 30 Mark Johnston freebsd_committer freebsd_triage 2025-03-31 11:20:07 UTC
(In reply to Wei Hu from comment #27)
Ok, good enough for me. :) Thanks for testing.