I am running FreeBSD 13.2 on my Windows 2023 Dev Kit. I had to apply D37765 and cherry pick D38031 to get this to work. hw.pac.enable=0 is needed in /boot/loader.conf to make the kernel boot. I have installed FreeBSD on ZFS on the internal NVMe SSD. Now I have noticed the following problem: if I have any USB storage device attached during boot, the loader crashes with a synchronous exception after displaying the beastie menu. I have unfortunately not managed to capture the address shown. With no USB storage device attached, the device boots just fine. Due to the device now being in a data center I will not be able to do any boot loader testing. I can however provide you with any other information you might need and give you copies of the binaries involved.
Is it any USB device? Or one particular one? I boot 20 times a day with USB storage devices attached to my UEFI FreeBSD test machine sometimes... So there's something different about your setup we need to understand.
(In reply to Warner Losh from comment #1) It happened with two USB sticks and with a SATA M.2 SSD in an M.2-SATA to USB adapter. USB keyboard worked fine. The SATA M.2 SSD had no partition table (gpart destroy was executed in a GPT partition table), the USB sticks had GPT partition tables and various file systems (FAT32 and UFS I think).
Confirmed with main FreeBSD USBC boot media attached. This is media I use to boot 3 types of Cortex-A72 systems, 2 by UEFI/ACPI (HoneyComb, MACCHIATObin Double Shot), and 1 by U-Boot (RPi4Bs). Also: 2 Cortex-A53 U-Boot based systems, a RPi3B and RPi2B v1.2 . Also: ZFS root media. The media has main [so: 14] and has the commits for D37765 and D38031 . I had put in place hw.pac.enable=0 aS well. Result on the Windows Dev Kit 2023 was a quick: Synchronous Exception 0x0000000092F922FC The sequence for the first Dev Kit Power On was: 1st: Power on with UEFI button and set the UEFI to use no secure boot keys. Set USB as the first context to try to boot from/via. 2nd: Try to boot with the FreeBSD media attached to USBC. Result: the exception. (So Windows 11 Pro had never been started yet and the internal media was undisturbed.) Everything I've tried to boot that USBC media has gotten the same result. I've not found any UEFI settings that make a difference. Later I'll try some blank USBC media and possibly other variations to see if something more specific about the media content leads to the specific failure report vs. possibly other failures. I'll note that I've no plan to remove the Windows Pro 11 from the internal media. I want to be able to swap external media and boot, such as switching between ZFS and UFS based systems. I'll also probably try FreeBSD in Hyper-V at some point.
(In reply to Mark Millard from comment #3) I took a as-factory-shipped example of the type of USBC media (but 1TiByte instead of 2 TiByte) and tried to boot with the media attached (but no OS or such present on the media). No Synchronous Exception Eventually it booted to the internal Windows 11 Pro (despite UEFI having the internal media unchecked) So it appears that the Synchronous Exception that I've gotten on the original media is a response to something on/from that the media. Merely having non-boot media connected did not have a problem.
(In reply to Mark Millard from comment #4) I booted teh media in question on another machine and did: # mv /boot/efi/EFI /boot/efi/EFI-disabled Then I tried booting the Dev Kit machine with the media connected: No Synchronous Exception Eventually it booted to the internal Windows 11 Pro So it appears that the FreeBSD boot loader is involved in the problem. Side note: I've now also tried a USB3-A style port instead of USB3-C. Both types of ports get the issue. (Given some Microsft wording I was not sure that ports USB3-A were involved in potential booting. They are.) FYI: # ls -Tld /boot/efi/EFI/*/* -r-xr-xr-x 1 root wheel 865292 Mar 15 21:30:46 2023 /boot/efi/EFI/BOOT/bootaa64.efi -rwxr-xr-x 1 root wheel 865292 Mar 15 21:30:46 2023 /boot/efi/EFI/FREEBSD/loader.efi They still match what is on my normal environment that still predates the openzfs import disaster.
(In reply to Mark Millard from comment #5) Well, "has the commits for D37765 and D38031": built/installed only. I needed to update the .efi files on the msdosfs. (Done now.) That gets things to where the FreeBSD kernel activity gets to the point of the root file system mount, no more exceptions. But it ends up complaining that it can not find the pool label for 'zroot'. (Lots of times.) So it fails to mount the root file system. Earlier there are a few ACPI errors/warnings I see when I scroll back (only on screen, no serial console): ACPI Error: AE_NOT_FOUND, While resolving a named reference package element -\_SB_.UBF0.PRT0 (20221020/dspkginit-605) ACPI Error: AE_NOT_FOUND, While resolving a named reference package element -\_SB_.UBF0.PRT1 (20221020/dspkginit-605) ACPI Warning: \_SB.GPU._CLS: Return Package is too small - found 1 element, expected 3 (20221020/nsprepkg-511) can't fetch resource for \_SB|.ADC1 - AE_AML_INVALID_RESOURCE_TYPE
(In reply to Mark Millard from comment #6) I found the distinction that controls failure vs. success in booting via the USB3 ports: USB3-C ugen0.5: <GenesysLogic USB3.2 Hub> at usbus0 ports: ZFS and UFS boots fail. USB3-A ugen0.1: <Generic XHCI root HUB> at usbus0 ports: ZFS and UFS boots work. Looks like the FreeBSD kernel does not handle USB3.2 (but the UEFI/ACPI does for the FreeBSD loader). This may make the Windows Dev Kit 2023 a useful context for development work on handling more modern USB3.*'s. I'll note that https://learn.microsoft.com/en-us/windows/arm/dev-kit/ reports: QUOTE When connecting an external keyboard or mouse, use the USB-A ports, not USB-C. Using USB-C to connect a keyboard or mouse will only work intermittently. END QUOTE (It is unclear if that is a Windows specific issue, UEFI issue, both, or more.) For reference for the UFS USB3-C boot failures, the messages are: Mounting from ufs:/dev/gpt/CA72USBufs failed with error 22; retrying for 10 more seconds Mounting from ufs:/dev/gpt/CA72USBufs failed with error 22; invalid fstype.
Just FYI: A problem that I've noticed is: # date Wed Dec 31 16:50:41 PST 1969 despite /etc/rc.conf having: ntpd_enable="YES" ntpd_sync_on_start="YES" and it working booting other machines.
Another FYI of an oddity (during a buildworld): # sysctl -a | grep "temp.*[0-9]C$" hw.acpi.thermal.tz31.temperature: -273.1C hw.acpi.thermal.tz30.temperature: -273.1C hw.acpi.thermal.tz29.temperature: -273.1C hw.acpi.thermal.tz28.temperature: -273.1C hw.acpi.thermal.tz27.temperature: -273.1C hw.acpi.thermal.tz26.temperature: -273.1C hw.acpi.thermal.tz25.temperature: -273.1C hw.acpi.thermal.tz24.temperature: -273.1C hw.acpi.thermal.tz23.temperature: -273.1C hw.acpi.thermal.tz22.temperature: -273.1C hw.acpi.thermal.tz21.temperature: -273.1C hw.acpi.thermal.tz20.temperature: -273.1C hw.acpi.thermal.tz19.temperature: -273.1C hw.acpi.thermal.tz18.temperature: -273.1C hw.acpi.thermal.tz17.temperature: -273.1C hw.acpi.thermal.tz16.temperature: -273.1C hw.acpi.thermal.tz15.temperature: -273.1C hw.acpi.thermal.tz14.temperature: -273.1C hw.acpi.thermal.tz13.temperature: -273.1C hw.acpi.thermal.tz12.temperature: -273.1C hw.acpi.thermal.tz11.temperature: -273.1C hw.acpi.thermal.tz10.temperature: -273.1C hw.acpi.thermal.tz9.temperature: -273.1C hw.acpi.thermal.tz8.temperature: -273.1C hw.acpi.thermal.tz7.temperature: -273.1C hw.acpi.thermal.tz6.temperature: -273.1C hw.acpi.thermal.tz5.temperature: -273.1C hw.acpi.thermal.tz4.temperature: -273.1C hw.acpi.thermal.tz3.temperature: -273.1C hw.acpi.thermal.tz2.temperature: -273.1C hw.acpi.thermal.tz1.temperature: -273.1C hw.acpi.thermal.tz0.temperature: -273.1C
(In reply to Mark Millard from comment #9) The thermal zones for some reason do not have registers to read the temperature from. Hence some internal interface returns -1, which is converted into slightly below absolute zero. Your information about USB 2 vs USB 3 is interesting. I did my previous testing with a SATA HDD attached to an USB to SATA adapter, which should be using USB 3. Will test with that one again next week.
(In reply to Robert Clausecker from comment #10) Previous testing, as in, before I finally set up the machine. After setting it up, I tried attaching a variety of other USB disks that may have all supported a later protocol level. However, in one of my previous attempts (http://fuz.su/~fuz/files/volterra-dmesg-7.log), you can clearly see that the boot disk is attached via the integrated USB 3.2 hub (however, it was a USB A port): ugen0.4: <GenesysLogic USB3.2 Hub> at usbus0 uhub2 on uhub0 uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 3> on usbus0 uhub2: 4 ports with 3 removable, self powered (...) ugen0.6: <ASMedia AS2115> at usbus0 umass0 on uhub2 umass0: <ASMedia AS2115, class 0/0, rev 3.00/0.01, addr 5> on usbus0 umass0: SCSI over Bulk-Only; quirks = 0x0100 umass0:1:0: Attached to scbus1 da0 at umass-sim0 bus 0 scbus1 target 0 lun 0 da0: <ASMT 2115 0> Fixed Direct Access SPC-4 SCSI device da0: Serial Number 00000000000000000000 da0: 400.000MB/s transfers da0: 152627MB (312581808 512 byte sectors) da0: quirks=0x2<NO_6_BYTE> da0: Delete methods: <NONE(*),ZERO> GEOM: new disk da0 The 1969 date is because FreeBSD does not detect an RTC clock. I'm not sure if the machine has one; there's no battery inside, so how would it keep the date? See dmesg log: Warning: no time-of-day clock registered, system time will not be set accurately
(In reply to Robert Clausecker from comment #10) I did not write anything about USB2, only USB3.? . The issue is USB3.0 vs USB3.2 for the hardware in the Windows Dev Kit 2023 (WDK23) hubs/ports and its handling by the FreeBSD kernel. I used the exact same drive connected to different places on the WDK23.
(In reply to Mark Millard from comment #12) Weird. I only ever tried to connect to the USB A ports. Maybe the USB A ports are not all the same?
(In reply to Robert Clausecker from comment #11) All my ZFS testing was with the same drive in different ports. All my UFS testing was with the same drive in different ports. (ZFS drive vs. UFS drive: same type but distinct instances.) The 2 drives are USB3.2 capable but are compatible/capable with USB3.0 (and with USB2). In this context, the WDK23 interhal hubs and ports are a mix of USB3.0 and USB3.2 . As I understand it, even for a USB3.0 device, when attached to a USB 3.2 hub/port the kernel has somewhat different activity to do. The hub/port is not fully transparent of itself. (May be you were referencing my keyboard/mouse note, where I did not reference USB2 explicitly. I do not expect that any keyboards/mice issues are relevant to the storage media issues.) As for the time: RPi4B's do not have an RTC but the ntpd startup I use deals with setting up the time anyway. That did not happen here. I'm unsure why. I ended up manually setting the date in order to allow my buildworld buildkernel test. (Again, I sometimes boot the RPi4B's with the same drives that I used for the Windows Dev Kit 2023 testing.) As for temperature: If what you report is true, it is odd that the UEFI/ACPI implementation supplies definitions for non-existing sensors.
(In reply to Robert Clausecker from comment #13) Looking again at the log for the successful boot that I was referencing, it is not as I said (from grep for usb/uhub references): Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0 on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0: 6 ports with 6 removable, self powered Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2 on uhub0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 4> on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2: 4 ports with 3 removable, self powered Dec 31 16:00:24 CA72_4c8G_ZFS kernel: umass0 on uhub2 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 6> on usbus0 (That was a USB-A connection.) (Unfortunately, no logs from the failing contexts. Tomorrow I can likely scroll back on screen and find and record where it reports umass0 as being when I use USB-C, tracing back to where the XHCI root HUB is.) But your log's subsequence is the same: uhub0 on usbus0 uhub0: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 usbus0: 5.0Gbps Super Speed USB v3.0 uhub0: 6 ports with 6 removable, self powered uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 3> on usbus0 uhub2: 4 ports with 3 removable, self powered uhub2 on uhub0 uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 3> on usbus0 umass0 on uhub2 umass0: <ASMedia AS2115, class 0/0, rev 3.00/0.01, addr 5> on usbus0
(In reply to Robert Clausecker from comment #13) Looking again at the log for the successful boot that I was referencing, it is not as I said (from grep for usb/uhub references) and what varies between your log and mine is something else: Mine was a USB3.2 device on the USB3.2 hub but yours was a USB3.0 device on the USB3.2 hub. My backtrace from umass0 to Generic XHCI root HUB: Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0 on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub0: 6 ports with 6 removable, self powered Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2 on uhub0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 4> on usbus0 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: uhub2: 4 ports with 3 removable, self powered Dec 31 16:00:24 CA72_4c8G_ZFS kernel: umass0 on uhub2 Dec 31 16:00:24 CA72_4c8G_ZFS kernel: umass0: <Samsung PSSD T7 Touch, class 0/0, rev 3.20/1.00, addr 6> on usbus0 (Note that last "rev 3.20".) Your backtrace from umass0 to Generic XHCI root HUB: uhub0 on usbus0 uhub0: <Generic XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 usbus0: 5.0Gbps Super Speed USB v3.0 uhub0: 6 ports with 6 removable, self powered uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 3> on usbus0 uhub2: 4 ports with 3 removable, self powered uhub2 on uhub0 uhub2: <GenesysLogic USB3.2 Hub, class 9/0, rev 3.20/61.24, addr 3> on usbus0 umass0 on uhub2 umass0: <ASMedia AS2115, class 0/0, rev 3.00/0.01, addr 5> on usbus0 (Note that last "rev 3.00".)
(In reply to Mark Millard from comment #16) FYI: The 2 "backtraces" are presented in forward-time order, so backtracing is reading each bottom-to-top.
(In reply to Mark Millard from comment #17) I set up the USB-C connection context again and am looking on screen at the scroll back for the failure: No "umass*" ever shows up for my failing context (USB-C in use). (But the FreeBSD loader loaded the kernel from the drive just fine via UEFI's drive I/O support.) This does not match your failure's log. So there may be 2 distinct problems for our 2 failures. After sleeping, I may try to set up a USB3.0/USB-A context to see if I can replicate your failure. Probably using UFS. By the way, in the UEFI, what is the boot order you have it using for finding a boot media? Did you disable any of the options? Which? I moved USB to the top (first) and disabled the others. (It still eventually boots Windows 11 Pro if no USB EFI loader is found.)
(In reply to Mark Millard from comment #18) I tried various boot orders and none of them changed the result. I believe that once the boot loader is successfully load by UEFI, the boot order ceases to be of importance.
(In reply to Robert Clausecker from comment #0) Ultimately, using main [so: 14], I've not been able to reproduce any "after displaying the beastie menu" crashes based on USB storage having been connected during the boot. USB3.2 and USB3.0 devices. I cover 3 of the 4 combinations relative to port types for my test context: USB3.2 in USB-C port (no "umass0" or no "umass1" or . . .) USB3.2 in USB-A port (works) USB3.0 in USB-A port (works) Unfortunately, I do not have a way to form a USB3.0/USB-C combination. The USB3.2 in USB-C port case has differing consequences for boot media (no root mount or the like) vs. having the same result as not plugging the drive into a port at all: not detected. (In my context, the FreeBSD boot media is always a "umass*".) I no longer maintain an environment for building stable/* or releng/* variants. So it may be a main vs. releng distinction compared to your results. I've not checked. I'll also note that I do not have access to a variety of media of the types listed. It could be some other distinction is involved that happens to correlate with USB3.2 in my context and that some other USB3.2 storage media would work. I've no way to know from what I've available to test. (Of course, a bunch of my comments ended up being the process of figuring my own operator error: All those reporting a Synchronous Exception.)
(In reply to Robert Clausecker from comment #19) Note: If you get ahold of a main [so: 14] loader.efi copy that has the 2 required commits, you could try substituting that loader.efi content into your 13.2-RELEASE media and see if you then end up with what I've reported. I've not checked the latest snapshot for those commits but at some point extracting a loader.efi from a main snapshot would allow the experiment (before stable or release had such).
(In reply to Mark Millard from comment #21) Unfortunately the device is colocated now and hard for me to access. It's also busy 24/7 building ports. I'll see if I can find an opportunity to perform these tests.
(In reply to Mark Millard from comment #20) I've submitted: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271012 against main's kernel for the failure context that I ran into (since it does not match yours as far as we can tell so far).
(In reply to Robert Clausecker from comment #22) FYI: I have tested one of the latest aarch64 snapshots, with /boot/loader.conf adjusted, and it worked fine for booting the Windows Dev Kit 2023 as the snapshot's first boot. So the loader.efi and kernel involved look to be appropriate. The media was one of the USB3.0/USB-A devices.
(In reply to Mark Millard from comment #20) In a few days I should have an adapter to connect the USB3.0 SSDs that have a USB-A connector to USB-C ports, such as on the Windows Dev Kit 2023. So I should then be able to test USB3.0 devices on the USB-C ports instead of only USB3.2 devices on those ports, including for being present during booting. (Still only one type of USB3.0 SSD device, not a variety. But one type is more than zero types.) Of course, my test would not be likely to duplicate the partitioning or content of the drives that got the crashes. Duplication of some aspects may be required to see the problem and if my tests do not get the crash, such would be suggested as a possibility.
(In reply to Mark Millard from comment #25) Interestingly, using the adaptor to USB-C for plugging in media after booting FreeBSD, I get different results on different systems (the only FreeBSD USB-C contexts that I've access to): ThreadRipper 1950X: media is detected. Windows Dev Kit 2023: same media is not detected. The ThreadRipper is a USB 3.1 context for the USB-C connector, not a USB 3.2 context. The WDK23 has its note about keyboards and mice via its USB-C connector which may indicate something relevant. The media here is a USB3.0 SSD. As for having the USB-C connection present during boot loader activity: that did not cause the loader any problems. The only way that I've ever found to have a synchronous exception during loader activity is from having plugged in FreeBSD boot media with too old of a UEFI loader --and for the UEFI to also have picked that media to get the UEFI loader from. Then that loader leads to the synchronous exception. The UEFI does not give a way to explicitly pick which USB device to boot from when more than one "bootable" USB media is present. May be the port scan is in a fixed order or some such. I've not tested for such. Overall: I'm unable to reproduce the problem that was reported.