Bug 282789

Summary: iwlwifi hangs on boot with request_module_nowait("iwlmvm") -- broken after 2ac644317e671b001d3fb8fd924a1ae808a0bf32
Product: Base System Reporter: Ruslan Makhmatkhanov <rm>
Component: wirelessAssignee: Bjoern A. Zeeb <bz>
Status: In Progress ---    
Severity: Affects Only Me CC: bz, emaste, junchoon, markj, pat, pete, wireless
Priority: --- Keywords: regression
Version: CURRENTFlags: bz: mfc-stable14?
bz: mfc-stable13?
Hardware: amd64   
OS: Any   
Bug Depends on:    
Bug Blocks: 273620    
Attachments:
Description Flags
boot hang
none
ctrl+t
none
verbose boot none

Description Ruslan Makhmatkhanov freebsd_committer freebsd_triage 2024-11-15 22:59:25 UTC
Created attachment 255206 [details]
boot hang

Hello, that's the last revision when my wireless connection worked:

==========
commit 65691b2dafda23691c3989749def755a98e731ec (HEAD)
Author: Robert William Vesterman <bob@vesterman.com>
Date:   Thu Oct 17 22:54:39 2024 -0400

	libexec/rc/rc.d/netif: Typo fix
==========

If I fetch the next couple of wifi related commits, then system just stops booting at iwlwifi initialization stage. See attached screenshot.

Minimal commit I tested, after that wifi breaks is:

==========
commit 878ede1a0d0f10f851b2bc54be1e28f512bfc016
Author: Mark Johnston <markj@FreeBSD.org>
Date:   Mon Oct 28 13:51:58 2024 +0000

    fstyp: Fix some memory safety bugs
==========


Last commit I tested, to not avail:
==========
commit 12fc79619acaa3041f9c032bc9afe7d2005b942e (HEAD -> main, origin/main, origin/HEAD)
Author: Randall Stewart <rrs@FreeBSD.org>
Date:   Fri Nov 15 12:37:05 2024 -0500

    Change the SOCKBUF_LOCK calls to use the more refined SOCK_XXXBUF_LOCK/UNLOCK.
==========

I also tried to install wifi-firmware-iwlwifi-kmod-20241017_1.

That's my adapter:
==========
iwlwifi0@pci0:0:20:3:	class=0x028000 rev=0x00 hdr=0x00 vendor=0x8086 device=0x02f0 subvendor=0x8086 subdevice=0x0070
    vendor     = 'Intel Corporation'
    device     = 'Comet Lake PCH-LP CNVi WiFi'
    class      = network
==========

Since I don't see any complains in mailing list about this issue, it looks like something my particular chipset, so please help me to figure out what's wrong.
Comment 1 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-11-15 23:14:35 UTC
(a) can you power off your machine entirely and try again  (do you do any dual-boot or wifi passthru)?  I had one (private) report which was exactly like yours and solved that way by the user.

(b) if it doesn't help please do a boot -v (bootverbose) and/or if the hang persists would be interesting to know where/what.  Does ^T reveal anything; if not can you use the debugger and check?
Comment 2 Ruslan Makhmatkhanov freebsd_committer freebsd_triage 2024-11-15 23:29:43 UTC
(In reply to Bjoern A. Zeeb from comment #1)
a) yes, I have windows as second system, but there are no issues with that with kernel prior 2ac644317e671. I tried to power off twice, but nothing changed. Enter and ^T are working, but ctrl+alt+del is not, so the only I can do is poweroff by button.

b) bootverbose doesn't shed any light as I can say. Nor ^T - just LA entry. I attached screenshots for these. I can use debugger if you tell me what to do. I have not experience with that.
Comment 3 Ruslan Makhmatkhanov freebsd_committer freebsd_triage 2024-11-15 23:30:08 UTC
Created attachment 255208 [details]
ctrl+t
Comment 4 Ruslan Makhmatkhanov freebsd_committer freebsd_triage 2024-11-15 23:30:28 UTC
Created attachment 255209 [details]
verbose boot
Comment 5 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-11-16 01:10:42 UTC
You know how you get past this issue (boot -s, mount -uw /, edit /etc/rc.conf setting devmatch_blocklist="if_iwlwifi" to avoid it being loaded automatically, exit) ...?

I'll drop you some debugging and other changes to try privately.
Comment 6 Tomoaki AOKI 2024-11-16 13:07:26 UTC
The same happenes when updated main, amd64
 from: commit 439fa16e1fd35181898b61264b205bf3b7103a41
 to:   commit 566c039d1e7555343fcf6439a10e56f5a632c0fe

For me, even ctrl-alt-del doesn't work and needed forcible power off.
Blocklisting if_iwlwifi allowed me to boot sanely.


Old hardware probe (of the exactly same computer, except attached USB devices.
No NIC is attached via USB.) is at URL below.

http://bsd-hardware.info/?probe=676f16ac86
Comment 7 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-11-16 15:54:11 UTC
Ruslan sent me debug information and I do now understand where the problem comes from but not yet why it only happens for some chipsets only.

Or possibly not chipsets but something else;  the entire order is triggered from SYSINITs via module_init;  in the cases of hang iwl_mvm_init() has not run (or not run to completion).

For cards here it looks like (with XXX-BZ printfs added to show the problem)

------
Autoloading module: if_iwlwifi                                                                                                                               
Intel(R) Wireless WiFi based driver for FreeBSD                                                                                                              
XXX-BZ iwl_opmode_register:1933: name 'iwlmvm' ops 0xffffffff82ba6d38                                                                                        
XXX-BZ iwl_opmode_register:1940: name 'iwlmvm' ops 0xffffffff82ba6d38 op 0xffffffff82baf0e0

^^^^ this is missing for the people with "hangs"
                                                                  
pci0: driver added                                                                                                                                           
found-> vendor=0x8086, dev=0x2725, revid=0x1a                                                                                                                
        domain=0, bus=0, slot=5, func=0                                                                                                                      
        class=02-80-00, hdrtype=0x00, mfdev=0                                                                                                                
        cmdreg=0x0407, statreg=0x0010, cachelnsz=0 (dwords)                                                                                                  
        lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)                                                                                         
        powerspec 3  supports D0 D3  current D0                                                                                                              
        MSI supports 1 message, 64 bit                                                                                                                       
        MSI-X supports 16 messages in map 0x10                                                                                                               
pci0:0:5:0: reprobing on driver added                                                                                                                        
iwlwifi0: <iwlwifi> mem 0x800010000-0x800013fff at device 5.0 on pci0                                                                                        
iwlwifi0: attempting to allocate 6 MSI-X vectors (16 supported)                                                                                              
msi: routing MSI-X IRQ 60 to local APIC 0 vector 57                                                                                                          
msi: routing MSI-X IRQ 61 to local APIC 1 vector 51                                                                                                          
msi: routing MSI-X IRQ 62 to local APIC 2 vector 52                                                                                                          
msi: routing MSI-X IRQ 63 to local APIC 3 vector 51                                                                                                          
msi: routing MSI-X IRQ 64 to local APIC 0 vector 58                        
msi: routing MSI-X IRQ 65 to local APIC 1 vector 52                                                                                                          
iwlwifi0: using IRQs 60-65 for MSI-X                                                                                                                         
iwlwifi0: Detected crf-id 0x400410, cnv-id 0x400410 wfpm id 0x80000000                                                                                       
iwlwifi0: PCI dev 2725/0024, rev=0x420, rfid=0x10d000                                                                                                        
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX210 160MHz                                                                                                             
...
------

And so for the people where it does not work:

in iwl_req_fw_callback op->ops is not defined and it tries to load the ilwmvm module ( which we don't have given we never split the driver given we do not have dwm support ).

I'll go and investigate why we never make it to the second SYSINIT.
Comment 8 Bjoern A. Zeeb freebsd_committer freebsd_triage 2024-12-08 20:35:24 UTC
Can you try the patch from:

https://reviews.freebsd.org/D47994

It will leave a printf currently on boot.  As mentioned in the review: I cannot test this so there may be secondary issues.  Would be good to know if it works and the output from loading the driver for the archives.
Comment 9 pete 2024-12-09 05:29:54 UTC
Hi, I can confirm that this patch now allows me to successfully load the iwlwifi kernel module on my CNVi device

$ dmesg|grep iwlwifi
iwlwifi0: <iwlwifi> mem 0xdd338000-0xdd33bfff at device 20.3 on pci0
iwlwifi0: Detected crf-id 0x2816, cnv-id 0x1000100 wfpm id 0x80000000
iwlwifi0: PCI dev 9df0/0030, rev=0x312, rfid=0x105110
iwlwifi0: Detected Intel(R) Wireless-AC 9560 160MHz
iwlwifi0: successfully loaded firmware image 'iwlwifi-9000-pu-b0-jf-b0-46.ucode'
iwlwifi0: WRT: Overriding region id 0
iwlwifi0: WRT: Overriding region id 1
iwlwifi0: WRT: Overriding region id 2
iwlwifi0: WRT: Overriding region id 3
iwlwifi0: WRT: Overriding region id 4
iwlwifi0: WRT: Overriding region id 6
iwlwifi0: WRT: Overriding region id 8
iwlwifi0: WRT: Overriding region id 9
iwlwifi0: WRT: Overriding region id 10
iwlwifi0: WRT: Overriding region id 11
iwlwifi0: WRT: Overriding region id 15
iwlwifi0: WRT: Overriding region id 16
iwlwifi0: WRT: Overriding region id 18
iwlwifi0: WRT: Overriding region id 19
iwlwifi0: WRT: Overriding region id 20
iwlwifi0: WRT: Overriding region id 21
iwlwifi0: WRT: Overriding region id 28
iwlwifi0: loaded firmware version 46.ff18e32a.0 9000-pu-b0-jf-b0-46.ucode op_mode iwlmvm
iwlwifi0: base HW address: XX:XX:XX:XX:XX:XX, OTP minor version: 0x4
Comment 10 Ruslan Makhmatkhanov freebsd_committer freebsd_triage 2024-12-09 09:25:39 UTC
(In reply to Bjoern A. Zeeb from comment #8)

works here too! Thank you.

Intel(R) Wireless WiFi based driver for FreeBSD
iwlwifi0: <iwlwifi> mem 0xef738000-0xef73bfff at device 20.3 on pci0
iwlwifi0: Detected crf-id 0x3617, cnv-id 0x20000302 wfpm id 0x80000000
iwlwifi0: PCI dev 02f0/0070, rev=0x351, rfid=0x10a100
iwlwifi0: Detected Intel(R) Wi-Fi 6 AX201 160MHz
iwlwifi0: successfully loaded firmware image 'iwlwifi-QuZ-a0-hr-b0-77.ucode'
iwlwifi0: TLV_FW_FSEQ_VERSION: FSEQ Version: 89.3.35.37
iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
iwl-debug-yoyo.bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo.bin either
iwl-debug-yoyo_bin: could not load binary firmware /boot/firmware/iwl-debug-yoyo_bin either
iwl_debug_yoyo_bin: could not load binary firmware /boot/firmware/iwl_debug_yoyo_bin either
iwlwifi0: loaded firmware version 77.2df8986f.0 QuZ-a0-hr-b0-77.ucode op_mode iwlmvm
iwl_req_fw_callback: module 'iwlmvm' not yet available; will beinitialized in a moment
iwlwifi0: Detected RF HR B5, rfid=0x10a100
iwlwifi0: base HW address: 6c:94:66:d7:e1:cd
Comment 11 commit-hook freebsd_committer freebsd_triage 2024-12-09 14:48:01 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=87e140a5c6f89eea7ea6320d1ae34566492abfc0

commit 87e140a5c6f89eea7ea6320d1ae34566492abfc0
Author:     Bjoern A. Zeeb <bz@FreeBSD.org>
AuthorDate: 2024-12-08 20:24:10 +0000
Commit:     Bjoern A. Zeeb <bz@FreeBSD.org>
CommitDate: 2024-12-09 14:45:24 +0000

    iwlwifi: avoid (hard) hang on loading module

    For certain users or chipsets (reports were for CNVi devices but
    we are not sure if this is limited or specific to them) loading
    if_iwlwifi hangs.

    The reason for this is that a SYSINIT (module_load_order()) has not
    yet run in this case and the Linux driver tries to load the
    chipsets-specific module.  On FreeBSD all supported sub-modules are
    part of if_iwlwifi so we do not have to load them separately but
    calling into kern_kldload via LinuxKPI request_module while loading
    the module gives us a hard hang.

    iwlwifi calls request_module_nowait() so we can simply skip over this
    and continue and the SYSINIT will do the job later if no other
    dependencies fail.

    Sponsored by:   The FreeBSD Foundation
    MFC after:      3 days
    PR:             282789
    Tested by:      Ruslan Makhmatkhanov, Pete Wright
    Differential Revision: https://reviews.freebsd.org/D47994

 sys/contrib/dev/iwlwifi/iwl-drv.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)
Comment 12 Mark Johnston freebsd_committer freebsd_triage 2025-01-17 18:09:19 UTC
(In reply to commit-hook from comment #11)
Bjoern, is there any reason not to MFC this change?  The timeout expired a while ago.
Comment 13 Bjoern A. Zeeb freebsd_committer freebsd_triage 2025-01-19 21:31:48 UTC
(In reply to Mark Johnston from comment #12)

I also wanted to MFC the entire driver update first stable/14 is behind main but not while vacation.