Bug 206801 - iwn(4) page fault on netif restart
Summary: iwn(4) page fault on netif restart
Status: Closed DUPLICATE of bug 195433
Alias: None
Product: Base System
Classification: Unclassified
Component: wireless (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-wireless (Nobody)
URL:
Keywords: crash, needs-qa
Depends on:
Blocks:
 
Reported: 2016-01-31 16:53 UTC by Devin Teske
Modified: 2019-02-02 19:51 UTC (History)
1 user (show)

See Also:
avos: mfc-stable10-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Devin Teske freebsd_committer freebsd_triage 2016-01-31 16:53:33 UTC
Woke up this morning to find that network wasn't working. A quick "service netif restart" hung at trying to bring down wpa_supplicant. A "kill -9" of wpa_supplicant had no effect. The ppid of this wpa_supplicant was 1. Eventually we landed on a page fault.

I was able to extract a fair bit of precious information.


$ uname -a
FreeBSD lent.shxd.cx 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r293338: Wed Jan 13 21:26:36 PST 2016     root@lent.shxd.cx:/usr/obj/home/dteske/src/freebsd_svn/base/head/sys/GENERIC  amd64

$ ident -q kern/subr_firmware.c dev/iwn/if_iwn.c 
kern/subr_firmware.c:
     $FreeBSD: head/sys/kern/subr_firmware.c 285391 2015-07-11 16:22:48Z mjg $
dev/iwn/if_iwn.c:
     $FreeBSD: head/sys/dev/iwn/if_iwn.c 293716 2016-01-12 00:24:40Z avos $

db> show msgbuf
msgbufp = 0xfffff8023bffffb8
magic = 63062, size = 98232, r= 97305, ptr = 0xfffff8023bfe8000, cksum= 7937404
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex firmware table (firmware table) r = 0 (0xffffffff81ba8c80) locked @ /home/dteske/src/freebsd_svn/base/head/sys/kern/subr_firmware.c:367
exclusive sleep mutex iwn0 (network driver) r = 0 (0xfffffe0000ed4018) locked @ /home/dteske/src/freebsd_svn/base/head/sys/dev/iwn/if_iwn.c:8197
stack backtrace:
#0 0xffffffff80a79b10 at witness_debugger+0x70
#1 0xffffffff80a7ae27 at witness_warn+0x3d7
#2 0xffffffff80e6a9e7 at trap_pfault+0x57
#3 0xffffffff80e6a2bf at trap+0x4bf
#4 0xffffffff80e4a1d7 at calltrap+0x8
#5 0xffffffff805b56d7 at iwn_init_locked+0x567
#6 0xffffffff805ad93b at iwn_radio_on+0x3b
#7 0xffffffff80a6d340 at taskqueue_run_locked+0xf0
#8 0xffffffff80a6de68 at taskqueue_thread_loop+0x88
#9 0xffffffff809e5c14 at fork_exit+0x84
#10 0xffffffff80e4a70e at fork_trampoline+0xe


Fatal trap 12: page fault while in kernel mode 
cpuid = 3; apic id = 05
fault virtual address   = 0xffffffffffffffe0
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80a5aee7
stack pointer           = 0x28:0xfffffe022a544aa0
frame pointer           = 0x28:0xfffffe022a544ac0
code segment            = base rx0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (iwn0 net80211 taskq)
255217) 20151225

db> bt
Tracing pid 0 tid 100039 td 0xfffff8000482b000
firmware_put() at firmware_put+0x27/frame 0xfffffe022a544ac0
iwn_init_locked() at iwn_init_locked+0x567/frame 0xfffffe022a544af0
iwn_radio_on() at iwn_radio_on+0x3b/frame 0xfffffe022a544b20
taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xfffffe022a544b80
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe022a544bb0
fork_exit() at fork_exit+0x84/frame 0xfffffe022a544bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe022a544bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Comment 1 Andriy Voskoboinyk freebsd_committer freebsd_triage 2019-02-02 17:09:40 UTC
Fixed in base r314234 - iwn_init() was called multiple times and, due to race during firmware upload, firmware_put() was called on the same pointer few times.

*** This bug has been marked as a duplicate of bug 195433 ***
Comment 2 Andriy Voskoboinyk freebsd_committer freebsd_triage 2019-02-02 19:51:55 UTC
A bit of clarification: it wasn't double free; NULL was passed to firmware_put() on the second (parallel) iwn_init_locked() run.