Bug 252608 - [panic] [usb] [if_ure] Kernel fault on ure disconnect
Summary: [panic] [usb] [if_ure] Kernel fault on ure disconnect
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: usb (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-usb (Nobody)
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2021-01-12 12:20 UTC by Ali Abdallah
Modified: 2021-01-13 21:14 UTC (History)
5 users (show)

See Also:


Attachments
Fix race (3.37 KB, patch)
2021-01-12 13:10 UTC, Hans Petter Selasky
no flags Details | Diff
Patch for second issue - please test (3.29 KB, patch)
2021-01-12 18:03 UTC, Hans Petter Selasky
no flags Details | Diff
Patch for second issue - please test (4.89 KB, patch)
2021-01-12 19:47 UTC, Hans Petter Selasky
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ali Abdallah 2021-01-12 12:20:58 UTC
My USB-C dock with Realtek NIC losts power for a second, but that was enough to panic the entire system...

Jan 10 11:01:11 Fryzen495 kernel: ugen1.2: <VIA Labs, Inc. USB2.0 Hub> at usbus1 (disconnected)
Jan 10 11:01:11 Fryzen495 kernel: uhub3: at uhub1, port 1, addr 1 (disconnected)
Jan 10 11:01:11 Fryzen495 kernel: ugen1.3: <Realtek USB-C Dock Ethernet> at usbus1 (disconnected)
Jan 10 11:01:11 Fryzen495 kernel: ure0: at uhub3, port 1, addr 2 (disconnected)
Jan 10 11:02:58 Fryzen495 syslogd: kernel boot file is /boot/kernel/kernel
Jan 10 11:02:58 Fryzen495 kernel: panic: vm_fault_lookup: fault on nofault entry, addr: 0xfffffe00fa8a1000
Jan 10 11:02:58 Fryzen495 kernel: cpuid = 3
Jan 10 11:02:58 Fryzen495 kernel: time = 1610272871
Jan 10 11:02:58 Fryzen495 kernel: KDB: stack backtrace:
Jan 10 11:02:58 Fryzen495 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b4b635a0
Jan 10 11:02:58 Fryzen495 kernel: vpanic() at vpanic+0x181/frame 0xfffffe00b4b635f0
Jan 10 11:02:58 Fryzen495 kernel: panic() at panic+0x43/frame 0xfffffe00b4b63650
Jan 10 11:02:58 Fryzen495 kernel: vm_fault() at vm_fault+0x1331/frame 0xfffffe00b4b63750
Jan 10 11:02:58 Fryzen495 kernel: vm_fault_trap() at vm_fault_trap+0xb1/frame 0xfffffe00b4b637a0
Jan 10 11:02:58 Fryzen495 kernel: trap_pfault() at trap_pfault+0x1f6/frame 0xfffffe00b4b63800
Jan 10 11:02:58 Fryzen495 kernel: trap() at trap+0x27d/frame 0xfffffe00b4b63910
Jan 10 11:02:58 Fryzen495 kernel: calltrap() at calltrap+0x8/frame 0xfffffe00b4b63910
Jan 10 11:02:58 Fryzen495 kernel: --- trap 0xc, rip = 0xffffffff80686ecd, rsp = 0xfffffe00b4b639e0, rbp = 0xfffffe00b4b639e0 ---
Jan 10 11:02:58 Fryzen495 kernel: memcpy_std() at memcpy_std+0x9d/frame 0xfffffe00b4b639e0
Jan 10 11:02:58 Fryzen495 kernel: usbd_copy_in() at usbd_copy_in+0x4d/frame 0xfffffe00b4b63a20
Jan 10 11:02:58 Fryzen495 kernel: ure_bulk_write_callback() at ure_bulk_write_callback+0x43f/frame 0xfffffe00b4b63ad0
Jan 10 11:02:58 Fryzen495 kernel: usbd_callback_wrapper() at usbd_callback_wrapper+0x6df/frame 0xfffffe00b4b63b30
Jan 10 11:02:58 Fryzen495 kernel: usb_command_wrapper() at usb_command_wrapper+0xb5/frame 0xfffffe00b4b63b50
Jan 10 11:02:58 Fryzen495 kernel: usb_callback_proc() at usb_callback_proc+0xb9/frame 0xfffffe00b4b63b70
Jan 10 11:02:58 Fryzen495 kernel: usb_process() at usb_process+0x106/frame 0xfffffe00b4b63bb0
Jan 10 11:02:58 Fryzen495 kernel: fork_exit() at fork_exit+0x7d/frame 0xfffffe00b4b63bf0
Jan 10 11:02:58 Fryzen495 kernel: fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00b4b63bf0
Jan 10 11:02:58 Fryzen495 kernel: --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Jan 10 11:02:58 Fryzen495 kernel: KDB: enter: panic

As you can see, the usb process is still running can called ure_bulk_write_callback, but the ure0 device was detached previously. I got a crash dump, please let me know if you need further data.
Comment 1 Hans Petter Selasky freebsd_committer 2021-01-12 12:55:51 UTC
There is likely a missing drain of the USB transfers.
Comment 2 Ali Abdallah 2021-01-12 13:00:12 UTC
Do you think there is an easy way to avoid the panic?
Comment 3 Hans Petter Selasky freebsd_committer 2021-01-12 13:04:10 UTC
I suspect "sc->sc_txavail" is not properly managed at shutdown and that freed USB transfers are kicked alive again.

Adding jmg@

Likely it is just more safe to kick all the transfers when there is a packet for TX and not have some many USB TX transfers.

--HPS
Comment 4 Hans Petter Selasky freebsd_committer 2021-01-12 13:09:55 UTC
> Do you think there is an easy way to avoid the panic?

Yes,

Try the attached patch.
Comment 5 Hans Petter Selasky freebsd_committer 2021-01-12 13:10:20 UTC
Created attachment 221491 [details]
Fix race
Comment 6 Ali Abdallah 2021-01-12 13:30:24 UTC
(In reply to Hans Petter Selasky from comment #5)
Thanks very much for the patch, will give it a try and report back.
Comment 7 Ali Abdallah 2021-01-12 17:17:21 UTC
It crashes as well with the patch from comment 5

--- trap 0xc, rip = 0xffffffff803b28f1, rsp = 0xfffffe00c83604c0, rbp = 0xfffffe00c8360540 ---                                                                                                                      
__mtx_lock_sleep() at __mtx_lock_sleep+0xd1/frame 0xfffffe00c8360540                                                                                                                                                
usbd_do_request_flags() at usbd_do_request_flags+0x740/frame 0xfffffe00c83605d0                                                                                                                                     
usbd_do_request_proc() at usbd_do_request_proc+0x60/frame 0xfffffe00c8360630                                                                                                                                        
ure_miibus_readreg() at ure_miibus_readreg+0x11c/frame 0xfffffe00c83606a0                                                                                                                                           
rgephy_status() at rgephy_status+0x80/frame 0xfffffe00c83606f0                                                                                                                                                      
rgephy_service() at rgephy_service+0x417/frame 0xfffffe00c8360750                                                                                                                                                   
mii_pollstat() at mii_pollstat+0x46/frame 0xfffffe00c8360780                                                                                                                                                        
ure_ifmedia_sts() at ure_ifmedia_sts+0x41/frame 0xfffffe00c83607b0                                                                                                                                                  
ifmedia_ioctl() at ifmedia_ioctl+0x16a/frame 0xfffffe00c83607e0                                                                                                                                                     
ifhwioctl() at ifhwioctl+0x2af/frame 0xfffffe00c83608e0                                                                                                                                                             
ifioctl() at ifioctl+0x2eb/frame 0xfffffe00c8360990                                                                                                                                                                 
kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe00c8360a00                                       
sys_ioctl() at sys_ioctl+0xf2/frame 0xfffffe00c8360ac0

Use after free of the if_ure mutex.
Comment 8 Hans Petter Selasky freebsd_committer 2021-01-12 17:20:49 UTC
Ali,

This is another issue, and is expected. Can we solve that separately?

Do you see any more TX-path panics?

--HPS
Comment 9 Ali Abdallah 2021-01-12 17:26:23 UTC
> This is another issue, and is expected. Can we solve that separately?

Yes absolutely, thanks.

> Do you see any more TX-path panics?

Not anymore (Will report here if I get another tx-path panic).
Comment 10 Hans Petter Selasky freebsd_committer 2021-01-12 17:30:46 UTC
@bz: ether_ifdetach() nor if_free() is draining if_ioctl. We should have a solution in place for all kernel drivers!

--HPS
Comment 11 commit-hook freebsd_committer 2021-01-12 17:34:41 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=6e5baec33c1032f4fbf713d601a34b4658b918a2

commit 6e5baec33c1032f4fbf713d601a34b4658b918a2
Author:     Hans Petter Selasky <hselasky@FreeBSD.org>
AuthorDate: 2021-01-12 13:13:14 +0000
Commit:     Hans Petter Selasky <hselasky@FreeBSD.org>
CommitDate: 2021-01-12 16:57:58 +0000

    Fix for use-after-free in if_ure(4) driver.

    When detaching the if_ure(4) driver, the TX active USB transfer array may
    point to freed USB transfers. Given that the number of USB transfers is
    very low, simply start all transfers every time there is a packet to
    keep safe from use-after-free.

    PR: 252608
    MFC after: 1 week
    Sponsored by: Mellanox Technologies // NVIDIA Networking

 sys/dev/usb/net/if_ure.c    | 49 ++++-----------------------------------------
 sys/dev/usb/net/if_urereg.h |  9 ---------
 2 files changed, 4 insertions(+), 54 deletions(-)
Comment 12 Hans Petter Selasky freebsd_committer 2021-01-12 18:03:35 UTC
Created attachment 221501 [details]
Patch for second issue - please test

Ali,

Please test this patch!

--HPS
Comment 13 Ali Abdallah 2021-01-12 18:49:59 UTC
(In reply to Hans Petter Selasky from comment #12)
Thanks, really appreciated your quick responses. 

With the patch from comment 12, it is still panicking with the following trace

--- trap 0xc, rip = 0xffffffff80e85515, rsp = 0xfffffe00eaf936b0, rbp = 0xfffffe00eaf936f0 ---
rgephy_status() at rgephy_status+0x95/frame 0xfffffe00eaf936f0
rgephy_service() at rgephy_service+0x417/frame 0xfffffe00eaf93750
mii_pollstat() at mii_pollstat+0x46/frame 0xfffffe00eaf93780
ure_ifmedia_sts() at ure_ifmedia_sts+0x44/frame 0xfffffe00eaf937b0
ifmedia_ioctl() at ifmedia_ioctl+0x16a/frame 0xfffffe00eaf937e0
ifhwioctl() at ifhwioctl+0x2af/frame 0xfffffe00eaf938e0
ifioctl() at ifioctl+0x2eb/frame 0xfffffe00eaf93990
kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe00eaf93a00
sys_ioctl() at sys_ioctl+0xf2/frame 0xfffffe00eaf93ac0
Comment 14 Hans Petter Selasky freebsd_committer 2021-01-12 19:47:33 UTC
Created attachment 221506 [details]
Patch for second issue - please test

Please test updated patch.
Comment 15 Ali Abdallah 2021-01-13 07:06:27 UTC
(In reply to Hans Petter Selasky from comment #14)
I can confirm that my system is no longer panicking when the dock is unplugged. Thanks!!!