Created attachment 230290 [details] main-n251848-3e01ee76f20 Panic The protection for probe/attach/detach appears to be the 'Giant' mutex lock. This is insufficient to protect against multiple thread attaching/detaching at the same time. Some attach/detach actions require "sleeps", this releases the 'Giant' lock, allowing other threads to also attach/detach. There are no checks for `DS_ATTACHING` state in device_attach() and there is currently no `DS_DETACHING` state to check in device_detach(). Steps to Reproduce: This is easily reproduced with devices that "sleep" during attach/detach, as Giant lock is dropped in these cases. https://www.freebsd.org/cgi/man.cgi?locking(9). "Giant is dropped during unbounded sleeps and reacquired after wakeup." igb devices enter e1000_get_cfg_done_i210()->safe_pause_sbt()->pause_sbt() during detach, thus releasing the Giant lock. As show in the attached backtrace. root@FBSDCURRENT:/ # devinfo -v | grep igb0 igb0 pnpinfo vendor=0x8086 device=0x1533 subvendor=0x8086 subdevice=0x0002 class=0x020000 at slot=0 function=0 dbsf=pci0:6:0:0 handle=\_SB_.PCI0.RP05.PXSX root@FBSDCURRENT:/ # Terminal #1: # sh root@FBSDCURRENT:/ # while [ true ] ; do devctl attach pci0:6:0:0;devctl detach pci0:6:0:0;done Terminal #2: # sh root@FBSDCURRENT:/ # while [ true ] ; do devctl attach pci0:6:0:0;devctl detach pci0:6:0:0;done Actual Results: This causes an immediate panic when Terminal #2 starts loop. A backtrace of the 2 threads both running the devctl's devctl2_ioctl() is attached (locally build kernel). `vmcore` file from "13.0-RELEASE" can also be provided. Expected Results: The kernel should not panic/crash. First thread should complete. The second thread should return an error (or wait until the other thread is complete). Build Date & Hardware: Locally compiled with git updated Dec 21 2021: FreeBSD FBSDCURRENT 14.0-CURRENT FreeBSD 14.0-CURRENT #3 main-n251848-3e01ee76f20: Tue Dec 21 00:33:45 EST 2021 root@FBSDCURRENT:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 Also reproducible on 13.0-RELEASE (Downloaded/Updated): FreeBSD TrafficHammerTwoHanded 13.0-RELEASE FreeBSD 13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f: Fri Apr 9 04:24:09 UTC 2021 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Some recent work [1] with Giant replacement in subr_bus and portentially relatged work with respect to [2], which mentions "patch works around the real issue, which is a locking deficiency of the busses. Request feedback from those committers accordingly. [1] https://reviews.freebsd.org/D31833 [2] https://reviews.freebsd.org/D19375 ^Triage: Set version to earliest supported and confirmed reproduction.
This is a know bug since FreeBSD 4.0. devctl just makes it a lot easier to tickle. As you note, the only real way to fix it is to move away from Giant, but even that exposes a few holes w/o additional work.
Is there a time frame for a fix?
The general fix is not simple. Simple hacks might limit the blast radius.. loops with devctl won't work for some time because it does unnatural things that normal drivers never do. I'll be concentrating on the typical scenarios first...
^Triage: clear unneeded flags. Nothing has yet been committed to be merged.