My WiFi USB dongle, using the ralink driver, works reproducibly and reliably in hostap mode in 10.3, but never works in CURRENT, giving an error of "run0: could not load 8051 microcode" when hostapd starts, and becoming invisible to WiFi clients. I've confirmed it works well in GitHub commit 74ee552c5dacc20b6dde64cbb8a44e8c8ce975d0 (from around the 10.3 release), and have begun bisecting for the regression. In CURRENT it does work in authmode "open", but not with WPA2. The microcode loads the first time, but doesn't load the second time when hostapd starts. It loads successfully both times in 74ee552c5dacc20b6dde64cbb8a44e8c8ce975d0. How I test it: # ifconfig wlan0 create wlandev run0 wlanmode hostap # ifconfig wlan0 inet 192.168.0.1 netmask 255.255.255.0 ssid home # service hostapd start /etc/rc.conf: hostapd_enable="YES" /etc/hostapd.conf: interface=wlan0 debug=1 ctrl_interface=/var/run/hostapd ctrl_interface_group=wheel ssid=home wpa=2 wpa_passphrase=abc123 wpa_key_mgmt=WPA-PSK wpa_pairwise=CCMP My USB dongle: # usbconfig dump_device_desc ugen0.5: <Ralink 802.11 n WLAN> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (450mA) bLength = 0x0012 bDescriptorType = 0x0001 bcdUSB = 0x0200 bDeviceClass = 0x0000 <Probed by interface class> bDeviceSubClass = 0x0000 bDeviceProtocol = 0x0000 bMaxPacketSize0 = 0x0040 idVendor = 0x148f idProduct = 0x3070 bcdDevice = 0x0101 iManufacturer = 0x0001 <Ralink> iProduct = 0x0002 <802.11 n WLAN> iSerialNumber = 0x0003 <1.0> bNumConfigurations = 0x0001
I bisected this regression to SVN commit 304629, which is this commit in GitHub: b33c688c747fa7cc20a9f21b3e6f4e137647a4cc is the first bad commit commit b33c688c747fa7cc20a9f21b3e6f4e137647a4cc Author: hselasky <hselasky@FreeBSD.org> Date: Mon Aug 22 19:32:50 2016 +0000 Don't separate the status stage of the XHCI USB control transfers into its own job because this breaks the simplified QEMU XHCI TRB parser, which expects the complete USB control transfer as a series of back to back TRBs. The old behaviour is kept under #ifdef in case this change breaks enumeration of any USB devices. PR: 212021 MFC after: 1 week :040000 040000 8fc5db306b826d39d5b25322b5f8f4aae71cd450 05ffa3294c4cbbfb29a04bcf29a53f305313a937 M sys Upgrading importance and adding author to CC.
Undoing the bad commit (304629) on top of CURRENT gets my WiFi dongle working again. Can someone please find a permanent fix for this bug though? It must affect other USB devices. It also affects the 11.1 release, if not more of them.
Hi, Can you collect "usbdump -i usbusX -f Y -w log.pcap" traces when plugging your device in both cases, w/ and w/o reverting r304629 ? Plug your device one time, and record X.Y numbers after ugen. Unplug your device and start usbdump as shown above. Then plug device again, and save the log after a few seconds. Upload binary logs to this PR. --HPS
Beware that this might be a bug in your USB device, that it only works with EHCI and not XHCI.
Created attachment 192839 [details] usbdump log with r304629 Thank you so much. Here is the usbdump taken at commit r304629 itself, where I got that "could not load microcode" error.
Created attachment 192840 [details] usbdump log without r304629 Here is the usbdump taken on CURRENT with commit r304629 reverted, where hostapd is working and my phone successfully connected. I think only the 2 blue USB ports on the back of my PC are USB 3. The dongle was only tested with a black USB port on the front.
Hi, In the log-with-r304629.pcap I clearly see ERR=IOERROR which is not present in the other log file. I suspect that your XHCI controller needs a quirk, I.E. the old behaviour. If I were to guess, the problem is that your XHCI controller executes the status stage instantly after the so-called USB data stage, because the patch make the DMA jobs back to back. The ralink dongle doesn't handle this and IOERROR usually means no response. Typically this patch makes the status stage execute after the next SOF. This should be handled automagically by the XHCI controller. Can you post the output of "pciconf -lv" for your XHCI PCI device? I guess we can make the #ifdef into a system tunable, and then you can set it in /boot/loader.conf ? --HPS
(In reply to Hans Petter Selasky from comment #7) Thank you for your help. Here is my pciconf -lv: hostb0@pci0:0:0:0: class=0x060000 card=0x50001458 chip=0x0c008086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '4th Gen Core Processor DRAM Controller' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:0: class=0x060400 card=0x50001458 chip=0x0c018086 rev=0x06 hdr=0x01 vendor = 'Intel Corporation' device = 'Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller' class = bridge subclass = PCI-PCI xhci0@pci0:0:20:0: class=0x0c0330 card=0x50071458 chip=0x8c318086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family USB xHCI' class = serial bus subclass = USB none0@pci0:0:22:0: class=0x078000 card=0x1c3a1458 chip=0x8c3a8086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family MEI Controller' class = simple comms ehci0@pci0:0:26:0: class=0x0c0320 card=0x50061458 chip=0x8c2d8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family USB EHCI' class = serial bus subclass = USB hdac1@pci0:0:27:0: class=0x040300 card=0xa0021458 chip=0x8c208086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset High Definition Audio Controller' class = multimedia subclass = HDA pcib2@pci0:0:28:0: class=0x060400 card=0x50011458 chip=0x8c108086 rev=0xd5 hdr=0x01 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family PCI Express Root Port' class = bridge subclass = PCI-PCI pcib3@pci0:0:28:2: class=0x060400 card=0x50011458 chip=0x8c148086 rev=0xd5 hdr=0x01 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family PCI Express Root Port' class = bridge subclass = PCI-PCI pcib4@pci0:0:28:3: class=0x060401 card=0x50011458 chip=0x244e8086 rev=0xd5 hdr=0x01 vendor = 'Intel Corporation' device = '82801 PCI Bridge' class = bridge subclass = PCI-PCI ehci1@pci0:0:29:0: class=0x0c0320 card=0x50061458 chip=0x8c268086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family USB EHCI' class = serial bus subclass = USB isab0@pci0:0:31:0: class=0x060100 card=0x50011458 chip=0x8c508086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'B85 Express LPC Controller' class = bridge subclass = PCI-ISA ahci0@pci0:0:31:2: class=0x010601 card=0xb0051458 chip=0x8c028086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode]' class = mass storage subclass = SATA none1@pci0:0:31:3: class=0x0c0500 card=0x50011458 chip=0x8c228086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '8 Series/C220 Series Chipset Family SMBus Controller' class = serial bus subclass = SMBus vgapci0@pci0:1:0:0: class=0x030000 card=0x00000000 chip=0x0f0110de rev=0xa1 hdr=0x00 vendor = 'NVIDIA Corporation' device = 'GF108 [GeForce GT 620]' class = display subclass = VGA hdac0@pci0:1:0:1: class=0x040300 card=0x00000000 chip=0x0e0810de rev=0xa1 hdr=0x00 vendor = 'NVIDIA Corporation' device = 'GF119 HDMI Audio Controller' class = multimedia subclass = HDA re0@pci0:3:0:0: class=0x020000 card=0xe0001458 chip=0x816810ec rev=0x06 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller' class = network subclass = ethernet pcib5@pci0:4:0:0: class=0x060401 card=0x88921458 chip=0x244e8086 rev=0x41 hdr=0x01 vendor = 'Intel Corporation' device = '82801 PCI Bridge' class = bridge subclass = PCI-PCI
Apparently r304629 was added because qemu needed these back to back DMA jobs for its emulated XHCI controller (in bug 212021). But qemu is cross platform, just like my USB dongle. How does Windows get both of them to work? How does Linux? Other BSDs? Do they all use these back to back DMA jobs, or does only FreeBSD use them? Also I can - very rarely - get the WiFi dongle to work even with r304629 applied. It only works 19.67% of the time. Could there be some kind of timing issue or race condition, that is being exacerbated by r304629? Let me try get a usbdump of one of these rare cases. I think it should be very revealing.
Created attachment 192894 [details] usbdump log of the rare case when it works with r304629 Here is the usbdump of the rare case when it fully works at commit r304629 itself. Does it reveal anything useful, compared to the broken case in attachment 192839 [details]?
> How does Linux? Linux is more good at adding quirks when needed. This does not only affect QEMU, but also VMWARE and all kinds of hypervisors. --HPS
Hi, The XHCI controller you have already has some quirks in FreeBSD, if you look at sys/dev/usb/controller/xhci_pci.c: case 0x8c318086: /* Lynx Point */ I'll add another one if I find some time later today for this issue. --HPS
A commit references this bug: Author: hselasky Date: Mon Apr 30 07:30:38 UTC 2018 New revision: 333100 URL: https://svnweb.freebsd.org/changeset/base/333100 Log: Improve fix in r304629 by allowing configuration of the behaviour through a SYSCTL instead of a compile time define. Add quirk by default for all LynxPoint XHCI controllers. PR: 227602 MFC after: 3 days Sponsored by: Mellanox Technologies Changes: head/sys/dev/usb/controller/xhci.c head/sys/dev/usb/controller/xhci.h head/sys/dev/usb/controller/xhci_pci.c
Please try the patch submitted to head. Let me know if it doesn't work.
A commit references this bug: Author: hselasky Date: Thu May 3 07:29:08 UTC 2018 New revision: 333199 URL: https://svnweb.freebsd.org/changeset/base/333199 Log: MFC r333100: Improve fix in r304629 by allowing configuration of the behaviour through a SYSCTL instead of a compile time define. Add quirk by default for all LynxPoint XHCI controllers. PR: 227602 Sponsored by: Mellanox Technologies Changes: _U stable/11/ stable/11/sys/dev/usb/controller/xhci.c stable/11/sys/dev/usb/controller/xhci.h stable/11/sys/dev/usb/controller/xhci_pci.c
A commit references this bug: Author: hselasky Date: Thu May 3 07:38:46 UTC 2018 New revision: 333203 URL: https://svnweb.freebsd.org/changeset/base/333203 Log: MFC r333100: Improve fix in r304629 by allowing configuration of the behaviour through a SYSCTL instead of a compile time define. Add quirk by default for all LynxPoint XHCI controllers. PR: 227602 Sponsored by: Mellanox Technologies Changes: _U stable/10/ stable/10/sys/dev/usb/controller/xhci.c stable/10/sys/dev/usb/controller/xhci.h stable/10/sys/dev/usb/controller/xhci_pci.c
It works with that commit. Thank you so much! .-'"""""'-. .' `. / . . \ : : | | : \ / : \ `.____.' / `. .' `-._____.-'