Summary: | [ehci] ehci_interrupt: unrecoverable error, controller halted | ||||||
---|---|---|---|---|---|---|---|
Product: | Base System | Reporter: | Samy Mahmoudi <samy.mahmoudi> | ||||
Component: | usb | Assignee: | freebsd-usb (Nobody) <usb> | ||||
Status: | Closed FIXED | ||||||
Severity: | Affects Only Me | CC: | hselasky | ||||
Priority: | --- | ||||||
Version: | 11.2-RELEASE | ||||||
Hardware: | amd64 | ||||||
OS: | Any | ||||||
Attachments: |
|
Hi, Did you try setting any of the EHCI quirks in the loader? hw.usb.ehci.lostintrbug: 0 hw.usb.ehci.iaadbug: 0 What does "pciconf -lv" say about your device? --HPS Hi, Thank you for your prompt reply. No, I did not try to set any of these. dmesg showed the ehci controller is identified as "Intel Cougar Point USB 2.0 controller" which is the result of this revision: https://svnweb.freebsd.org/base?view=revision&revision=316412. Output of "pciconf -lv": hostb0@pci0:0:0:0: class=0x060000 card=0x21cf17aa chip=0x01048086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '2nd Generation Core Processor Family DRAM Controller' class = bridge subclass = HOST-PCI vgapci0@pci0:0:2:0: class=0x030000 card=0x21cf17aa chip=0x01268086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = '2nd Generation Core Processor Family Integrated Graphics Controller' class = display subclass = VGA none0@pci0:0:22:0: class=0x078000 card=0x21cf17aa chip=0x1c3a8086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family MEI Controller' class = simple comms em0@pci0:0:25:0: class=0x020000 card=0x21ce17aa chip=0x15028086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '82579LM Gigabit Network Connection (Lewisville)' class = network subclass = ethernet none1@pci0:0:26:0: class=0x0c0320 card=0x21cf17aa chip=0x1c2d8086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family USB Enhanced Host Controller' class = serial bus subclass = USB hdac0@pci0:0:27:0: class=0x040300 card=0x21cf17aa chip=0x1c208086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family High Definition Audio Controller' class = multimedia subclass = HDA pcib1@pci0:0:28:0: class=0x060400 card=0x21cf17aa chip=0x1c108086 rev=0xb4 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 1' class = bridge subclass = PCI-PCI pcib2@pci0:0:28:1: class=0x060400 card=0x21cf17aa chip=0x1c128086 rev=0xb4 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 2' class = bridge subclass = PCI-PCI pcib3@pci0:0:28:3: class=0x060400 card=0x21cf17aa chip=0x1c168086 rev=0xb4 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 4' class = bridge subclass = PCI-PCI pcib4@pci0:0:28:4: class=0x060400 card=0x21cf17aa chip=0x1c188086 rev=0xb4 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 5' class = bridge subclass = PCI-PCI ehci0@pci0:0:29:0: class=0x0c0320 card=0x21cf17aa chip=0x1c268086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family USB Enhanced Host Controller' class = serial bus subclass = USB isab0@pci0:0:31:0: class=0x060100 card=0x21cf17aa chip=0x1c4f8086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = 'QM67 Express Chipset Family LPC Controller' class = bridge subclass = PCI-ISA ahci0@pci0:0:31:2: class=0x010601 card=0x21cf17aa chip=0x1c038086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family 6 port Mobile SATA AHCI Controller' class = mass storage subclass = SATA none2@pci0:0:31:3: class=0x0c0500 card=0x21cf17aa chip=0x1c228086 rev=0x04 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family SMBus Controller' class = serial bus subclass = SMBus iwn0@pci0:3:0:0: class=0x028000 card=0x13118086 chip=0x00858086 rev=0x34 hdr=0x00 vendor = 'Intel Corporation' device = 'Centrino Advanced-N 6205 [Taylor Peak]' class = network sdhci_pci0@pci0:13:0:0: class=0x088001 card=0x21cf17aa chip=0xe8221180 rev=0x08 hdr=0x00 vendor = 'Ricoh Co Ltd' device = 'MMC/SD Host Controller' class = base peripheral none3@pci0:13:0:3: class=0x0c0010 card=0x21cf17aa chip=0xe8321180 rev=0x04 hdr=0x00 vendor = 'Ricoh Co Ltd' device = 'R5C832 PCIe IEEE 1394 Controller' class = serial bus subclass = FireWire N.B. I just built the generic kernel from the last source tree (releng/11.2) and the problem disappeared. For further investigation as I was not satisfied with that logic, I ran freebsd-update fetch and freebsd-update install which (unexpectedly) give rise to a kernel reinstall. After rebooting, the problem came back so I can confirm the problem is present with the distributed generic kernel and absent with a home-made kernel. I doubt my /etc/make.conf has something to do with this: CPUTYPE?=sandybridge MAKE_JOBS_NUMBER=4 OPTIONS_UNSET+=DOCS EXAMPLES IPV6 LPR OPTIONS_SET+=CUPS CUPS_OVERWRITE_BASE=YES DEVELOPER=YES I have set hw.usb.ehci.lostintrbug and hw.usb.ehci.iaadbug to 0 and it seems to solve the problem. I will now try to isolate which one is relevant to the issue, if not both. It might be a quirk has already been added for your device in 11-stable or the issue was found and fixed. Is it a problem to run 11-stable kernel? --HPS It is absolutely not a problem to run a 11-STABLE kernel (especially because I use ZFS with a beadm-compatible layout) nor it is to build the generic kernel by my own. The problem is this kernel trap occuring after an upgrade to 11.2-RELEASE. Unfortunately, I can not reproduce the problem right now as I have not made a back up of the distributed generic kernel. Moreover, I can not confirm what I wrote in comment 3. I will try to reproduce the upgrade with a rollback as soon as possible. Could you please develop your hypothesis about this ? I have been able to reproduce the issue, even with my home-built generic kernel and/or the tunables hw.usb.ehci.(lostintrbug|iaadbug) set to 0. At least, it now makes more sense. I even encountered crashes. I will try to obtain a crash dump as soon as possible. I did not obtain a crash dump since the relevant machine does not have a regular swap partition on drive. When I partitioned this drive, I did not think I could get involved in any form of kernel debugging... Using a zvol as a swap device would have been useless for debugging so I wrote something like 'dumpdev="/dev/gpt/usbswap"' to /etc/rc.conf. Then I thought it would have been too hazardous to dump the crash to a USB swap device as the crash was precisely related to USB, so I dropped that idea. Anyway, thank you for your help Hans Petter. As you said, this bug has probably been fixed since then. |
Created attachment 196155 [details] dmesg output Hello, Upgrading from 11.1-RELEASE to 11.2-RELEASE broke two of my USB ports. dmesg gives me: ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xf252a000-0xf252a3ff at device 26.0 on pci0 usbus0: EHCI version 1.0 ehci_interrupt: unrecoverable error, controller halted cmd=0x00010030 EHCI_CMD_ITC_1 EHCI_CMD_ASE EHCI_CMD_PSE sts=0x0000d004 EHCI_STS_ASS EHCI_STS_PSS EHCI_STS_HCH EHCI_STS_PCD ien=0x00000037 frindex=0x00000000 ctrdsegm=0x00000000 periodic=0x03c2f000 async=0xd3427000 port 1 status=0x00001803 port 2 status=0x00001000 port 3 status=0x00001000 ehci_dump_isoc: isochronous dump from frame 0x000: ITD(0xfffffe01139f5000) at 0x03c58000 next=0x20a86004 status[0]=0x00000000; <> status[1]=0x00000000; <> status[2]=0x00000000; <> status[3]=0x00000000; <> status[4]=0x00000000; <> status[5]=0x00000000; <> status[6]=0x00000000; <> status[7]=0x00000000; <> bp[0]=0x00000000 addr=0x00; endpt=0x0 bp[1]=0x00000000 dir=out; mpl=0x00 bp[2..6]=0x00000000,0x00000000,0x00000000,0x00000000,0x00000000 bp_hi=0x00000000,0x00000000,0x00000000,0x00000000, 0x00000000,0x00000000,0x00000000 SITD(0xfffffe00ef486000) at 0x20a86000 next=0xd3458002 portaddr=0x00000000 dir=out addr=0 endpt=0x0 port=0x0 huba=0x0 mask=0x00000000 status=0x00000000 <> len=0x0 back=0x00000001, bp=0x00000000,0x00000000,0x00000000,0x00000000 ehci_interrupt: blocking interrupts 0x10 usbus0: run timeout ehci0: USB init failed err=18 device_attach: ehci0 attach returned 6 I have attached the complete output of dmesg.