Bug 246119

Summary: ahci: MFC of r359499 in 12.1-STABLE r359970 breaks JMicron JMB362
Product: Base System Reporter: rk <rk>
Component: kernAssignee: Alexander Motin <mav>
Status: New ---    
Severity: Affects Only Me CC: markj, mav, rk
Priority: --- Keywords: regression
Version: 12.1-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
Patch adds AHCI_Q_NOFORCE quirk to JMB362 none

Description rk 2020-05-02 11:01:11 UTC
The following MFC in r359972 breaks the detection of cd0 on the
JMB362 ahci controller on 12.1-STABLE:

  MFC r359499: Add ID for JMicron JMB582/JMB585 AHCI controller.

  JMB582 has 2 6Gbps SATA ports and PCIe 3.0 x1.
  JMB585 has 5 6Gbps SATA ports and PCIe 3.0 x2.

  Both chips support AHCI v1.31, Port Multiplier with FBS and 8 MSI vectors.

Before that change (e.g. r359957):
==================================

pci3: <ACPI PCI bus> on pcib3
atapci1: <JMicron JMB362 SATA300 controller> port 0xc040-0xc047,0xc030-0xc033,0xc020-0xc027,0xc010-0xc013,0xc000-0xc00f mem 0xfe510000-0xfe5101ff irq 46 at device 0.0 on pci3
ahci1: <JMicron JMB362 AHCI SATA controller> at channel -1 on atapci1
ahci1: AHCI v1.10 with 2 3Gbps ports, Port Multiplier supported
ahcich2: <AHCI channel> at channel 0 on ahci1
ahcich3: <AHCI channel> at channel 1 on ahci1
...
cd0 at ahcich3 bus 0 scbus3 target 0 lun 0
cd0: <PLEXTOR BD-R  PX-B950SA 1.02> Removable CD-ROM SCSI device
cd0: Serial Number 2512075 216211500893
cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed

-> cd0 is detected, all is well

With r359972:
=============

pci3: <ACPI PCI bus> on pcib3
ahci1: <JMicron JMB362 AHCI SATA controller> port 0xc040-0xc047,0xc030-0xc033,0x
c020-0xc027,0xc010-0xc013,0xc000-0xc00f mem 0xfe510000-0xfe5101ff irq 46 at devi
ce 0.0 on pci3
ahci1: AHCI v1.10 with 2 3Gbps ports, Port Multiplier supported
ahcich2: <AHCI channel> at channel 0 on ahci1
ahcich3: <AHCI channel> at channel 1 on ahci1
...
Root mount waiting for: CAM
  [repeated for multiple seconds]
...
ahcich3: Poll timeout on slot 1 port 15
ahcich3: is 00000000 cs 00000002 ss 00000000 rs 00000002 tfd 77 serr 00000000 cm
d 0004c117
Root mount waiting for:(aprobe1:ahcich3:0:15:0): SOFT_RESET. ACB: 00 00 00 00 00
 00 00 00 00 00 00 00
CAM(aprobe1:ahcich3:0:15:0): CAM status: Command timeout
(aprobe1:ahcich3:0:15:0): Error 5, Retries exhausted
 
Root mount waiting for: CAM
last message repeated 15 times
ahcich3: Poll timeout on slot 2 port 0
ahcich3: is 00000000 cs 00000004 ss 00000000 rs 00000004 tfd 77 serr 00000000 cm
d 0004c217
(aprobe0:ahcich3:0:0:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich3:0:0:0): CAM status: Command timeout
(aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted

--> no cd0 detected, hence unusable
Comment 1 rk 2020-05-02 15:50:10 UTC
Due to a cut&paste error I missed the following message in the previous note:

After

Root mount waiting for: CAM
  [repeated for multiple seconds]

there also was

ahcich3: AHCI reset: device not ready after 31000ms (tfd = 000000ff)
Root mount waiting for: CAM
...
Comment 2 rk 2020-05-09 13:42:13 UTC
Created attachment 214310 [details]
Patch adds AHCI_Q_NOFORCE quirk to JMB362
Comment 3 rk 2020-05-09 13:46:19 UTC
The following change from r359970 appears to be the culprit:

--- stable/12/sys/dev/ahci/ahci_pci.c	(revision 359969)
+++ stable/12/sys/dev/ahci/ahci_pci.c	(revision 359970)
@@ -247,6 +247,7 @@
 	{0x2365197b, 0x00, "JMicron JMB365",	AHCI_Q_NOFORCE},
 	{0x2366197b, 0x00, "JMicron JMB366",	AHCI_Q_NOFORCE},
 	{0x2368197b, 0x00, "JMicron JMB368",	AHCI_Q_NOFORCE},
+	{0x0585197b, 0x00, "JMicron JMB58x",	0},
 	{0x611111ab, 0x00, "Marvell 88SE6111",	AHCI_Q_NOFORCE | AHCI_Q_NOPMP |
 	    AHCI_Q_1CH | AHCI_Q_EDGEIS},
 	{0x612111ab, 0x00, "Marvell 88SE6121",	AHCI_Q_NOFORCE | AHCI_Q_NOPMP |
@@ -399,6 +400,7 @@
 		     !(ahci_ids[i].quirks & AHCI_Q_NOFORCE)))) {
 			/* Do not attach JMicrons with single PCI function. */
 			if (pci_get_vendor(dev) == 0x197b &&
+			    (ahci_ids[i].quirks & AHCI_Q_NOFORCE) &&
 			    (pci_read_config(dev, 0xdf, 1) & 0x40) == 0)
 				return (ENXIO);
 			snprintf(buf, sizeof(buf), "%s AHCI SATA controller",


This breaks JMB362. I've added the AHCI_Q_NOFORCE quirk to JMB362 and with
that change it works again. Tested on r360840 with the following patch:

Index: sys/dev/ahci/ahci_pci.c
===================================================================
--- sys/dev/ahci/ahci_pci.c     (revision 360840)
+++ sys/dev/ahci/ahci_pci.c     (working copy)
@@ -242,7 +242,7 @@
        {0x23238086, 0x00, "Intel DH89xxCC",    0},
        {0x2360197b, 0x00, "JMicron JMB360",    0},
        {0x2361197b, 0x00, "JMicron JMB361",    AHCI_Q_NOFORCE | AHCI_Q_1CH},
-       {0x2362197b, 0x00, "JMicron JMB362",    0},
+       {0x2362197b, 0x00, "JMicron JMB362",    AHCI_Q_NOFORCE},
        {0x2363197b, 0x00, "JMicron JMB363",    AHCI_Q_NOFORCE},
        {0x2365197b, 0x00, "JMicron JMB365",    AHCI_Q_NOFORCE},
        {0x2366197b, 0x00, "JMicron JMB366",    AHCI_Q_NOFORCE},

Now JMB362 works again (and cd0 is detected). I don't know if this is by
accident due to the code from r359970 or if JMB362 really needs AHCI_Q_NOFORCE.
Comment 4 Mark Linimon freebsd_committer freebsd_triage 2020-07-12 16:27:54 UTC
^Triage: assign to committer of r359972.
Comment 5 Alexander Motin freebsd_committer freebsd_triage 2020-07-13 13:31:35 UTC
As I see, JMB362 is SATA-only chip, it has no PATA ports, that is why attach on top of atapci does not make much sense to me.  What that attachment may do is disable MSI interrupts, since ata(4) driver seems to not enable them by default.  Could you instead of this patch try to set loader tunable `hint.ahci.1.msi=0`?
Comment 6 rk 2020-07-13 17:53:45 UTC
I removed the patch, installed a new kernel and enabled
hint.ahci.1.msi=0 in /boot/loader.conf and rebooted.

Unfortunately it did not help. The same symptoms are seen as before:

# kenv | grep ahci
hint.ahci.1.msi="0"


Jul 13 19:21:03 magrathea kernel: Root mount waiting for: CAM
Jul 13 19:21:03 magrathea syslogd: last message repeated 24 times
Jul 13 19:21:03 magrathea kernel: ahcich3: AHCI reset: device not ready after 31
000ms (tfd = 000000ff)
Jul 13 19:21:03 magrathea kernel: Root mount waiting for: CAM
Jul 13 19:21:03 magrathea syslogd: last message repeated 15 times
Jul 13 19:21:03 magrathea kernel: ahcich3: Poll timeout on slot 1 port 15
Jul 13 19:21:03 magrathea kernel: ahcich3: is 00000000 cs 00000002 ss 00000000 r
s 00000002 tfd 77 serr 00000000 cmd 0004c117
Jul 13 19:21:03 magrathea kernel: (aprobe1:ahcich3:0:15:0): SOFT_RESET. ACB: 00 
00 00 00 00 00 00 00 00 00 00 00
Jul 13 19:21:03 magrathea kernel: (aprobe1:ahcich3:0:15:0): CAM status: Command 
timeout
Jul 13 19:21:03 magrathea kernel: (aprobe1:ahcich3:0:15:0): Error 5, Retries exh
austed
Jul 13 19:21:03 magrathea kernel: Root mount waiting for: CAM
Jul 13 19:21:03 magrathea syslogd: last message repeated 16 times
Jul 13 19:21:03 magrathea kernel: ahcich3: Poll timeout on slot 2 port 0
Jul 13 19:21:03 magrathea kernel: ahcich3: is 00000000 cs 00000004 ss 00000000 r
s 00000004 tfd 77 serr 00000000 cmd 0004c217
Jul 13 19:21:03 magrathea kernel: (aprobe0:ahcich3:0:0:0): SOFT_RESET. ACB: 00 0
0 00 00 00 00 00 00 00 00 00 00
Jul 13 19:21:03 magrathea kernel: (aprobe0:ahcich3:0:0:0): CAM status: Command t
imeout
Jul 13 19:21:03 magrathea kernel: (aprobe0:ahcich3:0:0:0): Error 5, Retries exha
usted

No devices are attached on ahcich3 (ahci1).
(In reply to Alexander Motin from comment #5)