Bug 209468 - aacraid run_interrupt_driven_hooks: still waiting after 60-300 seconds for xpt_config panic (with patch and suggested errata)
Summary: aacraid run_interrupt_driven_hooks: still waiting after 60-300 seconds for xp...
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: amd64 Any
: Normal Affects Many People
Assignee: Maxim Sobolev
URL: https://reviews.freebsd.org/D18408
Keywords: patch, regression
: 222448 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-05-12 17:32 UTC by Steven Peterson
Modified: 2021-10-03 23:24 UTC (History)
20 users (show)

See Also:
sobomax: mfc-stable12+
sobomax: mfc-stable11+


Attachments
Screen shot of last full screen (157.18 KB, image/jpeg)
2016-05-12 22:12 UTC, Steven Peterson
no flags Details
a panic screen on 11.x (65.61 KB, image/png)
2016-06-14 11:52 UTC, emz
no flags Details
Free BSD 10.x aacraid code Fixes to work with new controller firmware (10.95 KB, patch)
2016-08-31 19:04 UTC, Prasad B M
no flags Details | Diff
Adaptec ASR8405 fix bug drivers with new firmware (11.00 KB, patch)
2017-09-22 21:43 UTC, Maxim
no flags Details | Diff
errata-template.txt (3.86 KB, text/plain)
2019-12-01 13:27 UTC, ev
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Peterson 2016-05-12 17:32:31 UTC
Hi having an issue (presumably) with aacraid driver, 
I have a system with and HBA-1000-16E system panics in late boot after Run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config. 
With the card removed boots normally. replaced the card thinking it was faulty same message. 
prior to the xpt_config message I get (probe0:umass-sim0:0:0:0): Down reving Protocol Version from 2 to 0?, i get this for 0:0:0:1,2,3,4 as well 
this issue is replicated on 11 current iso, 10.3 iso, and FreeNAS 9.10
Comment 1 Steven Peterson 2016-05-12 17:51:45 UTC
Here is a link to video of the boot from beginning to end captured by the BMC
two files in the folder one DVC the native video from the BMC the other is converted to QuickTime from the native file
https://www.dropbox.com/sh/f4186cg0x2z9flj/AACa4pGqCQtGjTtcN8t6fHWZa?dl=0
Comment 2 Steven Peterson 2016-05-12 22:12:45 UTC
Created attachment 170237 [details]
Screen shot of last full screen
Comment 3 zerofreebsd 2016-05-19 19:13:31 UTC
My system's encountering the same issue since I've installed an HBA 1000 16i, so it does seem to point to general direction of the aacraid driver at least.
Comment 4 emz 2016-06-14 09:09:31 UTC
Exactly same stuff here, ASR8402 / 10.3-R.
Comment 5 emz 2016-06-14 09:27:03 UTC
(In reply to emz from comment #4)

ASR8405, sorry.
Comment 6 emz 2016-06-14 11:41:57 UTC
Reproducible on 11.0-ALPHA3.
Comment 7 emz 2016-06-14 11:52:05 UTC
Created attachment 171420 [details]
a panic screen on 11.x

Got a panic screen on 11.x, however, due to a timeout that ran out. No USB keyboard though, so I got only the initial screen.
Comment 8 Alan Somers freebsd_committer freebsd_triage 2016-06-14 14:22:48 UTC
How many disks do you have attached to that controller?  I saw the same panic when I had ~ 100 disks on a controller and its (different) driver decided to spin those disks up one at a time.
Comment 9 Steven Peterson 2016-06-14 15:00:25 UTC
In my case this can be recreated with nothing attached to the controller. 
It was intended to drive a 48 drive disk shelf. (Only 24 slots filled)
Comment 10 emz 2016-06-15 05:04:33 UTC
I have 24 disks attached.
Comment 11 Prasad B M 2016-06-23 21:43:53 UTC
Hi All,
Iam able to recreate the issue in FreeBSD 10.3, it looks that the FreeBSD 10.3 has an older version of inbox aacraid driver which is loaded during the boot, the issue looks like that the New HBA1000-16e, HBA1000-16i and ASR 8405 controller is not compatible with the older driver. 
We are working on the solution to fix the issue.

Regards,
Prasad
Comment 12 Prasad B M 2016-08-31 19:04:55 UTC
Created attachment 174268 [details]
Free BSD 10.x aacraid code Fixes to work with new controller firmware

aacraid driver code fixes for Free BSD 10.1 & 10.3. The fixes are required to make the aacraid driver to work with the new Microsem (PMC/Adaptec) ARC RAID controller.

if the user uses the latest ARC controller with the new firmware he will see an issue.
Comment 13 Prasad B M 2016-12-21 19:46:34 UTC
Hi All,

 What is the next step on this issue. I have identified the problem and submitted the patch to fix the problem.

Let me know who is the person to be contacted in order to make changes to the status of this bug.

Thanks,
Prasad
Comment 14 emz 2016-12-22 08:50:31 UTC
I guess someone has now to test this patch and upon successful testing it will be commited to the tree. Unfortunately, I am unable to test this at this moment since the controllers are shelved. Hopefully, two guys that were having this issue too could perform a test. If none of them will do so in about two weeks I will try to put controllers back into the production and test myself.
Comment 15 Maxim 2017-09-20 06:45:17 UTC
Hi!

   Is there any news to fix the driver. We have two servers with Adaptec ASR8405, with a new firmware not working - freezes on run_interrupt_driven_hooks: still waiting after 60-300 seconds for xpt_config panic, I tried FreeBSD 10.3, 10.4 RC, 11.1 :(

Thnaks,
Maxim.
Comment 16 Prasad B M 2017-09-20 17:50:48 UTC
Hi Maxim,

The patch has been submitted for acceptance, I did not receive any response.

link for the patch is below.
https://bz-attachments.freebsd.org/attachment.cgi?id=174268

regards,
Prasad

(In reply to Maxim from comment #15)
Comment 17 Steven Peterson 2017-09-20 17:58:21 UTC
At this point, the controllers are shelved for me as well. I should have a server free I can do some initial testing with, But I will need a bootable install cd as we never got it installed with the card.
Comment 18 Maxim 2017-09-20 18:35:58 UTC
(In reply to Prasad B M from comment #16)

Can you please create bootable image for USB flash drive with kernel, where patched   driver ?

We will test the patch, booting from USB flash drive.
Comment 19 Prasad B M 2017-09-20 18:50:29 UTC
(In reply to Maxim from comment #18)

I have never created a bootable FreeBSD image for USB flash drive with the driver in it. can you please let me know the steps to create the FeeBSD bootable image with the driver aacraid inboxed in it.

regards,
Prasad
Comment 20 Maxim 2017-09-20 19:01:41 UTC
   I will try to install the FreeBSD 10.3 without a controller on a separate disk SATA, then rebuild the system with a patch.

   Then I'll install the controller to the same server and see how it works. 

   What do you think, if it works, can it be hoped that the patch will be included in FreeBSD 10.4?


(In reply to Prasad B M from comment #19)
Comment 21 Prasad B M 2017-09-20 21:13:07 UTC
(In reply to Maxim from comment #20)

Ok That sounds good.

Regarding the patch & fixes submission for 10.4, currently the fixes are available only through out of box driver.

regarding the fix submission to inbox driver I need to check with microsemi if they are currently supporting the inbox patch submission for FreeBSD.

regards,
Prasad
Comment 22 Maxim 2017-09-22 21:41:41 UTC
Hi All!

   I checked the patch on FreeBSD 10.3R, the bug is fixed and the controller is work fine on new firmware. In the previously proposed patch there were inconsistencies, because of what it was not applied:
Hunk #11 failed at 2921.
Hunk #12 succeeded at 3702 (offset -10 lines).
Hunk #13 succeeded at 3746 (offset -10 lines).
Hunk #14 succeeded at 3812 (offset -10 lines).

I was corrected patch, file in attachment.

   Test hardware:
Adaptec ASR8805
   BIOS                                     : 7.11-0 (33173)
   Firmware                                 : 7.11-0 (33173)
   Driver                                   : 3.2-10 (1)
   Boot Flash                               : 7.11-0 (33173)
   CPLD (Load version/ Flash version)       : 6/ 11
   SEEPROM (Load version/ Flash version)    : 1/ 1

Driver:
aacraid0: <Adaptec RAID Controller> port 0xe000-0xe0ff mem 0xfb100000-0xfb1fffff,0xfb280000-0xfb2803ff irq 42 at device 0.0 on pci4
aacraid0: Enable Raw I/O
aacraid0: Enable 64-bit array
aacraid0: using MSI-X interrupts (32 vectors)
aacraid0: New comm. interface type2 enabled
aacraid0: ASR8805, aacraid driver 3.2.10-1
aacraidp0 on aacraid0
aacraidp1 on aacraid0
aacraidp2 on aacraid0
aacraidp3 on aacraid0
Comment 23 Maxim 2017-09-22 21:43:12 UTC
Created attachment 186624 [details]
Adaptec ASR8405 fix bug drivers with new firmware
Comment 24 Maxim 2017-09-22 22:55:09 UTC
  Additionally checked patch on FreeBSD 10.4-RC2, also works fine.

https://bz-attachments.freebsd.org/attachment.cgi?id=174268
Comment 25 Prasad B M 2017-09-22 23:10:27 UTC
(In reply to Maxim from comment #24)

Thanks Maxim for testing the code change patch.
Comment 26 Maxim 2017-09-23 00:02:08 UTC
(In reply to Prasad B M from comment #25)

  I was check the patch on FreeBSD 11.1, work fine too :)

FreeBSD 11.1-RELEASE-p1

aacraid0: <Adaptec RAID Controller> port 0xe000-0xe0ff mem 0xfb100000-0xfb1fffff,0xfb280000-0xfb2803ff irq 42 at device 0.0 on pci5
aacraid0: Enable Raw I/O
aacraid0: Enable 64-bit array
aacraid0: using MSI-X interrupts (32 vectors)
aacraid0: New comm. interface type2 enabled
aacraid0: ASR8805, aacraid driver 3.2.10-1
aacraidp0 on aacraid0
aacraidp1 on aacraid0
aacraidp2 on aacraid0
aacraidp3 on aacraid0
Comment 27 Ed Maste freebsd_committer freebsd_triage 2017-10-02 10:18:00 UTC
*** Bug 222448 has been marked as a duplicate of this bug. ***
Comment 28 Ed Maste freebsd_committer freebsd_triage 2017-10-02 16:29:33 UTC
(In reply to Prasad B M from comment #21)
PMC had a FreeBSD committer on staff but I understand he is no longer working there. I'll help getting this patch committed to FreeBSD and can discuss w/ Microsemi steps on getting changes integrated into FreeBSD on an ongoing basis.

Is this patch tested with the earlier controllers also supported by aacraid, e.g. ASR-6405(T|E), ASR-7085 etc.?
Comment 29 Maxim 2017-10-10 08:01:03 UTC
  I have one more question, in drivers 3.3.2 from site adaptec.com we see (New comm. interface type3 enabled):

aacraidu0: <Adaptec RAID Controller> port 0x6000-0x60ff mem 0xc7100000-0xc71fffff,0xc7280000-0xc72803ff irq 32 at device 0.0 on pci2
aacraidu0: Enable Raw I/O
aacraidu0: Enable 64-bit array
aacraidu0: using MSI-X interrupts (32 vectors)
aacraidu0: New comm. interface type3 enabled
aacraidu0: ASR8405, aacraid driver 3.3.2-52013

but in internal driver we see (New comm. interface type2 enabled)

aacraid0: <Adaptec RAID Controller> port 0xe000-0xe0ff mem 0xfb100000-0xfb1fffff,0xfb280000-0xfb2803ff irq 42 at device 0.0 on pci5
aacraid0: Enable Raw I/O
aacraid0: Enable 64-bit array
aacraid0: using MSI-X interrupts (32 vectors)
aacraid0: New comm. interface type2 enabled
aacraid0: ASR8805, aacraid driver 3.2.10-1

This affects to performance ?

May be that need fix ?
Comment 30 Maxim 2017-10-10 08:05:30 UTC
(In reply to Ed Maste from comment #28)

  Since the 7 generation model and below are different from the 8 generation, does it just make different drivers?

  Do you think it is possible to contact adaptec to run the tests?
Comment 31 Maxim 2017-10-26 10:39:02 UTC
Hi!

  Have any news about this bug ?

What are the plans to include patches in HEAD ?

Thanks.
Comment 32 Sam 2017-11-22 14:15:22 UTC
I can confirm same bug on FreeBSD 11.1-RELEASE-p4, using HBA-1000-16i on ASRock Mainboard EPC612D8.

Same controller on same motherboard work on a CentOS install, so hardware side is fine.

Is the working patch available in FreeBSD 11-STABLE or 12.0-CURRENT?
Comment 33 Maxim 2018-06-20 12:31:59 UTC
Hi!

   This issue still exists, including version 11.2-RC3. This does not allow you to install FreeBSD on disk space located on the controller ASR8405 or ASR8805. There are patches, which were previously posted and tested. Also, there is the source code for the drivers on the site microsemi.com, which works when compiled in the kernel, but the installation new OS makes it difficult. Is the solution planned for this problem?


Best regards,
Maxim.
Comment 34 Andrew 2018-07-06 11:20:13 UTC
Hi!

I've got same issue with HBA-1000-8i8e and smartpqi driver (still waiting ...  for xpt_config).
Controller firmware flashed on last release ver 4.02 form Microsemi.

Also, it seems that smartpqi driver not properly loaded (I did't catch output yet).

I think this problem could be similar for Adaptec raid and HBA boards.

--
Andrew
Comment 35 Maxim 2018-07-13 12:15:31 UTC
Hi!

   I think that you need to make an adjustment to the current driver, because at some point Microsemi made adjustments to the firmware, after which these issue began.

   In this thread there is a patch version 3.2.10 that solves this issue.

   Also on the site there is a driver 3.3.2 version, which you can try to port, to solve this issue. I now use the driver version 3.3.2 from the Microsemi website. 
https://storage.microsemi.com/en-us/downloads/unix/freebsd/productid=asr-8805&dn=adaptec+raid+8805.php

   But there is a problem if you need to reinstall the system, you have to install OS it on an SSD drive, and then transfer it to a Raid array.

   I also wrote a email to Microsemi himself describing a similar problem, until the answer was followed :(

   I think that this is a serious problem, which probably should be paid attention to the developers of Microsemi, because this has a negative effect on the choice of equipment Adaptec. For example, we ordered new servers with LSI controllers for this reason.

Kind regards,
Maxim.
Comment 36 Maxim 2018-07-13 12:22:06 UTC
I really hope that the fix will be in the version of FreeBSD 11.3 and 12.0 :)
Comment 37 Maxim 2018-08-17 06:21:43 UTC
Hi!

  Are there any news on this issue?

  For today, under FreeBSD does not work Adaptec controllers, whether it is planned to fix this issue ?

Thank you.
Comment 38 Maxim 2018-09-24 10:46:41 UTC
Hi!

  May be somebody have any news about this is issue ?

Does have in plan include patch for driver in FreeBSD 12.0 or FreeBSD 11 stable ?

Thank you.
Comment 39 commit-hook freebsd_committer freebsd_triage 2019-05-22 04:51:20 UTC
A commit references this bug:

Author: sobomax
Date: Wed May 22 04:51:09 UTC 2019
New revision: 348091
URL: https://svnweb.freebsd.org/changeset/base/348091

Log:
  Make aacraid(4) working on ASR8805 & ASR8402 in particular. This patch
  has been in the PR system for 5 months and then on reviews for another 5.
  Nobody came with any cases where it fails, while many people cried for
  it to be commited & merged.

  PR:		209468
  Submitted by:	Prasad B M <prasad.munirathnam@microsemi.com>
  Reported by:	Steven Peterson <scp@mainstream.net>
  Approved by:	scottl
  MFC after:	1 month
  Differential Revision:	https://reviews.freebsd.org/D18408

Changes:
  head/sys/dev/aacraid/aacraid.c
  head/sys/dev/aacraid/aacraid_cam.c
  head/sys/dev/aacraid/aacraid_reg.h
  head/sys/dev/aacraid/aacraid_var.h
Comment 40 Maxim Sobolev freebsd_committer freebsd_triage 2019-05-22 04:55:52 UTC
Patch has landed in -CURRENT. Unless there are any issues, I plan to merge in into rel.11 and rel.12 in about a month.
Comment 41 ev 2019-06-22 09:22:03 UTC
11.3-RC2 available
Does have in plan include patch for driver in FreeBSD 11.3-RELEASE?
Comment 42 ev 2019-08-25 10:33:10 UTC
Does have in plan include patch for driver in FreeBSD 12.1-RELEASE?
Comment 43 ev 2019-09-22 08:09:22 UTC
12.1-BETA1 available
Will the next version of freebsd not work on these controllers? :(
Comment 44 Gert Doering 2019-11-14 20:56:56 UTC
Running into this as well right now - Supermicro mainboard, ASR8805 controller with two SAS drives attached.  11.2, 12.0, 12.1 disc1 boots, recognizes the card, then hangs with the dreaded "xpt_config" timeout (specifically, it detects card+drives, the proceeds to USB devices, and after detecting the - virtual - USB CD-ROM "umass0:14:0: Attached to scbus14" it stops, proceeding only with "run_interrupt_driven_hooks: still waiting after ... seconds for xpt_config").

So should the fix from Maxim in -CURRENT be in any of these RELEASEs?
Comment 45 commit-hook freebsd_committer freebsd_triage 2019-11-21 14:55:11 UTC
A commit references this bug:

Author: emaste
Date: Thu Nov 21 14:54:21 UTC 2019
New revision: 354964
URL: https://svnweb.freebsd.org/changeset/base/354964

Log:
  MFC r348091 by sobomax: update aacraid driver to 3.2.10

  PR:		209468

Changes:
_U  stable/12/
  stable/12/sys/dev/aacraid/aacraid.c
  stable/12/sys/dev/aacraid/aacraid_cam.c
  stable/12/sys/dev/aacraid/aacraid_reg.h
  stable/12/sys/dev/aacraid/aacraid_var.h
Comment 46 commit-hook freebsd_committer freebsd_triage 2019-11-21 14:56:18 UTC
A commit references this bug:

Author: emaste
Date: Thu Nov 21 14:55:28 UTC 2019
New revision: 354965
URL: https://svnweb.freebsd.org/changeset/base/354965

Log:
  MFC r348091 by sobomax: update aacraid driver to 3.2.10

  PR:		209468

Changes:
_U  stable/11/
  stable/11/sys/dev/aacraid/aacraid.c
  stable/11/sys/dev/aacraid/aacraid_cam.c
  stable/11/sys/dev/aacraid/aacraid_reg.h
  stable/11/sys/dev/aacraid/aacraid_var.h
Comment 47 Ed Maste freebsd_committer freebsd_triage 2019-11-29 21:16:53 UTC
I think this is a reasonable candidate for an errata update. It would help if someone who'd like to see this included in an EN would fill out the errata advisory template at https://www.freebsd.org/security/errata-template.txt
Comment 48 ev 2019-12-01 13:27:07 UTC
Created attachment 209572 [details]
errata-template.txt
Comment 49 ev 2019-12-01 13:28:55 UTC
(In reply to Ed Maste from comment #47)
Correctly?
What is the further algorithm of actions?
Comment 50 Piotr Pawel Stefaniak freebsd_committer freebsd_triage 2021-10-03 21:57:33 UTC
I think all supported releases contain this fix.
Comment 51 Warner Losh freebsd_committer freebsd_triage 2021-10-03 23:24:23 UTC
Fixed ages ago.