Bug 213371 - panic: spin lock held too long
Summary: panic: spin lock held too long
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 11.0-STABLE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2016-10-11 09:54 UTC by Wojciech Giel
Modified: 2018-07-20 03:33 UTC (History)
3 users (show)

See Also:


Attachments
panic screenshot (292.17 KB, image/png)
2016-10-11 09:54 UTC, Wojciech Giel
no flags Details
panic with 12.0-CURRENT (315.91 KB, image/png)
2016-10-13 13:26 UTC, Wojciech Giel
no flags Details
bt tid 1 (389.60 KB, image/png)
2016-10-13 14:28 UTC, Wojciech Giel
no flags Details
bt tid 2 (440.18 KB, image/png)
2016-10-13 14:29 UTC, Wojciech Giel
no flags Details
bt tid 3 (12.50 KB, image/png)
2016-10-14 09:23 UTC, Wojciech Giel
no flags Details
bt tid 4 (13.32 KB, image/png)
2016-10-14 09:24 UTC, Wojciech Giel
no flags Details
bt thread (165.71 KB, image/png)
2016-10-14 12:46 UTC, Wojciech Giel
no flags Details
bt (12.79 KB, image/png)
2016-10-14 13:14 UTC, Wojciech Giel
no flags Details
thread (1.54 KB, image/png)
2016-10-14 13:15 UTC, Wojciech Giel
no flags Details
bt second time (12.91 KB, image/png)
2016-10-14 13:15 UTC, Wojciech Giel
no flags Details
01.panic (380.86 KB, image/png)
2016-10-14 15:57 UTC, Wojciech Giel
no flags Details
02.bt (307.26 KB, image/png)
2016-10-14 15:58 UTC, Wojciech Giel
no flags Details
03.thread (32.02 KB, image/png)
2016-10-14 15:58 UTC, Wojciech Giel
no flags Details
04.bt (306.16 KB, image/png)
2016-10-14 15:59 UTC, Wojciech Giel
no flags Details
01.panic (548.51 KB, image/png)
2016-10-14 16:35 UTC, Wojciech Giel
no flags Details
02.bt (563.07 KB, image/png)
2016-10-14 16:35 UTC, Wojciech Giel
no flags Details
03.thread (479.04 KB, image/png)
2016-10-14 16:36 UTC, Wojciech Giel
no flags Details
04.thread cd (59.95 KB, image/png)
2016-10-14 16:36 UTC, Wojciech Giel
no flags Details
act01 (383.83 KB, image/png)
2016-11-07 12:47 UTC, Wojciech Giel
no flags Details
act02 (386.87 KB, image/png)
2016-11-07 12:47 UTC, Wojciech Giel
no flags Details
act03 (407.21 KB, image/png)
2016-11-07 12:47 UTC, Wojciech Giel
no flags Details
act04 (422.75 KB, image/png)
2016-11-07 12:48 UTC, Wojciech Giel
no flags Details
act05 (448.25 KB, image/png)
2016-11-07 12:48 UTC, Wojciech Giel
no flags Details
act06 (422.19 KB, image/png)
2016-11-07 12:49 UTC, Wojciech Giel
no flags Details
act06 (423.99 KB, image/png)
2016-11-07 12:49 UTC, Wojciech Giel
no flags Details
act07 (423.99 KB, image/png)
2016-11-07 12:49 UTC, Wojciech Giel
no flags Details
act08 (444.70 KB, image/png)
2016-11-07 12:50 UTC, Wojciech Giel
no flags Details
act09 (422.25 KB, image/png)
2016-11-07 12:50 UTC, Wojciech Giel
no flags Details
act10 (425.81 KB, image/png)
2016-11-07 12:50 UTC, Wojciech Giel
no flags Details
act11 (447.06 KB, image/png)
2016-11-07 12:51 UTC, Wojciech Giel
no flags Details
act12 (423.17 KB, image/png)
2016-11-07 12:51 UTC, Wojciech Giel
no flags Details
act13 (425.30 KB, image/png)
2016-11-07 12:51 UTC, Wojciech Giel
no flags Details
act14 (446.92 KB, image/png)
2016-11-07 12:51 UTC, Wojciech Giel
no flags Details
act15 (423.12 KB, image/png)
2016-11-07 12:51 UTC, Wojciech Giel
no flags Details
act16 (423.85 KB, image/png)
2016-11-07 12:52 UTC, Wojciech Giel
no flags Details
act17 (446.63 KB, image/png)
2016-11-07 12:52 UTC, Wojciech Giel
no flags Details
act18 (421.69 KB, image/png)
2016-11-07 12:52 UTC, Wojciech Giel
no flags Details
act19 (423.69 KB, image/png)
2016-11-07 12:52 UTC, Wojciech Giel
no flags Details
act20 (445.22 KB, image/png)
2016-11-07 12:52 UTC, Wojciech Giel
no flags Details
act21 (421.35 KB, image/png)
2016-11-07 12:53 UTC, Wojciech Giel
no flags Details
act22 (423.12 KB, image/png)
2016-11-07 12:53 UTC, Wojciech Giel
no flags Details
act23 (444.10 KB, image/png)
2016-11-07 12:53 UTC, Wojciech Giel
no flags Details
act24 (422.46 KB, image/png)
2016-11-07 12:53 UTC, Wojciech Giel
no flags Details
act25 (425.09 KB, image/png)
2016-11-07 12:54 UTC, Wojciech Giel
no flags Details
act26 (436.56 KB, image/png)
2016-11-07 12:54 UTC, Wojciech Giel
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wojciech Giel 2016-10-11 09:54:34 UTC
Created attachment 175614 [details]
panic screenshot

Hello,
I'm trying to install FreeBSD 11.0-RELEASE on Dell R620 poweredge with Perc H310mini raid controller. Controller is currently configured in jbod mode. I'm getting panic consistently whenever installer tried to access hard drives. I have tried zfs, ufs with raid, without raid. The problem doesn't exists when i install freebsd 10.3. I have this machine for testing for a while so I can help with debugging.
cheers
Wojciech
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2016-10-12 15:36:30 UTC
Would it be possible for you to try a latest snapshot from here ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/12.0/ in a suitable format?
A kernel in the snapshot should have more debugging facilities compiled in, so that might help to get more information.
Comment 2 Wojciech Giel 2016-10-13 13:26:28 UTC
Created attachment 175706 [details]
panic with 12.0-CURRENT
Comment 3 Wojciech Giel 2016-10-13 13:28:30 UTC
I have tried install 12-CUREENT got the same kernel panic see attachment. replaced H310 controller with H710 but behaviour is the same.
Comment 4 Andriy Gapon freebsd_committer freebsd_triage 2016-10-13 14:17:00 UTC
(In reply to Wojciech Giel from comment #3)
Could you please reproduce this again?  And once you are at the 'db>' prompt please execute the following commands:
db> bt
db> tid 10186 [*]
db> bt
where '10186' is a place holder for the id reported in the "too long" message.
Please capture the output.
Thank you.
Comment 5 Wojciech Giel 2016-10-13 14:28:48 UTC
Created attachment 175708 [details]
bt tid 1
Comment 6 Wojciech Giel 2016-10-13 14:29:14 UTC
Created attachment 175709 [details]
bt tid 2
Comment 7 Andriy Gapon freebsd_committer freebsd_triage 2016-10-13 17:00:35 UTC
Apologies for my conventions, but please do not enter '[*]', it should be just 'tid <number>'.  I used '[*]' only to draw your attention to a fact that '10186' should not be entered verbatim. Sorry.
Could you please get the stack traces again?
Comment 8 Andriy Gapon freebsd_committer freebsd_triage 2016-10-13 17:02:50 UTC
Also, it's not 'bt tid xxxx' on one line.  Those are 3 separate commands: 'bt', 'tid xxxx', 'bt'. You hit enter after typing each.
Comment 9 Wojciech Giel 2016-10-14 09:23:10 UTC
Created attachment 175735 [details]
bt tid 3
Comment 10 Wojciech Giel 2016-10-14 09:24:10 UTC
Created attachment 175736 [details]
bt tid 4
Comment 11 Wojciech Giel 2016-10-14 09:26:04 UTC
tid 100245 returns No such command?
Comment 12 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 12:21:55 UTC
(In reply to Wojciech Giel from comment #11)
I apologise again, I confused a kgdb command with a ddb command.
So, instead of 'tid' it should be 'thread'.
Could you please obtain the outputs again?
Comment 13 Wojciech Giel 2016-10-14 12:46:24 UTC
Created attachment 175739 [details]
bt thread

no much there
Comment 14 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 13:10:06 UTC
(In reply to Wojciech Giel from comment #13)
I am a little bit confused.
Could you please do the following?
<get the panic>
<take a picture>
bt
<take a picture>
thread xxxx
bt
<take a paicture>

So, I want to get 3 pictures of the same panic.
Thank you.
Comment 15 Wojciech Giel 2016-10-14 13:14:50 UTC
Created attachment 175742 [details]
bt
Comment 16 Wojciech Giel 2016-10-14 13:15:11 UTC
Created attachment 175743 [details]
thread
Comment 17 Wojciech Giel 2016-10-14 13:15:59 UTC
Created attachment 175744 [details]
bt second time

got three separate screenshots
Comment 18 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 13:54:35 UTC
(In reply to Wojciech Giel from comment #17)
Well, my instructions started with
<get the panic>
<take a picture>
bt

So, the first picture should be taken before any commands.
Please take another 3 pictures according to the instructions in comment #14.
Comment 19 Wojciech Giel 2016-10-14 15:57:42 UTC
Created attachment 175747 [details]
01.panic
Comment 20 Wojciech Giel 2016-10-14 15:58:04 UTC
Created attachment 175748 [details]
02.bt
Comment 21 Wojciech Giel 2016-10-14 15:58:41 UTC
Created attachment 175749 [details]
03.thread
Comment 22 Wojciech Giel 2016-10-14 15:59:01 UTC
Created attachment 175750 [details]
04.bt
Comment 23 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 16:21:04 UTC
Using your latest pictures as an example, you did 'thread 100186', but I what asked you to do was 'thread 100346'.  Remember, I said the id reported in the "too long" message.  That message is at the top of the first picture.
So, could you please do this once again using a correct thread number?
Comment 24 Wojciech Giel 2016-10-14 16:35:02 UTC
Created attachment 175752 [details]
01.panic
Comment 25 Wojciech Giel 2016-10-14 16:35:40 UTC
Created attachment 175753 [details]
02.bt
Comment 26 Wojciech Giel 2016-10-14 16:36:07 UTC
Created attachment 175754 [details]
03.thread
Comment 27 Wojciech Giel 2016-10-14 16:36:55 UTC
Created attachment 175755 [details]
04.thread cd

sorry. it was a bit confusing.
Comment 28 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 20:25:50 UTC
(In reply to Wojciech Giel from comment #27)
No worries.  Thank you very much!  The latest information is something to chew on.
Comment 29 Andriy Gapon freebsd_committer freebsd_triage 2016-10-14 20:38:58 UTC
A fellow developer suggests that the following command could provide more of interesting information:
show active trace
Could you please reproduce the panic and run that command?
It should result in a several screenfuls of output on your system, it's important to catch them all.
Thank you.
Comment 30 Wojciech Giel 2016-10-17 09:47:36 UTC
show active trace gives: "No such command"
Comment 31 Andriy Gapon freebsd_committer freebsd_triage 2016-11-05 08:20:49 UTC
There are newer snapshots available now, they should have the command.
If you still have access to the hardware and interested in debugging this issue, could you please try a newer snapshot?
Thank you.
Comment 32 Wojciech Giel 2016-11-07 12:47:14 UTC
Created attachment 176718 [details]
act01
Comment 33 Wojciech Giel 2016-11-07 12:47:39 UTC
Created attachment 176719 [details]
act02
Comment 34 Wojciech Giel 2016-11-07 12:47:59 UTC
Created attachment 176720 [details]
act03
Comment 35 Wojciech Giel 2016-11-07 12:48:22 UTC
Created attachment 176721 [details]
act04
Comment 36 Wojciech Giel 2016-11-07 12:48:45 UTC
Created attachment 176722 [details]
act05
Comment 37 Wojciech Giel 2016-11-07 12:49:04 UTC
Created attachment 176723 [details]
act06
Comment 38 Wojciech Giel 2016-11-07 12:49:26 UTC
Created attachment 176724 [details]
act06
Comment 39 Wojciech Giel 2016-11-07 12:49:49 UTC
Created attachment 176725 [details]
act07
Comment 40 Wojciech Giel 2016-11-07 12:50:11 UTC
Created attachment 176726 [details]
act08
Comment 41 Wojciech Giel 2016-11-07 12:50:21 UTC
Created attachment 176727 [details]
act09
Comment 42 Wojciech Giel 2016-11-07 12:50:35 UTC
Created attachment 176728 [details]
act10
Comment 43 Wojciech Giel 2016-11-07 12:51:01 UTC
Created attachment 176729 [details]
act11
Comment 44 Wojciech Giel 2016-11-07 12:51:11 UTC
Created attachment 176730 [details]
act12
Comment 45 Wojciech Giel 2016-11-07 12:51:23 UTC
Created attachment 176731 [details]
act13
Comment 46 Wojciech Giel 2016-11-07 12:51:33 UTC
Created attachment 176732 [details]
act14
Comment 47 Wojciech Giel 2016-11-07 12:51:59 UTC
Created attachment 176733 [details]
act15
Comment 48 Wojciech Giel 2016-11-07 12:52:09 UTC
Created attachment 176734 [details]
act16
Comment 49 Wojciech Giel 2016-11-07 12:52:20 UTC
Created attachment 176735 [details]
act17
Comment 50 Wojciech Giel 2016-11-07 12:52:30 UTC
Created attachment 176736 [details]
act18
Comment 51 Wojciech Giel 2016-11-07 12:52:42 UTC
Created attachment 176738 [details]
act19
Comment 52 Wojciech Giel 2016-11-07 12:52:52 UTC
Created attachment 176739 [details]
act20
Comment 53 Wojciech Giel 2016-11-07 12:53:19 UTC
Created attachment 176740 [details]
act21
Comment 54 Wojciech Giel 2016-11-07 12:53:29 UTC
Created attachment 176741 [details]
act22
Comment 55 Wojciech Giel 2016-11-07 12:53:40 UTC
Created attachment 176742 [details]
act23
Comment 56 Wojciech Giel 2016-11-07 12:53:50 UTC
Created attachment 176743 [details]
act24
Comment 57 Wojciech Giel 2016-11-07 12:54:01 UTC
Created attachment 176744 [details]
act25
Comment 58 Wojciech Giel 2016-11-07 12:54:21 UTC
Created attachment 176745 [details]
act26
Comment 59 Wojciech Giel 2016-11-07 12:55:16 UTC
(In reply to Andriy Gapon from comment #29)
uploaded "several screenshot" :-).
Comment 60 Andriy Gapon freebsd_committer freebsd_triage 2016-11-07 14:47:06 UTC
(In reply to Wojciech Giel from comment #59)
Thank you!

TLDR version of the screenshot for anyone else interested in the problem:
one thread panics because it waits "too long" to acquire the ipi spin lock,
the lock is held by a thread waiting for the targeted tlb shootdown to be executed by other cpus, the rest of the cpus are idle, acpi_cpu_idle_mwait is used for that.
Interesting attachments:
https://bz-attachments.freebsd.org/attachment.cgi?id=176718
https://bz-attachments.freebsd.org/attachment.cgi?id=176719
https://bz-attachments.freebsd.org/attachment.cgi?id=176720

Wojciech,
could you please try setting the following at loader prompt before booting the kernel?

debug.acpi.disabled=mwait
Comment 61 Wojciech Giel 2016-11-07 17:35:03 UTC
(In reply to Andriy Gapon from comment #60)

set that setting but still got panic: spin lock held too long
Comment 62 Konstantin Belousov freebsd_committer freebsd_triage 2016-11-07 18:40:03 UTC
(In reply to Wojciech Giel from comment #61)
I have no good suggestion about this case.

First, is it really true that all other CPUs are executing idle threads ?  Might be there is one, besides the two IPI callers, which is not.

Second, look for update of your BIOS and reflash it.  Update the perc firmware as well.

Third, try to boot e.g. from the USB disk, does it work ?  If system boots, try to access the drives on the Perc controller.

Show verbose dmesg of the successful boot.
Comment 63 Andriy Gapon freebsd_committer freebsd_triage 2016-11-08 06:57:01 UTC
(In reply to Konstantin Belousov from comment #62)
As far as I understand there has never been a successful boot on that system.
And looking through screenshots of "show active trace" output I do not see any other running thread.
Comment 64 Konstantin Belousov freebsd_committer freebsd_triage 2016-11-08 10:28:59 UTC
(In reply to Andriy Gapon from comment #63)
The sentence in the original report which makes me wonder is 'I'm getting panic consistently whenever installer tried to access hard drives.'  I am not sure whether it mean that installer program started, or that the install media panics outright on boot.

Anyway, the information I requested is what needed to define the next steps.
Comment 65 Wojciech Giel 2016-11-08 13:38:11 UTC
(In reply to Andriy Gapon from comment #63)
This machine had installed FreeBsd 10.0. We decided to rebuild this machine with 11.0. It fails at the stage of spinning drives some times during booing of install cd some times when I accept disk layout in the installer. Bios and raid firmware is up to date.
Comment 66 Andriy Gapon freebsd_committer freebsd_triage 2016-11-10 18:44:15 UTC
(In reply to Wojciech Giel from comment #65)
If that's so, could you please fulfil Konstantin's request?
That is, boot FreeBSD 10.x in verbose mode, capture dmesg output (as text) and attach it to this bug?