Created attachment 175614 [details] panic screenshot Hello, I'm trying to install FreeBSD 11.0-RELEASE on Dell R620 poweredge with Perc H310mini raid controller. Controller is currently configured in jbod mode. I'm getting panic consistently whenever installer tried to access hard drives. I have tried zfs, ufs with raid, without raid. The problem doesn't exists when i install freebsd 10.3. I have this machine for testing for a while so I can help with debugging. cheers Wojciech
Would it be possible for you to try a latest snapshot from here ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/12.0/ in a suitable format? A kernel in the snapshot should have more debugging facilities compiled in, so that might help to get more information.
Created attachment 175706 [details] panic with 12.0-CURRENT
I have tried install 12-CUREENT got the same kernel panic see attachment. replaced H310 controller with H710 but behaviour is the same.
(In reply to Wojciech Giel from comment #3) Could you please reproduce this again? And once you are at the 'db>' prompt please execute the following commands: db> bt db> tid 10186 [*] db> bt where '10186' is a place holder for the id reported in the "too long" message. Please capture the output. Thank you.
Created attachment 175708 [details] bt tid 1
Created attachment 175709 [details] bt tid 2
Apologies for my conventions, but please do not enter '[*]', it should be just 'tid <number>'. I used '[*]' only to draw your attention to a fact that '10186' should not be entered verbatim. Sorry. Could you please get the stack traces again?
Also, it's not 'bt tid xxxx' on one line. Those are 3 separate commands: 'bt', 'tid xxxx', 'bt'. You hit enter after typing each.
Created attachment 175735 [details] bt tid 3
Created attachment 175736 [details] bt tid 4
tid 100245 returns No such command?
(In reply to Wojciech Giel from comment #11) I apologise again, I confused a kgdb command with a ddb command. So, instead of 'tid' it should be 'thread'. Could you please obtain the outputs again?
Created attachment 175739 [details] bt thread no much there
(In reply to Wojciech Giel from comment #13) I am a little bit confused. Could you please do the following? <get the panic> <take a picture> bt <take a picture> thread xxxx bt <take a paicture> So, I want to get 3 pictures of the same panic. Thank you.
Created attachment 175742 [details] bt
Created attachment 175743 [details] thread
Created attachment 175744 [details] bt second time got three separate screenshots
(In reply to Wojciech Giel from comment #17) Well, my instructions started with <get the panic> <take a picture> bt So, the first picture should be taken before any commands. Please take another 3 pictures according to the instructions in comment #14.
Created attachment 175747 [details] 01.panic
Created attachment 175748 [details] 02.bt
Created attachment 175749 [details] 03.thread
Created attachment 175750 [details] 04.bt
Using your latest pictures as an example, you did 'thread 100186', but I what asked you to do was 'thread 100346'. Remember, I said the id reported in the "too long" message. That message is at the top of the first picture. So, could you please do this once again using a correct thread number?
Created attachment 175752 [details] 01.panic
Created attachment 175753 [details] 02.bt
Created attachment 175754 [details] 03.thread
Created attachment 175755 [details] 04.thread cd sorry. it was a bit confusing.
(In reply to Wojciech Giel from comment #27) No worries. Thank you very much! The latest information is something to chew on.
A fellow developer suggests that the following command could provide more of interesting information: show active trace Could you please reproduce the panic and run that command? It should result in a several screenfuls of output on your system, it's important to catch them all. Thank you.
show active trace gives: "No such command"
There are newer snapshots available now, they should have the command. If you still have access to the hardware and interested in debugging this issue, could you please try a newer snapshot? Thank you.
Created attachment 176718 [details] act01
Created attachment 176719 [details] act02
Created attachment 176720 [details] act03
Created attachment 176721 [details] act04
Created attachment 176722 [details] act05
Created attachment 176723 [details] act06
Created attachment 176724 [details] act06
Created attachment 176725 [details] act07
Created attachment 176726 [details] act08
Created attachment 176727 [details] act09
Created attachment 176728 [details] act10
Created attachment 176729 [details] act11
Created attachment 176730 [details] act12
Created attachment 176731 [details] act13
Created attachment 176732 [details] act14
Created attachment 176733 [details] act15
Created attachment 176734 [details] act16
Created attachment 176735 [details] act17
Created attachment 176736 [details] act18
Created attachment 176738 [details] act19
Created attachment 176739 [details] act20
Created attachment 176740 [details] act21
Created attachment 176741 [details] act22
Created attachment 176742 [details] act23
Created attachment 176743 [details] act24
Created attachment 176744 [details] act25
Created attachment 176745 [details] act26
(In reply to Andriy Gapon from comment #29) uploaded "several screenshot" :-).
(In reply to Wojciech Giel from comment #59) Thank you! TLDR version of the screenshot for anyone else interested in the problem: one thread panics because it waits "too long" to acquire the ipi spin lock, the lock is held by a thread waiting for the targeted tlb shootdown to be executed by other cpus, the rest of the cpus are idle, acpi_cpu_idle_mwait is used for that. Interesting attachments: https://bz-attachments.freebsd.org/attachment.cgi?id=176718 https://bz-attachments.freebsd.org/attachment.cgi?id=176719 https://bz-attachments.freebsd.org/attachment.cgi?id=176720 Wojciech, could you please try setting the following at loader prompt before booting the kernel? debug.acpi.disabled=mwait
(In reply to Andriy Gapon from comment #60) set that setting but still got panic: spin lock held too long
(In reply to Wojciech Giel from comment #61) I have no good suggestion about this case. First, is it really true that all other CPUs are executing idle threads ? Might be there is one, besides the two IPI callers, which is not. Second, look for update of your BIOS and reflash it. Update the perc firmware as well. Third, try to boot e.g. from the USB disk, does it work ? If system boots, try to access the drives on the Perc controller. Show verbose dmesg of the successful boot.
(In reply to Konstantin Belousov from comment #62) As far as I understand there has never been a successful boot on that system. And looking through screenshots of "show active trace" output I do not see any other running thread.
(In reply to Andriy Gapon from comment #63) The sentence in the original report which makes me wonder is 'I'm getting panic consistently whenever installer tried to access hard drives.' I am not sure whether it mean that installer program started, or that the install media panics outright on boot. Anyway, the information I requested is what needed to define the next steps.
(In reply to Andriy Gapon from comment #63) This machine had installed FreeBsd 10.0. We decided to rebuild this machine with 11.0. It fails at the stage of spinning drives some times during booing of install cd some times when I accept disk layout in the installer. Bios and raid firmware is up to date.
(In reply to Wojciech Giel from comment #65) If that's so, could you please fulfil Konstantin's request? That is, boot FreeBSD 10.x in verbose mode, capture dmesg output (as text) and attach it to this bug?