Bug 71800

Summary: 5.3-RELEASE crash (infinite IRQ list dump) (SMP-related)
Product: Base System Reporter: Vick Khera <vivek>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 5.3-BETA3   
Hardware: Any   
OS: Any   

Description Vick Khera 2004-09-16 18:50:28 UTC
	

I was updating to BETA4 to test it out, on the reboot to single user after the
"make installkernel" the console went into the loop shown below.  The same
happened about 4 or 5 hours after BETA3 was first installed last thursday when
the machine was just idle.

I updated a 5.2.1 system via cvsup.  Only recourse is to hit the reset button
(serial port break to debugger is not responding).  The loop just keeps going
and going.


System shutdown time has arrived
Shutting down daemon processes:postfix/postfix-script: stopping the Postfix
mail system
 pgsqlpg_ctl: could not find /u/data/postgres/postmaster.pid
Is postmaster running?
.
Stopping cron.
Shutting down local daemons:.
Writing entropy file:.
Terminated
.
Sep SSWaiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...7 2 0 0 0 interrupt                   total
irq0: clk                       53320822
irq1: atkbd0                           6
irq4: sio0                          2982
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320318
interrupt                   total
irq0: clk                       53320823
irq1: atkbd0                           6
irq4: sio0                          2983
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320320
interrupt                   total
irq0: clk                       53320824
irq1: atkbd0                           6
irq4: sio0                          2984
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320322
interrupt                   total
irq0: clk                       53320825
irq1: atkbd0                           6
irq4: sio0                          2985
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320324
interrupt                   total
irq0: clk                       53320826
irq1: atkbd0                           6
irq4: sio0                          2986
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320326
interrupt                   total
irq0: clk                       53320827
irq1: atkbd0                           6
irq4: sio0                          2987
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320328
interrupt                   total
irq0: clk                       53320828
irq1: atkbd0                           6
irq4: sio0                          2988
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320330
interrupt                   total
irq0: clk                       53320829
irq1: atkbd0                           6
irq4: sio0                          2989
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320332
interrupt                   total
irq0: clk                       53320830
irq1: atkbd0                           6
irq4: sio0                          2990
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320334
interrupt                   total
irq0: clk                       53320831
irq1: atkbd0                           6
irq4: sio0                          2991
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320336
interrupt                   total
irq0: clk                       53320832
irq1: atkbd0                           6
irq4: sio0                          2992
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320338
interrupt                   total
irq0: clk                       53320833
irq1: atkbd0                           6
irq4: sio0                          2993
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320340
interrupt                   total
irq0: clk                       53320834
irq1: atkbd0                           6
irq4: sio0                          2994
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320342
interrupt                   total
irq0: clk                       53320835
irq1: atkbd0                           6
irq4: sio0                          2995
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320344
interrupt                   total
irq0: clk                       53320836
irq1: atkbd0                           6
irq4: sio0                          2996
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320346
interrupt                   total
irq0: clk                       53320837
irq1: atkbd0                           6
irq4: sio0                          2997
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320348
interrupt                   total
irq0: clk                       53320838
irq1: atkbd0                           6
irq4: sio0                          2998
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320350
interrupt                   total
irq0: clk                       53320839
irq1: atkbd0                           6
irq4: sio0                          2999
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320352
interrupt                   total
irq0: clk                       53320840
irq1: atkbd0                           6
irq4: sio0                          3000
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320354
interrupt                   total
irq0: clk                       53320841
irq1: atkbd0                           6
irq4: sio0                          3001
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320356
interrupt                   total
irq0: clk                       53320842
irq1: atkbd0                           6
irq4: sio0                          3002
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320358
interrupt                   total
irq0: clk                       53320843
irq1: atkbd0                           6
irq4: sio0                          3003
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320360
interrupt                   total
irq0: clk                       53320844
irq1: atkbd0                           6
irq4: sio0                          3004
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320362
interrupt                   total
irq0: clk                       53320845
irq1: atkbd0                           6
irq4: sio0                          3005
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320364
interrupt                   total
irq0: clk                       53320846
irq1: atkbd0                           6
irq4: sio0                          3006
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320366
interrupt                   total
irq0: clk                       53320847
irq1: atkbd0                           6
irq4: sio0                          3007
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320368
interrupt                   total
irq0: clk                       53320848
irq1: atkbd0                           6
irq4: sio0                          3008
irq6: fdc0                             4
irq8: rtc                       68240433
irq10: fxp0                       184517
irq11: atapci1                  23942596
irq13: npx0                            2
irq14: ata0                      3628908
irq15: ata1                           48
Total                   149320370

Fix: 

Don't know.  Hopefully BETA4 won't crash thusly.
How-To-Repeat: 	
Don't know.  It just happend twice.  FreeBSD 5.2.1 has been otherwise rock
solid stable on this machine under pretty stressful postgres database
pre-production testing.
Comment 1 Vick Khera 2004-09-23 18:49:48 UTC
FWIW, this exact same thing happened again when rebooting to update to 
BETA5.  Otherwise it was pretty stable during some pretty heavy 
database testing.
Comment 2 Vick Khera 2004-09-27 19:08:14 UTC
I think I may have more of a clue now.  If I unmount my ccd partition 
(2 SATA disks in a stripe) before shutdown, it tends not to happen.  It 
did happen on BETA6 once, and once did not when I did the unmount 
first.

my ccd.conf is this:

ccd0            126     0       /dev/ad4s1a /dev/ad6s1a
Comment 3 Vick Khera 2004-12-21 16:44:10 UTC
Unfortunately, I am observing this with 5-STABLE (as of December 19, 
2004) on an Intel Xeon processor running FreeBSD-amd and a custom 
kernel.  The 5.3-RELEASE generic kernel did not exhibit this the few 
times I rebooted, but it also didn't support the NIC on the 
motherboard, so I had to upgrade.  This is on a brand new Dell PE800 
tower (Dell's diags report no errors on extended testing).  This box 
has a SATA RAID (aac device) and no ccd.

Essentially, I cannot reboot these boxes without having to manually hit 
the reset/power button after shutdown.
Comment 4 Vick Khera 2005-01-18 04:01:26 UTC
FWIW, I just put 5.3-STABLE as of Jan 17 on a dual Opteron system with 
4GB ram, and it never exhibits this problem.  It always reboots 
successfully.  It has an amr RAID controller device and 8 disks.
Comment 5 John Baldwin freebsd_committer freebsd_triage 2005-11-23 14:26:55 UTC
Can you test 5.4 and 6.0 to see if they exhibit the same problem?  The=20
interrupt output you are seeing is from the DDB command 'show intrcnt', and=
 I=20
have no idea why that function would be called during shutdown.

=2D-=20
John Baldwin <jhb@FreeBSD.org> =A0<>< =A0http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve" =A0=3D =A0http://www.FreeBSD.org
Comment 6 Vick Khera 2005-11-23 15:23:24 UTC
On Nov 23, 2005, at 9:26 AM, John Baldwin wrote:

> Can you test 5.4 and 6.0 to see if they exhibit the same problem?  The
> interrupt output you are seeing is from the DDB command 'show  
> intrcnt', and I
> have no idea why that function would be called during shutdown.
>
> -- 
> John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
> "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org

man, I forgot all about this PR....

apparently 5.4 and/or 6.0 solve the problem since I no longer see  
this happen.

I suppose we close this PR now.
Comment 7 John Baldwin freebsd_committer freebsd_triage 2005-11-23 19:01:25 UTC
State Changed
From-To: open->closed

Submitter reports it is fixed in 5.4 and 6.0.