Bug 66314 - SMP kernel panic: ipi_send: couldn't send ipi
Summary: SMP kernel panic: ipi_send: couldn't send ipi
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: sparc64 (show other bugs)
Version: 5.2-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-sparc64 (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-05-06 07:50 UTC by bel
Modified: 2004-10-01 04:32 UTC (History)
0 users

See Also:


Attachments
ipi_send.patch (1.08 KB, patch)
2004-05-06 07:50 UTC, bel
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description bel 2004-05-06 07:50:20 UTC
	I have two Ultra 60. Periodically they hard lookup or panic.
	Example:
======================================================================
panic: ipi_send: couldn't send ipi
at line 455 in file /usr/src/sys/sparc64/sparc64/mp_machdep.c
cpuid = 0;
Debugger("panic")
======================================================================

And no "db>" prompt. Sometimes kernel go to DDB. But core dump is not
possible to save.

Fix: Increase IPI_RETRIES to big value???
How-To-Repeat: 	I have included some debug code (see attached patch) into kernel
and increase IPI_RETRIES from 100 to 1000000.
	My code have cyclic array for storing last 32 counters of iteration.
Also it have max_ipi_retries variable for storing maximum value of iteration
counter.

	Boot machine and go to kgdb:

root@bel# gdb -k /usr/obj/usr/src/sys/SUNC3D/kernel.debug /dev/mem
GNU gdb 5.3 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc64-portbld-freebsd5.2"...
panic messages:
---
---
#0  sched_switch (td=0xc03c1288) at /usr/src/sys/kern/sched_ule.c:1186
1186                    cpu_switch(td, newtd);
(kgdb) p max_ipi_retries
$1 = 1
(kgdb) quit

root@bel# ls -laR /
[10 second output skipped]
^C

root@bel# gdb -k /usr/obj/usr/src/sys/SUNC3D/kernel.debug /dev/mem
[...]
(kgdb) p max_ipi_retries
$1 = 2022

Wow! max_ipi_retries in 20 times more than default limit (100).
Comment 1 Hannes Mehnert 2004-09-14 04:13:17 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I got the same panic on a sun enterprise 450 with 2 * 300 MHz, with
a FreeBSD 5.2-CURRENT from July, 27. Will test the patch.

I also get a stack backtrace:

KDB: stack backtrace:
cpu_ipi_send() at cpu_ipi_send+0xb8
cpu_ipi_selected() at cpu_ipi_selected+0x38
sleepq_resume_thread() at sleepq_resume_thread+0x8c
softclock() at softclock+0x218
ithread_loop() at ithread_loop+0x250
fork_exit() at fork_exit+0x9c
fork_trampoline() at fork_trampoline+0x8
KDB: enter: panic

Best Regards,

Hannes Mehnert
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (FreeBSD)

iD8DBQFBRmHKRcuNlziBjRwRAgGFAKC7qmdP7VGnF5cyupUDKzH3Iu2dPACeMWLT
D9nXwGkFOeGw1q6RMYLGeYA=
=DKTz
-----END PGP SIGNATURE-----
Comment 2 Hannes Mehnert 2004-09-14 04:13:17 UTC
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I got the same panic on a sun enterprise 450 with 2 * 300 MHz, with
a FreeBSD 5.2-CURRENT from July, 27. Will test the patch.

I also get a stack backtrace:

KDB: stack backtrace:
cpu_ipi_send() at cpu_ipi_send+0xb8
cpu_ipi_selected() at cpu_ipi_selected+0x38
sleepq_resume_thread() at sleepq_resume_thread+0x8c
softclock() at softclock+0x218
ithread_loop() at ithread_loop+0x250
fork_exit() at fork_exit+0x9c
fork_trampoline() at fork_trampoline+0x8
KDB: enter: panic

Best Regards,

Hannes Mehnert
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (FreeBSD)

iD8DBQFBRmHKRcuNlziBjRwRAgGFAKC7qmdP7VGnF5cyupUDKzH3Iu2dPACeMWLT
D9nXwGkFOeGw1q6RMYLGeYA=
=DKTz
-----END PGP SIGNATURE-----
Comment 3 kensmith freebsd_committer freebsd_triage 2004-10-01 04:27:23 UTC
State Changed
From-To: open->closed

Hopefully this will be taken care of by src/sys/sparc64/include/smp.h v1.17.