Bug 197756

Summary: System stops booting with "ACPI APIC Table: <SECCSD LH43STAR>"
Product: Base System Reporter: Beeblebrox <zaphod>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: jhb
Priority: ---    
Version: CURRENT   
Hardware: amd64   
OS: Any   
See Also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196542

Description Beeblebrox 2015-02-17 09:02:37 UTC
* I'm PXE/diskless booting a Samsung RS40 Laptop (Intel i3 CPU, Radeon Manhattan Mobility HD-5400 GPU 3GB RAM)
* I built a new kernel yesterday and got this error. The kernel I was using from Feb.04.2015 did not have this problem (so it's a recent event).
* Same kernel on my other hardware (not laptops) does not encounter this problem.
* Problem possibly similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196542
* Booting halts and system hangs with:

ACPI APIC Table: <SECCSD LH43STAR>
panic: Failed to deliver first STARTUP IPI to APIC 1
cpuid=0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff81bd4af0
vpanic() at vpanic+0x189/frame 0xffffffff81bd4b70
panic() at panic+0x43/frame 0xffffffff81bd4bd0
ipi_startup() at ipi_startup+0x5b/frame 0xffffffff81bd4bf0
native_start_all_apps() at native_start_all_apps+0x232/frame 0xffffffff81bd4c40
cpu_mp_start() at cpu_mp_start+0x342/frame 0xffffffff81bd4c70
mp_start() at mp_start+0x3d/frame 0xffffffff81bd4c90
mi_startup() at mi_startup+0x118/frame 0xffffffff81bd4cb0
btext() at btext+0x2c
KDB: enter: panic  /EOF
Comment 1 John Baldwin freebsd_committer freebsd_triage 2015-02-18 15:17:23 UTC
Can you try increasing the lengths of the delays in ipi_startup() in sys/amd64/amd64/mp_machdep.c?  (The lapic_ipi_wait() calls)
Comment 2 Beeblebrox 2015-02-23 11:28:19 UTC
Hi. I Increased all from 20 to 40 "lapic_ipi_wait(40);"
This kernel is not DEBUG enabled, but I get same end result:
(ACPI APIC Table: <SECCSD LH43STAR>
panic: Failed to deliver first STARTUP IPI to APIC 1)
Comment 3 Beeblebrox 2015-03-01 13:37:06 UTC
The NIC driver on this laptop is msk, so the problem is probably issue described here:  http://freebsd.1045724.n5.nabble.com/msk0-watchdog-timeout-Marvel-88E8071-td5990857.html
Comment 4 Beeblebrox 2015-03-01 13:52:02 UTC
Can't edit the last message, so I have to post a new one.

* As stated in first post, kernel from Feb.04.2015 did not have this problem, while kernel from Feb.16.2015 did have the problem. Hope that narrows down the time frame for you.
* This is not the first incidence with if_msk: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=166442
Comment 5 John Baldwin freebsd_committer freebsd_triage 2015-03-11 14:26:41 UTC
Can you keep increasing the value passed to lapic_ipi_wait() until it works?  This is not going to be caused by msk(4).
Comment 6 Beeblebrox 2015-03-26 14:29:33 UTC
Hi. I bumped the lapic_ipi_wait() to 60 and it worked. I then lowered it to 50 but I strangely got mixed results (intermittent and unpredictable success/fail).
60 does not seem to cause further ACPI problems.

Regards.
Comment 7 John Baldwin freebsd_committer freebsd_triage 2015-03-30 17:20:10 UTC
To be clear, something like this works for you?  (I'm just going to bump it up to 100 to be extra safe):

Index: mp_machdep.c
===================================================================
--- mp_machdep.c	(revision 280857)
+++ mp_machdep.c	(working copy)
@@ -1084,7 +1084,7 @@ ipi_startup(int apic_id, int vector)
 	 */
 	lapic_ipi_raw(APIC_DEST_DESTFLD | APIC_TRIGMOD_LEVEL |
 	    APIC_LEVEL_ASSERT | APIC_DESTMODE_PHY | APIC_DELMODE_INIT, apic_id);
-	lapic_ipi_wait(20);
+	lapic_ipi_wait(100);
 
 	/* Explicitly deassert the INIT IPI. */
 	lapic_ipi_raw(APIC_DEST_DESTFLD | APIC_TRIGMOD_LEVEL |
@@ -1104,7 +1104,7 @@ ipi_startup(int apic_id, int vector)
 	lapic_ipi_raw(APIC_DEST_DESTFLD | APIC_TRIGMOD_EDGE |
 	    APIC_LEVEL_ASSERT | APIC_DESTMODE_PHY | APIC_DELMODE_STARTUP |
 	    vector, apic_id);
-	if (!lapic_ipi_wait(20))
+	if (!lapic_ipi_wait(100))
 		panic("Failed to deliver first STARTUP IPI to APIC %d",
 		    apic_id);
 	DELAY(200);		/* wait ~200uS */
@@ -1118,7 +1118,7 @@ ipi_startup(int apic_id, int vector)
 	lapic_ipi_raw(APIC_DEST_DESTFLD | APIC_TRIGMOD_EDGE |
 	    APIC_LEVEL_ASSERT | APIC_DESTMODE_PHY | APIC_DELMODE_STARTUP |
 	    vector, apic_id);
-	if (!lapic_ipi_wait(20))
+	if (!lapic_ipi_wait(100))
 		panic("Failed to deliver second STARTUP IPI to APIC %d",
 		    apic_id);
Comment 8 Beeblebrox 2015-03-30 17:44:48 UTC
Hi John,
Yes that works to solve the problem for me (at a value of 60)
Regards.
Comment 9 commit-hook freebsd_committer freebsd_triage 2015-03-30 20:14:21 UTC
A commit references this bug:

Author: jhb
Date: Mon Mar 30 20:13:24 UTC 2015
New revision: 280866
URL: https://svnweb.freebsd.org/changeset/base/280866

Log:
  Wait 100 microseconds for a local APIC to dispatch each startup-related IPI
  rather than 20.  The MP 1.4 specification states in Appendix B.2:

    "A period of 20 microseconds should be sufficient for IPI dispatch to
     complete under normal operating conditions".

  (Note that this appears to be separate from the 10 millisecond (INIT) and
  200 microsecond (STARTUP) waits after the IPIs are dispatched.)  The
  Intel SDM is silent on this issue as far as I can tell.

  At least some hardware requires 60 microseconds as noted in the PR, so
  bump this to 100 to be on the safe side.

  PR:		197756
  Reported by:	zaphod@berentweb.com
  MFC after:	1 week

Changes:
  head/sys/amd64/amd64/mp_machdep.c
  head/sys/i386/i386/mp_machdep.c
Comment 10 commit-hook freebsd_committer freebsd_triage 2015-04-15 16:52:54 UTC
A commit references this bug:

Author: jhb
Date: Wed Apr 15 16:52:36 UTC 2015
New revision: 281560
URL: https://svnweb.freebsd.org/changeset/base/281560

Log:
  MFC 278325,280866:
  Revert the IPI startup sequence to match what is described in the
  Intel Multiprocessor Specification v1.4.  The Intel SDM claims that

  278325:
  Revert the IPI startup sequence to match what is described in the
  Intel Multiprocessor Specification v1.4.  The Intel SDM claims that
  the INIT IPIs here are invalid, but other systems follow the MP
  spec instead.

  While here, fix the IPI wait routine to accept a timeout in microseconds
  instead of a raw spin count, and don't spin forever during AP startup.
  Instead, panic if a STARTUP IPI is not delivered after 20 us.

  280866:
  Wait 100 microseconds for a local APIC to dispatch each startup-related IPI
  rather than 20.  The MP 1.4 specification states in Appendix B.2:

    "A period of 20 microseconds should be sufficient for IPI dispatch to
     complete under normal operating conditions".

  (Note that this appears to be separate from the 10 millisecond (INIT) and
  200 microsecond (STARTUP) waits after the IPIs are dispatched.)  The
  Intel SDM is silent on this issue as far as I can tell.

  At least some hardware requires 60 microseconds as noted in the PR, so
  bump this to 100 to be on the safe side.

  PR:		196542, 197756

Changes:
_U  stable/10/
  stable/10/sys/amd64/amd64/mp_machdep.c
  stable/10/sys/i386/i386/mp_machdep.c
  stable/10/sys/x86/x86/local_apic.c