Created attachment 183349 [details] SMP kernel panic fix based on no-SMP kernel implementation atpic_assign_cpu() function calls panic() and kernel fails to boot. After reading the comment of atpic_assig_cpu(), I wondered if non-SMP kernel boots. The result was good - non-SMP kernel booted. I adjusted atpic_assign_cpu() to mimic non-SMP path. sys/x86/x86/intr_machdep.c: static int intr_assign_cpu(void *arg, int cpu) { #ifdef SMP struct intsrc *isrc; int error; #ifdef EARLY_AP_STARTUP MPASS(mp_ncpus == 1 || smp_started); if (cpu != NOCPU) { #else /* * Don't do anything during early boot. We will pick up the * assignment once the APs are started. */ if (assign_cpu && cpu != NOCPU) { #endif isrc = arg; sx_xlock(&intrsrc_lock); error = isrc->is_pic->pic_assign_cpu(isrc, cpu_apic_ids[cpu]); sx_xunlock(&intrsrc_lock); } else error = 0; return (error); #else return (EOPNOTSUPP); #endif } ones in the following files also return EOPNOTSUPP for non-SMP option, too. sys/x86/xen/xen_intr.c sys/sparc64/sparc64/intr_machdep.c dmsg of the machine after applying the patch: FreeBSD 11.1-PRERELEASE #1065 r319639M: Thu Jun 8 01:05:28 EDT 2017 ota@localhost:/usr/obj/usr/src/sys/GENERIC i386 FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0 ) VT(vga): text 80x25 CPU: Mobile Intel(R) Celeron(R) CPU 2.40GHz (2394.05-MHz 686-class CPU) Origin="GenuineIntel" Id=0xf29 Family=0xf Model=0x2 Stepping=9 Features=0xbfebf9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,P AT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4400<CNXT-ID,xTPR> real memory = 536870912 (512 MB) avail memory = 477016064 (454 MB) random: unblocking device. Timecounter "TSC-low" frequency 1197026036 Hz quality 800 taskqgroup_adjust failed cnt: 1 stride: 1 mp_ncpus: 1 smp_started: 0 taskqgroup_adjust failed cnt: 1 stride: 1 mp_ncpus: 1 smp_started: 0
atpic_assign_cpu implementation in CURRENT is also the same as 11.1-PRERELEASE. Interestingly the panic was added 10 years ago. A call path must have changed in recent change set, though, since 11.0-RELEASE. Ref - https://svnweb.freebsd.org/base/head/sys/x86/isa/atpic.c?r1=153242&r2=156124
John, would you be able to comment on this?
Created attachment 183451 [details] noassign_on_up.patch Please try this patch instead. In the !EARLY_AP_STARTUP case, the 'assign_cpu' variable is never set on a UP system, so it results in never trying to assign CPUs. This patch restores that by checking for an SMP system explicitly before trying to assign a CPU. I think the panic in the atpic method is still a good canary for ensuring the calling code is doing the right thing.
(In reply to John Baldwin from comment #3) This patch also worked.
A commit references this bug: Author: jhb Date: Wed Jun 14 13:34:10 UTC 2017 New revision: 319942 URL: https://svnweb.freebsd.org/changeset/base/319942 Log: Don't try to assign interrupts to a CPU on single-CPU systems. All interrupts are routed to the sole CPU in that case implicitly. This is a regression in EARLY_AP_STARTUP. Previously the 'assign_cpu' variable was only set when a multi-CPU system finished booting, so it's value both meant that interrupts could be assigned and that there was more than one CPU. PR: 219882 Reported by: ota@j.email.ne.jp MFC after: 3 days Changes: head/sys/x86/x86/intr_machdep.c
A commit references this bug: Author: gjb Date: Mon Jun 19 13:26:39 UTC 2017 New revision: 320097 URL: https://svnweb.freebsd.org/changeset/base/320097 Log: MFC r319942 (jhb): Don't try to assign interrupts to a CPU on single-CPU systems. All interrupts are routed to the sole CPU in that case implicitly. This is a regression in EARLY_AP_STARTUP. Previously the 'assign_cpu' variable was only set when a multi-CPU system finished booting, so its value both meant that interrupts could be assigned and that there was more than one CPU. PR: 219882 Approved by: re (kib) Sponsored by: The FreeBSD Foundation Changes: _U stable/11/ stable/11/sys/x86/x86/intr_machdep.c
Thank you for the report, and for testing the fix.