Bug 243534 - Kernel panics with "panic: invalid count 2" early during boot
Summary: Kernel panics with "panic: invalid count 2" early during boot
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: sparc64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-01-23 06:43 UTC by Michael Reim
Modified: 2020-01-23 17:19 UTC (History)
2 users (show)

See Also:


Attachments
dmesg.boot contents from 12.1 (3.75 KB, text/plain)
2020-01-23 06:43 UTC, Michael Reim
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Reim 2020-01-23 06:43:06 UTC
Created attachment 210979 [details]
dmesg.boot contents from 12.1

There has been a fix proposal for unwind on SPARC64 that is looking for testers (r356552). I'd like to give it a try, but I cannot get any -CURRENT kernel booting on my machine. Both cross-compiled kernels as well as natively-built ones seem to hit the same problem, so it's likely not a GCC9 issue.

I'll attach a dmesg.boot file from a natively-built 12-STABLE system to give people a clue on what hardware the system has. The newest kernel that I tested is a cross-built r356986. I re-read AF3e's chapter on crash dumps, trying to provide something useful, but I guess that the crash happens too early and the system cannot dump anything, yet. So here's the serial output that I get:

Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...                                   
jumping to kernel entry at 0xc00b8020.  
GDB: no debug ports present                                                                                                                      
KDB: debugger backends: ddb   
KDB: current backend: ddb                                               
Copyright (c) 1992-2020 The FreeBSD Project.                       
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.                        
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #0 r356986: Wed Jan 22 16:54:54 CET 2020                                                                                    
    root@fbsdtest.omc.net:/usr/obj/usr/src/sparc64.sparc64/sys/GENERIC sparc64
gcc version 9.2.0 (FreeBSD Ports Collection for sparc64)                                                                                         
WARNING: WITNESS option enabled, expect reduced performance.
real memory  = 1073741824 (1024 MB)                                     
avail memory = 1024761856 (977 MB)
cpu0: Sun Microsystems UltraSparc-IIe Processor (548.00 MHz CPU)
random: unblocking device.
random: entropy device external interface
[ath_hal] loaded
WARNING: Device "kbd" is Giant locked and may be deleted before FreeBSD 13.0.
kbd0 at kbdmux0
WARNING: Device "openfirm" is Giant locked and may be deleted before FreeBSD 13.0.
WARNING: Device "openprom" is Giant locked and may be deleted before FreeBSD 13.0.
nexus0: <Open Firmware Nexus device>
pcib0: <U2P UPA-PCI bridge> mem 0x1fe00000000-0x1fe0000ffff,0x1fe01000000-0x1fe010000ff irq 2032,2030,2031,2021 on nexus0
pcib0: Sabre, impl 0, version 0, IGN 0x1f, bus A, 66MHz
pcib0: DVMA map: 0x60000000 to 0x63ffffff 8192 entries
pci0: <OFW PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <old, non-VGA display device> at device 3.0 (no driver attached)
dc0: <Davicom DM9102A 10/100BaseTX> port 0x10000-0x100ff mem 0-0xff at device 12.0 on pci0
miibus0: <MII bus> on dc0
amphy0: <DM9102 10/100 media interface> PHY 1 on miibus0           
amphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc0: Ethernet address: 00:03:ba:4e:55:e6
dc1: <Davicom DM9102A 10/100BaseTX> port 0x10100-0x101ff mem 0x2000-0x20ff at device 5.0 on pci0
miibus1: <MII bus> on dc1
amphy1: <DM9102 10/100 media interface> PHY 1 on miibus1
amphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc1: Ethernet address: 00:03:ba:4e:55:e6
ohci0: <AcerLabs M5237 (Aladdin-V) USB controller> mem 0x1000000-0x1000fff at device 10.0 on pci0
usbus0 on ohci0
atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 o
n pci0                              
atapci0: using PIO transfers above 137GB as workaround for 48bit DMA access bug, expect reduced performance
ata2: <ATA channel> at channel 0 on atapci0
ata3: <ATA channel> at channel 1 on atapci0
cryptosoft0: <software crypto> on nexus0
nexus0: <syscons> type unknown (no driver attached)
rtc0: <Real-Time Clock> at port 0x70-0x71 pnpid PNP0b00 on isa0
rtc0: registered as a time-of-day clock, resolution 1.000000s
uart0: console (9600,n,8,1)> at port 0x3f8-0x3ff irq 43 pnpid PNP0501 on isa0
uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 pnpid PNP0501 on isa0
Timecounter "tick" frequency 548000000 Hz quality 1000
Event timer "tick" frequency 548000000 Hz quality 1000
Timecounters tick every 1.000 msec
usbus0: 12Mbps Full Speed USB v1.0
Obsolete code will be removed soon: random(9) is the obsolete Park-Miller LCG from 1988
panic: invalid count 2
cpuid = 0
time = 1
KDB: stack backtrace:
_end() at 0xc1416fb8
vpanic() at vpanic+0x31c
panic() at panic+0x20
sched_switch() at sched_switch+0x8ac
mi_switch() at mi_switch+0x1dc
critical_exit_preempt() at critical_exit_preempt+0x88
spinlock_exit() at spinlock_exit+0x70
__mtx_unlock_spin_flags() at __mtx_unlock_spin_flags+0xb0
sched_add() at sched_add+0x2e8
gtaskqueue_start_threads() at gtaskqueue_start_threads+0x254
taskqgroup_cpu_create() at taskqgroup_cpu_create+0x124
taskqgroup_adjust() at taskqgroup_adjust+0x280
taskqgroup_adjust_softirq() at taskqgroup_adjust_softirq+0x34
mi_startup() at mi_startup+0x32c
btext() at btext+0x28
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x80: ta              %xcc, 1
db>

I'll gladly provide additional information if required. (BTW for those who care: The binutils fix for SPARC64 was accepted upsteam.)
Comment 1 Ed Maste freebsd_committer freebsd_triage 2020-01-23 14:11:06 UTC
This happened on other architectures after r355784; see the thread at https://lists.freebsd.org/pipermail/svn-src-all/2019-December/191362.html

It was fixed for others in r355819.  I'll apply the same change to sparc64.
Comment 2 commit-hook freebsd_committer freebsd_triage 2020-01-23 14:11:35 UTC
A commit references this bug:

Author: emaste
Date: Thu Jan 23 14:11:03 UTC 2020
New revision: 357045
URL: https://svnweb.freebsd.org/changeset/base/357045

Log:
  Apply r355819 to sparc64 - fix assertion failure after r355784

  From r355819:
  Repeat the spinlock_enter/exit pattern from amd64 on other architectures
  to fix an assert violation introduced in r355784.  Without this
  spinlock_exit() may see owepreempt and switch before reducing the
  spinlock count.  amd64 had been optimized to do a single critical
  enter/exit regardless of the number of spinlocks which avoided the
  problem and this optimization had not been applied elsewhere.

  This is completely untested - I have no obsolete Sparc hardware - but
  someone did try testing recent changes on sparc64 (PR 243534).

  PR:		243534

Changes:
  head/sys/sparc64/sparc64/machdep.c
Comment 3 Ed Maste freebsd_committer freebsd_triage 2020-01-23 14:36:37 UTC
FYI I have the GCC removal changes staged in a Git branch on GitHub at https://github.com/emaste/freebsd/tree/deorbit-gcc (which includes the change I just committed).
Comment 4 Michael Reim 2020-01-23 15:57:16 UTC
(In reply to Ed Maste from comment #1)

Thanks for the quick fix! It solved that problem and lead to the machine boot further. With r357045 I'm getting a new panic, though (this time obviously from the VM system):

Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...                                              
jumping to kernel entry at 0xc00b8020.                                         
GDB: no debug ports present                                                    
KDB: debugger backends: ddb                                                    
KDB: current backend: ddb                                                      
Copyright (c) 1992-2020 The FreeBSD Project.                            
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994       
        The Regents of the University of California. All rights reserved.      
FreeBSD is a registered trademark of The FreeBSD Foundation.                   
FreeBSD 13.0-CURRENT #0 r357046: Thu Jan 23 15:54:21 CET 2020                  
    root@fbsdtest.omc.net:/usr/obj/usr/src/sparc64.sparc64/sys/GENERIC sparc64 
gcc version 9.2.0 (FreeBSD Ports Collection for sparc64)                       
WARNING: WITNESS option enabled, expect reduced performance.                   
real memory  = 1073741824 (1024 MB)                                            
avail memory = 1024761856 (977 MB)                                             
cpu0: Sun Microsystems UltraSparc-IIe Processor (548.00 MHz CPU)               
random: unblocking device.                                                     
random: entropy device external interface             
[ath_hal] loaded                                                               
WARNING: Device "kbd" is Giant locked and may be deleted before FreeBSD 13.0.  
kbd0 at kbdmux0                                                                
WARNING: Device "openfirm" is Giant locked and may be deleted before FreeBSD 13.0.
WARNING: Device "openprom" is Giant locked and may be deleted before FreeBSD 13.0.
nexus0: <Open Firmware Nexus device>
pcib0: <U2P UPA-PCI bridge> mem 0x1fe00000000-0x1fe0000ffff,0x1fe01000000-0x1fe010000ff irq 2032,2030,2031,2021 on nexus0
pcib0: Sabre, impl 0, version 0, IGN 0x1f, bus A, 66MHz
pcib0: DVMA map: 0x60000000 to 0x63ffffff 8192 entries
pci0: <OFW PCI bus> on pcib0
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
pci0: <old, non-VGA display device> at device 3.0 (no driver attached)
dc0: <Davicom DM9102A 10/100BaseTX> port 0x10000-0x100ff mem 0-0xff at device 12.0 on pci0
miibus0: <MII bus> on dc0
amphy0: <DM9102 10/100 media interface> PHY 1 on miibus0
amphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc0: Ethernet address: 00:03:ba:4e:55:e6
dc1: <Davicom DM9102A 10/100BaseTX> port 0x10100-0x101ff mem 0x2000-0x20ff at device 5.0 on pci0
miibus1: <MII bus> on dc1
amphy1: <DM9102 10/100 media interface> PHY 1 on miibus1
amphy1:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc1: Ethernet address: 00:03:ba:4e:55:e6
ohci0: <AcerLabs M5237 (Aladdin-V) USB controller> mem 0x1000000-0x1000fff at device 10.0 on pci0
usbus0 on ohci0
atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0
atapci0: using PIO transfers above 137GB as workaround for 48bit DMA access bug, expect reduced performance
ata2: <ATA channel> at channel 0 on atapci0
ata3: <ATA channel> at channel 1 on atapci0
cryptosoft0: <software crypto> on nexus0
nexus0: <syscons> type unknown (no driver attached)
rtc0: <Real-Time Clock> at port 0x70-0x71 pnpid PNP0b00 on isa0
rtc0: registered as a time-of-day clock, resolution 1.000000s
uart0: console (9600,n,8,1)> at port 0x3f8-0x3ff irq 43 pnpid PNP0501 on isa0
uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 pnpid PNP0501 on isa0
Timecounter "tick" frequency 548000000 Hz quality 1000
Event timer "tick" frequency 548000000 Hz quality 1000
Timecounters tick every 1.000 msec
usbus0: 12Mbps Full Speed USB v1.0
Obsolete code will be removed soon: random(9) is the obsolete Park-Miller LCG from 1988
WARNING: WITNESS option enabled, expect reduced performance.
ugen0.1: <AcerLabs OHCI root HUB> at usbus0
uhub0 on usbus0
uhub0: <AcerLabs OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
Trying to mount root from ufs:/dev/ada1a [rw]...
Root mount waiting for: usbus0 CAM
cd0 at ata3 bus 0 scbus1 target 1 lun 0
cd0: <TEAC CD-224E P.9A> Removable CD-ROM SCSI device
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present
uhub0: 2 ports with 2 removable, self powered
ada0 at ata2 bus 0 scbus0 target 0 lun 0
ada0: <IC35L060AVER07-0 ER6OA46A> ATA-5 device
ada0: Serial Number SZPTZ202544
ada0: 66.700MB/s transfers (UDMA4, PIO 8192bytes)
ada0: 58644MB (120103200 512 byte sectors)
ada1 at ata3 bus 0 scbus1 target 0 lun 0
ada1: <IBM-DTLA-307015 TX2OA60A> ATA-5 device
ada1: Serial Number YFEYFML4312
ada1: 66.700MB/s transfers (UDMA4, PIO 8192bytes)
ada1: 14649MB (30003120 512 byte sectors)
mountroot: waiting for device /dev/ada1a...
panic: vm_page_assert_xbusied: page 0xfffff8009f65cb90 not exclusive busy @ /usr/src/sys/vm/vm_page.c:1555
cpuid = 0
time = 1579793596
KDB: stack backtrace:
_end() at 0xc92e90f8
vpanic() at vpanic+0x31c
panic() at panic+0x20
vm_page_object_remove() at vm_page_object_remove+0x16c
vm_page_free_prep() at vm_page_free_prep+0xe4
vm_page_free_toq() at vm_page_free_toq+0x4
vm_page_free_zero() at vm_page_free_zero+0x10
pmap_release() at pmap_release+0xcc
vmspace_free() at vmspace_free+0x9c
start_init() at start_init+0x36c
fork_exit() at fork_exit+0x6c
fork_trampoline() at fork_trampoline+0x8
KDB: enter: panic
[ thread pid 1 tid 100002 ]
Stopped at      kdb_enter+0x80: ta              %xcc, 1
db>

Again, I'll gladly provide additional information as needed.
Comment 5 Mark Johnston freebsd_committer freebsd_triage 2020-01-23 16:05:04 UTC
sparc64's pmap_release() is freeing pages belonging to the TSB object, and the new vm_page_free() contract requires the caller to busy the page.

diff --git a/sys/sparc64/sparc64/pmap.c b/sys/sparc64/sparc64/pmap.c
index 46454795ad26..753bd6af5aa1 100644
--- a/sys/sparc64/sparc64/pmap.c
+++ b/sys/sparc64/sparc64/pmap.c
@@ -1301,6 +1301,7 @@ pmap_release(pmap_t pm)
        while (!TAILQ_EMPTY(&obj->memq)) {
                m = TAILQ_FIRST(&obj->memq);
                m->md.pmap = NULL;
+               vm_page_xbusy(m);
                vm_page_unwire_noq(m);
                vm_page_free_zero(m);
        }
Comment 6 Michael Reim 2020-01-23 16:56:07 UTC
(In reply to Mark Johnston from comment #5)
Thank you, that fixed the second issue! I re-built the kernel after applying your patch and was able to successfully boot it up.

So now we have a working SPARC64 -CURRENT kernel built using the xtoolchain. I'll see if I can build the userland, too, next.
Comment 7 commit-hook freebsd_committer freebsd_triage 2020-01-23 17:19:58 UTC
A commit references this bug:

Author: markj
Date: Thu Jan 23 17:18:59 UTC 2020
New revision: 357055
URL: https://svnweb.freebsd.org/changeset/base/357055

Log:
  sparc64: Busy the TSB page before freeing it in pmap_release().

  This is now required by vm_page_free().

  PR:	243534
  Reported and tested by:	Michael Reim <kraileth@elderlinux.org>

Changes:
  head/sys/sparc64/sparc64/pmap.c