Bug 243837 - i386 boot panics after r357314
Summary: i386 boot panics after r357314
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: i386 Any
: --- Affects Only Me
Assignee: Mark Johnston
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2020-02-03 09:14 UTC by Li-Wen Hsu
Modified: 2020-02-03 19:32 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Li-Wen Hsu freebsd_committer 2020-02-03 09:14:57 UTC
i386 boot panics after r357314:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex kernel arena (kernel arena) r = 0 (0x1d99d80)
locked @ /usr/src/sys/kern/subr_vmem.c:1344

Full backtrace is available at
https://ci.freebsd.org/job/FreeBSD-head-i386-test/8275/console
Comment 1 Mark Millard 2020-02-03 10:08:33 UTC
(In reply to Li-Wen Hsu from comment #0)

FYI:

On the freebsd-arm and freebsd-ppc lists I've reported
very early boot failures for armv7 FreeBSD and 32-bit
powerpc FreeBSD, both at head -r357419. (But I'd jumped
from -r356426 .)

I later replicated the armv7 failure via an
artifact.ci.freebsd.org head -r357419 kernel. So both
non-debug and debug kernels got the problem for at
least armv7.

An example armv7 backtrace is included below:

---<<BOOT>>---
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2020 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
       The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #23 r357419M: Sun Feb  2 18:27:00 PST 2020
   markmi@FBSDFHUGE:/usr/obj/armv7_clang/arm.armv7/usr/src/arm.armv7/sys/GENERIC-NODBG arm
FreeBSD clang version 9.0.1 (git@github.com:llvm/llvm-project.git c1a0a213378a458fbea1a5c77b315c7dce08fd05) (based on LLVM 9.0.1)
VT: init without driver.
Fatal kernel mode data abort: 'Translation Fault (L1)' on read
trapframe: 0xc0e14cf0
FSR=00000005, FAR=00000150, spsr=000000d3
r0 =00000000, r1 =00000000, r2 =00000001, r3 =c0b55520
r4 =00000150, r5 =00000002, r6 =c510dca0, r7 =c096dc60
r8 =00000001, r9 =c0970fc0, r10=00000000, r11=c0e14da0
r12=00a00010, ssp=c0e14d80, slr=c0631f90, pc =c06313b8

panic: Fatal abort
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
        pc = 0xc06789fc  lr = 0xc007f710 (db_trace_self_wrapper+0x30)
        sp = 0xc0e14ac8  fp = 0xc0e14be0
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
        pc = 0xc007f710  lr = 0xc02e6e4c (vpanic+0x174)
        sp = 0xc0e14be8  fp = 0xc0e14c08
        r4 = 0x00000100  r5 = 0xc0b55520
        r6 = 0xc07cb7f7  r7 = 0x00000000
vpanic() at vpanic+0x174
        pc = 0xc02e6e4c  lr = 0xc02e6cd8 (vpanic)
        sp = 0xc0e14c10  fp = 0xc0e14c14
        r4 = 0xc0e14cf0  r5 = 0x00000013
        r6 = 0x00000150  r7 = 0x00000005
        r8 = 0x00000005  r9 = 0xc0b55520
       r10 = 0x00000150
vpanic() at vpanic
        pc = 0xc02e6cd8  lr = 0xc069c5a0 (abort_align)
        sp = 0xc0e14c1c  fp = 0xc0e14c48
        r4 = 0x00000005  r5 = 0x00000005
        r6 = 0xc0b55520  r7 = 0x00000150
        r8 = 0xc0e14c14  r9 = 0xc02e6cd8
       r10 = 0xc0e14c1c
abort_align() at abort_align
        pc = 0xc069c5a0  lr = 0xc069c14c (abort_handler+0x2f8)
        sp = 0xc0e14c50  fp = 0xc0e14ce8
        r4 = 0x00000013  r5 = 0x00000150
abort_handler() at abort_handler+0x2f8
        pc = 0xc069c14c  lr = 0xc067b348 (exception_exit)
        sp = 0xc0e14cf0  fp = 0xc0e14da0
        r4 = 0x00000150  r5 = 0x00000002
        r6 = 0xc510dca0  r7 = 0xc096dc60
        r8 = 0x00000001  r9 = 0xc0970fc0
       r10 = 0x00000000
exception_exit() at exception_exit
        pc = 0xc067b348  lr = 0xc0631f90 (cache_alloc+0x5c4)
        sp = 0xc0e14d80  fp = 0xc0e14da0
        r0 = 0x00000000  r1 = 0x00000000
        r2 = 0x00000001  r3 = 0xc0b55520
        r4 = 0x00000150  r5 = 0x00000002
        r6 = 0xc510dca0  r7 = 0xc096dc60
        r8 = 0x00000001  r9 = 0xc0970fc0
       r10 = 0x00000000 r12 = 0x00a00010
uma_zalloc_arg() at uma_zalloc_arg+0x50
        pc = 0xc06313b8  lr = 0xc0631f90 (cache_alloc+0x5c4)
        sp = 0xc0e14da8  fp = 0xc0e14df0
        r4 = 0xc510dc80  r5 = 0x00000002
        r6 = 0xc510dca0  r7 = 0xc096dc60
        r8 = 0x00000000  r9 = 0xc510dca0
       r10 = 0xffffffff
cache_alloc() at cache_alloc+0x5c4
        pc = 0xc0631f90  lr = 0xc06313dc (uma_zalloc_arg+0x74)
        sp = 0xc0e14df8  fp = 0xc0e14e18
        r4 = 0x00000150  r5 = 0x00000000
        r6 = 0xc510dc80  r7 = 0x000050b0
        r8 = 0x00000002  r9 = 0xc0970fc0
       r10 = 0xc510dc80
uma_zalloc_arg() at uma_zalloc_arg+0x74
        pc = 0xc06313dc  lr = 0xc02c117c (malloc+0x70)
        sp = 0xc0e14e20  fp = 0xc0e14e48
        r4 = 0xc0b554f0  r5 = 0xc0995b6c
        r6 = 0xc510dc80  r7 = 0x000050b0
        r8 = 0xc0945088  r9 = 0x00000002
       r10 = 0x0000000b
malloc() at malloc+0x70
        pc = 0xc02c117c  lr = 0xc02c6dd4 (mtx_pool_setup_dynamic+0x1c)
        sp = 0xc0e14e50  fp = 0xc0e14e60
        r4 = 0xc0b554f0  r5 = 0xc0995b6c
        r6 = 0x00000000  r7 = 0x00800001
        r8 = 0xc0b554fc  r9 = 0x01ac0000
       r10 = 0xc0b55500
mtx_pool_setup_dynamic() at mtx_pool_setup_dynamic+0x1c
        pc = 0xc02c6dd4  lr = 0xc026f490 (mi_startup+0x2a0)
        sp = 0xc0e14e68  fp = 0xc0e14e90
        r4 = 0xc0b554f0  r5 = 0xc0995b6c
        r6 = 0x00000000  r7 = 0x00800001
mi_startup() at mi_startup+0x2a0
        pc = 0xc026f490  lr = 0xc00002c4 (_start+0x144)
        sp = 0xc0e14e98  fp = 0x00000000
        r4 = 0xc00003f8  r5 = 0xc0bb0000
        r6 = 0x00000000  r7 = 0x00c52078
        r8 = 0xc0d83000  r9 = 0x00000000
       r10 = 0x0000000a
_start() at _start+0x144
        pc = 0xc00002c4  lr = 0xc00002c4 (_start+0x144)
        sp = 0xc0e14e98  fp = 0x00000000
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      kdb_enter+0x58: ldrb    r15, [r15, r15, ror r15]!

(I do not have access to any strong memory model 32-bit
FreeBSD machines.)
Comment 2 Mark Johnston freebsd_committer 2020-02-03 14:21:58 UTC
There is a regression on all 32-bit platforms.  With the addition of the sequence number field to UMA buckets, one of the built-in bucket sizes becomes 0, and the bucket zone selection code doesn't expect this.

This should fix the problem, I can't really see a better solution:

diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 28b959d66b4a..0551ec1ade48 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -239,7 +239,9 @@ struct uma_bucket_zone {
 #define        BUCKET_MIN      BUCKET_SIZE(4)
 
 struct uma_bucket_zone bucket_zones[] = {
+#ifndef __ILP32__
        { NULL, "4 Bucket", BUCKET_SIZE(4), 4096 },
+#endif
        { NULL, "6 Bucket", BUCKET_SIZE(6), 3072 },
        { NULL, "8 Bucket", BUCKET_SIZE(8), 2048 },
        { NULL, "12 Bucket", BUCKET_SIZE(12), 1536 },
Comment 3 commit-hook freebsd_committer 2020-02-03 19:29:14 UTC
A commit references this bug:

Author: markj
Date: Mon Feb  3 19:29:02 UTC 2020
New revision: 357463
URL: https://svnweb.freebsd.org/changeset/base/357463

Log:
  Disable the smallest UMA bucket size on 32-bit platforms.

  With r357314, sizeof(struct uma_bucket) grew to 16 bytes on 32-bit
  platforms, so BUCKET_SIZE(4) is 0.  This resulted in the creation of a
  bucket zone for buckets with zero capacity.  A more general fix is
  planned, but for now this bandaid allows 32-bit platforms to boot again.

  PR:		243837
  Discussed with:	jeff
  Reported by:	pho, Jenkins via lwhsu
  Tested by:	pho
  Sponsored by:	The FreeBSD Foundation

Changes:
  head/sys/vm/uma_core.c