Bug 253177 - buildkernel buildworld will crash system (lx2160a)
Summary: buildkernel buildworld will crash system (lx2160a)
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: 13.0-STABLE
Hardware: arm64 Any
: --- Affects Only Me
Assignee: freebsd-arm (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-02 14:05 UTC by yarshure
Modified: 2021-04-01 14:55 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description yarshure 2021-02-02 14:05:20 UTC
x0: ffff0000433f3d98
  x1: ffffa00028f2f580
  x2:           100000
  x3:                0
  x4:               60
  x5:                0
  x6:                0
  x7:                0
  x8:                1
  x9:                1
 x10: ffffa00028f2f580
 x11:                0
 x12:                0
 x13:                9
 x14:                3
 x15:                0
 x16:                0
 x17:               18
 x18: ffff0001938186f0
 x19: ffffa00026197000
 x20: ffffa00028f2f580
 x21:                0
 x22: ffffa0000008dc00
 x23: ffffa0000008dc28
 x24:                0
 x25: ffff000000b7b000
 x26: ffff000000b7b000
 x27:           232000
 x28:           232000
 x29: ffff0001938186f0
  sp: ffff0001938186f0
  lr: ffff0000004caabc
 elr: ffff0000004caa20
spsr:         60000145
 far:         402c4829
panic: Unknown kernel exception 0 esr_el1 2000000

cpuid = 1
time = 1612259959
KDB: stack backtrace:
#0 0xffff000000506834 at kdb_backtrace+0x60
#1 0xffff0000004b0a94 at vpanic+0x184
#2 0xffff0000004b090c at panic+0x44
#3 0xffff00000081bc28 at do_el1h_sync+0x140
#4 0xffff0000007fd878 at handle_el1h_sync+0x78
#5 0xffff0000004caab8 at tidhash_remove+0x174
#6 0xffff000000459100 at exit1+0x94c
#7 0xffff0000004587b0 at sys_sys_exit+0x10
#8 0xffff00000081c1b8 at do_el0_sync+0x448
#9 0xffff0000007fda24 at handle_el0_sync+0x90

default kernel dmesg :https://gist.github.com/yarshure/1cc3350b4cbd86d7514514b57987b9d7
Comment 1 yarshure 2021-02-03 07:55:33 UTC
after use $make buildkernel -j1 , don't crash ,  maybe SMP issue ?
Comment 2 Mark Millard 2021-02-03 08:08:17 UTC
You might want to mention tidhash_remove in the one line description. It
would give some folks more context in a simple way when they are looking
at defects.

https://cgit.freebsd.org/src/commit/sys/kern/kern_thread.c?id=26007fe37c06
by mjg@FreeBSD.org seems to be the last time the area was touched. (But the
caller(s) could instead be at issue so this could be a bad guess for who
might want to look.)

tidhash_remove looks to be a thread ID hash table handling routine.
Comment 3 yarshure 2021-02-08 02:49:32 UTC
13.0 beta1 $make buildkernel -DWITHOUT_CLEAN also crash system 
  
x0: ffff000000e4c000
  x1:                0
  x2:                1
  x3:             8040
  x4: ffffa02f28718170
  x5: ffff000191ddf6c0
  x6: ffff000191ddf689
  x7: ffff000191ddf694
  x8:             52b2
  x9: ffff000000e4c380
 x10: ffff000000b7b000
 x11:                0
 x12: ffffffffffffffff
 x13: ffff000000b7b000
 x14: ffffa02073a4e6c0
 x15:                1
 x16:             197d
 x17:                f
 x18: ffff000191ddf420
 x19: ffff000000e4c000
 x20:           eec1e6
 x21:             8040
 x22: ffff000000b7b000
 x23: ffffa02efb3a17e0
 x24: ffffa02f2870f068
 x25:                0
 x26: ffff000000b7b000
 x27:             7766
 x28: ffff000000e4c758
 x29: ffff000191ddf420
  sp: ffff000191ddf420
  lr: ffff0000007a30b8
 elr: ffff000000791610
spsr:         20000145
 far:         47766000
panic: Unknown kernel exception 0 esr_el1 2000000

cpuid = 1
time = 1612746964
KDB: stack backtrace:
#0 0xffff000000506774 at kdb_backtrace+0x60
#1 0xffff0000004b09d4 at vpanic+0x184
#2 0xffff0000004b084c at panic+0x44
#3 0xffff00000081bc28 at do_el1h_sync+0x140
#4 0xffff0000007fd878 at handle_el1h_sync+0x78
#5 0xffff0000007a30b4 at vm_reserv_alloc_page+0x384
#6 0xffff0000007a30b4 at vm_reserv_alloc_page+0x384
#7 0xffff00000079100c at vm_page_alloc_domain_after+0xb0
#8 0xffff000000790e40 at vm_page_alloc+0x6c
#9 0xffff000000777f34 at vm_fault_allocate+0x1b8
#10 0xffff0000007767ec at vm_fault+0x3dc
#11 0xffff0000007762f0 at vm_fault_trap+0x60
#12 0xffff00000081ca88 at data_abort+0xf4
#13 0xffff0000007fda24 at handle_el0_sync+0x90
Comment 4 Mark Millard 2021-02-09 03:37:58 UTC
(In reply to yarshure from comment #3)

Well, that makes my suggestion in #2 no longer fit
the evidence. Sorry for the noise.
Comment 5 yarshure 2021-04-01 14:55:11 UTC
use new version UEFI (https://solid-run-images.sos-de-fra-1.exo.io/LX2k/lx2160a_uefi/lx2160acex7_2000_700_2600_8_5_2_sd_4a89463.img.xz)

system most  stable when high cpu load use 13.0 RC4