[https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222314 is a different issue, as noted there.] A powerpc64 head -r339076 based context running # kyua test -k /usr/tests/Kyuafile reliably crashes (so far) during kyua displaying: sys/netinet/reuseport_lb:basic_ipv4 -> failed: /usr/src/tests/sys/netinet/reuseport_lb.c:165: bind() failed: Address already in use [0.013s] sys/netinet/reuseport_lb:basic_ipv6 -> failed: /usr/src/tests/sys/netinet/reuseport_lb.c:221: bind() failed: Address already in use [0.013s] sys/netipsec/tunnel/aes_cbc_128_hmac_sha1:v4 -> Example details based on a debug kernel (invariants, witness, and diagnostics) . . . Note the LOR backtrace and the crash backtrace are the same for the call chain that calls vnet_sysinit. . . . epair3a: Ethernet address: 02:60:27:70:4b:0a epair3b: Ethernet address: 02:60:27:70:4b:0b epair3a: link state changed to UP epair3b: link state changed to UP lock order reversal: 1st 0x13be260 allprison (allprison) @ /usr/src/sys/kern/kern_jail.c:960 2nd 0x15964a0 vnet_sysinit_sxlock (vnet_sysinit_sxlock) @ /usr/src/sys/net/vnet.c:575 stack backtrace: #0 0x6f6520 at witness_debugger+0xf4 #1 0x6f8440 at witness_checkorder+0xa1c #2 0x675690 at _sx_slock_int+0x70 #3 0x675810 at _sx_slock+0x1c #4 0x7f4338 at vnet_sysinit+0x38 #5 0x7f44dc at vnet_alloc+0x118 #6 0x62ab84 at kern_jail_set+0x3274 #7 0x62b62c at sys_jail_set+0x8c #8 0xa8a798 at trap+0x9a0 #9 0xa7e660 at powerpc_interrupt+0x140 fatal kernel trap: exception = 0x300 (data storage interrupt) virtual address = 0xc00000008df1df30 dsisr = 0x42000000 srr0 = 0xe000000047854e98 (0xe000000047854e98) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xe000000047854e90 (0xe000000047854e90) curthread = 0xc0000000206b6000 pid = 9464, comm = jail (Hand transcribed from here on:) [ thread pid 9464 tid 100296 ] Stopped at vnet_epair_init+0x78: stdx r3,r29,r30 db:0:kdb.enter.default> bt Tracing pid 9464 tid 100296 td 0xc0000000206b6000 0xe000000047274240: at vnet_sysinit+0x70 0xe000000047274270: at vnet_alloc+0x118 0xe000000047274300: at kern_jail_set+0x3274 0xe000000047274610: at sys_jail_set+0x8c 0xe000000047274660: at trap+0x9a0 0xe000000047274790: at powerpc_interrupt+0x140 0xe000000047274820: user sc trap by 0x81016a888 srr1 = 0x900000000000f032 r1 = 0x3fffffffffffd080 cr = 0x28002482 xer = 0x20000000 ctr = 0x81016a880 r2 = 0x810322300 There are past reports of the lock order reversal, such as: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210907 but this did not report any crash. Notes: The powerpc64 -r339076 based system was built via devel/powerpc-xtoolchain-gcc and created system-cc-is-clang and is using base/binutils as well. kyua is as of ports -r480180 and system-clang built it (and other things). I experiment with what the issues are with using fairly modern compiler toolchains for powerpc64 instead of gcc 4.2.1 . At this point I do not see this as likely to be responsible for the above crash. I'll see about adding a objdump or kgdb disass of vnet_epair_init in a bit.
vnet_epair_init extraction from objdump --prefix0addresses -d for if_epair.ko : <vnet_epair_init> addis r2,r12,2 <vnet_epair_init+0x4> addi r2,r2,-20768 <vnet_epair_init+0x8> mflr r0 <vnet_epair_init+0xc> std r31,-8(r1) <vnet_epair_init+0x10> std r29,-24(r1) <vnet_epair_init+0x14> std r30,-16(r1) <vnet_epair_init+0x18> std r0,16(r1) <vnet_epair_init+0x1c> stdu r1,-64(r1) <vnet_epair_init+0x20> mr r31,r1 <vnet_epair_init+0x24> mr r9,r13 <vnet_epair_init+0x28> ld r9,1240(r9) <vnet_epair_init+0x2c> addis r7,r2,-2 <vnet_epair_init+0x30> addis r6,r2,-2 <vnet_epair_init+0x34> addis r5,r2,-2 <vnet_epair_init+0x38> addis r3,r2,-2 <vnet_epair_init+0x3c> addi r7,r7,20960 <vnet_epair_init+0x40> addi r6,r6,22096 <vnet_epair_init+0x44> addi r5,r5,24896 <vnet_epair_init+0x48> li r4,0 <vnet_epair_init+0x4c> ld r30,40(r9) <vnet_epair_init+0x50> nop <vnet_epair_init+0x54> std r2,24(r1) <vnet_epair_init+0x58> addi r3,r3,31640 <vnet_epair_init+0x5c> nop <vnet_epair_init+0x60> ld r12,-32704(r2) <vnet_epair_init+0x64> addi r29,r2,-31440 <vnet_epair_init+0x68> mtctr r12 <vnet_epair_init+0x6c> bctrl <vnet_epair_init+0x70> ld r2,24(r1) <vnet_epair_init+0x74> nop <vnet_epair_init+0x78> stdx r3,r29,r30 <vnet_epair_init+0x7c> nop <vnet_epair_init+0x80> std r2,24(r1) <vnet_epair_init+0x84> ld r12,-32696(r2) <vnet_epair_init+0x88> addi r3,r2,-32312 <vnet_epair_init+0x8c> mtctr r12 <vnet_epair_init+0x90> bctrl <vnet_epair_init+0x94> ld r2,24(r1) <vnet_epair_init+0x98> addi r1,r31,64 <vnet_epair_init+0x9c> ld r0,16(r1) <vnet_epair_init+0xa0> ld r29,-24(r1) <vnet_epair_init+0xa4> ld r30,-16(r1) <vnet_epair_init+0xa8> ld r31,-8(r1) <vnet_epair_init+0xac> mtlr r0 <vnet_epair_init+0xb0> blr <vnet_epair_init+0xb4> .long 0x0 <vnet_epair_init+0xb8> .long 0x1 <vnet_epair_init+0xbc> lwz r0,0(r3) The crash reported: Stopped at vnet_epair_init+0x78: stdx r3,r29,r30
Ok, we probably also need registers. Given some of the things we have seen in the past were indeed toolchain related, before I punt this to powerpc people, can you confirm that this also happens with a default FreeBSD build (with whatever the default toolchain is for powerpc)?
(In reply to Bjoern A. Zeeb from comment #2) show reg will require crashing again. But you might be initially more interested in an official kernel build's behavior. The official powerpc64 and powerpc builds are via gcc 4.2.1 toolchain builds. For powerpc64 I can try substituting kernel materials from, say, somewhere like: https://artifact.ci.freebsd.org/snapshot/stable-11/r*/powerpc/powerpc64/kernel*.txz or https://download.freebsd.org/ftp/snapshots/powerpc/powerpc64/12.0-*/kernel*.txz (assuming that you do not care about the buildworld/ports side of things). It will be a while before I have such results.
(In reply to Bjoern A. Zeeb from comment #2) stable-11 was the wrong path. More like: https://artifact.ci.freebsd.org/snapshot/head/r*/powerpc/powerpc64/kernel*.txz But, unfortunately, -r339076 vintage official builds do not boot old PowerMac G5 so-called "Quad Core"s. I'm trying to find some official vintage that works via just a kernel substitution instead of needing to set up a private gcc 4.2.1 based build with patches.
(In reply to Bjoern A. Zeeb from comment #2) -r339269 (the last version before the opensll update and bump of __FreeBSD_version) does not boot the old PowerMac G5 "Quad Core". In my context, simply setting up a test of an officially-built-kernel seems problematical. So it looks like I'll just build from my test environment's sources via gcc 4.2.1 . . . . And the result was no crash but the test failed: sys/netipsec/tunnel/aes_cbc_128_hmac_sha1:v4 -> failed: atf-check failed; see the output of the test for details [0.652s] But for all I know the failure could be before the activity that was leading to crashes, possibly there by skipping the internal step that has the problem. So I view this as inconclusive. I'm not sure that you want to be looking into devel/powerpc64-xtoolchain-gcc styles of builds if that is the only context that I get the crashes with.
(In reply to Bjoern A. Zeeb from comment #2) I did a svnlite update -r339341 /usr/src/sys/kern/link_elf.c and rebuilt, reinstalled, rebooted, and retested based on using devel/powerpc64-xtoolchain-gcc materials. Result: Still crashed. The below has show reg as well. . . . epair3a: Ethernet address: 02:60:27:70:4b:0a epair3b: Ethernet address: 02:60:27:70:4b:0b epair3a: link state changed to UP epair3b: link state changed to UP lock order reversal: 1st 0x13be260 allprison (allprison) @ /usr/src/sys/kern/kern_jail.c:960 2nd 0x15964a0 vnet_sysinit_sxlock (vnet_sysinit_sxlock) @ /usr/src/sys/net/vnet.c:575 stack backtrace: #0 0x6f6520 at witness_debugger+0xf4 #1 0x6f8440 at witness_checkorder+0xa1c #2 0x675690 at _sx_slock_int+0x70 #3 0x675810 at _sx_slock+0x1c #4 0x7f4338 at vnet_sysinit+0x38 #5 0x7f44dc at vnet_alloc+0x118 #6 0x62ab84 at kern_jail_set+0x3274 #7 0x62b62c at sys_jail_set+0x8c #8 0xa8a798 at trap+0x9a0 #9 0xa7e660 at powerpc_interrupt+0x140 fatal kernel trap: exception = 0x300 (data storage interrupt) virtual address = 0xc00000008df1df30 dsisr = 0x42000000 srr0 = 0xe00000004784ce98 (0xe00000004784ce98) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xe00000004784ce90 (0xe00000004784ce90) curthread = 0xc00000000b8c8560 pid = 9536, comm = jail (Hand transcribed from here on:) [ thread pid 9536 tid 100174 ] Stopped at vnet_epair_init+0x78: stdx r3,r29,r30 db:0:kdb.enter.default> bt Tracing pid 9536 tid 100174 td 0xc00000000b8c8560 0xe000000047274240: at vnet_sysinit+0x70 0xe000000047274270: at vnet_alloc+0x118 0xe000000047274300: at kern_jail_set+0x3274 0xe000000047274610: at sys_jail_set+0x8c 0xe000000047274660: at trap+0x9a0 0xe000000047274790: at powerpc_interrupt+0x140 0xe000000047274820: user sc trap by 0x81016a888 srr1 = 0x900000000000f032 r1 = 0x3fffffffffffd080 cr = 0x28002482 xer = 0x20000000 ctr = 0x81016a880 r2 = 0x810322300 db> show reg r0 = 0xe00000004784ce90 vnet_epair_init+0x70 r1 = 0xe000000047050200 r2 = 0xe00000004784ce90 .TOC. r3 = 0xc000000027a3fc00 r4 = 0x4 r5 = 0xe04008 r6 = 0x119 r7 = 0 r8 = 0xc00000000b8c8560 r9 = 0x2 r10 = 0xc00000000b8c8560 r11 = 0xc00000000b8c86a0 r12 = 0x152f1e8 r13 = 0xc00000000b8c8560 r14 = 0 r15 = 0 r16 = 0xc00000000763d960 r17 = 0xc000000008145090 r18 = 0 r19 = 0xc000000027a45058 r20 = 0x10e3690 prison0 r21 = 0 r22 = 0 r23 = 0 r24 = 0 r25 = 0xc000000027a45000 r26 = 0 r27 = 0x11e71b0 __start_set_compressors r28 = 0xe000000047871000 .TOC.+0x9300 r29 = 0xe000000047860230 vnet_entry_epair_cloner r30 = 0xe0000000466bdd00 r31 = 0xe000000047050200 srr0 = 0xe00000004784ce98 vnet_epair_init+0x78 srr1 = 0x9000000000009032 lr = 0xe00000004784ce90 vnet_epair_init+0x70 ctr = 0x1 cr = 0x20222234 xer = 0 dar = 0xe00000008df1df30 dsisr= 0x42000000 vnet_epair_init+0x70: stdx r3,r29,r30
(In reply to Mark Millard from comment #6) I messed up the r2 line: r2 = 0xe00000004784ce90 .TOC. which should have been: r2 = 0xe000000047867d00 .TOC. And I messed up the r12 line: r12 = 0x152f1e8 which should have been: r12 = 0x152f138
(In reply to Mark Millard from comment #6) I also messed up the dar line: dar = 0xe00000008df1df30 which should have been: dar = 0xc00000008df1df30
(In reply to Mark Millard from comment #6) Looks to me like r29 is the value of (uintptr_t)&VNET_NAME(n) <from _VNET_PTR>. But r30 is supposed to be: (curthread->td_vnet)->vnet_data_base where: /* * All use of vnet-specific data will immediately subtract VNET_START * from the base memory pointer, so pre-calculate that now to avoid * it on each use. */ vnet->vnet_data_base = (uintptr_t)vnet->vnet_data_mem - VNET_START; and: /* * Location of the kernel's 'set_vnet' linker set. */ extern uintptr_t *__start_set_vnet; __GLOBL(__start_set_vnet); extern uintptr_t *__stop_set_vnet; __GLOBL(__stop_set_vnet); #define VNET_START (uintptr_t)&__start_set_vnet #define VNET_STOP (uintptr_t)&__stop_set_vnet This traces back to differing definitions for wokring vs. crashing. Working (gcc 4.2.1 toolchain) /boot/kernel/kernel : 00000000013dba00 g *ABS* 0000000000000000 __start_set_vnet /boot/kernel/if_epair.ko : 0000000000015448 g *ABS* 0000000000000000 __start_set_vnet devel/powerpc64-xtoolchain-gcc based kernel: /boot/kergoo/kernel : 0000000001223800 g set_vnet 0000000000000000 .protected __start_set_vnet /boot/kergoo/if_epair.ko : 0000000000014d30 g set_vnet 0000000000000000 .protected __start_set_vnet This apparently leads to VNET_START being some small value, possibly 0. That in turn means that ?->vnet_data_base ends up not being the intended offset but directly an address. This makes adding: (uintptr_t)&VNET_NAME(n) <from _VNET_PTR>. and: (curthread->td_vnet)->vnet_data_base junk that makes the indexed store sdtx r3,r29,r30 fail for a bad target address.
(In reply to Mark Millard from comment #9) In other terms, the gcc 4.2.1 WITH_BINUTILS_BOOTSTRAP= tool chain based build generates (elfdump output): entry: 72 st_name: __start_set_vnet st_value: 0x15448 st_size: 0 st_info: STT_NOTYPE STB_GLOBAL st_shndx: 65521 where 65521 is SHN_ABS, i.e., 0xfff1 . By contrast the devel/powerpc64-gcc + base/binutils (or devel/powerpc64-binutils ) toolchain context generates: entry: 73 st_name: __start_set_vnet st_value: 0x14d30 st_size: 0 st_info: STT_NOTYPE STB_GLOBAL st_shndx: 17 The 17 looks odd to me because: Sections: Idx Name Size VMA LMA File off Algn 0 .note.gnu.build-id 00000048 0000000000000158 0000000000000158 00000158 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA . . . 16 set_vnet 00000008 0000000000014d30 0000000000014d30 00004d30 2**3 CONTENTS, ALLOC, LOAD, DATA 17 .bss 00000008 0000000000014d38 0000000000014d38 00004d38 2**2 ALLOC . . .
(In reply to Mark Millard from comment #10) Looks like objdump and elfdump do not match for how section headers are numbered: 17 matches show elfdump shows it: section header: entry: 0 sh_name: sh_type: SHT_NULL sh_flags: sh_addr: 0 sh_offset: 0 sh_size: 0 sh_link: 0 sh_info: 0 sh_addralign: 0 sh_entsize: 0 entry: 1 sh_name: .note.gnu.build-id sh_type: SHT_NOTE sh_flags: sh_addr: 0x158 sh_offset: 344 sh_size: 72 sh_link: 0 sh_info: 0 sh_addralign: 4 sh_entsize: 0 . . . entry: 16 sh_name: set_sysuninit_set sh_type: SHT_PROGBITS sh_flags: sh_addr: 0x14d10 sh_offset: 19728 sh_size: 32 sh_link: 0 sh_info: 0 sh_addralign: 8 sh_entsize: 0 entry: 17 sh_name: set_vnet sh_type: SHT_PROGBITS sh_flags: sh_addr: 0x14d30 sh_offset: 19760 sh_size: 8 sh_link: 0 sh_info: 0 sh_addralign: 8 sh_entsize: 0 entry: 18 sh_name: .bss sh_type: SHT_NOBITS sh_flags: sh_addr: 0x14d38 sh_offset: 19768 sh_size: 8 sh_link: 0 sh_info: 0 sh_addralign: 4 sh_entsize: 0 . . .
(In reply to Mark Millard from comment #10) It turns out that my amd64 kernel builds are like the devel/powerpc64-xtoolchain-gcc ones in that instead of SHN_ABS it has a normal st_shndx value: ( these are looking at /boot/kernel/kernel ) entry: 16765 st_name: __start_set_vnet st_value: 0xffffffff824dc200 st_size: 0 st_info: STT_NOTYPE STB_GLOBAL st_shndx: 50 This might suggest powerpc64's kernel having incomplete coverage for handling such things? For the amd64 context, that 50 is: entry: 50 sh_name: set_vnet sh_type: SHT_PROGBITS sh_flags: SHF_WRITE|SHF_ALLOC sh_addr: 0xffffffff824dc200 sh_offset: 36553216 sh_size: 220872 sh_link: 0 sh_info: 0 sh_addralign: 16 sh_entsize: 0 A constrasting devel/powerpc64-xtoolchain-gcc based kernel has entries (note the set_vnet sh_flags differences vs. amd64): ( these are also looking at /boot/kernel/kernel ) entry: 40 sh_name: set_vnet sh_type: SHT_PROGBITS sh_flags: sh_addr: 0x1223800 sh_offset: 18036736 sh_size: 212656 sh_link: 0 sh_info: 0 sh_addralign: 8 sh_entsize: 0 and for __start_set_vnet : entry: 1490 st_name: __start_set_vnet st_value: 0x1223800 st_size: 0 st_info: STT_NOTYPE STB_GLOBAL st_shndx: 40
(In reply to Mark Millard from comment #12) Well, on powerpc64 objdump for the sections shows set_vnet as ALLOC (without listing READONLY). elfdump and objdump do not seem to agree about what the flags are for the same file in the powerpc64 context. elfdump shows all the sh_flags as empty. So the difference in reported flags between amd64 and powerpc64 may be just a tools issue.
I finally have a system-clang devel/powerpc64-binutils based build that I've tested ( head -r345558 based, ELFv1 ABI ). Again I'm off experimental futures. This was with a non-debug kernel. This too fails. Running just: kyua test -k /usr/tests/Kyuafile sys/netipsec/tunnel/aes_cbc_128_hmac_sha1 is sufficient for the crash to happen but it fails when a full Kyua test gets there. Again a data storage interrupt: ( Be warned: hand typed from a picture.) exception = 0x300 (data storage interrupt) virtual address= 0x860ce198 dsisr = 0x42000000 srr0 = 0xc0000000007b51a4 (0x7b51a4) srr1 = 0x9000000000009032 current msr = 0x9000000000009032 lr = 0xc0000000007184a8 (0x7184a8) frame = 0xe00000008ecb0dd0 curthread = 0xc00000000ad6c580 pid = 28966, comm = ifconfig Panic: data storage interrupt trap cpuid= 2 time = 1553886317 The frmae, curthread, pid, cpuid, and time can vary. The backtrace information is better this time. Be warned: hand typed from a picture. The example is from a full kyua run. KDB: stack backtrace: 0xe000000008eb0b00: at vpanic+0x1d8 0xe000000008eb0bb0: at panic+0x44 0xe000000008eb0be0: at trap_fatal+0x2f4 0xe000000008eb0c70: at trap+0x698 0xe000000008eb0d50: at powerpc_interrupt+0x1a0 0xe000000008eb0da0: at kernel DSI write trap @ 0x860ce198 by lock_init+0x140: srr1 = 0x9000000000009032 r1 = 0xe00000008ecb1050 cr = 0x22880f488 xer = 0 ctr = 0xc000000000718438 r2 = 0xc000000001370000 sr = 0x42000000 frame= 0xe00000008ecb0dd0 0xe00000008ecb1050: at __set_sysctl_set_sym_sysctl___net_link_epair_netisr_maxqlen+0x4 0xe00000008ecb1080: at epair_modeevent+0xbc 0xe00000008ecb1140: at module_register_init+0x130 0xe00000008ecb11f0: at linker_load_module+0xd88 0xe00000008ecb1620: at kern_kldload+0x8c 0xe00000008ecb1690: at sys_kldload+0x8c 0xe00000008ecb16e0: at trap+0xb28 0xe00000008ecb17c0: at powerpc_interrupt+0x1a0 0xe00000008ecb1810: at user SC trap by 0x810182da8: srr1 = 0x900000000200f032 r1 = 0x3fffffffffffcf10 cr = 0x24002200 xer = 0x20000000 ctr = 0x810182da0 r2 = 0x81033b900 frame= 0xe00000008ecb1840 Note: The 'at __set_sysctl_set_sym_sysctl___net_link_eapir_netisr_maxqlen+0x4' at other times has shown text such as 'at 0xffffffc'. The kernel stack addresses (0xe000 prefixes) can vary. Otherwise the backtraces agree so far as I've noticed.
(In reply to Mark Millard from comment #14) Turns out that just doing: # kldload if_epair is sufficient to cause the crash. By contrast, loading geom_uzip laods without crashing (including loading xz). I'd expect the same for a devel/powerpc-xtoolchain-gcc based buildworld buildkernel that is installed and booted but I've not tried it. (The original report was for that context.) devel/powerpc64-binutils is common to both types of builds but the compilers are not.
(In reply to Mark Millard from comment #15) I added printing &DPCPU_NAME(epair_dpcpu) to: static void epair_dpcpu_init(void) This and ddb use gives the following information for the use of: #define _DPCPU_PTR(b, n) \ (__typeof(DPCPU_NAME(n))*)((b) + (uintptr_t)&DPCPU_NAME(n)) . . . #define DPCPU_ID_PTR(i, n) _DPCPU_PTR(dpcpu_off[(i)], n) (Typed from a picture:) &DPCPU_NAME(epair_dpcpu)=0xe00000008fcee810 show dpcpu_off in ddb shows: dpcpu_off[0]=0x1ffffffffecf6980 dpcpu_off[1]=0x2000000002adc980 dpcpu_off[2]=0x2000000002ada980 dpcpu_off[3]=0x2000000002ad8980 The failing virtual address was reported as: virtual address = 0x8e9e5198 . . . cpuid = 0 Then, checking: 0x1ffffffffecf6980+0xe00000008fcee810==0x8e9e5190 so 0x8 short. But: <epair_modevent+0xd0> addi r3,r22,24 <epair_modevent+0xd4> bl <0000001b.plt_call._mtx_init> and: <_mtx_init+0x20> mr r30,r3 . . . <_mtx_init+0x60> addi r3,r30,-24 <_mtx_init+0x64> clrldi r7,r6,32 <_mtx_init+0x68> mr r6,r8 <_mtx_init+0x6c> bl <lock_init+0x8> and: <lock_init+0x140> stw r4,8(r3) So 0x8e9e5190+24-24+8==0x8e9e5198 (the failure address). It appears to me that the dpcpu_off[i] figures are expected to convert 0xc???_????_????_???? type (direct-map) addresses to 0xe???_????_????_???? addresses but &DPCPU_NAME(epair_dpcpu) already was a 0xe???_????_????_???? type of address. The result overflowed/wrapped/truncated and was invalid. This looks likely to be a problem for any kldload'd .ko that uses a DPCPU_DEFINE and DPCPU_ID_PTR similarly to (showing my printf addition as well): struct epair_dpcpu { struct mtx if_epair_mtx; /* Per-CPU locking. */ int epair_drv_flags; /* Per-CPU ``hw'' drv flags. */ struct eid_list epair_ifp_drain_list; /* Per-CPU list of ifps with * data in the ifq. */ }; DPCPU_DEFINE(struct epair_dpcpu, epair_dpcpu); static void epair_dpcpu_init(void) { struct epair_dpcpu *epair_dpcpu; struct eid_list *s; u_int cpuid; printf("epair_dpcpu_init: &DPCPU_NAME(epair_dpcpu)=%p\n", &DPCPU_NAME(epair_dpcpu)); CPU_FOREACH(cpuid) { epair_dpcpu = DPCPU_ID_PTR(cpuid, epair_dpcpu); /* Initialize per-cpu lock. */ EPAIR_LOCK_INIT(epair_dpcpu); . . .
(In reply to Mark Millard from comment #16) Looking around it appears the .ko files with such pcpu_entry_ content are: /boot/kernel/if_epair.ko: U dpcpu_off 0000000000014810 d pcpu_entry_epair_dpcpu /boot/kernel/ipfw.ko: U dpcpu_off 00000000000454c8 d pcpu_entry_dyn_hp /boot/kernel/linuxkpi.ko: U dpcpu_off 0000000000032400 d pcpu_entry_linux_epoch_record 0000000000032380 d pcpu_entry_linux_idr_cache 0000000000032580 d pcpu_entry_tasklet_worker /boot/kernel/siftr.ko: U dpcpu_off 0000000000015450 d pcpu_entry_ss I expect the following might be okay: /boot/kernel/hwpmc.ko: U dpcpu_off U pcpu_entry_pmc_sampled based on /boot/kernel/kernel having pcpu_entry_pmc_sampled in its list: 00000000016f1450 B dpcpu_off 000000000132b778 d pcpu_entry_decr_state 000000000132a6a8 D pcpu_entry_epoch_cb_count 000000000132a6b0 D pcpu_entry_epoch_cb_task 000000000132a688 d pcpu_entry_exec_args_kva 000000000132b6f0 D pcpu_entry_hardclocktime 000000000132a728 d pcpu_entry_modspace 000000000132af80 D pcpu_entry_nws 000000000132a680 d pcpu_entry_pcputicks 000000000132a690 D pcpu_entry_pmc_sampled 000000000132b480 d pcpu_entry_pqbatch 000000000132a6a4 d pcpu_entry_randomval 000000000132a698 d pcpu_entry_tc_cpu_ticks_base 000000000132a6a0 d pcpu_entry_tc_cpu_ticks_last 000000000132b680 d pcpu_entry_timerstate 000000000132b6f8 d pcpu_entry_xive_cpu_data I'll not try to list the .ko's with vnet_entry_ prefixed names. But this seems to operate via curthread->td_vnet instead of something like dpcpu_off[cpuid] . I've not checked what curthread->td_vnet values are like.
(In reply to Mark Millard from comment #15) For if_epair.ko 's problem, adding: device epair to the kernel configuration to avoid the dynamic load of the if_epair.ko works: # kyua test -k /usr/tests/Kyuafile sys/netipsec/tunnel/aes_cbc_128_hmac_sha1 sys/netipsec/tunnel/aes_cbc_128_hmac_sha1:v4 -> passed [0.626s] sys/netipsec/tunnel/aes_cbc_128_hmac_sha1:v6 -> passed [0.631s] Results file id is usr_tests.20190331-020517-933050 Results saved to /root/.kyua/store/results.usr_tests.20190331-020517-933050.db 2/2 passed (0 failed) I'll eventually see how far kyua gets this way.
(In reply to Mark Millard from comment #18) For the first time I've had a run of: kyua test -k /usr/tests/Kyuafile complete on a powerpc64 system that had no part built via gcc 4.2.1 . It reported: ===> Summary Results read from /root/.kyua/store/results.usr_tests.20190331-021208-373603.db Test cases: 7504 total, 227 skipped, 37 expected failures, 121 broken, 54 failed Total time: 6552.974s
From what I've seen: /boot/kernel/hwpmc.ko: U dpcpu_off U pcpu_entry_pmc_sampled does not cause problems (because pcpu_entry_pmc_sampled is defined in /boot/kernel/kernel instead of being dynamically loaded). So I've removed hwpmc.ko from the summary.
It's reproducible on the PowerPC64 LLVM+ELFv2 experimental ISO. I'm going to investigate more, but I found something interesting after I: - modified 'options VERBOSE_SYSINIT=0' to '=1' to GENERIC64 - added 'options DIAGNOSTIC' to GENERIC64 - add "#define EPAIR_DEBUG" on top of if_epair.c ->> Module loads, no panics.
(In reply to Alfredo Dal'Ava Júnior from comment #21) > ->> Module loads, no panics. I'm referring to 'if_epair'
Regarding the comments #21 and #22, I found out that issue isn't reproducible in that test because configuration flag VIMAGE is not set when compiling module directly in the tree. The following change should fix 'if_epair.ko" load and unlock FreeBSD TestSuite: https://reviews.freebsd.org/D20461
A commit references this bug: Author: luporl Date: Tue Jun 25 17:15:45 UTC 2019 New revision: 349377 URL: https://svnweb.freebsd.org/changeset/base/349377 Log: [PowerPC64] Don't mark module data as static Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler. Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code generated tries to access data in the wrong location, causing kernel panic (data storage interrupt trap) when loading if_epair and ipfw. Issue was reproduced with kernel/module compiled using gcc8 and clang8. It affects both ELFv1 and ELFv2 ABI environments. PR: 232387 Submitted by: alfredo.junior_eldorado.org.br Reported by: Mark Millard Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D20461 Changes: head/sys/net/vnet.h head/sys/sys/pcpu.h