https://ci.freebsd.org/job/FreeBSD-main-riscv64-test/14988/consoleFull sys/netgraph/hub:loop -> panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffc0afd93000 cpuid = 0 time = 1633913993 KDB: stack backtrace: db_trace_self() at db_trace_self KDB: enter: panic [ thread pid 42865 tid 100055 ] Stopped at kdb_enter+0x4c: sd zero,0(a0)db:0:kdb.enter.panic> show pcpu cpuid = 0 dynamic pcpu = 0x2b6a80 curthread = 0xffffffc0afe37680: pid 42865 tid 100055 critnest 1 "hub" curpcb = 0xffffffc0afd97d68 fpcurthread = none idlethread = 0xffffffc001c1eb80: tid 100003 "idle: cpu0" curvnet = 0xffffffd00105fd00 spin locks held:
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=d5fd5cdc063857132de19a94f63d1adbc581494e commit d5fd5cdc063857132de19a94f63d1adbc581494e Author: Li-Wen Hsu <lwhsu@FreeBSD.org> AuthorDate: 2021-10-13 21:31:22 +0000 Commit: Li-Wen Hsu <lwhsu@FreeBSD.org> CommitDate: 2021-10-13 21:31:22 +0000 Temporarily skip sys.netgraph.hub.loop on RISC-V in CI This case panics kernel. PR: 259157 Sponsored by: The FreeBSD Foundation tests/sys/netgraph/hub.c | 5 +++++ 1 file changed, 5 insertions(+)
With this build, the netgraph test suite was activated for this architecture for the first time: https://ci.freebsd.org/job/FreeBSD-main-riscv64-test/14974/ The netgraph/hub:loop test fails since then. It seems to had never worked on the RISC architecture. Do we have somebody with RISC experience, who is able to investigate this issue?
Hi, I looked into this issue a little bit. It appears to be a kernel stack overflow, since we can see the faulting address is just above the thread's stack, in the guard page: panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffc21fbab000 db> show thread Thread 100239 at 0xffffffc220f230e0: proc (pid 822): 0xffffffd01c2d7000 name: hub pcb: 0xffffffc21fbb3d68 stack: 0xffffffc21fbac000-0xffffffc21fbb3fff <------ stack bounds flags: 0x4 pflags: 0x100 state: RUNNING (CPU 0) priority: 172 container lock: sched lock 0 (0xffffffc00156e0c0) last voluntary switch: 6.246 s ago last involuntary switch: 0.000 s ago I tried a build of the kernel bumping KSTACK_PAGES from 4 to 8, but it failed in the same way. For the thread to eat through 8 pages of kstack this way this implies some kind of unwanted recursion (due to a loop in the graph?). I have almost no insight into how netgraph works. Lutz, can you provide any info on what this test is doing and what code path(s) it might be taking through the kernel? Unfortunately, we can't easily obtain a useful backtrace if the stack has overflowed. I tried dumping the memory contents from the top of the stack: db> x/gx 0xffffffc21fbac000,64 0xffffffc21fbac000: ffffdfffffffffff ffffffffffffffff 0xffffffc21fbac010: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac020: ffffffffffffffff ffff 0xffffffc21fbac030: f7ffffffffffffff ffffffffffdfffff 0xffffffc21fbac040: ffffffffffffffff 7fffffffffffffff 0xffffffc21fbac050: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac060: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac070: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac080: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac090: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac0a0: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac0b0: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac0c0: ffffffffffffff7f fffffffffbffffff 0xffffffc21fbac0d0: fffeffffffffffff ffffffffffffffff 0xffffffc21fbac0e0: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac0f0: ffffffffffffffff fbffffffffffffff 0xffffffc21fbac100: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac110: ffffffffffffffff ffffffffbfffffff 0xffffffc21fbac120: ffffffffffffffff ffffffffffffffff 0xffffffc21fbac130: fbffffffffffffff ffffffffffffffff 0xffffffc21fbac140: ffffffffffffffff fffffffffffeffff 0xffffffc21fbac150: 228106050356e2b 1 0xffffffc21fbac160: ffffffd000f98a00 1 0xffffffc21fbac170: 0 101 0xffffffc21fbac180: 0 0 0xffffffc21fbac190: ffffffc0006fe947 ffffffd000f98ac0 0xffffffc21fbac1a0: f40 ffffffd000f98ad8 0xffffffc21fbac1b0: ffffffc21fbac210 ffffffc0003341ee 0xffffffc21fbac1c0: ffffffffffffffff efffffffffffffff 0xffffffc21fbac1d0: ffffffd000f98a00 101 0xffffffc21fbac1e0: ffffffd000f98ad8 ffffffd000f98a00 0xffffffc21fbac1f0: 0 0 0xffffffc21fbac200: ffffffc21fbac250 ffffffc000622408 0xffffffc21fbac210: 0 101 0xffffffc21fbac220: 1 0 0xffffffc21fbac230: ffffffff 101 0xffffffc21fbac240: ffffffc21fbac330 ffffffc0006216f2 0xffffffc21fbac250: ffffffd000f98a18 1 0xffffffc21fbac260: 1 0 0xffffffc21fbac270: 101 ffffffc00475cdc0 0xffffffc21fbac280: fe f90 0xffffffc21fbac290: ffffffc0006fe947 ffffffd17c51a010 0xffffffc21fbac2a0: 10100000000 ffffffc00475cdc0 0xffffffc21fbac2b0: 51a10000 800 0xffffffc21fbac2c0: ffffffd17c51a000 ffffffd17c51a010 0xffffffc21fbac2d0: 0 0 0xffffffc21fbac2e0: 0 ffffffc00475cdc0 0xffffffc21fbac2f0: ffffffc00475cde0 ffffffd17c51a000 0xffffffc21fbac300: fe 101 0xffffffc21fbac310: 0 ffffffc00089d4b0 The contents near the very top are quite strange to me, I'm not sure what is being written here. By examining the addresses in the kernel's .text range, I see repeated occurrences of ng_snd_item() and ng_apply_item() going down the stack. Finally, the contents of the ktr buffer may give some insight: db> show ktr 155 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d000 154 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d080 153 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d100 152 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d180 151 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d200 150 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d280 149 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d300 148 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d380 147 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d400 146 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d480 145 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d500 144 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d580 143 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d600 142 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d680 141 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d700 140 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d780 139 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d800 138 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d880 137 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d900 136 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d980 135 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59da00 134 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59da80 133 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59db00 132 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59db80 131 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59dc00 130 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59dc80 129 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59dd00 128 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59dd80 127 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59de00 126 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59de80 125 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59df00 124 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517000 123 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517080 122 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517100 121 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517180 120 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517200 119 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517280 118 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517300 117 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517380 116 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517400 115 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517480 114 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517500 113 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517580 112 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517600 111 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517680 110 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517700 109 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517780 108 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517800 107 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517880 106 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517900 105 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517980 104 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517a00 103 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517a80 102 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517b00 101 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517b80 100 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517c00 99 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517c80 98 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517d00 97 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517d80 96 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517e00 95 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517e80 94 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00 93 (0xffffffc0047881c0:cpu2): ng_dequeue: node [a] (0xffffffd0085a0800) queue empty; queue flags 0x0 92 (0xffffffc0047881c0:cpu2): ng_dequeue: node [a] (0xffffffd0085a0800) returning item 0xffffffd17c51db80 as WRITER; queue flags 0x1 91 (0xffffffc0047881c0:cpu2): ngthread: node [a] (0xffffffd0085a0800) taken off worklist 90 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [a] (0xffffffd0085a0800) put on worklist 89 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [a] (0xffffffd0085a0800) queued item 0xffffffd17c51db80 as WRITER 88 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51db80 87 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dc00 86 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [d] (0xffffffd0085a0700) acquired item 0xffffffd17c51dc00 85 (0xffffffc0047870e0:cpu2): ng_dequeue: node [d] (0xffffffd0085a0700) queue empty; queue flags 0x0 84 (0xffffffc0047870e0:cpu2): ng_dequeue: node [d] (0xffffffd0085a0700) returning item 0xffffffd17c51dd00 as READER; queue flags 0x4 83 (0xffffffc0047870e0:cpu2): ngthread: node [d] (0xffffffd0085a0700) taken off worklist 82 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [d] (0xffffffd0085a0700) put on worklist 81 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [d] (0xffffffd0085a0700) queued item 0xffffffd17c51dd00 as READER 80 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd00 79 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c517f00 78 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517f00 77 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00 76 (0xffffffc004787680:cpu1): ng_dequeue: node [a] (0xffffffd0085a0800) queue empty; queue flags 0x0 75 (0xffffffc004787680:cpu1): ng_dequeue: node [a] (0xffffffd0085a0800) returning item 0xffffffd17c51dc80 as WRITER; queue flags 0x1 74 (0xffffffc004787680:cpu1): ngthread: node [a] (0xffffffd0085a0800) taken off worklist 73 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [a] (0xffffffd0085a0800) put on worklist 72 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [a] (0xffffffd0085a0800) queued item 0xffffffd17c51dc80 as WRITER 71 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51dc80 70 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd00 69 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [c] (0xffffffd0085a0700) acquired item 0xffffffd17c51dd00 68 (0xffffffc004787c20:cpu2): ng_dequeue: node [c] (0xffffffd0085a0700) queue empty; queue flags 0x0 67 (0xffffffc004787c20:cpu2): ng_dequeue: node [c] (0xffffffd0085a0700) returning item 0xffffffd17c51de80 as READER; queue flags 0x4 66 (0xffffffc004787c20:cpu2): ngthread: node [c] (0xffffffd0085a0700) taken off worklist 65 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [c] (0xffffffd0085a0700) put on worklist 64 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [c] (0xffffffd0085a0700) queued item 0xffffffd17c51de80 as READER 63 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80 62 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51de80 61 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80 60 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80 59 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80 58 (0xffffffc0047881c0:cpu2): ng_dequeue: node [9] (0xffffffd0085a0900) queue empty; queue flags 0x0 57 (0xffffffc0047881c0:cpu2): ng_dequeue: node [9] (0xffffffd0085a0900) returning item 0xffffffd17c51de00 as READER; queue flags 0x4 56 (0xffffffc0047881c0:cpu2): ngthread: node [9] (0xffffffd0085a0900) taken off worklist 55 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [9] (0xffffffd0085a0900) already on worklist 54 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [9] (0xffffffd0085a0900) put on worklist 53 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [9] (0xffffffd0085a0900) queued item 0xffffffd17c51de00 as READER 52 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de00 51 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51df00 50 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c7f1f00 49 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c7f1f00 48 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51df00 47 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [8] (0xffffffd0085a0e00) acquired item 0xffffffd17c51df00 46 (0xffffffc0047870e0:cpu0): ng_dequeue: node [8] (0xffffffd0085a0e00) queue empty; queue flags 0x0 45 (0xffffffc0047870e0:cpu0): ng_dequeue: node [8] (0xffffffd0085a0e00) returning item 0xffffffd17c51de00 as READER; queue flags 0x4 44 (0xffffffc0047870e0:cpu0): ngthread: node [8] (0xffffffd0085a0e00) taken off worklist 43 (0xffffffc220f230e0:cpu2): ng_worklist_add: node [8] (0xffffffd0085a0e00) put on worklist 42 (0xffffffc220f230e0:cpu2): ng_queue_rw: node [8] (0xffffffd0085a0e00) queued item 0xffffffd17c51de00 as READER 41 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de00 40 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00 39 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00 38 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51df00 37 (0xffffffc220f230e0:cpu2): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de00 36 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00 35 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80 34 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00 33 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00 32 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80 31 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00 30 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00 29 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80 28 (0xffffffc220f230e0:cpu2): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00 27 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c517f00 26 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c517e80 25 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00 24 (0xffffffc004787680:cpu1): ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0 23 (0xffffffc004787680:cpu1): ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51dd80 as WRITER; queue flags 0x1 22 (0xffffffc004787680:cpu1): ngthread: node [6] (0xffffffd0085a0900) taken off worklist 21 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist 20 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51dd80 as WRITER 19 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd80 18 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80 17 (0xffffffc004787c20:cpu2): ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0 16 (0xffffffc004787c20:cpu2): ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51de00 as WRITER; queue flags 0x1 15 (0xffffffc004787c20:cpu2): ngthread: node [6] (0xffffffd0085a0900) taken off worklist 14 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist 13 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51de00 as WRITER 12 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de00 11 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80 10 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80 9 (0xffffffc220f230e0:cpu0): ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80 8 (0xffffffc0047881c0:cpu2): ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0 7 (0xffffffc0047881c0:cpu2): ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51df00 as READER; queue flags 0x4 6 (0xffffffc0047881c0:cpu2): ngthread: node [6] (0xffffffd0085a0900) taken off worklist 5 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [6] (0xffffffd0085a0900) already on worklist 4 (0xffffffc220f230e0:cpu0): ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist 3 (0xffffffc220f230e0:cpu0): ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51df00 as READER 2 (0xffffffc220f230e0:cpu0): ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51df00 1 (0xffffffc2215c2760:cpu1): bpf_attachd: bpf_attach called by pid 414, adding to writer list 0 (0xffffffc2215c2760:cpu1): bpf_attachd: bpf_attach called by pid 414, adding to active list --- End of trace buffer ---
(In reply to Mitchell Horne from comment #3) Ah, a stack overflow in the kernel during handling looping traffic? That's expected. (see D30633) The interesting point is, that it fails that spectacularly. So the current workaround (disable the test for RISC) is acceptable, otherwise we need to implement expected failure per architecture ...
(In reply to Lutz Donnerhacke from comment #4) The expected behaviour for a page fault in the kernel is to panic, with a couple exceptions. Certainly, anything that overflows the thread's kernel stack in this way will cause this. I managed to identify why this panics riscv, and not amd64. Turns out, the recursion is avoided when the architecture defines GET_STACK_USAGE(), as it allows for an early return from ng_snd_item(). The test also causes panics on 32-bit arm, but we did not notice because the armv7 CI is broken :( I created a patch which implements the missing macro for remaining archs, and it 'fixes' the panics in this test. See: https://reviews.freebsd.org/D32580 Still, I wonder if the test is flawed? If it is expected that the test might overflow the kernel stack then it is not really safe for CI. Maybe the test needs rethinking, or additional safeguards.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=0d2224733e970aaa67a4e1af7b340044adda92f6 commit 0d2224733e970aaa67a4e1af7b340044adda92f6 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2021-11-25 16:01:11 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2021-11-30 15:15:56 +0000 Implement GET_STACK_USAGE on remaining archs This definition enables callers to estimate remaining space on the kstack, and take action on it. Notably, it enables optimizations in the GEOM and netgraph subsystems to directly dispatch work items when there is sufficient stack space, rather than queuing them for a worker thread. Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will not go unimplemented elsewhere. PR: 259157 Reviewed by: mav, kib, markj (previous version) MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32580 sys/arm/include/proc.h | 11 +++++++++++ sys/geom/geom_io.c | 8 -------- sys/mips/include/proc.h | 11 +++++++++++ sys/netgraph/ng_base.c | 3 +-- sys/riscv/include/proc.h | 11 +++++++++++ 5 files changed, 34 insertions(+), 10 deletions(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=1d640e61358469c17fb0ce340f78104a50b26959 commit 1d640e61358469c17fb0ce340f78104a50b26959 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2021-11-25 16:01:11 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2021-12-07 18:13:47 +0000 Implement GET_STACK_USAGE on remaining archs This definition enables callers to estimate remaining space on the kstack, and take action on it. Notably, it enables optimizations in the GEOM and netgraph subsystems to directly dispatch work items when there is sufficient stack space, rather than queuing them for a worker thread. Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will not go unimplemented elsewhere. PR: 259157 Reviewed by: mav, kib, markj (previous version) MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32580 (cherry picked from commit 0d2224733e970aaa67a4e1af7b340044adda92f6) sys/arm/include/proc.h | 11 +++++++++++ sys/geom/geom_io.c | 8 -------- sys/mips/include/proc.h | 11 +++++++++++ sys/netgraph/ng_base.c | 3 +-- sys/riscv/include/proc.h | 11 +++++++++++ 5 files changed, 34 insertions(+), 10 deletions(-)
A commit in branch stable/12 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=b3f404d51f292644fa2b4ade5dc740018a4440e7 commit b3f404d51f292644fa2b4ade5dc740018a4440e7 Author: Mitchell Horne <mhorne@FreeBSD.org> AuthorDate: 2021-11-25 16:01:11 +0000 Commit: Mitchell Horne <mhorne@FreeBSD.org> CommitDate: 2021-12-07 18:15:59 +0000 Implement GET_STACK_USAGE on remaining archs This definition enables callers to estimate remaining space on the kstack, and take action on it. Notably, it enables optimizations in the GEOM and netgraph subsystems to directly dispatch work items when there is sufficient stack space, rather than queuing them for a worker thread. Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will not go unimplemented elsewhere. PR: 259157 Reviewed by: mav, kib, markj (previous version) MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D32580 (cherry picked from commit 0d2224733e970aaa67a4e1af7b340044adda92f6) sys/arm/include/proc.h | 11 +++++++++++ sys/geom/geom_io.c | 8 -------- sys/mips/include/proc.h | 11 +++++++++++ sys/netgraph/ng_base.c | 3 +-- sys/riscv/include/proc.h | 11 +++++++++++ 5 files changed, 34 insertions(+), 10 deletions(-)