Bug 259157 - Test case sys.netgraph.hub.loop panics RISC-V kernel
Summary: Test case sys.netgraph.hub.loop panics RISC-V kernel
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: tests (show other bugs)
Version: CURRENT
Hardware: riscv Any
: --- Affects Only Me
Assignee: freebsd-testing (Nobody)
URL: https://reviews.freebsd.org/D32580
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-13 21:20 UTC by Li-Wen Hsu
Modified: 2021-12-07 18:21 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Li-Wen Hsu freebsd_committer freebsd_triage 2021-10-13 21:20:43 UTC
https://ci.freebsd.org/job/FreeBSD-main-riscv64-test/14988/consoleFull

sys/netgraph/hub:loop  ->  panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffc0afd93000
cpuid = 0
time = 1633913993
KDB: stack backtrace:
db_trace_self() at db_trace_self
KDB: enter: panic
[ thread pid 42865 tid 100055 ]
Stopped at      kdb_enter+0x4c: sd      zero,0(a0)db:0:kdb.enter.panic> show pcpu
cpuid        = 0
dynamic pcpu = 0x2b6a80
curthread    = 0xffffffc0afe37680: pid 42865 tid 100055 critnest 1 "hub"
curpcb       = 0xffffffc0afd97d68
fpcurthread  = none
idlethread   = 0xffffffc001c1eb80: tid 100003 "idle: cpu0"
curvnet      = 0xffffffd00105fd00
spin locks held:
Comment 1 commit-hook freebsd_committer freebsd_triage 2021-10-13 21:32:29 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=d5fd5cdc063857132de19a94f63d1adbc581494e

commit d5fd5cdc063857132de19a94f63d1adbc581494e
Author:     Li-Wen Hsu <lwhsu@FreeBSD.org>
AuthorDate: 2021-10-13 21:31:22 +0000
Commit:     Li-Wen Hsu <lwhsu@FreeBSD.org>
CommitDate: 2021-10-13 21:31:22 +0000

    Temporarily skip sys.netgraph.hub.loop on RISC-V in CI

    This case panics kernel.

    PR:             259157
    Sponsored by:   The FreeBSD Foundation

 tests/sys/netgraph/hub.c | 5 +++++
 1 file changed, 5 insertions(+)
Comment 2 Lutz Donnerhacke freebsd_committer freebsd_triage 2021-10-13 23:08:43 UTC
With this build, the netgraph test suite was activated for this architecture for the first time: https://ci.freebsd.org/job/FreeBSD-main-riscv64-test/14974/

The netgraph/hub:loop test fails since then. It seems to had never worked on the RISC architecture.

Do we have somebody with RISC experience, who is able to investigate this issue?
Comment 3 Mitchell Horne freebsd_committer freebsd_triage 2021-10-14 20:22:21 UTC
Hi,

I looked into this issue a little bit. It appears to be a kernel stack overflow, since we can see the faulting address is just above the thread's stack, in the guard page:

panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffc21fbab000
db> show thread
Thread 100239 at 0xffffffc220f230e0:
 proc (pid 822): 0xffffffd01c2d7000
 name: hub
 pcb: 0xffffffc21fbb3d68
 stack: 0xffffffc21fbac000-0xffffffc21fbb3fff   <------ stack bounds
 flags: 0x4  pflags: 0x100
 state: RUNNING (CPU 0)
 priority: 172
 container lock: sched lock 0 (0xffffffc00156e0c0)
 last voluntary switch: 6.246 s ago
 last involuntary switch: 0.000 s ago

I tried a build of the kernel bumping KSTACK_PAGES from 4 to 8, but it failed in the same way. For the thread to eat through 8 pages of kstack this way this implies some kind of unwanted recursion (due to a loop in the graph?).

I have almost no insight into how netgraph works. Lutz, can you provide any info on what this test is doing and what code path(s) it might be taking through the kernel?

Unfortunately, we can't easily obtain a useful backtrace if the stack has overflowed. I tried dumping the memory contents from the top of the stack:

db> x/gx 0xffffffc21fbac000,64
0xffffffc21fbac000:     ffffdfffffffffff                ffffffffffffffff
0xffffffc21fbac010:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac020:     ffffffffffffffff                ffff
0xffffffc21fbac030:     f7ffffffffffffff                ffffffffffdfffff
0xffffffc21fbac040:     ffffffffffffffff                7fffffffffffffff
0xffffffc21fbac050:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac060:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac070:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac080:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac090:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac0a0:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac0b0:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac0c0:     ffffffffffffff7f                fffffffffbffffff
0xffffffc21fbac0d0:     fffeffffffffffff                ffffffffffffffff
0xffffffc21fbac0e0:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac0f0:     ffffffffffffffff                fbffffffffffffff
0xffffffc21fbac100:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac110:     ffffffffffffffff                ffffffffbfffffff
0xffffffc21fbac120:     ffffffffffffffff                ffffffffffffffff
0xffffffc21fbac130:     fbffffffffffffff                ffffffffffffffff
0xffffffc21fbac140:     ffffffffffffffff                fffffffffffeffff
0xffffffc21fbac150:     228106050356e2b                 1
0xffffffc21fbac160:     ffffffd000f98a00                1
0xffffffc21fbac170:     0                               101
0xffffffc21fbac180:     0                               0
0xffffffc21fbac190:     ffffffc0006fe947                ffffffd000f98ac0
0xffffffc21fbac1a0:     f40                             ffffffd000f98ad8
0xffffffc21fbac1b0:     ffffffc21fbac210                ffffffc0003341ee
0xffffffc21fbac1c0:     ffffffffffffffff                efffffffffffffff
0xffffffc21fbac1d0:     ffffffd000f98a00                101
0xffffffc21fbac1e0:     ffffffd000f98ad8                ffffffd000f98a00
0xffffffc21fbac1f0:     0                               0
0xffffffc21fbac200:     ffffffc21fbac250                ffffffc000622408
0xffffffc21fbac210:     0                               101
0xffffffc21fbac220:     1                               0
0xffffffc21fbac230:     ffffffff                        101
0xffffffc21fbac240:     ffffffc21fbac330                ffffffc0006216f2
0xffffffc21fbac250:     ffffffd000f98a18                1
0xffffffc21fbac260:     1                               0
0xffffffc21fbac270:     101                             ffffffc00475cdc0
0xffffffc21fbac280:     fe                              f90
0xffffffc21fbac290:     ffffffc0006fe947                ffffffd17c51a010
0xffffffc21fbac2a0:     10100000000                     ffffffc00475cdc0
0xffffffc21fbac2b0:     51a10000                        800
0xffffffc21fbac2c0:     ffffffd17c51a000                ffffffd17c51a010
0xffffffc21fbac2d0:     0                               0
0xffffffc21fbac2e0:     0                               ffffffc00475cdc0
0xffffffc21fbac2f0:     ffffffc00475cde0                ffffffd17c51a000
0xffffffc21fbac300:     fe                              101
0xffffffc21fbac310:     0                               ffffffc00089d4b0


The contents near the very top are quite strange to me, I'm not sure what is being written here. By examining the addresses in the kernel's .text range, I see repeated occurrences of ng_snd_item() and ng_apply_item() going down the stack.

Finally, the contents of the ktr buffer may give some insight:

db> show ktr
155 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d000
154 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d080
153 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d100
152 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d180
151 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d200
150 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d280
149 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d300
148 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d380
147 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d400
146 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d480
145 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d500
144 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d580
143 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d600
142 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d680
141 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d700
140 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d780
139 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d800
138 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d880
137 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59d900
136 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59d980
135 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59da00
134 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59da80
133 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59db00
132 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59db80
131 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59dc00
130 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59dc80
129 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59dd00
128 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59dd80
127 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59de00
126 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd01c59de80
125 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd01c59df00
124 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517000
123 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517080
122 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517100
121 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517180
120 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517200
119 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517280
118 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517300
117 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517380
116 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517400
115 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517480
114 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517500
113 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517580
112 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517600
111 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517680
110 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517700
109 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517780
108 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517800
107 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517880
106 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517900
105 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517980
104 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517a00
103 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517a80
102 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517b00
101 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517b80
100 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517c00
99 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517c80
98 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517d00
97 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517d80
96 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517e00
95 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517e80
94 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00
93 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [a] (0xffffffd0085a0800) queue empty; queue flags 0x0
92 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [a] (0xffffffd0085a0800) returning item 0xffffffd17c51db80 as WRITER; queue flags 0x1
91 (0xffffffc0047881c0:cpu2):             ngthread: node [a] (0xffffffd0085a0800) taken off worklist
90 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [a] (0xffffffd0085a0800) put on worklist
89 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [a] (0xffffffd0085a0800) queued item 0xffffffd17c51db80 as WRITER
88 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51db80
87 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dc00
86 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [d] (0xffffffd0085a0700) acquired item 0xffffffd17c51dc00
85 (0xffffffc0047870e0:cpu2):           ng_dequeue: node [d] (0xffffffd0085a0700) queue empty; queue flags 0x0
84 (0xffffffc0047870e0:cpu2):           ng_dequeue: node [d] (0xffffffd0085a0700) returning item 0xffffffd17c51dd00 as READER; queue flags 0x4
83 (0xffffffc0047870e0:cpu2):             ngthread: node [d] (0xffffffd0085a0700) taken off worklist
82 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [d] (0xffffffd0085a0700) put on worklist
81 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [d] (0xffffffd0085a0700) queued item 0xffffffd17c51dd00 as READER
80 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd00
79 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c517f00
78 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c517f00
77 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00
76 (0xffffffc004787680:cpu1):           ng_dequeue: node [a] (0xffffffd0085a0800) queue empty; queue flags 0x0
75 (0xffffffc004787680:cpu1):           ng_dequeue: node [a] (0xffffffd0085a0800) returning item 0xffffffd17c51dc80 as WRITER; queue flags 0x1
74 (0xffffffc004787680:cpu1):             ngthread: node [a] (0xffffffd0085a0800) taken off worklist
73 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [a] (0xffffffd0085a0800) put on worklist
72 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [a] (0xffffffd0085a0800) queued item 0xffffffd17c51dc80 as WRITER
71 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51dc80
70 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd00
69 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [c] (0xffffffd0085a0700) acquired item 0xffffffd17c51dd00
68 (0xffffffc004787c20:cpu2):           ng_dequeue: node [c] (0xffffffd0085a0700) queue empty; queue flags 0x0
67 (0xffffffc004787c20:cpu2):           ng_dequeue: node [c] (0xffffffd0085a0700) returning item 0xffffffd17c51de80 as READER; queue flags 0x4
66 (0xffffffc004787c20:cpu2):             ngthread: node [c] (0xffffffd0085a0700) taken off worklist
65 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [c] (0xffffffd0085a0700) put on worklist
64 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [c] (0xffffffd0085a0700) queued item 0xffffffd17c51de80 as READER
63 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80
62 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [b] (0xffffffd0085a0e00) acquired item 0xffffffd17c51de80
61 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80
60 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [a] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80
59 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80
58 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [9] (0xffffffd0085a0900) queue empty; queue flags 0x0
57 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [9] (0xffffffd0085a0900) returning item 0xffffffd17c51de00 as READER; queue flags 0x4
56 (0xffffffc0047881c0:cpu2):             ngthread: node [9] (0xffffffd0085a0900) taken off worklist
55 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [9] (0xffffffd0085a0900) already on worklist
54 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [9] (0xffffffd0085a0900) put on worklist
53 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [9] (0xffffffd0085a0900) queued item 0xffffffd17c51de00 as READER
52 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [9] (0xffffffd0085a0900) acquired item 0xffffffd17c51de00
51 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51df00
50 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c7f1f00
49 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c7f1f00
48 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51df00
47 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [8] (0xffffffd0085a0e00) acquired item 0xffffffd17c51df00
46 (0xffffffc0047870e0:cpu0):           ng_dequeue: node [8] (0xffffffd0085a0e00) queue empty; queue flags 0x0
45 (0xffffffc0047870e0:cpu0):           ng_dequeue: node [8] (0xffffffd0085a0e00) returning item 0xffffffd17c51de00 as READER; queue flags 0x4
44 (0xffffffc0047870e0:cpu0):             ngthread: node [8] (0xffffffd0085a0e00) taken off worklist
43 (0xffffffc220f230e0:cpu2):      ng_worklist_add: node [8] (0xffffffd0085a0e00) put on worklist
42 (0xffffffc220f230e0:cpu2):          ng_queue_rw: node [8] (0xffffffd0085a0e00) queued item 0xffffffd17c51de00 as READER
41 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de00
40 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00
39 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00
38 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51df00
37 (0xffffffc220f230e0:cpu2):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de00
36 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00
35 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80
34 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00
33 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00
32 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80
31 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00
30 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004f00
29 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd1a6004e80
28 (0xffffffc220f230e0:cpu2):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd1a6004f00
27 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c517f00
26 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c517e80
25 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c517f00
24 (0xffffffc004787680:cpu1):           ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0
23 (0xffffffc004787680:cpu1):           ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51dd80 as WRITER; queue flags 0x1
22 (0xffffffc004787680:cpu1):             ngthread: node [6] (0xffffffd0085a0900) taken off worklist
21 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist
20 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51dd80 as WRITER
19 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51dd80
18 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80
17 (0xffffffc004787c20:cpu2):           ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0
16 (0xffffffc004787c20:cpu2):           ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51de00 as WRITER; queue flags 0x1
15 (0xffffffc004787c20:cpu2):             ngthread: node [6] (0xffffffd0085a0900) taken off worklist
14 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist
13 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51de00 as WRITER
12 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de00
11 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80
10 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [7] (0xffffffd0085a0800) acquired item 0xffffffd17c51de80
9 (0xffffffc220f230e0:cpu0):     ng_acquire_write: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51de80
8 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [6] (0xffffffd0085a0900) queue empty; queue flags 0x0
7 (0xffffffc0047881c0:cpu2):           ng_dequeue: node [6] (0xffffffd0085a0900) returning item 0xffffffd17c51df00 as READER; queue flags 0x4
6 (0xffffffc0047881c0:cpu2):             ngthread: node [6] (0xffffffd0085a0900) taken off worklist
5 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [6] (0xffffffd0085a0900) already on worklist
4 (0xffffffc220f230e0:cpu0):      ng_worklist_add: node [6] (0xffffffd0085a0900) put on worklist
3 (0xffffffc220f230e0:cpu0):          ng_queue_rw: node [6] (0xffffffd0085a0900) queued item 0xffffffd17c51df00 as READER
2 (0xffffffc220f230e0:cpu0):      ng_acquire_read: node [6] (0xffffffd0085a0900) acquired item 0xffffffd17c51df00
1 (0xffffffc2215c2760:cpu1): bpf_attachd: bpf_attach called by pid 414, adding to writer list
0 (0xffffffc2215c2760:cpu1): bpf_attachd: bpf_attach called by pid 414, adding to active list
--- End of trace buffer ---
Comment 4 Lutz Donnerhacke freebsd_committer freebsd_triage 2021-10-14 21:31:07 UTC
(In reply to Mitchell Horne from comment #3)

Ah, a stack overflow in the kernel during handling looping traffic?
That's expected. (see D30633)

The interesting point is, that it fails that spectacularly.

So the current workaround (disable the test for RISC) is acceptable, otherwise we need to implement expected failure per architecture ...
Comment 5 Mitchell Horne freebsd_committer freebsd_triage 2021-10-20 15:28:39 UTC
(In reply to Lutz Donnerhacke from comment #4)

The expected behaviour for a page fault in the kernel is to panic, with a couple exceptions. Certainly, anything that overflows the thread's kernel stack in this way will cause this.

I managed to identify why this panics riscv, and not amd64. Turns out, the recursion is avoided when the architecture defines GET_STACK_USAGE(), as it allows for an early return from ng_snd_item(). The test also causes panics on 32-bit arm, but we did not notice because the armv7 CI is broken :(

I created a patch which implements the missing macro for remaining archs, and it 'fixes' the panics in this test. See:
https://reviews.freebsd.org/D32580

Still, I wonder if the test is flawed? If it is expected that the test might overflow the kernel stack then it is not really safe for CI. Maybe the test needs rethinking, or additional safeguards.
Comment 6 commit-hook freebsd_committer freebsd_triage 2021-11-30 15:16:59 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=0d2224733e970aaa67a4e1af7b340044adda92f6

commit 0d2224733e970aaa67a4e1af7b340044adda92f6
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2021-11-25 16:01:11 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2021-11-30 15:15:56 +0000

    Implement GET_STACK_USAGE on remaining archs

    This definition enables callers to estimate remaining space on the
    kstack, and take action on it. Notably, it enables optimizations in the
    GEOM and netgraph subsystems to directly dispatch work items when there
    is sufficient stack space, rather than queuing them for a worker thread.

    Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
    not go unimplemented elsewhere.

    PR:             259157
    Reviewed by:    mav, kib, markj (previous version)
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D32580

 sys/arm/include/proc.h   | 11 +++++++++++
 sys/geom/geom_io.c       |  8 --------
 sys/mips/include/proc.h  | 11 +++++++++++
 sys/netgraph/ng_base.c   |  3 +--
 sys/riscv/include/proc.h | 11 +++++++++++
 5 files changed, 34 insertions(+), 10 deletions(-)
Comment 7 commit-hook freebsd_committer freebsd_triage 2021-12-07 18:15:13 UTC
A commit in branch stable/13 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=1d640e61358469c17fb0ce340f78104a50b26959

commit 1d640e61358469c17fb0ce340f78104a50b26959
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2021-11-25 16:01:11 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2021-12-07 18:13:47 +0000

    Implement GET_STACK_USAGE on remaining archs

    This definition enables callers to estimate remaining space on the
    kstack, and take action on it. Notably, it enables optimizations in the
    GEOM and netgraph subsystems to directly dispatch work items when there
    is sufficient stack space, rather than queuing them for a worker thread.

    Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
    not go unimplemented elsewhere.

    PR:             259157
    Reviewed by:    mav, kib, markj (previous version)
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D32580

    (cherry picked from commit 0d2224733e970aaa67a4e1af7b340044adda92f6)

 sys/arm/include/proc.h   | 11 +++++++++++
 sys/geom/geom_io.c       |  8 --------
 sys/mips/include/proc.h  | 11 +++++++++++
 sys/netgraph/ng_base.c   |  3 +--
 sys/riscv/include/proc.h | 11 +++++++++++
 5 files changed, 34 insertions(+), 10 deletions(-)
Comment 8 commit-hook freebsd_committer freebsd_triage 2021-12-07 18:20:15 UTC
A commit in branch stable/12 references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=b3f404d51f292644fa2b4ade5dc740018a4440e7

commit b3f404d51f292644fa2b4ade5dc740018a4440e7
Author:     Mitchell Horne <mhorne@FreeBSD.org>
AuthorDate: 2021-11-25 16:01:11 +0000
Commit:     Mitchell Horne <mhorne@FreeBSD.org>
CommitDate: 2021-12-07 18:15:59 +0000

    Implement GET_STACK_USAGE on remaining archs

    This definition enables callers to estimate remaining space on the
    kstack, and take action on it. Notably, it enables optimizations in the
    GEOM and netgraph subsystems to directly dispatch work items when there
    is sufficient stack space, rather than queuing them for a worker thread.

    Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
    not go unimplemented elsewhere.

    PR:             259157
    Reviewed by:    mav, kib, markj (previous version)
    MFC after:      1 week
    Sponsored by:   The FreeBSD Foundation
    Differential Revision:  https://reviews.freebsd.org/D32580

    (cherry picked from commit 0d2224733e970aaa67a4e1af7b340044adda92f6)

 sys/arm/include/proc.h   | 11 +++++++++++
 sys/geom/geom_io.c       |  8 --------
 sys/mips/include/proc.h  | 11 +++++++++++
 sys/netgraph/ng_base.c   |  3 +--
 sys/riscv/include/proc.h | 11 +++++++++++
 5 files changed, 34 insertions(+), 10 deletions(-)