20241016 00:36:10 all (763/970): ext3fs.sh Oct 16 00:36:58 mercat1 kernel: pid 58404 (swap), jid 0, uid 2007, was killed: failed to reclaim memory Oct 16 00:37:03 mercat1 kernel: pid 58389 (swap), jid 0, uid 2007, was killed: failed to reclaim memory KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01d903d4e0 hardclock() at hardclock+0x103/frame 0xfffffe01d903d520 handleevents() at handleevents+0xaf/frame 0xfffffe01d903d560 timercb() at timercb+0x18e/frame 0xfffffe01d903d5b0 lapic_handle_timer() at lapic_handle_timer+0xab/frame 0xfffffe01d903d5d0 Xtimerint() at Xtimerint+0xb1/frame 0xfffffe01d903d5d0 --- interrupt, rip = 0xffffffff829c59b2, rsp = 0xfffffe01d903d6a0, rbp = 0xfffffe01d903d710 --- ext2_htree_split_dirblock() at ext2_htree_split_dirblock+0xb2/frame 0xfffffe01d903d710 ext2_htree_add_entry() at ext2_htree_add_entry+0x233/frame 0xfffffe01d903d890 ext2_direnter() at ext2_direnter+0xac/frame 0xfffffe01d903da50 ext2_makeinode() at ext2_makeinode+0x128/frame 0xfffffe01d903daa0 ext2_create() at ext2_create+0x2c/frame 0xfffffe01d903dac0 VOP_CREATE_APV() at VOP_CREATE_APV+0x5f/frame 0xfffffe01d903dae0 vn_open_cred() at vn_open_cred+0x3f9/frame 0xfffffe01d903dc60 openatfp() at openatfp+0x287/frame 0xfffffe01d903ddb0 sys_openat() at sys_openat+0x3d/frame 0xfffffe01d903dde0 filemon_wrapper_openat() at filemon_wrapper_openat+0x12/frame 0xfffffe01d903de00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe01d903df30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe01d903df30 --- syscall (499, FreeBSD ELF64, openat), rip = 0xcb2e6d966fa, rsp = 0xcb2e5a793d8, rbp = 0xcb2e5a79490 --- KDB: enter: watchdog timeout https://people.freebsd.org/~pho/stress/log/log0554.txt Seems easy to reproduce: cd /usr/src/tools/test/stress2/misc ./all.sh ext3fs.sh
Presumably the problem is a directory entry with ep->e2d_reclen == 0, so the first loop in ext2_htree_split_dirblock() never terminates.
A commit search for when the ext2fs problems (log0554.txt and log0555.txt) were introduced, shows: 10/14 Kevin Bowling (2,9K) git: 7763b194d8de - main - igc: txrx function prototype cleanup OK 10/14 Doug Moore (3,2K) git: 2c8caa4b3925 - main - vfs_subr: optimize inval_buf_range FAIL
Created attachment 254324 [details] Handle a range from negative to positive A first guess is that somebody is asking to invalidate a range from a negative lower bound to a positive upper bound. I added a fix for that case, and the ext3fs.sh test seems fine for me.
(In reply to Doug Moore from comment #3) The patch did not seem to make any difference to what I see: https://people.freebsd.org/~pho/stress/log/log0556.txt
Created attachment 254341 [details] diagnostic patch Another patch, intended to diagnose and not likely to fix. I can't reproduce the problem because I can't install ext2. I'll examine the results after this new assertion fails.
(In reply to Doug Moore from comment #5) You need to install the e2fsprogs package to run the ext2 tests.
I have not been able to trigger the assertion in your latest diagnostic patch. I still see the same issues as before.
Created attachment 254350 [details] deal with negative values; sort the list I've had some success with this patch.
(In reply to Doug Moore from comment #8) Yes, I no longer see any issues with the ext2fs tests.
https://reviews.freebsd.org/D47200 is posted for review to address this bug.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=e2414d91d33f31d6f2c9f49eef7a1553b5798c9e commit e2414d91d33f31d6f2c9f49eef7a1553b5798c9e Author: Doug Moore <dougm@FreeBSD.org> AuthorDate: 2024-10-22 21:54:34 +0000 Commit: Doug Moore <dougm@FreeBSD.org> CommitDate: 2024-10-22 21:54:34 +0000 vfs_subr: maintain sorted tailq Pctries are based on unsigned index values. Type daddr_t is signed. Using daddr_t as an index type for a pctrie works, except that the pctrie considers negative values greater than nonnegative ones. Building a sorted tailq of bufs, based on pctrie results, sorts negative daddr_ts larger than nonnegative ones, and makes code that depends on the tailq being actually sorted broken. Write wrappers for the functions that do pctrie operations that depend on index ordering that fix the order problem, and use them in place of direct pctrie operations. PR: 282134 Reported by: pho Reviewed by: kib, markj Tested by: pho Fixes: 2c8caa4b3925aa7335 vfs_subr: optimize inval_buf_range Differential Revision: https://reviews.freebsd.org/D47200 sys/kern/vfs_subr.c | 56 +++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 44 insertions(+), 12 deletions(-)