Bug 293957 - [fusefs]: page fault (0x78) in fuse_vnop_write during vnode recycling (csync2 synchro)
Summary: [fusefs]: page fault (0x78) in fuse_vnop_write during vnode recycling (csync2...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 15.0-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: Alan Somers
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2026-03-21 17:14 UTC by zjk
Modified: 2026-04-10 16:46 UTC (History)
3 users (show)

See Also:


Attachments
Handle memory allocation failures more gracefully (4.60 KB, patch)
2026-03-30 14:30 UTC, Alan Somers
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description zjk 2026-03-21 17:14:39 UTC
I am experiencing a consistent kernel panic on FreeBSD 15.0-RELEASE-p4 when running a heavy file scanning tool (csync2) on a local SSD, while having MooseFS (FUSE) mounts active (even if the FUSE mounts are excluded from the scan).
Panic details:

    Panic String: page fault
    Fault virtual address: 0x78 (NULL pointer dereference)
    Instruction pointer: fuse_vnop_write+0x1d5
    Stack Trace:
    #0 vpanic
    #1 panic
    #2 trap_pfault
    #3 calltrap
    #4 fuse_vnop_write+0x1d5
    #5 VOP_WRITE_APV
    #6 vn_write
    #7 vn_io_fault
    #8 dofilewrite
    #9 sys_write
    #10 amd64_syscall

Observed behavior:
The panic occurs when vfs.numvnodes grows rapidly due to local disk activity. It seems like the vnode reclamation process (recycling) triggers a write operation in the fusefs module, which then tries to access a NULL pointer (offset 0x78).
System Environment:

    FreeBSD 15.0-RELEASE-p4 GENERIC
    RAM: 16GB
    FUSE mounts: MooseFS (mfsmount)
    Settings tried (but still panics): vfs.fusefs.data_cache_mode=0, kern.maxvnodes=2000000, vfs.fusefs.iov_credit=4.

Dump header from device: /dev/gpt/swap
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 2126323712
  Blocksize: 512
  Compression: none
  Dumptime: 2026-03-21 17:37:36 +0100
  Hostname: almach.zjk.pl
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 15.0-RELEASE-p4 GENERIC
  Panic String: page fault
  Dump Parity: 1743253822
  Bounds: 7
  Dump Status: good
Comment 1 Alan Somers freebsd_committer freebsd_triage 2026-03-22 02:13:37 UTC
(In reply to zjk from comment #0)
Do you have a full core dump?  If so, if you look in /var/crash/core.*.txt , there should be a more complete stack trace, with function arguments and stuff.

The stack trace that you showed definitely didn't come from the vnode reclamation process, because it starts with sys_write.  That means it's a write from userland.
Comment 2 zjk 2026-03-22 08:29:27 UTC
(In reply to Alan Somers from comment #1)
Yes, you're right. At first glance, it seems like FUSE (from the moosefs mount), but after extensive testing, I concluded that the issue is related to memory management – ​​this needs to be verified:

1. Situation: I started synchronizing a website (40GB, several hundred files) from one server to multiple servers.

2. The website content contains mounts to the network moosefs – but I enabled synchronization exclusions for the entire mounted resource (this disk is mounted the same way on other servers).

3. csync2 causes panics not only during uploads, but also only during local file scanning. I've now confirmed that rsync also causes panics. (Ultimately, after many restarts, I have servers syncing, but it's not suitable for production – the restarts are regular/repeatable during sync). The HTTP server is running constantly, and sometimes the core dump trace leads to it.

4. I changed
kern.maxvnodes=500000-2000000 (better at 2000000)
vfs.fusefs.iov_credit=4-128 (better at 64)
vfs.fusefs.data_cache_mode=0-1 (better at 0)
where vfs.numvnodes changes from 60000 to 250000 (idle), when csync2 is running - close to 800000.

5. for moosefs
mfsentrycacheto,mfsattrcacheto,mfsdirentrycacheto - ranges from 5 to 600 (better at 60). Moosefs mount options: rw,late,failok,noatime,mfsdelayedinit,mfscachemode=DIRECT,mfsport=9421,mountprog=/usr/local/bin/mfsmount,mfsentrycacheto=60,mfsattrcacheto=60,mfsdirentrycacheto=60,mfstimeout=30,mfsioretries=60

6. I still have free memory (1GB) and swap is at 0 usage.

I'm sending a core dump, 2 examples:
https://www.filemail.com/d/sqswmdirehrziwk

Best regards
zjk
Comment 3 zjk 2026-03-23 14:00:16 UTC
Subject: Additional observations regarding vnode recycling and stability
I have more data points that might help pinpoint the problem in fusefs.

Observations:
    The Panic Trigger: I've confirmed that the panic (page fault 0x78 in fuse_vnop_write) consistently occurred exactly when vfs.numvnodes reached the kern.maxvnodes limit (initially set to 500000, and then 2,000,000).
    Workload Context: This happens on 2 out of 8 hosts in a cluster. These 2 hosts run heavy Apache/httpd traffic on MooseFS (FUSE) mounts. The other 6 hosts, which have lower file activity (around 500,000 vnodes), remained perfectly stable even with the 500000 limit (in this hosts are 3 file servers with moosefs source).
    Current Stability: After increasing kern.maxvnodes to 3,000,000 for 2 apache servers and setting vfs.fusefs.data_cache_mode=0, the systems have become stable. But today I dont make a sync (csync rsync).
    Current Stats: On the problematic hosts, vfs.numvnodes has now stabilized at approximately 2,050,000.
        When the limit was 2,000,000, maybe the system was forced into aggressive vnode reclamation/recycling, which triggered the crash.
       
Hypothesis:
The crash seems to be a race condition triggered by the vnode reclamation process. When the kernel tries to retire an inactive FUSE vnode under memory pressure, fuse_vnop_write is called (likely for a dirty page or metadata sync) and encounters a NULL pointer (offset 0x78).
Current Workaround Settings:

    vfs.fusefs.data_cache_mode=0
    vfs.fusefs.iov_credit=4 (reduced from 128)
    kern.maxvnodes=3000000
    Mount options: noatime (to reduce metadata writes)

But - once more - I dont make today sync test with csync2/rsync (I need more time and It must be time with "crash enable" for www-servers ;) ). 

Best regards
zjk
Comment 4 Alan Somers freebsd_committer freebsd_triage 2026-03-23 21:02:34 UTC
Those core dump files are too big for me to download.  They take over an hour apiece, and the download failed.  Instead of the full core dump, can you send the summary? It should be in /var/crash/core.txt.* .  If it's not there, you can generate it with this command:

crashinfo /var/crash/core.txt.0
Comment 5 zjk 2026-03-24 15:08:23 UTC
(In reply to Alan Somers from comment #4)
I have two full core dumps available. Since they are very large, please let me know if you need specific information from them (like variable values from kgdb) or if I should try to upload the full files to a specific location.


I was able to extract the stack backtrace from the vmcore using strings.
System version:
FreeBSD almach.zjk.pl 15.0-RELEASE-p4 FreeBSD 15.0-RELEASE-p4 GENERIC amd64


################################################################################

cat panic_log_0.txt
panic: page fault
cpuid = 1
time = 1774124510
KDB: stack backtrace:
#0 0xffffffff80bbe1ed at kdb_backtrace+0x5d
#1 0xffffffff80b71576 at vpanic+0x136
#2 0xffffffff80b71433 at panic+0x43
#3 0xffffffff81079f69 at trap_pfault+0x3c9
#4 0xffffffff81050008 at calltrap+0x8
#5 0xffffffff839c9ad5 at fuse_vnop_write+0x1d5
#6 0xffffffff81140944 at VOP_WRITE_APV+0x94
#7 0xffffffff80c7d7f7 at vn_write+0x287
#8 0xffffffff80c750d1 at vn_io_fault+0x1c1
#9 0xffffffff80be0581 at dofilewrite+0x81
#10 0xffffffff80bdff87 at sys_write+0x127
#11 0xffffffff8107a8a6 at amd64_syscall+0x126
#12 0xffffffff810508fb at fast_syscall_common+0xf8
Uptime: 1h6m2s
Dumping 2083 out of 16168 MB:in via lagg0
<110>ipfw: 6200 Accept TCP 192.168.1.14:61453 192.168.1.27:22000 in via lagg0
<110>ipfw: 6200 Accept TCP 192.168.1.14:61453 192.168.1.27:22000 out via lagg0
--
system panic: rebooting...
max_pending_per_proc
preallocate
Send SIGSYS on return from invalid syscall
sysctl mem
timecounter description
Estimated system boottime
max_threads_per_proc
mq_open_max
unknown:
detached
 on %s
devstat
Number of devices in the devstat list
fsbn %jd%s
epoch information
EDRAIN
taskqgroup_attach_cpu
kobj_methodcount
How often (times per second) to check for /dev/log waiters.
MODINFOMD_SSYM
MODINFOMD_CTORS_SIZE
<%d>
kern.maxusers
Maximum hz value supported
dfldsiz
Last controller runtime
rangeset pctrie nodes
Trigger one-time IO on RTC clocks; 1=read (and discard), 2=write
pipewr
pptrace
mprotect
compat.sethostid
compat6.lseek
afs3_syscall
__mac_set_file
utimensat
semmnu
Minimum shared memory segment size
Number of segments per process
ttyin
empty foreground process group
sleeping without queue
TLS sessions unable to switch to SW from ifnet
destroy_task
TLS sessions that failed to allocate a new ifnet send tag
Unable to bind KTLS worker thread for CPU %d: error %d
, %*D
Default stream receive space.
All buffers synced.
shutdown sync complete
--
panic: %s
/usr/src/sys/kern/kern_shutdown.c
sigsusp
ksiginfo_t *
Log invalid syscalls
ksiginfo
sysctl_unregister_oid
Timecounter hardware detected
fast_gettime
logsigexit
Livedump sx
%s(%d): module %s e_shstrndx is %d, sh_type is %d
<flag name="HTT">HTT group</flag>
acl_posix1e
%pnpinfo
epoch_alloc
epoch_call_tasks
%s: qid not found for %s cpu=%d
gtq_adrain
Maximum number of transient BIOs mappings
Maximum text size
Last controller process variable input
msgbuf_clear
Clear kernel message buffer
%s: trying %#jx <%#jx,%#jx>
adjusted start exceeds end
smp rendezvous
SMR SHARED
tq_destroy
task queue
neg offset
Pipe KVA usage
Process Descriptors
writev
compat6.mmap
ktimer_gettime
nanosleep
mmap
fstat
Maximum number of semaphores in the system
ttys
no foreground process group
sw_tx_pending
Total Active TLS sessions
corrupted_records
mbuf %p: m_data outside mbuf data range left
sonewconn: pcb %p (%s): Listen queue overflow: %i already in queue awaiting acceptance (%d occurrences), euid %d, rgid %d, jail %s
%s: pcb %p: New socket allocation failure: limit reached or out of memory
%s: pcb %p: soreserve() failed
numopensockets
AIOLIO
--
panic:
Encrypted kernel crash dumps support
%lds
Device(s) for kernel dumps
forcesigexit
ast_scheduler
Fixed-point scale factor used for calculating load average values
oiddescr
kern.threads.max_threads_per_proc hit count
%s: error writing livedump block at offset %jx: %d
vnode_dump
idlespinthresh
XML dump of detected CPU topology
%*s</group>
config_intrhook_disestablish: disestablishing an unestablished hook
comconsole_speed
driver bug: Unable to set devclass (class: %s devname: %s)
%.*s
%srid=%#x
Local offset from UTC in seconds
cmd=write
%jd-%jd
epoch_calls
%s: Loaded binary firmware using %s
MODINFOMD_SPLASH
[%jd]
maxssiz
vbox
pctrie_iter_insert_lookup
Inverse of proportional gain
Maximum number of elements in an I/O vector; sysconf(_SC_IOV_MAX)
length > iosize_max
invalid timespec
ktimer_create
compat11.kevent
kmq_timedsend
ftruncate
aio_mlock
nil free_msgmaps
msginfo.msgseg > 32767
msgmni
semwait
Maximum shared memory segment size
intrwait
Adjust format of kernel stack(9) traces on ^T (tty info): 0 - disabled; 1 - long; 2 - compact
pts/%u
Number of TLS threads in thread-pool
Number of TLS 1.0 records waiting for earlier TLS records
sw_tx_inqueue
switch_to_ifnet
m__freemp
--
system panic: powering off...
fscale
OID path exceeds %d bytes
current timecounter value
Maximum number of threads
itimerval
ncallout
kern.ncallout
failed to create softclock thread for cpu %d: %d
Enable compression of user corefiles (1 = gzip, 2 = zstd)
usem
%s: pcpu module space is out of space; cannot allocate %d for %s
sem_value_max
Interactivity score threshold
Threshold before we will permit idle thread spinning
Always run the stealer from the idle thread
number of maximal free ranges: %jd
               count  |  size range
root
struct bio *
firmware_register
trigger a kernel panic, using the provided string as the panic message
pretty-print the bootloader metadata
msgbuf_show_timestamp
kern.nswbuf
Maximum value of vfs.maxbufspace
pcpu-64
RANGESET_PCTRIE_REMOVE
no unshared regions found
maxid
negative ndescs
kern.ipc.maxpipekva exceeded; see tuning(7)
soaio jobs
tfdrd
wait4
compat.getdirentries
compat6.truncate
nnpfs_syscall
__mac_execve
msqids
Adjust on exit max value
suspended
tty_inq_flush_secure
Hardware (ifnet) TLS session stats
Active number of ifnet TLS sessions using AES-CBC
TLS sessions updated to a new ifnet send tag
Size of largest link layer header
max_hdr
starting so_splice worker thread for CPU %d
num_aio_procs
is attempting to use unsafe AIO requests
--
kdb_sysctl_panic: %s
alt_break_to_debugger
setpoint
%c%02x
log_console_output
Contents of kernel message buffer
<null>
No registered RTC clocks
Max number of CPUs that the system was compiled for.
Topology override setting; 0 is default provided by hardware.
timeout stopping cpus
iov_max
piperesizefail
Maximum number of kernel processes to use for async socket IO
timerfd
compat.sigaction
compat7.shmctl
modstat
ksem_timedwait
kmq_open
cap_ioctls_get
sysvsem
semaem
switch_failed
Active number of software TLS sessions using AES-GCM
wakeups
maxsockets
pcblist
enable_unsafe
Maximum number of aio requests to queue, globally
AIOCB
vfs_unbusy_pages: page missing
bufspacethresh
numbufallocfails
const char *
Number of namecache entries with vnodes held
Number of cache hits (negative)
evict_skipped_contended
STS VFS Cache
ncnege
Cluster read min block count
cluster_rbuild: b_bufsize(%ld) > b_kvasize(%d)
lookup: SAVESTART
nonosuid
retry_delay_hz
Mountpoints
struct mount vlist mtx
.ask
retry
    usecount %d, writecount %d, refcount %d seqc users %d
 socket %p
--
system panic: halting...
%ldd
sigfastblock_fetch_always
I'm a teapot!
Pin the per-CPU swis (except PCPU 0, which is also default)
kldload: %s
cmd %s pid %d tried to use non-present %s
mapped_files
off__cpu
steal_thresh
Topological distance limit for stealing threads in sched_switch()
vaccess_acl_posix1e: ACL_OTHER missing
config_intrhook_establish: establishing an already established hook.
devices
%parent
 not found
bus-related data
set to enter the debugger
kdb_sysctl_panic
Number of kernel object methods registered
unable to find kernel metadata
MODINFO_EMPTY
MODINFOMD_FW_HANDLE
kern.maxphys
msgbufsize
vm_guest
Error value from last interval
log_console_add_linefeed
rman_init
implement RMAN_GAUGE
splitting region in three parts: [%#jx, %#jx]; [%#jx, %#jx]; [%#jx, %#jx]
clock_show_io
CLOCK_SETTIME error %d
Number of CPUs online
SMR Stats
trap_enotcap
selfd
pipe mutex
piperd
/usr/src/sys/kern/sys_process.c
Preferred number of ready kernel processes for async socket IO
compat.recvmsg
futimes
compat7.__semctl
msgrcv
compat11.nfstat
aio_waitcomplete
aio_readv
semmns
Maximum number of pages available for shared memory
TLS sessions switched from ifnet to SW
--
panic: page fault
cpuid = 1
time = 1774124510
KDB: stack backtrace:
#0 0xffffffff80bbe1ed at kdb_backtrace+0x5d
#1 0xffffffff80b71576 at vpanic+0x136
#2 0xffffffff80b71433 at panic+0x43
#3 0xffffffff81079f69 at trap_pfault+0x3c9
#4 0xffffffff81050008 at calltrap+0x8
#5 0xffffffff839c9ad5 at fuse_vnop_write+0x1d5
#6 0xffffffff81140944 at VOP_WRITE_APV+0x94
#7 0xffffffff80c7d7f7 at vn_write+0x287
#8 0xffffffff80c750d1 at vn_io_fault+0x1c1
#9 0xffffffff80be0581 at dofilewrite+0x81
#10 0xffffffff80bdff87 at sys_write+0x127
#11 0xffffffff8107a8a6 at amd64_syscall+0x126
#12 0xffffffff810508fb at fast_syscall_common+0xf8
Uptime: 1h6m2s
Dumping 2083 out of 16168 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%.14:61453 192.168.1.27:22000 in via lagg0
<110>ipfw: 6200 Accept TCP 192.168.1.14:61453 192.168.1.27:22000 out via lagg0


##################################################################################



cat panic_log_9.txt
panic: page fault
cpuid = 2
time = 1774120492
KDB: stack backtrace:
#0 0xffffffff80bbe1ed at kdb_backtrace+0x5d
#1 0xffffffff80b71576 at vpanic+0x136
#2 0xffffffff80b71433 at panic+0x43
#3 0xffffffff81079f69 at trap_pfault+0x3c9
#4 0xffffffff81050008 at calltrap+0x8
#5 0xffffffff8341aad5 at fuse_vnop_write+0x1d5
#6 0xffffffff81140944 at VOP_WRITE_APV+0x94
#7 0xffffffff80c7d7f7 at vn_write+0x287
#8 0xffffffff80c750d1 at vn_io_fault+0x1c1
#9 0xffffffff80be0581 at dofilewrite+0x81
#10 0xffffffff80bdff87 at sys_write+0x127
#11 0xffffffff8107a8a6 at amd64_syscall+0x126
#12 0xffffffff810508fb at fast_syscall_common+0xf8
Uptime: 2h12m36s
Dumping 2251 out of 16168 MB: UDP 192.168.1.102:35390 239.255.255.250:1900 in via lagg0
--
panic: MUTEX_UNLOCK (%d) [%s:%d]
x, pkey
store, method
ssl, file
from
select($PASSOUTw); $| = 1;
select(STDOUT);
return ($PASSINr, $PASSINw, $PASSOUTr, $PASSOUTw);
# recv_pam_question(&conv, fd)
# Reads one PAM question from the sub-process, and sends it to the HTTP handler.
# Returns 0 if the conversation is over, 1 if not.
sub recv_pam_question
local ($conf, $fh) = @_;
local $pr = $conf->{'PAMOUTr'};
select($pr); $| = 1; select(STDOUT);
local $line = <$pr>;
$line =~ s/\r|\n//g;
if (!$line) {
        $line = <$pr>;
        $line =~ s/\r|\n//g;
$conf->{'last'} = time();
if (!$line) {
        # Failed!
        print $fh "0 PAM conversation error\n";
        return 0;
else {
        local ($type, $msg) = split(/\s+/, $line, 2);
        if ($type =~ /^x(\d+)/) {
                # Pass this status code through
                print $fh "$1 $msg\n";
                return $1 == 2 || $1 == 0 ? 0 : 1;
        elsif ($type == PAM_PROMPT_ECHO_ON()) {
                # A normal question
                print $fh "1 $msg\n";
                return 1;
        elsif ($type == PAM_PROMPT_ECHO_OFF()) {
                # A password
                print $fh "3 $msg\n";
                return 1;
        elsif ($type == PAM_ERROR_MSG() || $type == PAM_TEXT_INFO()) {
                # A message that does not require a response
                print $fh "4 $msg\n";
                return 1;
        else {
                # Unknown type!
                print $fh "0 Unknown PAM message type $type\n";
                return 0;
# send_pam_answer(&conv, answer)
# Sends a response from the user to the PAM sub-process
sub send_pam_answer
local ($conf, $answer) = @_;
--
system panic: rebooting...
max_pending_per_proc
preallocate
Send SIGSYS on return from invalid syscall
sysctl mem
timecounter description
Estimated system boottime
max_threads_per_proc
mq_open_max
unknown:
detached
 on %s
devstat
Number of devices in the devstat list
fsbn %jd%s
epoch information
EDRAIN
taskqgroup_attach_cpu
kobj_methodcount
How often (times per second) to check for /dev/log waiters.
MODINFOMD_SSYM
MODINFOMD_CTORS_SIZE
<%d>
kern.maxusers
Maximum hz value supported
dfldsiz
Last controller runtime
rangeset pctrie nodes
Trigger one-time IO on RTC clocks; 1=read (and discard), 2=write
pipewr
pptrace
mprotect
compat.sethostid
compat6.lseek
afs3_syscall
__mac_set_file
utimensat
semmnu
Minimum shared memory segment size
Number of segments per process
ttyin
empty foreground process group
sleeping without queue
TLS sessions unable to switch to SW from ifnet
destroy_task
TLS sessions that failed to allocate a new ifnet send tag
Unable to bind KTLS worker thread for CPU %d: error %d
, %*D
Default stream receive space.
All buffers synced.
shutdown sync complete
--
panic: %s
/usr/src/sys/kern/kern_shutdown.c
sigsusp
ksiginfo_t *
Log invalid syscalls
ksiginfo
sysctl_unregister_oid
Timecounter hardware detected
fast_gettime
logsigexit
Livedump sx
%s(%d): module %s e_shstrndx is %d, sh_type is %d
<flag name="HTT">HTT group</flag>
acl_posix1e
%pnpinfo
epoch_alloc
epoch_call_tasks
%s: qid not found for %s cpu=%d
gtq_adrain
Maximum number of transient BIOs mappings
Maximum text size
Last controller process variable input
msgbuf_clear
Clear kernel message buffer
%s: trying %#jx <%#jx,%#jx>
adjusted start exceeds end
smp rendezvous
SMR SHARED
tq_destroy
task queue
neg offset
Pipe KVA usage
Process Descriptors
writev
compat6.mmap
ktimer_gettime
nanosleep
mmap
fstat
Maximum number of semaphores in the system
ttys
no foreground process group
sw_tx_pending
Total Active TLS sessions
corrupted_records
mbuf %p: m_data outside mbuf data range left
sonewconn: pcb %p (%s): Listen queue overflow: %i already in queue awaiting acceptance (%d occurrences), euid %d, rgid %d, jail %s
%s: pcb %p: New socket allocation failure: limit reached or out of memory
%s: pcb %p: soreserve() failed
numopensockets
AIOLIO
--
panic:
Encrypted kernel crash dumps support
%lds
Device(s) for kernel dumps
forcesigexit
ast_scheduler
Fixed-point scale factor used for calculating load average values
oiddescr
kern.threads.max_threads_per_proc hit count
%s: error writing livedump block at offset %jx: %d
vnode_dump
idlespinthresh
XML dump of detected CPU topology
%*s</group>
config_intrhook_disestablish: disestablishing an unestablished hook
comconsole_speed
driver bug: Unable to set devclass (class: %s devname: %s)
%.*s
%srid=%#x
Local offset from UTC in seconds
cmd=write
%jd-%jd
epoch_calls
%s: Loaded binary firmware using %s
MODINFOMD_SPLASH
[%jd]
maxssiz
vbox
pctrie_iter_insert_lookup
Inverse of proportional gain
Maximum number of elements in an I/O vector; sysconf(_SC_IOV_MAX)
length > iosize_max
invalid timespec
ktimer_create
compat11.kevent
kmq_timedsend
ftruncate
aio_mlock
nil free_msgmaps
msginfo.msgseg > 32767
msgmni
semwait
Maximum shared memory segment size
intrwait
Adjust format of kernel stack(9) traces on ^T (tty info): 0 - disabled; 1 - long; 2 - compact
pts/%u
Number of TLS threads in thread-pool
Number of TLS 1.0 records waiting for earlier TLS records
sw_tx_inqueue
switch_to_ifnet
m__freemp
--
system panic: powering off...
fscale
OID path exceeds %d bytes
current timecounter value
Maximum number of threads
itimerval
ncallout
kern.ncallout
failed to create softclock thread for cpu %d: %d
Enable compression of user corefiles (1 = gzip, 2 = zstd)
usem
%s: pcpu module space is out of space; cannot allocate %d for %s
sem_value_max
Interactivity score threshold
Threshold before we will permit idle thread spinning
Always run the stealer from the idle thread
number of maximal free ranges: %jd
               count  |  size range
root
struct bio *
firmware_register
trigger a kernel panic, using the provided string as the panic message
pretty-print the bootloader metadata
msgbuf_show_timestamp
kern.nswbuf
Maximum value of vfs.maxbufspace
pcpu-64
RANGESET_PCTRIE_REMOVE
no unshared regions found
maxid
negative ndescs
kern.ipc.maxpipekva exceeded; see tuning(7)
soaio jobs
tfdrd
wait4
compat.getdirentries
compat6.truncate
nnpfs_syscall
__mac_execve
msqids
Adjust on exit max value
suspended
tty_inq_flush_secure
Hardware (ifnet) TLS session stats
Active number of ifnet TLS sessions using AES-CBC
TLS sessions updated to a new ifnet send tag
Size of largest link layer header
max_hdr
starting so_splice worker thread for CPU %d
num_aio_procs
is attempting to use unsafe AIO requests
--
kdb_sysctl_panic: %s
alt_break_to_debugger
setpoint
%c%02x
log_console_output
Contents of kernel message buffer
<null>
No registered RTC clocks
Max number of CPUs that the system was compiled for.
Topology override setting; 0 is default provided by hardware.
timeout stopping cpus
iov_max
piperesizefail
Maximum number of kernel processes to use for async socket IO
timerfd
compat.sigaction
compat7.shmctl
modstat
ksem_timedwait
kmq_open
cap_ioctls_get
sysvsem
semaem
switch_failed
Active number of software TLS sessions using AES-GCM
wakeups
maxsockets
pcblist
enable_unsafe
Maximum number of aio requests to queue, globally
AIOCB
vfs_unbusy_pages: page missing
bufspacethresh
numbufallocfails
const char *
Number of namecache entries with vnodes held
Number of cache hits (negative)
evict_skipped_contended
STS VFS Cache
ncnege
Cluster read min block count
cluster_rbuild: b_bufsize(%ld) > b_kvasize(%d)
lookup: SAVESTART
nonosuid
retry_delay_hz
Mountpoints
struct mount vlist mtx
.ask
retry
    usecount %d, writecount %d, refcount %d seqc users %d
 socket %p
--
system panic: halting...
%ldd
sigfastblock_fetch_always
I'm a teapot!
Pin the per-CPU swis (except PCPU 0, which is also default)
kldload: %s
cmd %s pid %d tried to use non-present %s
mapped_files
off__cpu
steal_thresh
Topological distance limit for stealing threads in sched_switch()
vaccess_acl_posix1e: ACL_OTHER missing
config_intrhook_establish: establishing an already established hook.
devices
%parent
 not found
bus-related data
set to enter the debugger
kdb_sysctl_panic
Number of kernel object methods registered
unable to find kernel metadata
MODINFO_EMPTY
MODINFOMD_FW_HANDLE
kern.maxphys
msgbufsize
vm_guest
Error value from last interval
log_console_add_linefeed
rman_init
implement RMAN_GAUGE
splitting region in three parts: [%#jx, %#jx]; [%#jx, %#jx]; [%#jx, %#jx]
clock_show_io
CLOCK_SETTIME error %d
Number of CPUs online
SMR Stats
trap_enotcap
selfd
pipe mutex
piperd
/usr/src/sys/kern/sys_process.c
Preferred number of ready kernel processes for async socket IO
compat.recvmsg
futimes
compat7.__semctl
msgrcv
compat11.nfstat
aio_waitcomplete
aio_readv
semmns
Maximum number of pages available for shared memory
TLS sessions switched from ifnet to SW
--
panic: page fault
cpuid = 2
time = 1774120492
KDB: stack backtrace:
#0 0xffffffff80bbe1ed at kdb_backtrace+0x5d
#1 0xffffffff80b71576 at vpanic+0x136
#2 0xffffffff80b71433 at panic+0x43
#3 0xffffffff81079f69 at trap_pfault+0x3c9
#4 0xffffffff81050008 at calltrap+0x8
#5 0xffffffff8341aad5 at fuse_vnop_write+0x1d5
#6 0xffffffff81140944 at VOP_WRITE_APV+0x94
#7 0xffffffff80c7d7f7 at vn_write+0x287
#8 0xffffffff80c750d1 at vn_io_fault+0x1c1
#9 0xffffffff80be0581 at dofilewrite+0x81
#10 0xffffffff80bdff87 at sys_write+0x127
#11 0xffffffff8107a8a6 at amd64_syscall+0x126
#12 0xffffffff810508fb at fast_syscall_common+0xf8
Uptime: 2h12m36s
Dumping 2251 out of 16168 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%via lagg0
<110>ipfw: 900 Deny UDP 192.168.1.102:35390 239.255.255.250:1900 in via lagg0
Comment 6 Alan Somers freebsd_committer freebsd_triage 2026-03-24 16:34:11 UTC
(In reply to zjk from comment #5)
I don't want the full core dumps.  I just want the crashinfo output.  They should be files with names like /var/crash/core.*.txt and they should be hundreds of kB apiece, not GB.  Do you have those?
Comment 7 zjk 2026-03-24 17:19:04 UTC
(In reply to Alan Somers from comment #6)
I have them, but they are the same and do not carry any information - there is a version error:
"
cat core.txt.9
'version' has unknown type; cast it to its declared type
'version' has unknown type; cast it to its declared type
'version' has unknown type; cast it to its declared type
Unable to find matching kernel for /var/crash/vmcore.9
"
Comment 8 zjk 2026-03-24 17:22:04 UTC
(In reply to zjk from comment #7)
I am using a standard GENERIC kernel installed via freebsd-update. It seems it lacks the debug symbols required by crashinfo, which is why it fails with 'unknown type' and 'matching kernel' errors. However, the manual backtrace I provided earlier was extracted directly from the same vmcore and clearly shows the fault at fuse_vnop_write+0x1d5.
Comment 9 Alan Somers freebsd_committer freebsd_triage 2026-03-24 17:43:32 UTC
(In reply to zjk from comment #8)
Do you have a bunch of *.debug files in ls /usr/lib/debug/boot/kernel ?  If not, then that may be the problem.  The easiest way to fix it is to download https://download.freebsd.org/releases/amd64/15.0-RELEASE/kernel-dbg.txz , then run "tar -C / -xf kernel-dbg.txz".  Then you can run 'freebsd-update fetch ;freebsd-update-install'.  Finally, you'll be able to run crashinfo again.  However, that will only work if the crash happened while running the latest release kernel.  If your uptime was more than a few months, then it's likely that an older kernel was running at the time.

The reason why I'd like to see the full crashinfo output is that it contains much more information than the stack trace shown by ddb.  For instance, line numbers and argument values.
Comment 10 zjk 2026-03-24 18:10:44 UTC
(In reply to Alan Somers from comment #9)
OK - I prepared and sent it. I delete the VMcore periodically, but these are the last two dumps I have.

https://www.filemail.com/d/patrqvzolwazjlo

If you need more core dumps, let me know. I can produce them for various scenarios by the end of the week.

zjk
Comment 11 Alan Somers freebsd_committer freebsd_triage 2026-03-24 18:19:48 UTC
(In reply to zjk from comment #10)
That's perfect.  But it would be even better if you could provide one more thing.  If the moosefs fuse daemon has some kind of --verbose or --debug output that prints a list of every FUSE opcode as it arrives, could you please provide whatever that shows before the crash?  If not, that's ok.  I can make do without.
Comment 12 zjk 2026-03-24 18:43:17 UTC
(In reply to Alan Somers from comment #11)
I understand what you're talking about. Yes, mfsmaster is dumping information to the syslog (maybe it's not a full dump of the operation, but I see some interesting things here). 

I'll try a csync tomorrow (probably causing a crash). I'll get the information and maybe something interesting will be logged from moosefs. 

Maybe a "spontaneous" dump will be created? I'll also check the syslog. I'll post it when I have the complete file.
Comment 13 Alan Somers freebsd_committer freebsd_triage 2026-03-25 20:39:55 UTC
It's obvious that getblk failed, but I really don't have any idea why.  Could you please try reproducing the bug with this patch applied and share the output with me?  This would help to narrow it down.

diff --git a/sys/kern/vfs_bio.c b/sys/kern/vfs_bio.c
index 2368f863a248..6f5ac2b5f363 100644
--- a/sys/kern/vfs_bio.c
+++ b/sys/kern/vfs_bio.c
@@ -3928,8 +3928,10 @@ getblk(struct vnode *vp, daddr_t blkno, int size, int slpflag, int slptimeo,
        int error;
 
        error = getblkx(vp, blkno, blkno, size, slpflag, slptimeo, flags, &bp);
-       if (error != 0)
+       if (error != 0) {
+               printf("getblk failed: \n", strerror(error));
                return (NULL);
+       }
        return (bp);
 }
Comment 14 zjk 2026-03-26 06:46:54 UTC
(In reply to Alan Somers from comment #13)
Since this is a production machine, I'd prefer to avoid a custom kernel build if possible - but it isnt big problem, I can compile new kernel too. Also, your patch has a small typo (missing %s in printf).

Can I use this DTrace one-liner instead?
dtrace -n 'fbt::getblkx:return /arg1 != 0/ { printf("getblk failed for vnode %p, error: %d", args[0]->vp, arg1); }'
If the system panics, this message should still be visible in the dmesg buffer within the vmcore (which I can extract with strings). Would the return value of getblkx and the vnode pointer be enough for you to investigate further?

I just have new core dumps with messages from moosefes master - it looks like  master drops communication with chunkservers - network problem.
Comment 15 zjk 2026-03-26 06:57:57 UTC
(In reply to zjk from comment #14)
The environment uses Intel i219 NICs which are known to be unstable under FreeBSD. I am experiencing frequent network glitches/timeouts on the lagg0 interface. It seems that fuse_vnop_write panics instead of returning an error when the underlying network connection to the MooseFS storage is interrupted. This confirms your suspicion about getblk failing due to these network-induced timeouts.
Comment 16 zjk 2026-03-26 12:23:47 UTC
(In reply to zjk from comment #15)
Alan - I made a kernel and cracked it with rsync. But nothing interesting was written to the info - then I did run kdbg.

Of course, if you want, I can send you the core.txt, but: I've analyzed the vmcore with kgdb and found that the panic does not occur in getblk.
The crash happens in fuse_inval_buf_range (called by fuse_vnop_write) at fuse_vnops.c:359.
Key findings from kgdb: 

vp is a valid VREG vnode. 
vp->v_data is a valid fuse_vnode_data (nid: 78819, size: 484059039). 
The panic is a Page Fault in frame #8.

It looks like fuse_inval_buf_range is trying to invalidate a buffer range that causes a fault, even though the vnode structures seem intact. getblk debug patch was never reached. Please see the attached bt and p *vp output.

bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80b710f9 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:519
#3  0xffffffff80b71607 in vpanic (fmt=0xffffffff811d12fe "%s", ap=ap@entry=0xfffffe00ff360920) at /usr/src/sys/kern/kern_shutdown.c:974
#4  0xffffffff80b71433 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:887
#5  0xffffffff81079f69 in trap_fatal (frame=<optimized out>, eva=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:969
#6  0xffffffff81079f69 in trap_pfault (frame=0xfffffe00ff3609a0, usermode=false, signo=<optimized out>, ucode=<optimized out>)
#7  <signal handler called>
#8  fuse_inval_buf_range (vp=vp@entry=0xfffff8007502f898, filesize=<optimized out>, start=<optimized out>, end=<optimized out>) at /usr/src/sys/fs/fuse/fuse_vnops.c:359
#9  0xffffffff8396bad5 in fuse_vnop_write (ap=<optimized out>) at /usr/src/sys/fs/fuse/fuse_vnops.c:2545
#10 0xffffffff81140944 in VOP_WRITE_APV (vop=0xffffffff83976898 <fuse_vnops>, a=a@entry=0xfffffe00ff360c40) at vnode_if.c:1096
#11 0xffffffff80c7d887 in VOP_WRITE (vp=0xfffff8007502f898, uio=0xfffffe00ff360da8, ioflag=131075, cred=0x0) at ./vnode_if.h:595
#12 vn_write (fp=fp@entry=0xfffff801833b7af0, uio=uio@entry=0xfffffe00ff360da8, active_cred=active_cred@entry=0xfffff802c2b86600, flags=<optimized out>, td=<optimized out>)
    at /usr/src/sys/kern/vfs_vnops.c:1300
#13 0xffffffff80c75161 in vn_io_fault (fp=0xfffff801833b7af0, uio=0xfffffe00ff360da8, active_cred=0xfffff802c2b86600, flags=0, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:1614
#14 0xffffffff80be0581 in fo_write (fp=0xfffff801833b7af0, uio=0xfffffe00ff360da8, active_cred=0x0, td=0xfffff8027eb6a780, flags=<optimized out>) at /usr/src/sys/sys/file.h:362
#15 dofilewrite (td=td@entry=0xfffff8027eb6a780, fd=fd@entry=152, fp=0xfffff801833b7af0, auio=auio@entry=0xfffffe00ff360da8, offset=offset@entry=-1, flags=flags@entry=0) at /usr/src/sys/kern/sys_generic.c:565
#16 0xffffffff80bdff87 in kern_writev (td=0xfffff8027eb6a780, fd=152, auio=0xfffffe00ff360da8) at /usr/src/sys/kern/sys_generic.c:492
#17 sys_write (td=0xfffff8027eb6a780, uap=<optimized out>) at /usr/src/sys/kern/sys_generic.c:407
#18 0xffffffff8107a8a6 in syscallenter (td=0xfffff8027eb6a780) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:193
#19 amd64_syscall (td=0xfffff8027eb6a780, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1208
#20 <signal handler called>
#21 0x000000082a78516a in ?? ()
Backtrace stopped: Cannot access memory at address 0x82078fe18


 f 8
#8  fuse_inval_buf_range (vp=vp@entry=0xfffff8007502f898, filesize=<optimized out>, start=<optimized out>, end=<optimized out>) at /usr/src/sys/fs/fuse/fuse_vnops.c:359
359                     if ((bp->b_flags & B_CACHE) != 0 && bp->b_dirtyoff < right_on) {


p *vp
$1 = {v_type = VREG, v_state = VSTATE_CONSTRUCTED, v_irflag = 0, v_seqc = 0, v_nchash = 443112150, v_hash = 309852434, v_op = 0xffffffff83976898 <fuse_vnops>, v_data = 0xfffff8015a143a80,
  v_mount = 0xfffffe00c1910b00, v_nmntvnodes = {tqe_next = 0xfffff8007541b898, tqe_prev = 0xfffff800755a8028}, {v_mountedhere = 0x0, v_unpcb = 0x0, v_rdev = 0x0, v_fifoinfo = 0x0}, v_hashlist = {
    le_next = 0x0, le_prev = 0xfffffe00f35ef4b8}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0xfffff8030ee0ff58, tqh_last = 0xfffff8030ee0ff68}, v_cache_dd = 0xfffff8030ee0ff58, v_lock = {
    lock_object = {lo_name = 0xffffffff839723ff "fuse", lo_flags = 116588544, lo_data = 0, lo_witness = 0x0}, lk_lock = 18446735288332363648, lk_exslpfail = 0, lk_pri = 44, lk_timo = 51}, v_interlock = {
    lock_object = {lo_name = 0xffffffff811d76f7 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, v_vnlock = 0xfffff8007502f908, v_vnodelist = {tqe_next = 0x0,
    tqe_prev = 0xfffff8007502fcc8}, v_lazylist = {tqe_next = 0x0, tqe_prev = 0xfffffe00c1911550}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff81217a74 "bufobj interlock", lo_flags = 86179840,
        lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0xffffffff818e46d0 <buf_ops_bio>, bo_object = 0xfffff803903c02e8, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff8007502f898,
    bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff8007502f9c0}, bv_root = {pt_root = 0x1}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff8007502f9e0}, bv_root = {
        pt_root = 0x1}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_domain = 4, bo_bsize = 65536}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {head = 1, sleepers = false}, v_holdcnt = 1,
  v_usecount = 1, v_iflag = 0, v_vflag = 0, v_mflag = 1, v_dbatchcpu = 0, v_writecount = 1, v_seqc_users = 0}

 p *(struct vnode *)vp
$2 = {v_type = VREG, v_state = VSTATE_CONSTRUCTED, v_irflag = 0, v_seqc = 0, v_nchash = 443112150, v_hash = 309852434, v_op = 0xffffffff83976898 <fuse_vnops>, v_data = 0xfffff8015a143a80,
  v_mount = 0xfffffe00c1910b00, v_nmntvnodes = {tqe_next = 0xfffff8007541b898, tqe_prev = 0xfffff800755a8028}, {v_mountedhere = 0x0, v_unpcb = 0x0, v_rdev = 0x0, v_fifoinfo = 0x0}, v_hashlist = {
    le_next = 0x0, le_prev = 0xfffffe00f35ef4b8}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0xfffff8030ee0ff58, tqh_last = 0xfffff8030ee0ff68}, v_cache_dd = 0xfffff8030ee0ff58, v_lock = {
    lock_object = {lo_name = 0xffffffff839723ff "fuse", lo_flags = 116588544, lo_data = 0, lo_witness = 0x0}, lk_lock = 18446735288332363648, lk_exslpfail = 0, lk_pri = 44, lk_timo = 51}, v_interlock = {
    lock_object = {lo_name = 0xffffffff811d76f7 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, v_vnlock = 0xfffff8007502f908, v_vnodelist = {tqe_next = 0x0,
    tqe_prev = 0xfffff8007502fcc8}, v_lazylist = {tqe_next = 0x0, tqe_prev = 0xfffffe00c1911550}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff81217a74 "bufobj interlock", lo_flags = 86179840,
        lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0xffffffff818e46d0 <buf_ops_bio>, bo_object = 0xfffff803903c02e8, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff8007502f898,
    bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff8007502f9c0}, bv_root = {pt_root = 0x1}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff8007502f9e0}, bv_root = {
        pt_root = 0x1}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_domain = 4, bo_bsize = 65536}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {head = 1, sleepers = false}, v_holdcnt = 1,
  v_usecount = 1, v_iflag = 0, v_vflag = 0, v_mflag = 1, v_dbatchcpu = 0, v_writecount = 1, v_seqc_users = 0}

p *(struct fuse_vnode_data *)vp->v_data
$3 = {nid = 78819, generation = 1, parent_nid = 0, handles = {lh_first = 0xfffff80075db0900}, flag = 6656, attr_cache_timeout = {sec = 1623, frac = 14421952452054373985}, entry_cache_timeout = {sec = 1623,
    frac = 14421952452054373985}, last_local_modify = {tv_sec = 1563, tv_nsec = 892812274}, cached_attrs = {va_type = VREG, va_mode = 416, va_bsdflags = 0, va_uid = 80, va_gid = 80, va_nlink = 1,
    va_fsid = 18446744073390849800, va_fileid = 78819, va_size = 484059039, va_blocksize = 65536, va_atime = {tv_sec = 1774496793, tv_nsec = 0}, va_mtime = {tv_sec = 1774526210, tv_nsec = 0}, va_ctime = {
      tv_sec = 1774526210, tv_nsec = 0}, va_birthtime = {tv_sec = -1, tv_nsec = 0}, va_gen = 0, va_flags = 0, va_rdev = 0, va_bytes = 484054528, va_filerev = 18446744073709551615, va_vaflags = 0,
    va_spare = 0}, nlookup = 34, vtype = VNON, clusterw = {v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0}}
Comment 17 zjk 2026-03-26 12:37:21 UTC
(In reply to zjk from comment #16)
Alan,
I have analyzed the vmcore from the latest crash. It appears the panic occurs in fuse_inval_buf_range at fuse_vnops.c:359, where getblk returns NULL.
Here is the relevant part of the disassembly from my crash (frame #8):
gdb

358                     bp = getblk(vp, right_lbn, right_blksize, PCATCH, 0, 0);
   0xffffffff8396e3f0 <+304>:   call   0xffffffff80c380f0 <getblk>

359                     if ((bp->b_flags & B_CACHE) != 0 && bp->b_dirtyoff < right_on) {
=> 0xffffffff8396e3f5 <+309>:   testb  $0x20,0x78(%rax)

At the time of the panic, the register %rax is 0x0 (returned by getblk), which causes the Page Fault when the code immediately attempts to access bp->b_flags.
Values from kgdb at frame #8:

    bp = 0x0
    right_blksize = 65536
    right_lbn = <optimized out>
    vp = 0xfffff8007502f898 (VREG)

The fuse_inval_buf_range function appears to access the bp pointer without checking for NULL. This explains why debug patch in vfs_bio.c was not triggered – the system panics here first.
Comment 18 Alan Somers freebsd_committer freebsd_triage 2026-03-29 15:20:42 UTC
1) You can't use dtrace in this case because dtrace works asynchronously.  The message probably won't be printed before the panic happens.

2) The fact that you never saw that message printed even after patching the
kernel tells us that getblkx did not return an error; it just didn't initialize
bp.  That's valuable to know.  But I don't see how that can happen.  I must be
missing something.  Either that, or you put the debug printf that I supplied in the wrong location by mistake.  Because if getblkx returned an error, then that printf definitely should've printed.

3) At the very end of getblkx, just before the return, is a KASSERT that dereferences bp.  That dereference ought to trigger a page fault, but the KASSERT is disabled by default on release kernels.  If you can reproduce the problem with 16.0-CURRENT, or with a 15.0-RELEASE kernel built using the GENERIC-DEBUG configuration, that would get us closer to the cause of the problem.

4) getblkx also includes some CTR trace points.  Those things are a bit like an older, crappier version of dtrace.  But you should be able to view them from a core file.  That would also reveal some information.  But you'll have to rebuild the kernel to enable it, placing these options in your kernel config file:

options         KTR
options         KTR_BOOT_ENTRIES=1024
options         KTR_ENTRIES=(128*1024)
options         KTR_COMPILE=(KTR_ALL)
options         KTR_MASK=KTR_BUF
options         KTR_CPUMASK=0xFFFF
Comment 19 zjk 2026-03-30 12:48:37 UTC
Alan - I have the results.

I've also improved your debug instructions (now they only list the error number - and it works): now it appears twice: at startup and then during a panic. These are numbers 60 (ETIMEOUT - panic) and 16 (EBUSY - start).

You'll also find both instances in the core under the entry ALAN-DEBUG.

I also created a package: core + KTR dump + system startup image - when it appears.
Sending it. If I'm doing something wrong, let me know; I don't know if this is what you wanted.

https://www.filemail.com/d/cquppwqeturkouo
Comment 20 Alan Somers freebsd_committer freebsd_triage 2026-03-30 14:30:55 UTC
Created attachment 269230 [details]
Handle memory allocation failures more gracefully

That ETIMEDOUT clue was key.  It tells us that we're looking at a memory allocation failure.  That makes sense, considering your workload.  Unfortunately such failures are difficult to simulate during testing.  But could you please try the attached patch?  I think it should fix your problem.  With this patch, writes that would previously page fault should now fail with EINTR.

Also, would you mind telling me how much RAM and swap you have?
Comment 21 zjk 2026-03-30 15:25:27 UTC
The patch won't compile because it doesn't have the strerror function—it throws a compilation error. I don't know how to find these functions in the libraries.

I have 16GB of RAM and 16GB of swap—now empty after 3 hours of rebooting. If there's anything there, it's usually around 0.5GB after many hours of work.

Furthermore, I've seen error 16 displayed on the console twice during this time —but everything works stably despite KTR.
Comment 22 zjk 2026-03-30 15:30:57 UTC
(In reply to zjk from comment #21)
A.... ok - now I understand, it's not about the previous patch, but the new one - attached at the beginning after the signature - ok, I'll implement it and let you know.
Comment 23 zjk 2026-03-30 15:50:10 UTC
You know what – unfortunately, the file versions are different. As I mentioned, I have 15.0-RELEASE-p4 (I held off on the latest p5). So one file is OK, it automatically patches, but the other one has to be manually added. I won't be able to do that today.
Comment 24 Alan Somers freebsd_committer freebsd_triage 2026-03-30 16:10:47 UTC
That patch applies fine for me on the releng/15.0 branch.  What kind of conflict are you getting?  And if the printf patch is the problem, then don't bother with it.
Comment 25 zjk 2026-03-30 16:22:43 UTC
(In reply to Alan Somers from comment #24)
uname -a
FreeBSD almach.zjk.pl 15.0-RELEASE-p4 FreeBSD 15.0-RELEASE-p4 ALAN_DEBUG amd64


I ran an mcdiff — it looks like completely different file versions. Mine are from February 11, 2026. I'm updating them with freebsd-update. Maybe I shouldn't wait and apply the p5 patch (I've been holding off on that)?

Anyway, I can manually see what I need to change, and it's visible in the code, but in completely different places. However, I need to do it exactly right.



Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
| sys/fs/fuse/fuse_node.c  |  6 +-----
| sys/fs/fuse/fuse_vnops.c | 24 +++++++++++++++++-------
| 2 files changed, 18 insertions(+), 12 deletions(-)
|
|diff --git a/sys/fs/fuse/fuse_node.c b/sys/fs/fuse/fuse_node.c
|index f4fb993a7ca1..6768e87fdef1 100644
|--- a/sys/fs/fuse/fuse_node.c
|+++ b/sys/fs/fuse/fuse_node.c
--------------------------
Patching file sys/fs/fuse/fuse_node.c using Plan A...
Hunk #1 failed at 506.
1 out of 1 hunks failed--saving rejects to sys/fs/fuse/fuse_node.c.rej
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/sys/fs/fuse/fuse_vnops.c b/sys/fs/fuse/fuse_vnops.c
|index ad7a2dcf84ef..dd3cc5f16092 100644
|--- a/sys/fs/fuse/fuse_vnops.c
|+++ b/sys/fs/fuse/fuse_vnops.c
--------------------------
Patching file sys/fs/fuse/fuse_vnops.c using Plan A...
Hunk #1 succeeded at 324 with fuzz 2 (offset -3 lines).
Hunk #2 failed at 337.
Hunk #3 failed at 358.
Hunk #4 failed at 728.
Hunk #5 failed at 1025.
Hunk #6 failed at 2677.
Hunk #7 failed at 3211.
6 out of 7 hunks failed--saving rejects to sys/fs/fuse/fuse_vnops.c.rej
Comment 26 zjk 2026-03-30 17:44:39 UTC
Alan - I've already made the changes and the compilation is ok. But I'll test it tomorrow :)
Comment 27 zjk 2026-03-31 06:30:27 UTC
(In reply to zjk from comment #26)
If I've made all the changes correctly—I still need to check — we still have ALAN_DEBUG error 16 at startup (about 32 instances). I'll start stress testing rsync this afternoon.
Comment 28 zjk 2026-03-31 07:35:39 UTC
I have one more observation regarding system performance – I always balanced HTTP threads, for example, so that all RAM was used, but no swap. Now, without any configuration changes, I see that:

last pid: 20964; Load averages: 0.40, 0.70, 1.08 up 0+01:04:20 09:29:36
128 processes: 1 running, 127 sleeping
CPU: 1.6% user, 0.0% nice, 0.8% system, 0.8% interrupt, 96.9% idle
Mem: 2162M Active, 2388M Inact, 2595M Wired, 1545M Buf, 8649M Free
ARC:
Swap: 16G Total, 16G Free

So I suddenly have over 8.5 GB of free memory!

Question: Is this a result of better memory management by FUSE, or is there, for example, a decrease in the number of buffers associated with a degradation in overall system performance?

Should we explore a more dynamic way for FUSE to handle request buffers?
Comment 29 zjk 2026-03-31 08:59:38 UTC
OK - I ran a few tests, all with positive results:
1. rsync - passes, although it throws a few error 60 messages
2. csync2 - ok
3. find - passes with error 60 messages.
So far, there's been no system crash.
Other results: besides the massive 8.5G of free memory, "find" is significantly slower. Scanning from "/" used to take 2-3 minutes, now it's 12 (but maybe that's a result of KTR still being enabled?).
I'll continue testing the system today.

Additionally - I have the procstat -kk output for the "frozen" find process. It's stuck in fdisp_wait_answ. The process is constantly sleeping in sleepq_timedwait_sig, waiting for FUSE answers.
Comment 30 zjk 2026-04-08 13:34:16 UTC
Hi Alan,

Quick update after 8 days of testing:

    Stability: The system is rock solid. Zero Panics or reboots on 15.0-RELEASE-p4 under production web traffic. Your patch effectively fixed the immediate crash.
    Performance: As mentioned, find and rsync are still significantly slower (12 min vs 2 min baseline).

Since this is a production machine, I’m staying on this patch for now to avoid crashes, but is still the performance hit. Do you plan a 'Version 2' of patch? I'm ready to test it.
Comment 31 Alan Somers freebsd_committer freebsd_triage 2026-04-08 14:42:10 UTC
(In reply to zjk from comment #30)
It's good to hear that the patch is working for you.  As for performance, this patch should have no effect.  If you're noticing a performance difference, it might be because you have INVARIANTS enabled now and you didn't before.  When you switched from the GENERIC to the GENERIC-DEBUG kernel, that enabled invariants, which can definitely slow things down.  Also, it may be that your moosefs server is more responsive now than it was two weeks ago.

I'd really like to commit this in time for 15.1-RELEASE.  That means we have about one week.  Would you like to try building a GENERIC kernel with the patch to see if performance improves?  If not I'll commit as is.
Comment 32 zjk 2026-04-08 16:43:28 UTC
Hi Alan. 

I'm not an expert, but I just recompiled the kernel with the new options and specifically removed the invariants for fear of performance loss on a production machine. So the configuration looked like this:

cat /usr/src/sys/amd64/conf/ALAN_DEBUG
# Alan's debug configuration for FreeBSD
include GENERIC

ident ALAN_DEBUG

# Debugging options from Alan
options KTR
options KTR_BOOT_ENTRIES=1024
options KTR_ENTRIES=(128*1024)
options KTR_COMPILE=(KTR_ALL)
options KTR_MASK=KTR_BUF
options KTR_CPUMASK=0xFFFF

# Invariants to catch silent errors
#optionsINVARIANTS
#options INVARIANT_SUPPORT

i.e. only KTR is permanently enabled.

I also looked at other results: My lockstat shows that while the system is stable, we have high contention on so_rcv, so_snd, and CAM device lock.
It looks like the patch effectively serialized I/O and network operations to prevent panic, but it's hitting performance hard on UFS(?). The latency in g_io_request (21k+ nsec) suggests the I/O path is being throttled by these new locking constraints.

But you'll have to judge for yourself.

I tried it several times, and the rsync time increased from 2 to 12 minutes (more or less). I can of course test this further to confirm, but I think it comes at the expense of speed.
Comment 33 Alan Somers freebsd_committer freebsd_triage 2026-04-08 17:37:27 UTC
I forgot about KTR.  That could definitely affect performance too.  I don't use it much, because dtrace is usually superior.
Comment 34 zjk 2026-04-08 18:37:41 UTC
OPK - I'll throw out the KTR and do some tests - we'll see what happens.
Comment 35 zjk 2026-04-09 17:31:23 UTC
Hi Alan. 

A bit rushed, but I did the tests. I reverted to GENERIC with your fix. So, I'm still getting getblk failed messages. I synced my 40GB drive and several hundred thousand web content files several times. During one sync, I received about 10-20 gelblk failed messages. But of course, there was no kernel panic.

Regarding the times, after removing KTR, the synchronization time returned to about 2 minutes (it was also about 2 minutes with your fix - almost identic).

But also keep in mind that my synchronization is between two web servers: a directory on UFS - to a directory on UFS. Moosefs is bypassed (excluded from synchronization), yet there are errors number 60. I also have such synchronization on Moosefs, but I didn't measure the time for comparison, I only have the rsync time of the backup with hardlinks - in any case, it's several dozen minutes, over an hour (and there are errors number 16 here).
Comment 36 Alan Somers freebsd_committer freebsd_triage 2026-04-09 17:53:46 UTC
You can remove those debugging error messages, if they're bothering you.  There are legitimate reasons for that code path to be reached.  I'm glad to hear that the performance is back to normal.  Are you satisfied that your original problem is fixed by this patch?  If so I'll commit it, and it will be available in FreeBSD 15.1.
Comment 37 zjk 2026-04-10 08:51:00 UTC
Let's briefly summarize:
- FreeBSD 15.0 with Moosefs shares on ZFS (including home, network share),
- random panics throughout the day,
- panics forced by a simple find,
- and almost 100% of panics forced by rsync, csync2 (but excluding Moosefs shares)

After the patch:
- no spontaneous or forced panics - stable operation
- for operations on UFS - no performance degradation

Concerns:
- for Moosefs shares on ZFS - unfortunately, I don't have the full results yet. What I have now are the find results on the Moosefs share from identical computers with and without the patch: interestingly, the one with the patch runs about 10-30% faster (the results are surprising, although repeated). I need to calmly generate the read, write, and copy results using the Moosefs shares. This suggests that there's probably still potential for further improvements in FUSE.

Alan - thanks for the fix, please consider further improvements to FUSE :)

zjk
Comment 38 zjk 2026-04-10 12:29:37 UTC
I have performed a side-by-side performance comparison on FreeBSD 15.0-RELEASE using MooseFS (FUSE). The results clearly show that the patch not only fixes the panics but also significantly improves metadata operation performance.
Test Environment:

    OS: FreeBSD 15.0-RELEASE (GENERIC)
    Filesystem: MooseFS (via FUSE) with 2x replication on SSD
    Test Case: 5,000 small files creation/deletion and 512MB sequential I/O.

Performance Comparison: Alan's Patch vs. Unpatched
Operation	With Patch (Alan)	Unpatched (Baseline)	Improvement
Create 5,000 files	10.27s	24.64s	2.4x Faster
Delete 5,000 files	1.67s	11.48s	6.8x Faster
Write 512MB (Seq)	4.64s	4.19s	No significant impact
Read 512MB (Seq)	4.70s	4.47s	No significant impact
Kernel CPU (sys)	3.20s	9.06s	65% Less Overhead

Maybe I must make more testruns...

Observations:

    Metadata Efficiency: The most striking improvement is in file creation and deletion. Without the patch, the system experiences heavy lock contention, evidenced by the high sys time (9.06s). Alan's patch reduces this overhead to 3.20s.
    Unlink Performance: The deletion of files is nearly 7 times faster with the patch. 
    Throughput: Large file I/O remains stable with negligible differences, proving the patch doesn't introduce any regressions in data throughput.

The system remains rock-solid during these tests, with no "getblk failed" errors or panics.
Comment 39 zjk 2026-04-10 16:29:10 UTC
FIO Benchmark Results: Observations from a user's perspective

Hi Alan,

I am just a regular user, not a specialist, but I wanted to provide you with some solid data from my MooseFS/FUSE setup. I compared two systems: almach (with your patch, p4) and almbis (unpatched, p2).

The results look very promising, especially regarding system stability and latency spikes.

Comparison Table (Patched vs. Unpatched)
Metric	Patched (almach)	Unpatched (almbis)	Change
Mixed: Read Throughput (BW)	2617 KiB/s	1665 KiB/s	+57.2%
Mixed: Avg Read Latency	12.1 ms	19.1 ms	-36.6%
RandWrite: IOPS (4k)	2665	2080	+28.1%
RandWrite: Max Latency	1.64 s	3.40 s	-51.7%
RandRead: IOPS (4k)	6119	5322	+15.0%
RandRead: 99.99th Latency	35.9 ms	240.1 ms	-85.0%
SeqRead: Max Latency	2.83 s	10.53 s	-73.1%

Simple Observations:
    Stability: The most important thing for me is that the massive 10-second hangs are gone. With the patch, the system feels much more "fluid" and consistent.
    Speed: I was surprised to see a 57% gain in mixed workloads. It seems the patch helps the system handle reading and writing at the same time much better.
    Reliability: No more panics or weird error messages during these heavy tests.

In my humble opinion, this patch is a huge improvement. Even as a non-expert, I can see that there might be even more potential for FUSE optimizations in the future, but this fix is already a great step forward.

Thank you for your hard work on this!

zjk
Comment 40 zjk 2026-04-10 16:46:09 UTC
System Efficiency & Stability Metrics:

Metric	Patched (almach)	Unpatched (almbis)	Change
RandWrite: Context Switches	143,017	183,178	-22.0%
RandWrite: Latency stdev	18.2 ms	27.8 ms	-34.5%
SeqWrite: 99.9th Latency	859 ms	1,010 ms	-15.0%

From a user's perspective, the system with the patch feels much more stable and "lighter." The 22% reduction in context switches suggests that the kernel handles I/O operations more efficiently, without wasting resources on lock contention. Most importantly, the dramatic reduction in latency spikes (jitter) means the system no longer "freezes" during heavy workloads, which was the main cause of previous issues.

zjk