So... I've been able to reproduce this at least 3 times: The actions of baloo (kde plasma file indexer) create errors on ZFS. I see: NAME STATE READ WRITE CKSUM zhit ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/zhit0a ONLINE 0 0 218 gpt/zhit1a ONLINE 0 0 218 errors: Permanent errors have been detected in the following files: zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-11h45:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-13h30:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-12h30:/share/baloo/index /home/dgilbert/.local/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-13h45:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-12h45:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-12h15:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-11h30:/share/baloo/index zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-13h15:/share/baloo/index I can hear your first question... is this the media? Both drives (4T SN850X (WD black) nvme) say: Media and Data Integrity Errors: 0 Both drives are at 0% used. No other operations seem to cause this problem. It's always baloo ... although by # of operations, Baloo is the major spend. Moving on. The indexed space is large: zhit/home/dgilbert 3.2T 13G 3.2T 0% /home/dgilbert zhit/home/dgilbert/.local 3.2T 19G 3.2T 1% /home/dgilbert/.local zhit/home/dgilbert/tmp 3.2T 22G 3.2T 1% /home/dgilbert/tmp zhit/home/dgilbert/.thunderbird 3.2T 14G 3.2T 0% /home/dgilbert/.thunderbird zhit/home/dgilbert/.cache 3.2T 7.3G 3.2T 0% /home/dgilbert/.cache yhit/retro 16T 1.7G 16T 0% /home/dgilbert/retro yhit/dgilbert 16T 5.6G 16T 0% /home/dgilbert/yhit yhit/games/wine_c 16T 1.0G 16T 0% /home/dgilbert/.wine/drive_c yhit/nextcloud 18T 1.3T 16T 7% /home/dgilbert/nextcloud vr:/home/dgilbert 3.9T 535G 3.3T 14% /d/vr/dgilbert That last nfs mount is referenced by symlink, and I'm pretty sure it's being indexed. I've seen the index as large as 50G zhit is the 2 4T nvmes (and where the index is stored and where the bad files show up). yhit is 4x 10T spinning rust with a 2T nvme index and log. dmesg is here: https://termbin.com/6ppe System is: FreeBSD hit.dclg.ca 15.0-BETA5 FreeBSD 15.0-BETA5 releng/15.0-n280912-69c726c15077 GENERIC amd64
BTW... and other than baloo creating this error, I'm using this system as my daily driver... with all sorts of activity... so to have a userland process create a zfs error ... that's seems dire.
Spinning my tires a bit on this. I realized it's 4 replications, actually. The first time this happened, ~/.local was just a directory in ~ (dgilbert). Then I made ~/.local it's own filesystem ... because a) zfs, and b) the writes from boloo are insane --- generating gigabytes of filesystem churn. For a 50G file and 8 snapshots 15 minutes apart, it was consuming over 300G. Right now (with the error sitting there)... baloo is still churning. I haven't killed it yet. I gather there's some compression. [3:41:341]root@hit:/home/dgilbert> zfs list -rt all zhit/home/dgilbert/.local NAME USED AVAIL REFER MOUNTPOINT zhit/home/dgilbert/.local 19.4G 3.16T 19.4G /home/dgilbert/.local zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-13h30 39.5K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-13h45 39.5K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-14h15 72K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-14h30 77K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-14h45 110K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-15h15 35.5K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-15h30 35.5K - 19.4G - zhit/home/dgilbert/.local@zfs-auto-snap_frequent-2025-11-13-15h45 35.5K - 19.4G - [3:42:342]root@hit:/home/dgilbert> ll -h .local/baloo ls: .local/baloo: No such file or directory [3:43:343]root@hit:/home/dgilbert> ll -h .local/share/baloo/index -rw-r--r-- 1 dgilbert wheel 28G Nov 13 07:16 .local/share/baloo/index [3:44:344]root@hit:/home/dgilbert> du -h .local/share/baloo/index 18G .local/share/baloo/index ... so that's not bad... but I don't know what's happening to the actual index.
Created attachment 265479 [details] Core dump #1
Created attachment 265480 [details] Core dump #2
Added a couple of kernel dumps here... #2 is particularly dire --- I had to back out the last dozen transactions to get the pool to boot again.
I'm continuing to see crashes. Here's a bit of the core dump (so people don't have to open the file) from core dump #1: #5 0xffffffff8107bb98 in trap_fatal (frame=0xfffffe015c0dd910, eva=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:969 type = <optimized out> handled = <optimized out> #6 <signal handler called> No locals. #7 pctrie_node_load (p=p@entry=0x80000000000048, smr=0x0, access=PCTRIE_LOCKED) at /usr/src/sys/kern/subr_pctrie.c:123 No locals. #8 pctrie_root_load (ptree=ptree@entry=0x80000000000048, smr=0x0, access=PCTRIE_LOCKED) at /usr/src/sys/kern/subr_pctrie.c:164 No locals. #9 _pctrie_lookup_node (ptree=ptree@entry=0x80000000000048, node=0x0, index=16045693110842147038, smr=0x0, access=PCTRIE_LOCKED, parent_out=<optimized out>) at /usr/src/sys/kern/subr_pctrie.c:299 parent = 0x0 slot = <optimized out> Now... Core dump #2 has something that might mean it's a different bug. It might just be the system trying to deal with zfs corruption caused by #1. However, this just occurred last night (uploaded as core dump #3): #5 0xffffffff81079b98 in trap_fatal (frame=0xfffffe015c0dd910, eva=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:969 type = <optimized out> handled = <optimized out> #6 <signal handler called> No locals. #7 pctrie_node_load (p=p@entry=0x80000000000048, smr=0x0, access=PCTRIE_LOCKED) at /usr/src/sys/kern/subr_pctrie.c:123 No locals. #8 pctrie_root_load (ptree=ptree@entry=0x80000000000048, smr=0x0, access=PCTRIE_LOCKED) at /usr/src/sys/kern/subr_pctrie.c:164 No locals. #9 _pctrie_lookup_node (ptree=ptree@entry=0x80000000000048, node=0x0, index=16045693110842147038, smr=0x0, access=PCTRIE_LOCKED, parent_out=<optimized out>) at /usr/src/sys/kern/subr_pctrie.c:299 parent = 0x0 slot = <optimized out>
Created attachment 265561 [details] Core.txt #3