I tried to destry a zfs volume: # zfs destroy -r zroot/foo/bar cannot unmount '/foo/bar': unmount failed fstat sure enough reveals the culprit: # fstat | grep "foo/bar" ehaupt gvfsd-trash 1814 25 /foo/bar 34 drwxr-xr-x 23 r ehaupt gvfsd-trash 1814 26 /foo/bar 34 drwxr-xr-x 23 r Attaching truss to the pid I see: # truss -f -p 1814 1814: poll({ 4/POLLIN },1,2504) = 0 (0x0) 1814: getfsstat(0x0,0,MNT_NOWAIT) = 41 (0x29) 1814: getfsstat(0x80153e140,96104,MNT_NOWAIT) = 41 (0x29) 1814: poll({ 4/POLLIN },1,2992) = 0 (0x0) 1814: getfsstat(0x0,0,MNT_NOWAIT) = 41 (0x29) 1814: getfsstat(0x80153ee00,96104,MNT_NOWAIT) = 41 (0x29) 1814: poll({ 4/POLLIN },1,2924) = 0 (0x0) 1814: getfsstat(0x0,0,MNT_NOWAIT) = 41 (0x29) 1814: getfsstat(0x80153e340,96104,MNT_NOWAIT) = 41 (0x29) 1814: poll({ 4/POLLIN },1,2965) = 0 (0x0) 1814: getfsstat(0x0,0,MNT_NOWAIT) = 41 (0x29) 1814: getfsstat(0x80153e340,96104,MNT_NOWAIT) = 41 (0x29) ^C I am on: FreeBSD freebsd.local 13.0-BETA4 FreeBSD 13.0-BETA4 #5 releng/13.0-n244620-3664067ea91: Wed Mar 3 21:05:06 CET 2021 root@freebsd.local:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 gvfs version: gvfs-1.46.2
Thanks for reporting this. It has bugged me for months, I habitually use htop to kill the process. If I recall correctly, gvfsd-trash also often prevents unmount of other types of file system; not just ZFS. <https://lists.freebsd.org/pipermail/freebsd-current/2020-November/077692.html> refers to a 2013 post, gvfsd-trash considered harmful <https://lists.freebsd.org/pipermail/freebsd-gnome/2013-September/029033.html> Back in November: > As far as I can tell the problem occurs only when Thunar, > which I rarely use, is wrongly opened by Firefox. Ignore that statement. I'm almost certain that the problem occurs without a start of Thunar.
I'm asked for steps to reproduce. For me, it's very easily reproducible: 1. in Konsole, without changing to any directory on a mounted volume of a pool, touch (create) a file on the volume 2. use the context menu of Thunar, Move to Wastebasket 3. browse away from the volume, to Computer 4. in Konsole, try, fail to export the pool 5. quit Thunar 6. retry, fail to export the pool 7. use a filtered view of htop to kill (15 SIGTERM) the gvfsd-trash process 8. export the pool 9. import the pool 10. touch (create) a file on the same volume 11. use the context menu of PCManFM-Qt, Move to Trash 12. browse away from the volume, to Computer 13. try, fail to export the pool 14. quit PCManFM-Qt 15. retry, fail to export the pool … and so on.
Thank you. For now please try that patch from https://github.com/helloSystem/ISO/files/6177145/glib-no-kqueue.txt to disable devel/glib20's suspicious use of kqueue, and see if it helps. I'd rather not debug gvfs's trashcan unless absolutely necessary.
I might be right about it being a glib kqueue issue. On https://wiki.freebsd.org/Gnome under "GNOME TODO" it says: gvfsd-trash | starts monitoring mount points, thus making them unmountable. This is actually a issue in glib's monitoring code.
I can't reproduce this. # dd if=/dev/zero of=/tmp/ztrash bs=1024 count=100000 # zpool create ztrash /tmp/ztrash # chown user:user /ztrash $ touch /ztrash/hello Delete /ztrash/hello in Thunar, change cwd outside mountpoint $ ls /ztrash/.Trash-1000/files hello # lsof -n | grep PID-OF-gvfsd-trash (no /ztrash path listed) # zpool export /ztrash (successful) Re-importing, creating another file, deleting it in Thunar, gvfsd-trash still doesn't have the mountpoint open in "lsof", and I can still export successfully. gvfs-1.46.1_2 FreeBSD 12.2 How is your setup different?
Created attachment 223490 [details] hack to disable kqueue in glib Somehow after much rebuilding and restarting gvfs and logging out and back in, I finally managed to reproduce this bug. gvfsd-trash opened directories under /ztrash in "lsof", and then I couldn't zfs export. With this hack to glib, I managed to disable kqueue, after which gvfsd-trash didn't stop zfs export any more, but it also didn't have any directories open in "lsof", and clicking on the trashcan in Thunar shows an empty directory. Other directory changes are not picked up in Thunar until I refresh. I was hoping by removing kqueue, it would fall back to dumber file change scanning, eg. by repeated opendir()/readdir() every few seconds, but apparently it completely stops all file monitoring instead? Looks like a different debugging approach will have to be used: isolate where in the gvfs code it starts monitoring trash on mountpoints (it seems to begin in daemon/trashlib/trashwatcher.c), and follow from there.
Created attachment 223496 [details] g_file_monitor_directory() test Here's a simple test that monitors a directory with g_file_monitor_directory() like gvfsd-trash does. While it monitors a mountpoint: On Linux: 1. "lsof" shows an inotify file descriptor, but no directory file descriptors. 2. umount succeeds. 3. glib delivers a "changed" event to GFileMonitor after umount. On FreeBSD: 1. "lsof" shows a kqueue file descriptor, AND an open file descriptor for each monitored directory. 2. umount fails with "Device busy" (EBUSY). Thus it seems the directories kept open by glib's kqueue usage are stopping the unmount. Either glib is using kqueue badly and needs fixing, or kqueue, by design, cannot monitor directories without keeping them open, and thus always stops unmounting (which could be much harder to fix). ------- To build, make a SConstruct file with these lines and runs "scons": glib_mount_monitor = Environment() glib_mount_monitor.ParseConfig('pkg-config --cflags --libs glib-2.0') glib_mount_monitor.ParseConfig('pkg-config --cflags --libs gio-2.0') glib_mount_monitor.Program('glib_mount_monitor', 'glib_mount_monitor.c') Run: ./glib_mount_monitor /path/to/mountpoint
glib's system calls: 91540: openat(AT_FDCWD,"/mountpoint/.Trash-1003",O_RDONLY,00) = 11 (0xb) kevent(10,{ 11,EVFILT_VNODE,EV_ADD|EV_CLEAR,NOTE_DELETE|NOTE_WRITE|NOTE_EXTEND|NOTE_ATTRIB|NOTE_RENAME|NOTE_REVOKE|NOTE_CLOSE_WRITE,0,0x801373400 },1,0x0,0,0x0) = 0 (0x0) Even the sample code in the kqueue(2) man page, when monitoring the mountpoint, is enough to stop umount. According to the man page, you monitor a directory by registering its file descriptor. Closing the file descriptor stops monitoring. Keeping the file descriptor open stops unmounting. NOTE_REVOKE is meant to notify you of unmounts, but it doesn't, and still stops unmounting. It's looking like a kernel bug. Can some kernel developer more familiar with kqueue please comment?
Unsure it's related, but I use kqueue for my simple log monitoring software, and sometimes I don't get any kevent about modified files that are monitored. I thought it's a problem with my software, which works like a charm on macOS kqueue implementation.
If there is an open file descriptor referencing any file on the volume, non-forced unmount fails with EBUSY. This is by design. It has nothing to do with kqueue.
What happens when -f is provided to the zfs destroy command to force unmount? From zfs man page: -f Force an unmount of any file systems using the "zfs unmount -f" command. This option has no effect on non-file systems or unmounted file systems. Because normal umount will not unmount when there are open files. This happens w/ UFS file systems as well.
(In reply to John-Mark Gurney from comment #11) If -f is provided, the volume is destroyed without error.
devel/glib20 have also FAM alternate backend options, but it will not help. This happen because open() on FreeBSD can be used only for RD, WR, RD+WR. Same time on MAC it can be used with O_EVTONLY - witch is do not block file/dir/mount point. Also FreeBSD does not have O_NOATIME for open(). Only umount -f is workaround for this now.
(In reply to Graham Perrin from comment #1) > … I habitually use htop to kill the process. … (In reply to rozhuk.im from comment #13) > … Only umount -f is workaround for this now. Please, why not kill gvfsd-trash? ---- I dislike applying force when it's not known which files are open; and I don't get a list of open files before attempting to unmount.
(In reply to Graham Perrin from comment #14) > I don't get a list of open files before attempting to unmount. You will soon get that automatically: (1) sysutils/lsof patched with https://github.com/lsof-org/lsof/pull/151 (2) sysutils/bsdisks 0.25 (3) devel/gvfs (git 981787fd860346d2e43104d45dd650a84503d6a6 if possible) (4) GIO-based file manager, eg. Nemo/Thunar/PCmanFM and when you try unmount from the file manager and it fails with EBUSY, gvfs will "lsof -t /mountpoint", and show a nice dialog with the processes that still have open files within the mountpoint (and their icons and command line arguments), with an "Unmount anyway" button to force it. This has existed for years, it just never worked on FreeBSD until now due to missing features in bsdisks and lsof. While I am at it, can someone please make the gvfs port depend on lsof, bug 254322? O_EVTONLY sounds interesting. Did Apple open source that code?
(In reply to Graham Perrin from comment #14) > Please, why not kill gvfsd-trash? Because you may have many apps that do some fs monitoring. (In reply to Damjan Jovanovic from comment #15) > O_EVTONLY sounds interesting. Did Apple open source that code? I do not dig into this. Even if source code available, I suppose it is much different than we have in our base. I try to start discussion about this @freebsd-hackers mail list 4 years ago ("open(): O_EVTONLY and O_NOATIME") but no ones care. Next attempt was try to add kernel unmount notifications to catch it in user space apps and close all descriptors in mine FAM implementation for glib20: https://reviews.freebsd.org/D19690 but I do not finish it, and I do not like this design: it can not close all descriptors on first unmount attempt without some sort of sleep in kernel to wait for all processes receive and handle unmount notification.
(In reply to Damjan Jovanovic from comment #15) Thank you, also: root@mowa219-gjp4-8570p:~ # lsof /Volumes/t500/ lsof: WARNING: device cache mismatch: /dev/drm/230 lsof: WARNING: no ZFS support has been defined. See 00FAQ for more information. lsof: WARNING: /root/.lsof_mowa219-gjp4-8570p was updated. root@mowa219-gjp4-8570p:~ #
(In reply to Damjan Jovanovic from comment #15) > O_EVTONLY sounds interesting. Did Apple open source that code? XNU kernel is opensourced and available e.g. at github: https://github.com/apple/darwin-xnu/ But... better alternative called O_PATH is coming: https://reviews.freebsd.org/D29323. Unlike O_EVTONLY which only allows unmount, it totally blocks filesystem access (with some exceptions like fstat()), thus making it appropriate to work over fuse, nfs and other slow fs.
Created attachment 224174 [details] glib20-libinotify-O_PATH.patch Finally O_PATH support has hit the tree in series of dozen commits starting from c78e124. It possible to test it with attached patch which includes development version of libinotify with O_PATH support and patch which enables libinotify support in glib20. Just do following steps: 1. Install 14-CURRENT as of 16 apr or newer. 2. Apply patch to the ports tree. 3. Rebuild devel/libinotify 4. Rebuild devel/glib20 with LIBINOTIFY file monitoring backend activated in port options. 5. PROFIT!!! It may have sense to poke glib kqueue file monitoring backend maintainers to add support for O_PATH open(2) flag. P.S. MFC of O_PATH support to 13-STABLE is planned in one week.
(In reply to Vladimir Kondratyev from comment #19) > Finally O_PATH support has hit the tree in series > of dozen commits starting from c78e124. Big thanks to @kib for O_PATH implementation
(In reply to Vladimir Kondratyev from comment #20) > Big thanks to @kib for O_PATH implementation A big +1
I will look into O_PATH after MFC to 13 and try to integrate with my alternate FAM into glib20.
(In reply to Vladimir Kondratyev from comment #18) > But... better alternative called O_PATH It is not clear to me: can I use fd = open(O_PATH) in kqueue to monitor fs changes events? > Rebuild devel/glib20 with LIBINOTIFY file monitoring backend activated in port options. You should promote this as option for glib20 port, probably as default on.
(In reply to rozhuk.im from comment #23) > It is not clear to me: can I use fd = open(O_PATH) > in kqueue to monitor fs changes events? Yes, certainly. You should append typical flag suffix O_NOFOLLOW|O_NONBLOCK|O_CLOEXEC to handle symlinks, FIFOs and execve() as well. > You should promote this as option for glib20 port, probably as default on. I'll do it after pushing the O_PATH changes in to master libinotify branch. Unfortunately, it will take some time as my HDD which held VMs required for regression testing has died, so I have to recreate these VMs.
O_NOATIME is missing in FreeBSD. It is required to not update dir access time on dir read.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=255707
Using FreeBSD 14.0-CURRENT: since I last encountered this bug, a long time has passed (months, maybe). Has there been a fix somewhere? Or have I been lucky?
(In reply to Graham Perrin from comment #27) Bug should be gone after recent glib20 update, try it
(In reply to Dima Panov from comment #28) Thanks, do you mean the "patches for lock getfsent() usage [1]" in 943a699e47d7205726f1f77983fef0b5e6c82a56? <https://github.com/freebsd/freebsd-ports/commit/943a699e47d7205726f1f77983fef0b5e6c82a56> <https://www.freshports.org/devel/glib20/#history> I gained 2.72.3,2 two days ago: % zgrep glib /var/log/messages.0.bz2 Jul 19 06:54:18 mowa219-gjp4-8570p-freebsd pkg[39708]: glib upgraded: 2.72.2,2 -> 2.72.3,2 %
Still occurring in FreeBSD 13.1-RELEASE with gvfs 1.50.2 and glib 2.72.3,2.
Steps to debug this: 1. Install devel/gdb 2. add WITH_DEBUG_PORTS+=devel/glib20 to /etc/make.conf 3. rebuild devel/glib20, reinstall, reboot system 4. sysctl kern.corefile=/tmp/%N.%I.core sysctl kern.compress_user_cores=1 sysctl kern.compress_user_cores_level=3 sysctl debug.ncores=16 5. reproduce crash 6. run this script: http://www.netlab.linkpc.net/download/software/os_cfg/FBSD/13/base/root/bin/coredumper.sh 7. post script output here May be only steps 4-7 will give enough info.
It doesn't crash. Nobody ever said it crashed. It just prevents filesystems from being unmounted.
(In reply to Dima Panov from comment #28) > Bug should be gone after recent glib20 update, try it (In reply to Dag-Erling Smørgrav from comment #30) > Still occurring in FreeBSD 13.1-RELEASE with gvfs 1.50.2 and > glib 2.72.3,2. Reproducible with the more recent version? 1.50.2_1 If reproducible: has the major upgrade to sysutils/lsof eased the situation? <https://www.freebsd.org/status/report-2022-07-2022-09/#_sysutilslsof_major_upgrade> c80f55d775ccc6a00cd9523b4fe781aa6171817a <https://github.com/FreeBSD/freebsd-ports/commit/c80f55d775ccc6a00cd9523b4fe781aa6171817a> (2022-09-27): > devel/gvfs: Depend on sysutils/lsof at run time > > When a drive cannot be unmounted and returns EBUSY, gvfs calls > "lsof -t /mountpoint" to find which processes have files open. > This list is sent over the "show-processes" signal, which allows > file managers to show which apps are preventing the unmount. > > For this to work, sysutils/lsof needs to be around.
The process preventing the unmount is gvfsd-trash.
The problem still persists. I've mitigated the issue by simply removing /usr/local/libexec/gvfsd-trash. Not very elegant but effective.