Bug 254024 - devel/gvfs: gvfsd-trash latches to zfs volumes
Summary: devel/gvfs: gvfsd-trash latches to zfs volumes
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Many People
Assignee: freebsd-gnome (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-05 08:14 UTC by Emanuel Haupt
Modified: 2021-04-17 19:10 UTC (History)
8 users (show)

See Also:
bugzilla: maintainer-feedback? (gnome)


Attachments
hack to disable kqueue in glib (902 bytes, patch)
2021-03-21 18:32 UTC, Damjan Jovanovic
no flags Details | Diff
g_file_monitor_directory() test (1.05 KB, text/plain)
2021-03-22 09:24 UTC, Damjan Jovanovic
no flags Details
glib20-libinotify-O_PATH.patch (4.25 KB, patch)
2021-04-16 23:56 UTC, Vladimir Kondratyev
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Emanuel Haupt freebsd_committer 2021-03-05 08:14:23 UTC
I tried to destry a zfs volume:

# zfs destroy -r zroot/foo/bar
cannot unmount '/foo/bar': unmount failed

fstat sure enough reveals the culprit:

# fstat | grep "foo/bar"
ehaupt   gvfsd-trash  1814   25 /foo/bar     34 drwxr-xr-x      23  r
ehaupt   gvfsd-trash  1814   26 /foo/bar     34 drwxr-xr-x      23  r

Attaching truss to the pid I see:

# truss -f -p 1814
 1814: poll({ 4/POLLIN },1,2504)                 = 0 (0x0)
 1814: getfsstat(0x0,0,MNT_NOWAIT)               = 41 (0x29)
 1814: getfsstat(0x80153e140,96104,MNT_NOWAIT)   = 41 (0x29)
 1814: poll({ 4/POLLIN },1,2992)                 = 0 (0x0)
 1814: getfsstat(0x0,0,MNT_NOWAIT)               = 41 (0x29)
 1814: getfsstat(0x80153ee00,96104,MNT_NOWAIT)   = 41 (0x29)
 1814: poll({ 4/POLLIN },1,2924)                 = 0 (0x0)
 1814: getfsstat(0x0,0,MNT_NOWAIT)               = 41 (0x29)
 1814: getfsstat(0x80153e340,96104,MNT_NOWAIT)   = 41 (0x29)
 1814: poll({ 4/POLLIN },1,2965)                 = 0 (0x0)
 1814: getfsstat(0x0,0,MNT_NOWAIT)               = 41 (0x29)
 1814: getfsstat(0x80153e340,96104,MNT_NOWAIT)   = 41 (0x29)
^C

I am on:
FreeBSD freebsd.local 13.0-BETA4 FreeBSD 13.0-BETA4 #5 releng/13.0-n244620-3664067ea91: Wed Mar  3 21:05:06 CET 2021     root@freebsd.local:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

gvfs version:
gvfs-1.46.2
Comment 1 Graham Perrin 2021-03-20 13:03:28 UTC
Thanks for reporting this. It has bugged me for months, I habitually use htop to kill the process. 

If I recall correctly, gvfsd-trash also often prevents unmount of other types of file system; not just ZFS. 

<https://lists.freebsd.org/pipermail/freebsd-current/2020-November/077692.html> refers to a 2013 post, 

gvfsd-trash considered harmful
<https://lists.freebsd.org/pipermail/freebsd-gnome/2013-September/029033.html> 

Back in November: 

> As far as I can tell the problem occurs only when Thunar, 
> which I rarely use, is wrongly opened by Firefox.

Ignore that statement. I'm almost certain that the problem occurs without a start of Thunar.
Comment 2 Graham Perrin 2021-03-21 09:57:09 UTC
I'm asked for steps to reproduce. 

For me, it's very easily reproducible: 

1. in Konsole, without changing to any directory on a mounted volume of a pool, touch (create) a file on the volume

2. use the context menu of Thunar, Move to Wastebasket

3. browse away from the volume, to Computer

4. in Konsole, try, fail to export the pool

5. quit Thunar

6. retry, fail to export the pool

7. use a filtered view of htop to kill (15 SIGTERM) the gvfsd-trash process

8. export the pool

9. import the pool

10. touch (create) a file on the same volume

11. use the context menu of PCManFM-Qt, Move to Trash

12. browse away from the volume, to Computer

13. try, fail to export the pool

14. quit PCManFM-Qt

15. retry, fail to export the pool

… and so on.
Comment 3 Damjan Jovanovic 2021-03-21 10:16:21 UTC
Thank you. For now please try that patch from https://github.com/helloSystem/ISO/files/6177145/glib-no-kqueue.txt to disable devel/glib20's suspicious use of kqueue, and see if it helps. I'd rather not debug gvfs's trashcan unless absolutely necessary.
Comment 4 Damjan Jovanovic 2021-03-21 13:24:02 UTC
I might be right about it being a glib kqueue issue. On https://wiki.freebsd.org/Gnome under "GNOME TODO" it says:

gvfsd-trash | starts monitoring mount points, thus making them unmountable. This is actually a issue in glib's monitoring code.
Comment 5 Damjan Jovanovic 2021-03-21 16:45:03 UTC
I can't reproduce this.

# dd if=/dev/zero of=/tmp/ztrash bs=1024 count=100000
# zpool create ztrash /tmp/ztrash
# chown user:user /ztrash
$ touch /ztrash/hello
Delete /ztrash/hello in Thunar, change cwd outside mountpoint
$ ls /ztrash/.Trash-1000/files
hello
# lsof -n | grep PID-OF-gvfsd-trash
(no /ztrash path listed)
# zpool export /ztrash
(successful)

Re-importing, creating another file, deleting it in Thunar, gvfsd-trash still doesn't have the mountpoint open in "lsof", and I can still export successfully.

gvfs-1.46.1_2
FreeBSD 12.2

How is your setup different?
Comment 6 Damjan Jovanovic 2021-03-21 18:32:33 UTC
Created attachment 223490 [details]
hack to disable kqueue in glib

Somehow after much rebuilding and restarting gvfs and logging out and back in, I finally managed to reproduce this bug. gvfsd-trash opened directories under /ztrash in "lsof", and then I couldn't zfs export.

With this hack to glib, I managed to disable kqueue, after which gvfsd-trash didn't stop zfs export any more, but it also didn't have any directories open in "lsof", and clicking on the trashcan in Thunar shows an empty directory. Other directory changes are not picked up in Thunar until I refresh.

I was hoping by removing kqueue, it would fall back to dumber file change scanning, eg. by repeated opendir()/readdir() every few seconds, but apparently it completely stops all file monitoring instead?

Looks like a different debugging approach will have to be used: isolate where in the gvfs code it starts monitoring trash on mountpoints (it seems to begin in daemon/trashlib/trashwatcher.c), and follow from there.
Comment 7 Damjan Jovanovic 2021-03-22 09:24:01 UTC
Created attachment 223496 [details]
g_file_monitor_directory() test

Here's a simple test that monitors a directory with g_file_monitor_directory() like gvfsd-trash does.

While it monitors a mountpoint:

On Linux:
1. "lsof" shows an inotify file descriptor, but no directory file descriptors.
2. umount succeeds.
3. glib delivers a "changed" event to GFileMonitor after umount.

On FreeBSD:
1. "lsof" shows a kqueue file descriptor, AND an open file descriptor for each monitored directory.
2. umount fails with "Device busy" (EBUSY).

Thus it seems the directories kept open by glib's kqueue usage are stopping the unmount.

Either glib is using kqueue badly and needs fixing, or kqueue, by design, cannot monitor directories without keeping them open, and thus always stops unmounting (which could be much harder to fix).

-------

To build, make a SConstruct file with these lines and runs "scons":
glib_mount_monitor = Environment()
glib_mount_monitor.ParseConfig('pkg-config --cflags --libs glib-2.0')
glib_mount_monitor.ParseConfig('pkg-config --cflags --libs gio-2.0')
glib_mount_monitor.Program('glib_mount_monitor', 'glib_mount_monitor.c')

Run: ./glib_mount_monitor /path/to/mountpoint
Comment 8 Damjan Jovanovic 2021-03-22 09:58:43 UTC
glib's system calls:

91540: openat(AT_FDCWD,"/mountpoint/.Trash-1003",O_RDONLY,00) = 11 (0xb)
kevent(10,{ 11,EVFILT_VNODE,EV_ADD|EV_CLEAR,NOTE_DELETE|NOTE_WRITE|NOTE_EXTEND|NOTE_ATTRIB|NOTE_RENAME|NOTE_REVOKE|NOTE_CLOSE_WRITE,0,0x801373400 },1,0x0,0,0x0) = 0 (0x0)



Even the sample code in the kqueue(2) man page, when monitoring the mountpoint, is enough to stop umount.

According to the man page, you monitor a directory by registering its file descriptor. Closing the file descriptor stops monitoring. Keeping the file descriptor open stops unmounting. NOTE_REVOKE is meant to notify you of unmounts, but it doesn't, and still stops unmounting.

It's looking like a kernel bug. Can some kernel developer more familiar with kqueue please comment?
Comment 9 dmilith 2021-03-27 12:14:24 UTC
Unsure it's related, but I use kqueue for my simple log monitoring software, and sometimes I don't get any kevent about modified files that are monitored. I thought it's a problem with my software, which works like a charm on macOS kqueue implementation.
Comment 10 Konstantin Belousov freebsd_committer 2021-03-27 12:56:36 UTC
If there is an open file descriptor referencing any file on the volume,
non-forced unmount fails with EBUSY.  This is by design.  It has nothing to
do with kqueue.
Comment 11 John-Mark Gurney freebsd_committer 2021-03-30 18:11:36 UTC
What happens when -f is provided to the zfs destroy command to force unmount?

From zfs man page:
         -f      Force an unmount of any file systems using the "zfs unmount
                 -f" command. This option has no effect on non-file systems or
                 unmounted file systems.

Because normal umount will not unmount when there are open files.  This happens w/ UFS file systems as well.
Comment 12 Emanuel Haupt freebsd_committer 2021-03-31 20:07:17 UTC
(In reply to John-Mark Gurney from comment #11)
If -f is provided, the volume is destroyed without error.
Comment 13 rozhuk.im 2021-04-03 02:57:57 UTC
devel/glib20 have also FAM alternate backend options, but it will not help.

This happen because open() on FreeBSD can be used only for RD, WR, RD+WR.
Same time on MAC it can be used with O_EVTONLY - witch is do not block file/dir/mount point.

Also FreeBSD does not have O_NOATIME for open().

Only umount -f is workaround for this now.
Comment 14 Graham Perrin 2021-04-03 03:17:54 UTC
(In reply to Graham Perrin from comment #1)

> … I habitually use htop to kill the process. …

(In reply to rozhuk.im from comment #13)

> … Only umount -f is workaround for this now.

Please, why not kill gvfsd-trash?

----

I dislike applying force when it's not known which files are open; and I don't get a list of open files before attempting to unmount.
Comment 15 Damjan Jovanovic 2021-04-03 05:42:59 UTC
(In reply to Graham Perrin from comment #14)

> I don't get a list of open files before attempting to unmount.

You will soon get that automatically:
(1) sysutils/lsof patched with https://github.com/lsof-org/lsof/pull/151
(2) sysutils/bsdisks 0.25
(3) devel/gvfs (git 981787fd860346d2e43104d45dd650a84503d6a6 if possible)
(4) GIO-based file manager, eg. Nemo/Thunar/PCmanFM

and when you try unmount from the file manager and it fails with EBUSY, gvfs will "lsof -t /mountpoint", and show a nice dialog with the processes that still have open files within the mountpoint (and their icons and command line arguments), with an "Unmount anyway" button to force it. This has existed for years, it just never worked on FreeBSD until now due to missing features in bsdisks and lsof.

While I am at it, can someone please make the gvfs port depend on lsof, bug 254322?


O_EVTONLY sounds interesting. Did Apple open source that code?
Comment 16 rozhuk.im 2021-04-03 06:12:53 UTC
(In reply to Graham Perrin from comment #14)

> Please, why not kill gvfsd-trash?

Because you may have many apps that do some fs monitoring.



(In reply to  Damjan Jovanovic from comment #15)

> O_EVTONLY sounds interesting. Did Apple open source that code?

I do not dig into this.
Even if source code available, I suppose it is much different than we have in our base.

I try to start discussion about this @freebsd-hackers mail list 4 years ago ("open(): O_EVTONLY and O_NOATIME") but no ones care.

Next attempt was try to add kernel unmount notifications to catch it in user space apps and close all descriptors in mine FAM implementation for glib20:
https://reviews.freebsd.org/D19690
but I do not finish it, and I do not like this design: it can not close all descriptors on first unmount attempt without some sort of sleep in kernel to wait for all processes receive and handle unmount notification.
Comment 17 Graham Perrin 2021-04-03 06:38:08 UTC
(In reply to Damjan Jovanovic from comment #15)

Thank you, also: 

root@mowa219-gjp4-8570p:~ # lsof /Volumes/t500/
lsof: WARNING: device cache mismatch: /dev/drm/230
lsof: WARNING: no ZFS support has been defined.
      See 00FAQ for more information.
lsof: WARNING: /root/.lsof_mowa219-gjp4-8570p was updated.
root@mowa219-gjp4-8570p:~ #
Comment 18 Vladimir Kondratyev freebsd_committer 2021-04-09 12:21:50 UTC
(In reply to Damjan Jovanovic from comment #15)
> O_EVTONLY sounds interesting. Did Apple open source that code?

XNU kernel is opensourced and available e.g. at github: https://github.com/apple/darwin-xnu/

But... better alternative called O_PATH is coming: https://reviews.freebsd.org/D29323. Unlike O_EVTONLY which only allows unmount, it totally blocks filesystem access (with some exceptions like fstat()), thus making it appropriate to work over fuse, nfs and other slow fs.
Comment 19 Vladimir Kondratyev freebsd_committer 2021-04-16 23:56:44 UTC
Created attachment 224174 [details]
glib20-libinotify-O_PATH.patch

Finally O_PATH support has hit the tree in series of dozen commits starting from c78e124.

It possible to test it with attached patch which includes development version of libinotify with O_PATH support and patch which enables libinotify support in glib20.

Just do following steps:

1. Install 14-CURRENT as of 16 apr or newer.
2. Apply patch to the ports tree.
3. Rebuild devel/libinotify
4. Rebuild devel/glib20 with LIBINOTIFY file monitoring backend activated in port options.
5. PROFIT!!!

It may have sense to poke glib kqueue file monitoring backend maintainers to add support for O_PATH open(2) flag.

P.S. MFC of O_PATH support to 13-STABLE is planned in one week.
Comment 20 Vladimir Kondratyev freebsd_committer 2021-04-17 00:06:00 UTC
(In reply to Vladimir Kondratyev from comment #19)
> Finally O_PATH support has hit the tree in series
> of dozen commits starting from c78e124.
Big thanks to @kib for O_PATH implementation
Comment 21 Emanuel Haupt freebsd_committer 2021-04-17 08:01:29 UTC
(In reply to Vladimir Kondratyev from comment #20)
> Big thanks to @kib for O_PATH implementation

A big +1
Comment 22 rozhuk.im 2021-04-17 12:43:43 UTC
I will look into O_PATH after MFC to 13 and try to integrate with my alternate FAM into glib20.
Comment 23 rozhuk.im 2021-04-17 14:55:11 UTC
(In reply to Vladimir Kondratyev from comment #18)

> But... better alternative called O_PATH

It is not clear to me: can I use fd = open(O_PATH) in kqueue to monitor fs changes events?


> Rebuild devel/glib20 with LIBINOTIFY file monitoring backend activated in port options.

You should promote this as option for glib20 port, probably as default on.
Comment 24 Vladimir Kondratyev freebsd_committer 2021-04-17 19:10:09 UTC
(In reply to rozhuk.im from comment #23)
> It is not clear to me: can I use fd = open(O_PATH)
> in kqueue to monitor fs changes events?
Yes, certainly. You should append typical flag suffix O_NOFOLLOW|O_NONBLOCK|O_CLOEXEC to handle symlinks, FIFOs and execve() as well.

> You should promote this as option for glib20 port, probably as default on.
I'll do it after pushing the O_PATH changes in to master libinotify branch.
Unfortunately, it will take some time as my HDD which held VMs required for regression testing has died, so I have to recreate these VMs.