Bug 258056 - lang/go118: musn't poll /dev/fuse
Summary: lang/go118: musn't poll /dev/fuse
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: Dmitri Goutnik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-26 10:15 UTC by Keith White
Modified: 2023-07-04 13:18 UTC (History)
14 users (show)

See Also:


Attachments
net/rclone patch to add CMOUNT option (633 bytes, patch)
2022-07-26 11:28 UTC, Keith White
no flags Details | Diff
dtrace wrapper to patch kevent "on the fly" (1.85 KB, text/plain)
2022-07-26 11:43 UTC, Keith White
no flags Details
lang/go118 special case /dev/fuse on FreeBSD (407 bytes, patch)
2022-07-27 13:09 UTC, Keith White
no flags Details | Diff
bazil/fuse workaround (577 bytes, patch)
2022-08-09 08:05 UTC, Yuval Pavel Zholkover
no flags Details | Diff
rclone-1.59.0_3.diff (1.06 KB, patch)
2022-08-09 17:35 UTC, Dmitri Goutnik
tremere: maintainer-approval+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Keith White 2021-08-26 10:15:29 UTC
After this commit, rclone mount no longer works.

# rclone mount --http-url https://www.freebsd.org/ :http: /mnt --read-only
2021/07/27 07:06:59 Fatal error: failed to umount FUSE fs: resource temporarily unavailable

==================
7b8622fa220b9c08041102f638f848c48e022644 is the first bad commit
commit 7b8622fa220b9c08041102f638f848c48e022644
Author: Alan Somers <asomers@FreeBSD.org>
Date: Tue Jun 15 17:17:28 2021 -0600

fusefs: support EVFILT_WRITE on /dev/fuse

/dev/fuse is always ready for writing, so it's kind of dumb to poll it.
But some applications do it anyway. Better to return ready than EINVAL.

MFC after: 2 weeks
Reviewed by: emaste, pfg
Differential Revision: https://reviews.freebsd.org/D30784

sys/fs/fuse/fuse_device.c | 22 +++++++++++++++++++++-
tests/sys/fs/fusefs/mockfs.cc | 20 +++++++++++++++++---
2 files changed, 38 insertions(+), 4 deletions(-)
=====================

Restoring the previous EINVAL fixes the problem for me.
========
diff --git a/sys/fs/fuse/fuse_device.c b/sys/fs/fuse/fuse_device.c
index f8807d6d1c2..62b71ae1514 100644
--- a/sys/fs/fuse/fuse_device.c
+++ b/sys/fs/fuse/fuse_device.c
@@ -193,7 +193,7 @@ fuse_device_filter(struct cdev *dev, struct knote *kn)
                error = 0;
        } else if (error == 0 && kn->kn_filter == EVFILT_WRITE) {
                kn->kn_fop = &fuse_device_wfiltops;
-               error = 0;
+               error = EINVAL;
        } else if (error == 0) {
                error = EINVAL;
                kn->kn_data = error;
========

# rclone --version
rclone v1.55.1-DEV
- os/type: freebsd
- os/arch: amd64
- go/version: go1.16.6
- go/linking: dynamic
- go/tags: none

...keith
Comment 1 Alan Somers freebsd_committer freebsd_triage 2021-08-26 13:16:57 UTC
That's quite weird.  Have you tried using ktrace to see exactly where the failure is?
Comment 2 Keith White 2021-08-26 22:06:39 UTC
I have tried.  I cannot be of much help here it seems!  It's really not clear to me what ktrace is saying. There's little observable difference in the trace between the working version and the broken version.  What should I be looking for?
Comment 3 Alan Somers freebsd_committer freebsd_triage 2021-08-26 22:11:00 UTC
The "resource temporarily unavailable" message is the string representation of EAGAIN.  So when you run kdump, you should search for EAGAIN.  The last place it occurs is usually the critical thing that caused the program to exit.
Comment 4 Keith White 2021-08-26 22:38:49 UTC
I see, thanks!  I still need help though.  Do you suspect the application?

The "bad" version has: 
  5632 rclone   RET   fork 0
  5632 rclone   CALL  sigfastblock(0x1,0x82ac02038)
  5632 rclone   RET   sigfastblock 0
  5632 rclone   CALL  thr_self(0x7fffdf9faed8)
  5632 rclone   RET   thr_self 0
  5632 rclone   CALL  sigaltstack(0,0x7fffdf9fae98)
  5632 rclone   RET   sigaltstack 0
  5632 rclone   CALL  sigaltstack(0x7fffdf9fae60,0)
  5632 rclone   RET   sigaltstack 0
  5632 rclone   CALL  sigprocmask(SIG_SETMASK,0x7fffdf9faea8,0)
  5632 rclone   RET   sigprocmask 0
  5632 rclone   CALL  read(0x8,0xc000600000,0x21000)
  5632 rclone   RET   read -1 errno 35 Resource temporarily unavailable
  5632 rclone   CALL  close(0x8)
  5632 rclone   RET   close 0

Whereas the "good" version has:
  5640 rclone   RET   fork 0
  5640 rclone   CALL  sigfastblock(0x1,0x82ac02038)
  5640 rclone   RET   sigfastblock 0
  5640 rclone   CALL  thr_self(0x7fffdf9faed8)
  5640 rclone   RET   thr_self 0
  5640 rclone   CALL  sigaltstack(0,0x7fffdf9fae98)
  5640 rclone   RET   sigaltstack 0
  5640 rclone   CALL  sigaltstack(0x7fffdf9fae60,0)
  5640 rclone   RET   sigaltstack 0
  5640 rclone   CALL  sigprocmask(SIG_SETMASK,0x7fffdf9faea8,0)
  5640 rclone   RET   sigprocmask 0
  5640 rclone   CALL  read(0x8,0xc0006c6000,0x21000)
  5640 rclone   RET   nanosleep 0
  5640 rclone   CALL  compat11.kevent(0x5,0,0,0x7fffdfffd698,0x40,0x7fffdfffd670
)
  5640 rclone   STRU  struct kevent_freebsd11[] = {  }

So the read(8, ...) is what is failing.  Finding this in net/rclone will be a challenge...

(fd8 = "/dev/fuse")
Comment 5 Alan Somers freebsd_committer freebsd_triage 2021-08-27 00:54:52 UTC
Yes, I do suspect the application.  My theory is that rclone opens /dev/fuse in nonblocking mode and tries to register it with kevent for EVFILT_WRITE | EVFILT_READ.  Previously this would fail, so it would fallback to a blocking paradigm.  But now that succeeds, so it tries to use the non-blocking path.  But that path is entirely untested because it was previously unreachable.  That's my guess at least.  You should take it up with the rclone developers.  I don't know Go, so I won't be much help.
Comment 6 Keith White 2021-08-28 11:01:34 UTC
Just a go dabbler, me.  I'll add the port maintainer to CC.  Perhaps they have an insight?
"make -C /usr/ports/net/rclone extract" results in 635M of source; so plenty of opportunity for finger pointing!
Comment 7 Keith White 2021-08-28 11:06:41 UTC
Just a go dabbler me. But I agree, most likely a client problem.  I'll add the port maintainer in CC; perhaps they have an insight?

"make -C /usr/ports/net/rclone extract" results in 635M of source.  Plenty of opportunity for finger pointing!
Comment 8 Brad Ackerman 2021-08-31 07:09:13 UTC
I’m moving house this week but will try to get to this over the weekend.
Comment 9 Ralf van der Enden 2021-09-01 07:24:16 UTC
Hi,

I tested the command (rclone mount --http-url https://www.freebsd.org/ :http: /mnt --read-only) against rclone 1.56.0 on 13.0-RELEASE-amd64 and it works fine.

To test if the new version fixes it for you, you can check out bug #258193, which I just submitted.

Cheers,
Ralf
Comment 10 Keith White 2021-09-01 09:58:07 UTC
(In reply to Ralf van der Enden from comment #9)
Thanks for your suggestion! I updating rclone using your patch.  It does not fix the problem with 14.0-CURRENT.

After patching:

# .../net/rclone/work/stage/usr/local/bin/rclone version
rclone v1.56.0-DEV
- os/version: freebsd 14.0-current (64 bit)
- os/kernel: 14.0-current (amd64)
- os/type: freebsd
- os/arch: amd64
- go/version: go1.17
- go/linking: static
- go/tags: none

# .../net/rclone/work/stage/usr/local/bin/rclone mount  --http-url https://www.freebsd.org/ :http: /mnt --read-only
2021/09/01 05:37:33 Fatal error: failed to umount FUSE fs: resource temporarily unavailable

# uname -ap
FreeBSD e6220 14.0-CURRENT FreeBSD 14.0-CURRENT #1 main-n248890-1a4c5061fc5-dirty: Tue Aug 24 13:11:06 EDT 2021     root@e6220:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG  amd64 amd64

=============
It might not be simple.  Here's were I'm at for investigation.

On CURRENT, if you run "go test -v" in the ${WRKDIR}/rclone-1.56.0/vendor/bazil.org/fuse directory, the tests "PASS" but "TestMountOptionReadOnly" takes 60.01s [probably a timeout].  On 13.0-RELEASE (and any release before commit 7b8622fa22) that particular test takes 0.01s.

Go 1.9 added kevent for open files which might exercise different code paths. https://github.com/golang/go/issues/24331

...keith
Comment 11 Nick Craig-Wood 2021-12-01 10:21:32 UTC
There is an rclone issue about this here:

https://github.com/rclone/rclone/issues/5843


However I think the appropriate upstream issue is this one on the github.com/bazil/fuse library rclone uses:


https://github.com/bazil/fuse/issues/280

-- Nick Craig-Wood - rclone maintainer
Comment 12 Keith White 2021-12-01 14:19:40 UTC
Thanks Nick for looking at this!  I've reviewed the suggestion in https://github.com/rclone/rclone/issues/5843 to build rclone with cmount.  I can confirm that using cmount *does* work for me.

I have my suspicions that this might actually be a bug in go as hinted at in https://github.com/bazil/fuse/issues/280

...keith
Comment 13 Lars Wittebrood 2022-04-08 10:18:28 UTC
For what it is worth: I have this same problem on just upgraded systems to 12.3. On 12.2 this does NOT occur.
Rclone version is 1.57
Comment 14 Alan Somers freebsd_committer freebsd_triage 2022-04-08 15:04:32 UTC
(In reply to Lars Wittebrood from comment #13)
That's to be expected, Lars.  fuse on 12.2 supports EVFILT_WRITE, unlike 12.2 or 13.0.
Comment 15 iron.udjin 2022-07-20 22:49:06 UTC
"rclone mount" still broken on 13.1-RELEASE.
Mount bucket use sysutils/fusefs-s3fs works fine.
Comment 16 Keith White 2022-07-26 11:28:27 UTC
Created attachment 235480 [details]
net/rclone patch to add CMOUNT option
Comment 17 Keith White 2022-07-26 11:43:32 UTC
Created attachment 235481 [details]
dtrace wrapper to patch kevent "on the fly"

Changing the kevent EVFILT_WRITE to something bogus "in flight" for "/dev/fuse" allows unmodified kernels and rclones.

For the brave.  Season or file 13 as desired.

...keith
Comment 18 Keith White 2022-07-27 13:09:45 UTC
Created attachment 235506 [details]
lang/go118 special case /dev/fuse on FreeBSD

As suspected by asomers last year, go itself makes assumptions about what it can do with opened files.  The attached patch makes "/dev/fuse" unpollable, and is probably much closer to "the right thing" to do.

This go patch returns "rclone mount" functionality.
Comment 19 Alan Somers freebsd_committer freebsd_triage 2022-07-27 13:18:20 UTC
The whole point of 7b8622fa220b9c08041102f638f848c48e022644 was to make /dev/fuse pollable.  It is pollable now, in other languages.  So there's probably still a bug somewhere in the Go code.  Does Go wrongly assume that the file is ready to read if it polls ready to write?
Comment 20 Keith White 2022-07-27 13:41:35 UTC
Not an issue for you!

I think the issue is how go itself "thinks" kqueue works with a file or device.

The comments in src/os/file_unix.go would seem to indicate that files that are always ready to write are a problem for go's poll code.  It already special cases regular files:

// Don't try to use kqueue with regular files on *BSDs.
// On FreeBSD a regular file is always
// reported as ready for writing.
Comment 21 Keith White 2022-07-27 14:45:32 UTC
(In reply to Keith White from comment #20)
For more background reading.

It looks like the go bug report 19093 (referenced in src/os/file_unix.go) gives more detail as to what go "assumes" and why regular BSD files were removed from polling.
  https://github.com/golang/go/issues/19093

Polling was added in change 36800 which explains, somewhat, what's expected and why go's polling doesn't work on all filetypes on all OSes?
    https://go-review.googlesource.com/c/go/+/36800/
Comment 22 Alan Somers freebsd_committer freebsd_triage 2022-07-27 15:24:13 UTC
So it sounds like they assume that regular files will always poll readable (not true for /dev/fuse) and anything that isn't a socket or a pipe will fail when you try to register it with kqueue/epoll (true for /dev/fuse until commit 7b8622fa220b9c08041102f638f848c48e022644).  And since regular files are special cased on *BSD and fail to register with epoll, the poller code never worked for them.  In that case I'd say that your change is the correct one.

The change looks correct to me, though I haven't tested it.  Please submit it upstream too.
Comment 23 Keith White 2022-07-27 22:29:10 UTC
Added upstream as https://github.com/golang/go/issues/54100

Assuming it gets noticed and incorporated, this FreeBSD bug could be "closed" or changed to "not an issue..."

Thanks asomers for all your hints, support, etc.!
Comment 24 Ralf van der Enden 2022-08-05 09:21:46 UTC
Hi Keith,

Í've added your patch to lang/go119 and can confirm rclone mount works as expected with it. Would it be an option to inlcude it as a patch to the port and not wait for upstream ?

Cheers,
Ralf
Comment 25 Keith White 2022-08-05 12:44:33 UTC
(In reply to Ralf van der Enden from comment #24)

Hi Ralf.  Well, it's certainly a solution I use locally.  So I'd say "yes".

I see that a change to go has been committed (https://go-review.googlesource.com/c/go/+/420334) but I do not know when/if it will make it to a release.

...keith
Comment 26 Yuval Pavel Zholkover 2022-08-09 08:05:10 UTC
Created attachment 235792 [details]
bazil/fuse workaround

Hi,

As I mentioned in the upstream issue https://github.com/golang/go/issues/54100, there are a few issues here.

The Go runtime has a bug in which file descriptors in blocking mode can get added to the network poller. This is what the https://go-review.googlesource.com/c/go/+/420334 fix is about.

However I think there's an additional bug in the /dev/fuse character device:
The call to fcntl(<fd>,F_SETFL,O_RDWR|O_NONBLOCK) is reporting an ENODEV (Operation not supported by device) error, but the device is still put into non-blocking mode. Causing reads to return EAGAIN.
These EAGAIN errors are unexpected by the bazil/fuse package rclone uses.
 
bazil/fuse is doing blocking operations only anyway, so there's a simple workaround which avoids calling fcntl in the first place (attached).
Comment 27 Alan Somers freebsd_committer freebsd_triage 2022-08-09 12:27:01 UTC
Yuval, would you mind opening a separate bug for the fcntl(<fd>,F_SETFL,O_RDWR|O_NONBLOCK) issue?  I'll investigate that one separately.
Comment 28 Yuval Pavel Zholkover 2022-08-09 13:41:41 UTC
(In reply to Alan Somers from comment #27)

Thanks! I've opened bug #265736
Comment 29 Yuval Pavel Zholkover 2022-08-09 13:54:38 UTC
Any thoughts on just fixing rclone by patching the bazil/fuse module it downloads?
I'm not sure how easy it is to do under the ports patch system:
It would require updating the rclone go.mod with a local 'replace' during build
require (
	bazil.org/fuse v0.0.0-20200524192727-fb710f7dfd05
        ...
)

replace (
	bazil.org/fuse v0.0.0-20200524192727-fb710f7dfd05 => ./fork/fuse
)

to point to a local copy of bazil.org/fuse with the applied workaround.

If not, then the "full" exclude fuse devices from the netpoller fix by Dmitri Goutnik should probably be used instead:
https://go-review.googlesource.com/c/go/+/420235/
Comment 30 Dmitri Goutnik freebsd_committer freebsd_triage 2022-08-09 15:51:26 UTC
Just to be clear, is this the workaround we're talking about - 
https://github.com/golang/go/issues/54100#issuecomment-1200265368 ?

Go ports cannot patch go.mod, but they vendor dependencies before the build, and it's possible to patch bazil.org/fuse vendored copy.
Comment 31 Yuval Pavel Zholkover 2022-08-09 16:00:57 UTC
(In reply to Dmitri Goutnik from comment #30)
No, the previous one https://github.com/golang/go/issues/54100#issuecomment-1200165500. Also added here https://bugs.freebsd.org/bugzilla/attachment.cgi?id=235792.

Because bazil/fuse is using Go File.Fd(), so the file descriptor will be switched to blocking anyway.
Comment 32 commit-hook freebsd_committer freebsd_triage 2022-08-09 17:29:45 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=41e4cf7d809db0be86d7aced3107ba9eec12a4bb

commit 41e4cf7d809db0be86d7aced3107ba9eec12a4bb
Author:     Dmitri Goutnik <dmgk@FreeBSD.org>
AuthorDate: 2022-08-09 17:14:18 +0000
Commit:     Dmitri Goutnik <dmgk@FreeBSD.org>
CommitDate: 2022-08-09 17:28:23 +0000

    lang/go119: Backport upstream fix for /dev/fuse polling issue

    This a backport of Go CL 420334.

    PR:             258056
    Obtained from:  https://go-review.googlesource.com/c/go/+/420334/

 lang/go119/Makefile                               |  2 +-
 lang/go119/files/patch-src_os_file__unix.go (new) | 55 +++++++++++++++++++++++
 2 files changed, 56 insertions(+), 1 deletion(-)
Comment 33 Dmitri Goutnik freebsd_committer freebsd_triage 2022-08-09 17:35:40 UTC
Created attachment 235811 [details]
rclone-1.59.0_3.diff

Apply bazil/fuse workaround from https://github.com/golang/go/issues/54100#issuecomment-1200165500 by @paulzhol, bump PORTREVISION.

Note that this fix requires lang/go119 1.19_1 (ports 41e4cf7d809d).
Comment 34 Ralf van der Enden 2022-08-15 07:58:17 UTC
Comment on attachment 235811 [details]
rclone-1.59.0_3.diff


Tested by mounting several of my remotes and works fine.
Comment 35 commit-hook freebsd_committer freebsd_triage 2022-08-15 14:18:35 UTC
A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=95c0d83d6082be7af1f380d4f5858c2fd1fdfac2

commit 95c0d83d6082be7af1f380d4f5858c2fd1fdfac2
Author:     Dmitri Goutnik <dmgk@FreeBSD.org>
AuthorDate: 2022-08-15 14:07:58 +0000
Commit:     Dmitri Goutnik <dmgk@FreeBSD.org>
CommitDate: 2022-08-15 14:12:13 +0000

    net/rclone: Fix /dev/fuse polling issue

    Apply bazil/fuse workaround from [1], bump PORTREVISION.

    [1] https://github.com/golang/go/issues/54100#issuecomment-1200165500

    PR:             258056
    Approved by:    Ralf van der Enden <tremere@cainites.net> (maintainer)

 net/rclone/Makefile                                       |  2 +-
 .../patch-vendor_bazil.org_fuse_mount__freebsd.go (new)   | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)
Comment 36 commit-hook freebsd_committer freebsd_triage 2022-08-22 13:09:53 UTC
A commit in branch 2022Q3 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=c6701d7260fdbbe1420c554e5a98c92caefbacd3

commit c6701d7260fdbbe1420c554e5a98c92caefbacd3
Author:     Dmitri Goutnik <dmgk@FreeBSD.org>
AuthorDate: 2022-08-22 12:56:39 +0000
Commit:     Dmitri Goutnik <dmgk@FreeBSD.org>
CommitDate: 2022-08-22 12:58:27 +0000

    net/rclone: Fix /dev/fuse polling issue

    Apply bazil/fuse workaround from [1], bump PORTREVISION.

    [1] https://github.com/golang/go/issues/54100#issuecomment-1200165500

    PR:             258056
    Approved by:    Ralf van der Enden <tremere@cainites.net> (maintainer)

 net/rclone/Makefile                                       |  2 +-
 .../patch-vendor_bazil.org_fuse_mount__freebsd.go (new)   | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)
Comment 37 commit-hook freebsd_committer freebsd_triage 2022-08-22 13:09:56 UTC
A commit in branch 2022Q3 references this bug:

URL: https://cgit.FreeBSD.org/ports/commit/?id=e27b4194097ae00c09ee7f1ad43316d8e844f537

commit e27b4194097ae00c09ee7f1ad43316d8e844f537
Author:     Dmitri Goutnik <dmgk@FreeBSD.org>
AuthorDate: 2022-08-22 12:55:48 +0000
Commit:     Dmitri Goutnik <dmgk@FreeBSD.org>
CommitDate: 2022-08-22 12:58:27 +0000

    lang/go118: Backport upstream fix for /dev/fuse polling issue

    This a backport of Go CL 420334.

    PR:             258056
    Obtained from:  https://go-review.googlesource.com/c/go/+/420334/

 lang/go118/Makefile                               |  2 +-
 lang/go118/files/patch-src_os_file__unix.go (new) | 55 +++++++++++++++++++++++
 2 files changed, 56 insertions(+), 1 deletion(-)
Comment 38 Graham Perrin freebsd_committer freebsd_triage 2022-12-29 11:12:36 UTC
Thank you … please, what next? 

Partial triage: 

* status

* assignment to the committer of 41e4cf7d809db0be86d7aced3107ba9eec12a4bb, 
  95c0d83d6082be7af1f380d4f5858c2fd1fdfac2, 
  c6701d7260fdbbe1420c554e5a98c92caefbacd3, and 
  e27b4194097ae00c09ee7f1ad43316d8e844f537

* CC kern@ for component kern
Comment 39 Nuno Teixeira freebsd_committer freebsd_triage 2023-07-04 13:18:57 UTC
Hello,

Having same problem with sysutils/restic:

---
# restic -r Work/restic-repo-test mount /mnt/restic/
enter password for repository:
repository 60a8f695 opened (version 2, compression level auto)
Now serving the repository at /mnt/restic/
Use another terminal or tool to browse the contents of this folder.
When finished, quit with Ctrl-c here or umount the mountpoint.
resource temporarily unavailable
---

Any hints?