Scenario: - FreeBSD 12.2-RELEASE-p2 #4 r368500M - autofs on /net using a self-developed /etc/autofs/special_hosts3 (similar to the regular special_hosts) Result: - On every auto mount, an entry is added to /var/db/mounttab - On no auto unmount any entry is deleted from /var/db/mounttab - When forcibly timing out all mounts and none remain, /var/db/mounttab is still full of (even duplicated) entries: [1]# df -t nfs ; automount -u ; echo after ; df -t nfs ; cat /var/db/mounttab Filesystem 1K-blocks Used Avail Capacity Mounted on gandalf:/z/ss 65015520 629996 64385524 1% /net/gandalf/z/ss after 1608103434 hal /z/SRC/FreeBSD/releng/12.2 1608117698 hal /z/release/FreeBSD-ports/amd64/packages-12 1608117947 hal /z/VOL/FreeBSD-ports 1608145583 hal /z/SRC/FreeBSD-ports/head 1608145583 hal /z/VOL/FreeBSD-ports 1608145584 hal /z/SRC/FreeBSD/releng/12.2 1608145643 hal /z/VOL/FreeBSD-ports 1608145685 hal /z/VOL/ftp 1608145715 hal /z/VOL/FreeBSD-ports 1608145720 hal /z/SRC/FreeBSD/releng/12.2 1608148304 hal /z/release/FreeBSD-ports/amd64/packages-12 1608148333 hal /z/VOL/FreeBSD-ports 1608148364 gandalf /z/SRC/src.local 1608148531 gandalf /z/SRC/src.local 1608148859 gandalf /z/SRC/src.local 1608151096 gandalf /z/SRC/src.local 1608151236 gandalf /z/ss [0]# \rm /var/db/mounttab [0]# Expected result: - The autounmountd should remove one matching line from /var/db/mounttab whenever it times out a mount Note: - See also bug #251395. -- Martin
It seems there is also a similar issue with /var/db/mountdtab: On the NFS server, entries only accumulate in that file but are never cleared. -- Martin
(In reply to Martin Birgmeier from comment #1) Hey Martin, This patch should also fix that. With this patch, `automount -u` will try to use umount(8) when unmounting filesystems. umount(8) does some additional work such as notifying the mountd server that a NFS mount has been unmounted, which will remove the /var/db/mountdtab entry on the mountd server. If the notification to the mountd server is successful, the /var/db/mounttab entry will also be removed. -Rob
Hi Robert, Thank you for your quick reaction. Will this be merged to releng/12.2 eventually, or only to stable/12? And do you know when? Or should I merge myself (to releng/12.2) and report the results? Best regards, Martin
(In reply to Martin Birgmeier from comment #3) Only to stable/12. Maybe 2-3 weeks into CURRENT and then merged into stable/12 a week after that.
I have merged the patch in D27801 to releng/12.2 (together with D27832 for bug #224601). This now results in the following: [0]# df -t nfs ; automount -u ; echo after ; df -t nfs ; wc /var/db/mounttab && sort -u +1 /var/db/mounttab Filesystem 1K-blocks Used Avail Capacity Mounted on hal:/z/SRC/FreeBSD/base/releng/12.2 807739509 2810359 804929150 0% /net/hal/z/SRC/FreeBSD/base/releng/12.2 umount: 04ff003a3a: statfs: No such file or directory umount: 04ff003a3a: unknown file system automount: "umount \M-p\M-]\M^?\M^?\M^?\^?04ff003a3a", pid 1101, terminated with exit status 1 after 1 3 47 /var/db/mounttab 1609357704 hal /z/SRC/FreeBSD/base/releng/12.2 [0]# -- Martin
Thanks for testing this. Looks like an issue with how I'm building up the fsid, I'll try to reproduce and see what the hang up is.. Would it be possible to get the output of `mount -v` while the automount filesystem is mounted? Mostly interested in the fsid field.
Hmmm... I already reverted the change (it's all on zfs and I rolled back). Here is the output on the system running without the two changes mentioned in comment #5: [0]# mount -v /dev/ufs/disk908a on / (ufs, NFS exported, local, soft-updates, writes: sync 20 async 80, reads: sync 747 async 5, fsid 2cfcd657fffe02fc) devfs on /dev (devfs, fsid 00ff007171000000) /dev/ufs/disk908d on /usr (ufs, NFS exported, local, soft-updates, writes: sync 2 async 162, reads: sync 1141 async 25, fsid 3afcd65706f40533) /dev/md0 on /tmp (ufs, local, soft-updates, writes: sync 2 async 14, reads: sync 9 async 0, fsid 68e1ec5f11845b81) procfs on /proc (procfs, local, fsid 01ff000202000000) fdescfs on /dev/fd (fdescfs, fsid 02ff005959000000) map -hosts3 on /net (autofs, fsid 03ff00cfcf000000) hal:/z/SRC/FreeBSD/base/releng/12.2 on /net/hal/z/SRC/FreeBSD/base/releng/12.2 (nfs, nosuid, automounted, fsid 04ff003a3a000000) [0]# -- Maritn
Created attachment 221111 [details] attempt 2: use f_mntfromname for umount This uses f_mntfromname instead, let me know if you get around to testing it.
I applied the patch to releng/12.2 and tested it. It seems to be working. Both /var/db/mounttab on the client and /var/db/mountdtab on the server are upated as expected. Thank you for your efforts. -- Martin
One more thing - how are parallel mounts and unmounts handled with respect to updating /var/db/mounttab? -- Martin
I installed the patch on all my machines now. Something is still amiss because very often, even though everything has been unmounted, /var/db/mounttab and the corresponding entries on the server in /var/db/mountdtab are not fully cleared. Such a situation can be worked around by again automounting the still-existing entries and the auto-unmounting them again. In most cases, this will clear /var/db/mounttab (and the corresponding entries on the server in /var/db/mountdtab). Maybe there is another path whereby the unmount of an auto-mounted directory can take place? -- Martin
It seems that a similar patch needs to be done in usr.sbin/autofs/autounmoountd.c in order to catch the case where filesystems are unmounted after not being used for some time. -- Martin
It seems that there are further problems with mounttab handling... I am using chroot to change into different environments, and in each environment the same automount structure is being used: # # $FreeBSD: releng/12.2/usr.sbin/autofs/auto_master 337749 2018-08-14 13:52:08Z trasz $ # # Automounter master map, see auto_master(5) for details. # /net -hosts3 -nobrowse,nosuid,intr # When using the -media special map, make sure to edit devd.conf(5) # to move the call to "automount -c" out of the comments section. #/media -media -nosuid,noatime,autoro #/- -noauto /z/netboot/920/net -hosts3 -nobrowse,nosuid,intr /z/netboot/921/net -hosts3 -nobrowse,nosuid,intr (special_hosts3 is an improved version of special_hosts.) This means that one server directory can well be mounted into multiple locations on the client. But in both mounttab and mountdtab, only the host path is added, so that when unmounting on the client it is not clear how many mounts of the same server directory are still active. It seems that the handling of /var/db/mounttab and /var/db/mountdtab need be thoroughly re-worked, including the client path (or in the case of mountdtab maybe just a count), and including making it race-free when multiple programs try to modify these files. -- Martin
Next issue: It seems that if the automount is loopback (instead of NFS, achieved by using "-fstype=nullfs" in the automount map), unmounting always first fails using the following lines but then succeeds anyway: umount: unmount of /z/NCVS/cvs.local failed: Device busy automount: "umount /z/NCVS/cvs.local", pid 65867, terminated with exit status 1 -- Martin
Created attachment 221372 [details] create a common unmount routine for automount This patch goes back to using FSID's. The problem with my original patch using FSID's was that I was calling `strlen()` on an uninitialized character array which was producing undefined results. I was able to reproduce the problem. This patch also creates a common unmount routine that will try to unmount using umount(8) and if that fails will fallback to using unmount(3). Comment #14 was likely from an error when using `f_mntfromname` and should be fixed with this patch by using the FSID instead. Comment #13 is a separate problem from this one. I was able to reproduce that issue as well though - I'll look into it. Thanks for the reporting and testing Martin, much appreciated! -Rob
I have installed the new patch on my machines. First tests seem to indicate it is not working correctly. Specifically, /var/db/mounttab is not cleared, and neither /var/db/mountdtab on the NFS server. -- Martin
... it seems the issue is with what I wrote in comment 12 - the patch also needs to be done in autounmountd.c. -- Martin
Sorry, I see it is, but it is not working for timed-out unmounts, only when using automount -u. Something is still broken for autounmountd. -- Martin
I have recompiled again, this time cleaning out the autofs obj directory entirely (I am recompiling using "make -DNO_CLEAN buildworld"). It seems to be working now... maybe there is a dependency issue in the Makefile for autofs? -- Martin
Another problem remaining is that for busy NFS mounts, /var/log/messages is now spammed with error output from umount(8). -- Martin
Maybe instead of spawning umount(8) it would be better to use the routines from usr.sbin/rpc.umntall/mounttab.c together with unmount(2), similar to what umount(8) is doing. -- Martin
(In reply to Martin Birgmeier from comment #21) Darn, I'm really coming up short on this one so far.. You may be right, trying to bring umount(8) into the scene is proving to be fraught with errors. I'll look into putting the mounttab and moundtab handling directing in automount. Out of curiosity, what were the spam errors in /var/log/messages? If they're gone, don't worry about - just curious. -Rob
Like this for example: Jan 8 16:59:02 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 68544, terminated with exit status 1 Jan 8 16:59:02 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 16:59:02 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 68553, terminated with exit status 1 Jan 8 16:59:02 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 16:59:02 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 68556, terminated with exit status 1 Jan 8 16:59:02 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 16:59:32 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 77196, terminated with exit status 1 Jan 8 16:59:32 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 16:59:32 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 77246, terminated with exit status 1 Jan 8 16:59:32 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 16:59:34 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 77654, terminated with exit status 1 Jan 8 16:59:34 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:04 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 86766, terminated with exit status 1 Jan 8 17:00:04 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:04 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 86777, terminated with exit status 1 Jan 8 17:00:04 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:04 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 86783, terminated with exit status 1 Jan 8 17:00:04 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:34 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 1512, terminated with exit status 1 Jan 8 17:00:34 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:34 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 1521, terminated with exit status 1 Jan 8 17:00:34 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:47 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 5179, terminated with exit status 1 Jan 8 17:00:47 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy Jan 8 17:00:59 mizar autounmountd[46891]: "umount 17ff003a3a000000", pid 10264, terminated with exit status 1 Jan 8 17:00:59 mizar autounmountd[46891]: cannot unmount /net/hal/z/SRC/FreeBSD/ports/head (FSID:973143831:58): Device busy The same can be seen on the console when manually using "automount -u". -- Martin
Maybe the stuff from usr.sbin/rpc.umntall/mounttab.[ch] should be put in a library which is then being used by mount(8), umount(8), autofs(8), rpc.umntall, ... -- Martin
It looks like unmount(3) is failing there too because the error is only logged after umount(8) and unmount(3) fails.
Yes, but this is expected because these directories/mount points are in use on the client and so cannot be unmounted. This is a normal use case for an auto(un)mounter. -- Martin
(In reply to Martin Birgmeier from comment #26) Those are expected results, got it. At first, I thought those were unexpected. Even unpatched `automount -u` will report 'Device Busy' errors. An unpatched `autounmountd` doesn't log 'Device Busy' errors, unless `autounmountd` was called with '-d' or '-v' - in which case, you'll see the 'Device Busy' errors. My patch did change the behavior of `autounmountd` to log all errors (including 'Device Busy' errors), it appears that was a mistake. Other than the logging issues, it sounds like the patch works as expected? If the patch is working, I'll be more inclined on getting the log messages dialed in. The error messages being spammed in /var/log/messages are all logged from the automount code. I agree, it would be handy to have a library to share some of the code between these programs.
Indeed, the patch is working. Thank you for your efforts! -- Martin
(In reply to Martin Birgmeier from comment #28) Hey Martin, I spoke too soon about this patch making it into base. The consensus is that my approach to fixing the problem with umount(8) is flawed. Maybe someone else might come up with a version of this patch that will be acceptable. Your testing and detailed bug reports are highly appreciated, thanks for your help. Sorry for not being able to follow through on this. -Rob
Thank you. I guess what I've written in comment #24 would need to be done. Is there any concrete time plan when an improved solution could be available? -- Martin
That I do not know. I’ve cc’ed Edward (@trasz), in on this (author/maintainer of automount)- he might be able to shed some light on that..
Created attachment 222930 [details] use rpc.umntall(8) after succesful unmount(2) Hey Martin, Here's another patch, if you get around to trying it out - let me know the results. Thanks, Rob
Hello Robert, Please excuse me - I currently do not have the time to test this (but I am still happily running your previous patches :-)). I have briefly looked at the patch and assume it will be working. It appears like quite a sledgehammer method, first because it seems to nondiscriminatorely unmount everything, and second because it still uses an exec (popen). I believe it would be better to build a small library for dealing with the interaction of NFS mounts and maintaining the mounttab (and probably also the mountdtab on the server) file and call that from both the automounter and rpc.umntall etc. That library should also take care of possibly simultaneous accesses to these files and properly lock/unlock them so that they cannot get corrupted while they are being updated/in use. Finally, this library should also take care of counting any possible mounts - one and the same NFS client might (and in my case, does) mount the same export twice in different places, and this must be properly handled in both mounttab and mountdtab. Best regards, Martin
(In reply to Martin Birgmeier from comment #33) No worries. Keep in mind that rpc.umntall(8) doesn't unmount anything - it only notifies the NFS server of an unmounted NFS file system. Couple other points: - popen(rpc.umntall) is only called after a successful unmount(2) - rpc.umntall -k only notifies the NFS server of an unmounted NFS file system when the NFS entry is found in the mounttab and is no longer mounted.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=88e531f38c2412bf030f4e8dd563efc45b70797e commit 88e531f38c2412bf030f4e8dd563efc45b70797e Author: Robert Wing <rew@FreeBSD.org> AuthorDate: 2021-02-17 07:51:38 +0000 Commit: Robert Wing <rew@FreeBSD.org> CommitDate: 2021-03-12 15:41:55 +0000 autofs: best effort to maintain mounttab and mountdtab When an automounted filesystem is successfully unmounted, call rpc.umntall(8) with the -k flag. rpc.umntall(8) is used to clean up /var/db/mounttab on the client and /var/db/mountdtab on the server. This is only useful for NFSv3. PR: 251906 Reviewed by: trasz Differential Revision: https://reviews.freebsd.org/D27801 usr.sbin/autofs/automount.c | 2 ++ usr.sbin/autofs/autounmountd.c | 3 ++- usr.sbin/autofs/common.c | 13 +++++++++++++ usr.sbin/autofs/common.h | 1 + 4 files changed, 18 insertions(+), 1 deletion(-)
A commit in branch stable/13 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=6fa8a157705debef78e86de378f8a929207d62dc commit 6fa8a157705debef78e86de378f8a929207d62dc Author: Robert Wing <rew@FreeBSD.org> AuthorDate: 2021-02-17 07:51:38 +0000 Commit: Robert Wing <rew@FreeBSD.org> CommitDate: 2021-05-11 23:48:44 +0000 autofs: best effort to maintain mounttab and mountdtab When an automounted filesystem is successfully unmounted, call rpc.umntall(8) with the -k flag. rpc.umntall(8) is used to clean up /var/db/mounttab on the client and /var/db/mountdtab on the server. This is only useful for NFSv3. PR: 251906 Reviewed by: trasz Differential Revision: https://reviews.freebsd.org/D27801 (cherry picked from commit 88e531f38c2412bf030f4e8dd563efc45b70797e) usr.sbin/autofs/automount.c | 2 ++ usr.sbin/autofs/autounmountd.c | 3 ++- usr.sbin/autofs/common.c | 13 +++++++++++++ usr.sbin/autofs/common.h | 1 + 4 files changed, 18 insertions(+), 1 deletion(-)