When removing subdirectories, __getcwd is sometimes falsely returning ENOENT, and continues to do so until pwd -L is invoked (it is the stat($PWD) that fixes the problem, presumably this is updating some cached information somewhere?) in a specific configuration (with a few edge-cases). This only occurs with tcsh for me, not sh, because the sh builtin compensates for the problem. ^_^ (mallettj@alala:~)267% cd pwd ^_^ (mallettj@alala:~/pwd)268% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo pwd: .: Permission denied ^_^ (mallettj@alala:~/pwd)269% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -L ; cd .. ; rm -rf foo /home/mallettj/pwd/foo ^_^ (mallettj@alala:~/pwd)270% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; ~/pwd/pwd -P ; cd .. ; rm -rf foo /data/home/mallettj/pwd/foo ^_^ (mallettj@alala:~/pwd)271% ls -lartd . drwxr-xr-x 3 mallettj mallettj 512 Jan 29 15:38 . ^_^ (mallettj@alala:~/pwd)272% pwd -L /home/mallettj/pwd ^_^ (mallettj@alala:~/pwd)273% pwd -P /data/home/mallettj/pwd ^_^ (mallettj@alala:~/pwd)274% ls -lartd /data drwxr-xr-x 8 root wheel 512 Jul 5 2007 /data ^_^ (mallettj@alala:~/pwd)275% ls -lartd /data/home drwx--x--x 9 root wheel 512 Jan 23 19:00 /data/home ^_^ (mallettj@alala:~/pwd)276% ls -lartd /data/home/mallettj drwxr-xr-x 13 mallettj mallettj 1024 Jan 29 15:34 /data/home/mallettj ^_^ (mallettj@alala:~/pwd)277% ls -lartd /data/home/mallettj/pwd drwxr-xr-x 3 mallettj mallettj 512 Jan 29 15:38 /data/home/mallettj/pwd ^_^ (mallettj@alala:~/pwd)278% ls -lartd /home lrwxr-xr-x 1 root wheel 10 Jun 21 2007 /home -> /data/home ^_^ (mallettj@alala:~/pwd)279% cvs diff -u pwd.c socket: Protocol not supported Index: pwd.c =================================================================== RCS file: /home/ncvs/src/bin/pwd/pwd.c,v retrieving revision 1.25 diff -u -r1.25 pwd.c --- pwd.c 9 Feb 2005 17:37:38 -0000 1.25 +++ pwd.c 29 Jan 2008 21:39:47 -0000 @@ -84,6 +84,8 @@ * If we're trying to find the logical current directory and that * fails, behave as if -P was specified. */ + struct stat phy; + stat(getenv("PWD"), &phy); if ((!physical && (p = getcwd_logical()) != NULL) || (p = getcwd(NULL, 0)) != NULL) printf("%s\n", p); Fix: Unlikely to have time to dig in to the kernel to figure out the root cause. Want to stress that the patch in the full description (obviously) is not a solution, should someone stumble over this bug and think it is. This affects all callers of getcwd, including my shell: ^_^ (mallettj@alala:~/pwd)286% tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo' tcsh: Permission denied tcsh: Trying to start from "/home/mallettj" pwd: .: Permission denied ^_^ (mallettj@alala:~/pwd)287% pwd -L /home/mallettj/pwd ^_^ (mallettj@alala:~/pwd)288% tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo' pwd: .: Permission denied How-To-Repeat: tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
Responsible Changed From-To: freebsd-bugs->attilio See if I can trick attilio into tracking this down further, since he's touched vfs_cache some recently.
Hi, I couldn't reproduce this on 7.1-STABLE or 8.0-CURRENT (r188436). Do you still see this behavior? -- Jaakko
On Tue, Feb 10, 2009 at 8:55 AM, Jaakko Heinonen <jh@saunalahti.fi> wrote: > I couldn't reproduce this on 7.1-STABLE or 8.0-CURRENT (r188436). Do you > still see this behavior? Yep. Rebuilt userland and kernel today from SVN. ^_^ (mallettj@alala:~/tst2)8% tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo' pwd: .: Permission denied ^_^ (mallettj@alala:~/tst2)9% uname -a FreeBSD alala.evergreen.edu 8.0-CURRENT FreeBSD 8.0-CURRENT #3 r188431: Tue Feb 10 11:50:37 PST 2009 root@alala.evergreen.edu:/usr/obj/usr/src/sys/ALALA amd64 I believe it is specific to something in how the parent directories are laid out / their permissions. Some relevant details: ^_^ (mallettj@alala:~)23% ls -ld /data /data/home /data/home/mallettj /home drwxr-xr-x 13 root wheel 512 Jan 14 14:38 /data drwx--x--x 11 root wheel 512 Dec 10 13:14 /data/home drwxr-xr-x 26 mallettj mallettj 1536 Feb 10 17:11 /data/home/mallettj lrwxr-xr-x 1 root wheel 10 Jun 21 2007 /home -> /data/home /dev/da0d on /data (ufs, local, soft-updates) Note that it doesn't happen in, i.e. my /tmp which is: drwxrwxrwt 17 root wheel 13312 Feb 10 17:15 /tmp
On 2009-02-10, Juli Mallett wrote: > ^_^ (mallettj@alala:~/tst2)8% tcsh -c 'mkdir foo ; cd foo ; mkdir bar > ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo' > pwd: .: Permission denied > drwx--x--x 11 root wheel 512 Dec 10 13:14 /data/home Thanks for the info. I think I know what's going on: UFS purges the parent directory from cache when you do rmdir(2) ("rm -rf bar" in your case). This causes __getcwd() to fail and getcwd(3) reverts to userspace traversal method which fails because of insufficient permissions to read /data/home directory. The behavior is allowed by SUSv3: "If a program is operating in a directory where some (grand)parent directory does not permit reading, getcwd() may fail, as in most implementations it must read the directory to determine the name of the file. This can occur if search, but not read, permission is granted in an intermediate directory, or if the program is placed in that directory by some more privileged process (for example, login)." Current __getcwd() implementation is not guaranteed to succeed. According to kib@ there is ongoing work to improve __getcwd(). See this message: http://lists.freebsd.org/pipermail/freebsd-fs/2009-February/005675.html -- Jaakko
Attached is a little C program that I've used to reproduce the problem on 7.1-RELEASE-p4. On Linux, this behaves as I would expect: $ ./getcwd_bug CWD: /home/pioto/top/mine CWD: /home/pioto/top/mine On FreeBSD, though: $ ./getcwd_bug CWD: /usr/home/pioto/top/mine getcwd: Permission denied -- Mike Kelly
Sorry, I'm stupid, and it looks like I somehow attached the compiled program instead. Here's the real attachment. -- Mike Kelly
The first attachment came through fine, as did the second. It does not have the bug on Mac OS X, either, fwiw.
It seems to be a duplicated with: - bin/121898: [nullfs] pwd(1)/getcwd(2) fails with Permission denied - kern/39527: getcwd() and unreadable parent directory - kern/22291: [nfs] getcwd(3) fails on recently-modified NFS-mounted dirs I made my tests on 7.2-RELEASE-p2 IN KERNEL ========= The problem comes from the system call __getcwd() when used on some specific FS (at least nullfs). This system call is defined in sys/kern/ vfs_cache.c. It looks like cache_enter() is not called from the VOP "lookup" for nullfs (null_lookup(), in sys/fs/nullfs/null_vnops.c). This is the function that stores data later used by vn_fullpath1(), the function doing the effective stuff for __getcwd(). Not calling cache_enter() in the VOP "lookup" results in a ENOENT error in vn_fullpath1(). The number of ENOENT errors is stored in the sysctl entry "vfs.cache.numfullpathfail2". IN USERLAND =========== The getcwd() function (in libc, lib/libc/gen/getcwd.c) works in two phases : - first try to use the system call __getcwd() - if it returns an error, falls back a userland algorithm, that opens all parent directories. If the user don't have read right on at least one parent directory, getcwd() returns with a EACCESS error. A simple workaround is to use a second fallback in getcwd(), in case both phases fail : returning the content of the environment variable PWD. (It'll then be possible to make getcwd() lie : potential security issue ?) HOW TO REPEAT ============= % mkdir -m 111 /aaa % mkdir /aaa/bbb % mount -t nullfs /lib /aaa/bbb % cd /aaa/bbb/geom % sysctl vfs.cache.numfullpathfail2 vfs.cache.numfullpathfail2: 12498 % su -fm nobody -c ktrace -f /tmp/ktrace.out /bin/pwd pwd: .: Permission denied % sysctl vfs.cache.numfullpathfail2 vfs.cache.numfullpathfail2: 12499 % kdump -f /tmp/ktrace.out | grep __getcwd 99094 pwd CALL __getcwd(0x800902400,0x400) 99094 pwd RET __getcwd -1 errno 2 No such file or directory grep -r cache_enter returns (in /usr/src/sys): ./fs/cd9660/cd9660_lookup.c: cache_enter(vdp, *vpp, cnp); ./fs/cd9660/cd9660_lookup.c: cache_enter(vdp, *vpp, cnp); ./fs/coda/coda_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/coda/coda_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/coda/coda_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/coda/coda_vnops.c: cache_enter(dvp, NULL, cnp); ./fs/coda/coda_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/hpfs/hpfs_vnops.c: cache_enter(dvp, *ap->a_vpp, cnp); ./fs/msdosfs/msdosfs_lookup.c: cache_enter(vdp, *vpp, cnp); ./fs/msdosfs/msdosfs_lookup.c: cache_enter(vdp, *vpp, cnp); ./fs/ntfs/ntfs_vnops.c: cache_enter(dvp, *ap->a_vpp, cnp); ./fs/nwfs/nwfs_io.c: cache_enter(vp, newvp, &cn); ./fs/nwfs/nwfs_vnops.c: cache_enter(dvp, vp, cnp); ./fs/nwfs/nwfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/pseudofs/pseudofs_vncache.c: * Some callers cache_enter(vp) later, so ./fs/pseudofs/pseudofs_vnops.c: cache_enter(vn, *vpp, cnp); ./fs/smbfs/smbfs_io.c: cache_enter(vp, newvp, &cn); ./fs/smbfs/smbfs_vnops.c: cache_enter(dvp, vp, cnp); ./fs/smbfs/smbfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/tmpfs/tmpfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./fs/udf/udf_vnops.c: cache_enter(dvp, *vpp, a->a_cnp); ./fs/udf/udf_vnops.c: cache_enter(dvp, *vpp, a->a_cnp); ./fs/unionfs/union_vnops.c: cache_enter(dvp, NULLVP, cnp); ./fs/unionfs/union_vnops.c: cache_enter(dvp, vp, cnp); ./fs/unionfs/union_vnops.c: cache_enter(dvp, NULLVP, cnp); ./gnu/fs/ext2fs/ext2_lookup.c: cache_enter(vdp, *vpp, cnp); ./gnu/fs/ext2fs/ext2_lookup.c: cache_enter(vdp, *vpp, cnp); ./gnu/fs/reiserfs/reiserfs_namei.c: cache_enter(vdp, *vpp, cnp); ./gnu/fs/xfs/FreeBSD/xfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./gnu/fs/xfs/FreeBSD/xfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./kern/uipc_mqueue.c: cache_enter(dvp, *vpp, cnp); ./kern/vfs_cache.c:cache_enter(dvp, vp, cnp) ./kern/vfs_cache.c: CTR3(KTR_VFS, "cache_enter(%p, %p, %s)", dvp, vp, cnp->cn_nameptr); ./nfs4client/nfs4_vnops.c: cache_enter(dvp, newvp, cnp); ./nfs4client/nfs4_vnops.c: cache_enter(dvp, newvp, cnp); ./nfs4client/nfs4_vnops.c: cache_enter(dvp, newvp, cnp); ./nfsclient/nfs_vnops.c: cache_enter(dvp, newvp, cnp); ./nfsclient/nfs_vnops.c: cache_enter(dvp, newvp, cnp); ./nfsclient/nfs_vnops.c: cache_enter(dvp, newvp, cnp); ./nfsclient/nfs_vnops.c: cache_enter(ndp->ni_dvp, ndp- >ni_vp, cnp); ./sys/vnode.h:void cache_enter(struct vnode *dvp, struct vnode *vp, ./ufs/ufs/ufs_lookup.c: cache_enter(vdp, *vpp, cnp); ./ufs/ufs/ufs_lookup.c: cache_enter(vdp, *vpp, cnp); ./cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c: cache_enter(dvp, *vpp, cnp); ./cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c: cache_enter(dvp, *vpp, cnp); So, it's effectively called from most FS implementations, excluding nullfs.
attilio has turned in his commit bit. To submitter: is this problem still present in modern releases of FreeBSD?
For bugs matching the following conditions: - Status == In Progress - Assignee == "bugs@FreeBSD.org" - Last Modified Year <= 2017 Do - Set Status to "Open"
Keyword: patch or patch-ready – in lieu of summary line prefix: [patch] * bulk change for the keyword * summary lines may be edited manually (not in bulk). Keyword descriptions and search interface: <https://bugs.freebsd.org/bugzilla/describekeywords.cgi>
^Triage: feedback timeout (many years).