Bug 120128 - [libc] [patch] __getcwd erroneously returning ENOENT
Summary: [libc] [patch] __getcwd erroneously returning ENOENT
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 8.0-CURRENT
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-29 21:50 UTC by Juli Mallett
Modified: 2018-05-20 23:50 UTC (History)
0 users

See Also:


Attachments
getcwd_bug.c (1.06 KB, text/plain)
2009-04-08 15:30 UTC, Mike Kelly
no flags Details
getcwd_bug.c (1.05 KB, text/plain)
2009-04-10 04:27 UTC, Mike Kelly
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Juli Mallett 2008-01-29 21:50:01 UTC
When removing subdirectories, __getcwd is sometimes falsely returning
ENOENT, and continues to do so until pwd -L is invoked (it is the
stat($PWD) that fixes the problem, presumably this is updating some
cached information somewhere?) in a specific configuration (with a few
edge-cases).  This only occurs with tcsh for me, not sh, because the sh
builtin compensates for the problem.

^_^ (mallettj@alala:~)267% cd pwd
^_^ (mallettj@alala:~/pwd)268% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo
pwd: .: Permission denied
^_^ (mallettj@alala:~/pwd)269% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -L ; cd .. ; rm -rf foo
/home/mallettj/pwd/foo
^_^ (mallettj@alala:~/pwd)270% mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; ~/pwd/pwd -P ; cd .. ; rm -rf foo
/data/home/mallettj/pwd/foo
^_^ (mallettj@alala:~/pwd)271% ls -lartd .
drwxr-xr-x  3 mallettj  mallettj  512 Jan 29 15:38 .
^_^ (mallettj@alala:~/pwd)272% pwd -L
/home/mallettj/pwd
^_^ (mallettj@alala:~/pwd)273% pwd -P
/data/home/mallettj/pwd
^_^ (mallettj@alala:~/pwd)274% ls -lartd /data
drwxr-xr-x  8 root  wheel  512 Jul  5  2007 /data
^_^ (mallettj@alala:~/pwd)275% ls -lartd /data/home
drwx--x--x  9 root  wheel  512 Jan 23 19:00 /data/home
^_^ (mallettj@alala:~/pwd)276% ls -lartd /data/home/mallettj
drwxr-xr-x  13 mallettj  mallettj  1024 Jan 29 15:34 /data/home/mallettj
^_^ (mallettj@alala:~/pwd)277% ls -lartd /data/home/mallettj/pwd
drwxr-xr-x  3 mallettj  mallettj  512 Jan 29 15:38 /data/home/mallettj/pwd
^_^ (mallettj@alala:~/pwd)278% ls -lartd /home
lrwxr-xr-x  1 root  wheel  10 Jun 21  2007 /home -> /data/home
^_^ (mallettj@alala:~/pwd)279% cvs diff -u pwd.c
socket: Protocol not supported
Index: pwd.c
===================================================================
RCS file: /home/ncvs/src/bin/pwd/pwd.c,v
retrieving revision 1.25
diff -u -r1.25 pwd.c
--- pwd.c       9 Feb 2005 17:37:38 -0000       1.25
+++ pwd.c       29 Jan 2008 21:39:47 -0000
@@ -84,6 +84,8 @@
         * If we're trying to find the logical current directory and that
         * fails, behave as if -P was specified.
         */
+       struct stat phy;
+       stat(getenv("PWD"), &phy);
        if ((!physical && (p = getcwd_logical()) != NULL) ||
            (p = getcwd(NULL, 0)) != NULL)
                printf("%s\n", p);

Fix: 

Unlikely to have time to dig in to the kernel to figure out the root cause.
 Want to stress that the patch in the full description (obviously) is not
a solution, should someone stumble over this bug and think it is.  This
affects all callers of getcwd, including my shell:

^_^ (mallettj@alala:~/pwd)286% tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
tcsh: Permission denied
tcsh: Trying to start from "/home/mallettj"
pwd: .: Permission denied
^_^ (mallettj@alala:~/pwd)287% pwd -L
/home/mallettj/pwd
^_^ (mallettj@alala:~/pwd)288% tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
pwd: .: Permission denied
How-To-Repeat: tcsh -c 'mkdir foo ; cd foo ; mkdir bar ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
Comment 1 Juli Mallett freebsd_committer 2008-02-03 20:53:33 UTC
Responsible Changed
From-To: freebsd-bugs->attilio

See if I can trick attilio into tracking this down further, since he's touched 
vfs_cache some recently.
Comment 2 Jaakko Heinonen 2009-02-10 16:55:17 UTC
Hi,

I couldn't reproduce this on 7.1-STABLE or 8.0-CURRENT (r188436). Do you
still see this behavior?

-- 
Jaakko
Comment 3 Juli Mallett 2009-02-11 01:16:29 UTC
On Tue, Feb 10, 2009 at 8:55 AM, Jaakko Heinonen <jh@saunalahti.fi> wrote:
> I couldn't reproduce this on 7.1-STABLE or 8.0-CURRENT (r188436). Do you
> still see this behavior?

Yep.  Rebuilt userland and kernel today from SVN.

^_^ (mallettj@alala:~/tst2)8% tcsh -c 'mkdir foo ; cd foo ; mkdir bar
; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
pwd: .: Permission denied
^_^ (mallettj@alala:~/tst2)9% uname -a
FreeBSD alala.evergreen.edu 8.0-CURRENT FreeBSD 8.0-CURRENT #3
r188431: Tue Feb 10 11:50:37 PST 2009
root@alala.evergreen.edu:/usr/obj/usr/src/sys/ALALA  amd64

I believe it is specific to something in how the parent directories
are laid out / their permissions.  Some relevant details:

^_^ (mallettj@alala:~)23% ls -ld /data /data/home /data/home/mallettj /home
drwxr-xr-x  13 root      wheel      512 Jan 14 14:38 /data
drwx--x--x  11 root      wheel      512 Dec 10 13:14 /data/home
drwxr-xr-x  26 mallettj  mallettj  1536 Feb 10 17:11 /data/home/mallettj
lrwxr-xr-x   1 root      wheel       10 Jun 21  2007 /home -> /data/home

/dev/da0d on /data (ufs, local, soft-updates)

Note that it doesn't happen in, i.e. my /tmp which is:
drwxrwxrwt  17 root  wheel  13312 Feb 10 17:15 /tmp
Comment 4 Jaakko Heinonen 2009-02-11 14:24:08 UTC
On 2009-02-10, Juli Mallett wrote:
> ^_^ (mallettj@alala:~/tst2)8% tcsh -c 'mkdir foo ; cd foo ; mkdir bar
> ; rm -rf bar ; pwd -P ; cd .. ; rm -rf foo'
> pwd: .: Permission denied

> drwx--x--x  11 root      wheel      512 Dec 10 13:14 /data/home

Thanks for the info. I think I know what's going on: UFS purges the
parent directory from cache when you do rmdir(2) ("rm -rf bar" in your
case). This causes __getcwd() to fail and getcwd(3) reverts to userspace
traversal method which fails because of insufficient permissions to read
/data/home directory.

The behavior is allowed by SUSv3: "If a program is operating in a
directory where some (grand)parent directory does not permit reading,
getcwd() may fail, as in most implementations it must read the directory
to determine the name of the file. This can occur if search, but not
read, permission is granted in an intermediate directory, or if the
program is placed in that directory by some more privileged process (for
example, login)."

Current __getcwd() implementation is not guaranteed to succeed.
According to kib@ there is ongoing work to improve __getcwd(). See this
message:

http://lists.freebsd.org/pipermail/freebsd-fs/2009-February/005675.html

-- 
Jaakko
Comment 5 Mike Kelly 2009-04-08 15:30:23 UTC
Attached is a little C program that I've used to reproduce the problem
on 7.1-RELEASE-p4.

On Linux, this behaves as I would expect:

$ ./getcwd_bug
CWD: /home/pioto/top/mine
CWD: /home/pioto/top/mine

On FreeBSD, though:

 $ ./getcwd_bug
CWD: /usr/home/pioto/top/mine
getcwd: Permission denied

-- 
Mike Kelly
Comment 6 Mike Kelly 2009-04-10 04:27:27 UTC
Sorry, I'm stupid, and it looks like I somehow attached the compiled
program instead. Here's the real attachment.

-- 
Mike Kelly
Comment 7 Juli Mallett 2009-04-10 04:30:56 UTC
The first attachment came through fine, as did the second.

It does not have the bug on Mac OS X, either, fwiw.
Comment 8 damien.bobillot 2009-11-22 13:16:32 UTC
It seems to be a duplicated with:
- bin/121898: [nullfs] pwd(1)/getcwd(2) fails with Permission denied
- kern/39527: getcwd() and unreadable parent directory
- kern/22291: [nfs] getcwd(3) fails on recently-modified NFS-mounted  
dirs

I made my tests on 7.2-RELEASE-p2

IN KERNEL
=========

The problem comes from the system call __getcwd() when used on some  
specific FS (at least nullfs). This system call is defined in sys/kern/ 
vfs_cache.c.

It looks like cache_enter() is not called from the VOP "lookup" for  
nullfs (null_lookup(), in sys/fs/nullfs/null_vnops.c). This is the  
function that stores data later used by vn_fullpath1(), the function  
doing the effective stuff for __getcwd().

Not calling cache_enter() in the VOP "lookup" results in a ENOENT  
error in vn_fullpath1(). The number of ENOENT errors is stored in the  
sysctl entry "vfs.cache.numfullpathfail2".

IN USERLAND
===========

The getcwd() function (in libc, lib/libc/gen/getcwd.c) works in two  
phases :
- first try to use the system call __getcwd()
- if it returns an error, falls back a userland algorithm, that opens  
all parent directories. If the user don't have read right on at least  
one parent directory, getcwd() returns with a EACCESS error.

A simple workaround is to use a second fallback in getcwd(), in case  
both phases fail : returning the content of the environment variable  
PWD. (It'll then be possible to make getcwd() lie : potential security  
issue ?)

HOW TO REPEAT
=============

% mkdir -m 111 /aaa
% mkdir /aaa/bbb
% mount -t nullfs /lib /aaa/bbb
% cd /aaa/bbb/geom
% sysctl vfs.cache.numfullpathfail2
vfs.cache.numfullpathfail2: 12498
% su -fm nobody -c ktrace -f /tmp/ktrace.out /bin/pwd
pwd: .: Permission denied
% sysctl vfs.cache.numfullpathfail2
vfs.cache.numfullpathfail2: 12499
% kdump -f /tmp/ktrace.out | grep __getcwd
99094 pwd      CALL  __getcwd(0x800902400,0x400)
99094 pwd      RET   __getcwd -1 errno 2 No such file or directory


grep -r cache_enter returns (in /usr/src/sys):
./fs/cd9660/cd9660_lookup.c:		cache_enter(vdp, *vpp, cnp);
./fs/cd9660/cd9660_lookup.c:		cache_enter(vdp, *vpp, cnp);
./fs/coda/coda_vnops.c:			cache_enter(dvp, *vpp, cnp);
./fs/coda/coda_vnops.c:		cache_enter(dvp, *vpp, cnp);
./fs/coda/coda_vnops.c:			cache_enter(dvp, *vpp, cnp);
./fs/coda/coda_vnops.c:			cache_enter(dvp, NULL, cnp);
./fs/coda/coda_vnops.c:			cache_enter(dvp, *vpp, cnp);
./fs/hpfs/hpfs_vnops.c:			cache_enter(dvp, *ap->a_vpp, cnp);
./fs/msdosfs/msdosfs_lookup.c:		cache_enter(vdp, *vpp, cnp);
./fs/msdosfs/msdosfs_lookup.c:		cache_enter(vdp, *vpp, cnp);
./fs/ntfs/ntfs_vnops.c:		cache_enter(dvp, *ap->a_vpp, cnp);
./fs/nwfs/nwfs_io.c:				cache_enter(vp, newvp, &cn);
./fs/nwfs/nwfs_vnops.c:			cache_enter(dvp, vp, cnp);
./fs/nwfs/nwfs_vnops.c:		cache_enter(dvp, *vpp, cnp);
./fs/pseudofs/pseudofs_vncache.c:				 * Some callers cache_enter(vp)  
later, so
./fs/pseudofs/pseudofs_vnops.c:		cache_enter(vn, *vpp, cnp);
./fs/smbfs/smbfs_io.c:				cache_enter(vp, newvp, &cn);
./fs/smbfs/smbfs_vnops.c:		cache_enter(dvp, vp, cnp);
./fs/smbfs/smbfs_vnops.c:		cache_enter(dvp, *vpp, cnp);
./fs/tmpfs/tmpfs_vnops.c:		cache_enter(dvp, *vpp, cnp);
./fs/udf/udf_vnops.c:				cache_enter(dvp, *vpp, a->a_cnp);
./fs/udf/udf_vnops.c:			cache_enter(dvp, *vpp, a->a_cnp);
./fs/unionfs/union_vnops.c:			cache_enter(dvp, NULLVP, cnp);
./fs/unionfs/union_vnops.c:		cache_enter(dvp, vp, cnp);
./fs/unionfs/union_vnops.c:		cache_enter(dvp, NULLVP, cnp);
./gnu/fs/ext2fs/ext2_lookup.c:		cache_enter(vdp, *vpp, cnp);
./gnu/fs/ext2fs/ext2_lookup.c:		cache_enter(vdp, *vpp, cnp);
./gnu/fs/reiserfs/reiserfs_namei.c:		cache_enter(vdp, *vpp, cnp);
./gnu/fs/xfs/FreeBSD/xfs_vnops.c:			cache_enter(dvp, *vpp, cnp);
./gnu/fs/xfs/FreeBSD/xfs_vnops.c:		cache_enter(dvp, *vpp, cnp);
./kern/uipc_mqueue.c:			cache_enter(dvp, *vpp, cnp);
./kern/vfs_cache.c:cache_enter(dvp, vp, cnp)
./kern/vfs_cache.c:	CTR3(KTR_VFS, "cache_enter(%p, %p, %s)", dvp, vp,  
cnp->cn_nameptr);
./nfs4client/nfs4_vnops.c:		cache_enter(dvp, newvp, cnp);
./nfs4client/nfs4_vnops.c:		cache_enter(dvp, newvp, cnp);
./nfs4client/nfs4_vnops.c:		cache_enter(dvp, newvp, cnp);
./nfsclient/nfs_vnops.c:		cache_enter(dvp, newvp, cnp);
./nfsclient/nfs_vnops.c:			cache_enter(dvp, newvp, cnp);
./nfsclient/nfs_vnops.c:			cache_enter(dvp, newvp, cnp);
./nfsclient/nfs_vnops.c:			        cache_enter(ndp->ni_dvp, ndp- 
 >ni_vp, cnp);
./sys/vnode.h:void	cache_enter(struct vnode *dvp, struct vnode *vp,
./ufs/ufs/ufs_lookup.c:		cache_enter(vdp, *vpp, cnp);
./ufs/ufs/ufs_lookup.c:		cache_enter(vdp, *vpp, cnp);
./cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:		 
cache_enter(dvp, *vpp, cnp);
./cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:			 
cache_enter(dvp, *vpp, cnp);

So, it's effectively called from most FS implementations, excluding  
nullfs.
Comment 9 Mark Linimon freebsd_committer freebsd_triage 2015-01-04 20:26:25 UTC
attilio has turned in his commit bit.

To submitter: is this problem still present in modern releases of FreeBSD?
Comment 10 Eitan Adler freebsd_committer freebsd_triage 2018-05-20 23:50:19 UTC
For bugs matching the following conditions:
- Status == In Progress
- Assignee == "bugs@FreeBSD.org"
- Last Modified Year <= 2017

Do
- Set Status to "Open"