Utilizing find(1)'s 'newer' primary expression is broken with symbolic links. According to the man page, -P should be the default behaviour, but even if explicitly defining -P, find uses timestamp information from the file referenced by the link, not the link itself. To see in action (copy'n'pastable for sh and csh): [ -e /tmp/find-test ] && rm -R /tmp/find-test mkdir /tmp/find-test && cd /tmp/find-test ln -s fILE lINK-to-fILE && sleep 1 echo "Created after lINK-to-fILE" > fILE2 && sleep 1 echo "Newest in the hood" > fILE && sleep 1 find . -newermm lINK-to-fILE It doesn't return fILE2 nor fILE, while stat(1) clearly confirms newer m_times for fILE and fILE2 compared to lINK-to-fILE ( 'stat -f "%N: %Sm" *' ). It's possible that find(1) hasn't been working like expected for a quiet long time. I first recognized strange results arround FreeBSD 8.0 but haven't had time|need to investigate. More precisely, I thought that the error was in my backup script due to timestamp precision changes, which changed around that time if I remember correctly. But since timestamp comparings is a fallback mechanism in my script, never needed until yesterday in real world, I haven't paid any attention until today. To falsify find(1)'s misbehaviour, continue the test above with: touch -m -t `stat -f %Sm -t %y%m%d%H%M.%S lINK-to-fILE` fILE find . -newermm lINK-to-fILE Now you get the result which was expected before. Why don't I just use 'touch -r' ? Well, tried that of course, but I had to find out that touch(1) is not using the m_time of the link, but from the file referenced by the link. touch(1) hasn't options which influence that and doesn't state default behaviour... Will see if I can identify the problem in the source of find(1) and add comments as soon as I found anything. But I hope people with better C skills jump in! Even if it's a old bug, it's a very nasty one, because it can affect user's data... So high priority IMHO. -harry
Here the behavior is described in the POSIX 2008, 2016 edition standard: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html Harald is correct that the default behavior (and for -H, traversed paths) should use the metadata for the link itself, rather than that of the referenced file.
Truss: open(".",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC,01) = 5 (0x5) fstatfs(5,{ fstypename=ufs,mntonname=/,mntfromname=/dev/gpt/freebsd-root,fsid= }) = 0 (0x0) fstat(5,{ mode=drwxr-xr-x ,inode=42455429,size=512,blksize=32768 }) = 0 (0x0) fchdir(0x5) = 0 (0x0) getdirentries(5,"\M^E\M-Q\M^G\^B\0\0\0\0\0\0\0\0"...,4096,{ 0x0 }) = 168 (0xa8) fstatat(AT_FDCWD,"lINK-to-fILE",{ mode=lrwxr-xr-x ,inode=42459218,size=4,blksize=32768 },AT_SYMLINK_NOFOLLOW) = 0 (0x0) fstatat(AT_FDCWD,"fILE2",{ mode=-rw-r--r-- ,inode=42459473,size=27,blksize=32768 },AT_SYMLINK_NOFOLLOW) = 0 (0x0) fstatat(AT_FDCWD,"fILE",{ mode=-rw-r--r-- ,inode=42459474,size=19,blksize=32768 },AT_SYMLINK_NOFOLLOW) = 0 (0x0) getdirentries(5,0x8006a8000,4096,{ 0x200 }) = 0 (0x0) close(5) = 0 (0x0) So we see that find(1) is correctly using AT_SYMLINK_NOFOLLOW to obtain stat information for the symlink, as well as the other files. So either the kernel is broken, or the comparison is somehow broken.
(In reply to Conrad Meyer from comment #2) Ah, that is during traversal. For the reference file (the "plan"), find(1) is incorrectly using fstatat without the NOFOLLOW flag: fstatat(AT_FDCWD,"lINK-to-fILE",{ mode=-rw-r--r-- ,inode=42459474,size=19,blksize=32768 },0x0) = 0 (0x0)
This change ought to be sufficient to fix -newer: --- a/usr.bin/find/function.c +++ b/usr.bin/find/function.c @@ -1201,6 +1201,7 @@ c_newer(OPTION *option, char ***argvp) char *fn_or_tspec; PLAN *new; struct stat sb; + int error; fn_or_tspec = nextarg(option, argvp); ftsoptions &= ~FTS_NOSTAT; @@ -1214,7 +1215,11 @@ c_newer(OPTION *option, char ***argvp) /* Use the seconds only in the comparison. */ new->t_data.tv_nsec = 999999999; } else { - if (stat(fn_or_tspec, &sb)) + if (ftsoptions & FTS_PHYSICAL) + error = lstat(fn_or_tspec, &sb); + else + error = stat(fn_or_tspec, &sb); + if (error != 0) err(1, "%s", fn_or_tspec); if (option->flags & F_TIME2_C) new->t_data = sb.st_ctim; However, -samefile is similarly broken. Here's a patch for that part: --- a/usr.bin/find/function.c +++ b/usr.bin/find/function.c @@ -1066,12 +1066,17 @@ c_samefile(OPTION *option, char ***argvp) char *fn; PLAN *new; struct stat sb; + int error; fn = nextarg(option, argvp); ftsoptions &= ~FTS_NOSTAT; new = palloc(option); - if (stat(fn, &sb)) + if (ftsoptions & FTS_PHYSICAL) + error = lstat(fn, &sb); + else + error = stat(fn, &sb); + if (error != 0) err(1, "%s", fn); new->i_data = sb.st_ino; return new;
This has apparently been broken since inclusion in 2001 (r76250) of a patch originally submitted in 1999.
No, hold on, it was broken even before that, in the BSD 4.4-Lite source import: r1590. Probably broken from the dawn of CSRG.
Thanks a lot Conrad! I spent an hour on this back than and reached the same point in code (as far as I remember), but wasn't even close to your solution. My result was more like pigs staring at clockworks ... :-( But I checked different sources and also all linux versions I found had the same erroneous implementation. Glad you finally made it POSIX/man page compatible (after 20+ years?!?). Unfortunately I don't have time to do any tests, but I don't think there's a need for testing if the example show correct behaviour now. Have an guadn Rutsch! -harry
Happy new years!
A commit references this bug: Author: cem Date: Fri Dec 29 22:08:44 UTC 2017 New revision: 327362 URL: https://svnweb.freebsd.org/changeset/base/327362 Log: find(1): Fix -newer and -samefile to conform to POSIX[0] By default, or with the -P flag, find(1) should evaluate paths "physically." For symlinks, this means using the link itself instead of the target. Historically (since the import of BSD 4.4-lite from CSRG), find(1) has failed to refer to the link itself, at least for -newer and -samefile. [0]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html PR: 222698 Reported by: Harald Schmalzbauer <bugzilla.freebsd AT omnilan.de> Sponsored by: Dell EMC Isilon Changes: head/usr.bin/find/Makefile head/usr.bin/find/function.c head/usr.bin/find/tests/ head/usr.bin/find/tests/Makefile head/usr.bin/find/tests/find_test.sh
What should happen in -H mode (half-logical; symlinks mentioned on the command line are followed but symlinks found in a directory tree walk are not)? A literal interpretation of POSIX (with the assumption that auxiliary arguments work the same way as primary arguments) would be that symlinks should be followed, but in the case of -samefile this makes it non-reflexive.
(In reply to Jilles Tjoelker from comment #10) This is a bit belated, but I think the requested behavior in this PR is actually POSIX-incompliant, and your question should technically be moot (with the benefit of hindsight). The -H and -L descriptions from the 2018 edition are pretty clear here: > ...for each symbolic link encountered as a *path* operand on the command line ... The standard later goes on to carefully describe -newer's argument as a file, rather than an arbitrary path that should be subject to link resolution.