Bug 197336 - find command cannot see more than 32765 subdirectories when using ZFS
Summary: find command cannot see more than 32765 subdirectories when using ZFS
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.1-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-04 23:19 UTC by Will Dormann
Modified: 2022-05-01 12:21 UTC (History)
7 users (show)

See Also:


Attachments
python script to generate a bunch of subdirectories with files in them (870 bytes, text/plain)
2015-02-04 23:19 UTC, Will Dormann
no flags Details
fts(3) patch that disables the nlink optimization if st_nlink >= LINK_MAX (402 bytes, patch)
2015-04-03 14:38 UTC, Jilles Tjoelker
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Will Dormann 2015-02-04 23:19:51 UTC
Created attachment 152566 [details]
python script to generate a bunch of subdirectories with files in them

When a directory has more than 32765 subdirectories in it, the find command fails to find all of the contents if the find command is executed in a ZFS filesystem. 
If the same command is executed in another filesystem that FreeBSD supports that also supports large counts of subdirectories, the find command sees everything.  I've confirmed the correct behavior with both Reiserfs and unionfs.  So it appears to be something about the interaction between find and ZFS that triggers the bug.

Steps to reproduce:

1. Create a directory structure using the attached dirgen.py script
2. Verify the file count with the ls command. e.g.: ls -lR find_test_q65puW | egrep "txt$" | wc -l
3. Verify the file count with the find command. e.g.: find find_test_q65puW -name "*.txt" | wc -l

Actual results:
[~/test]$ python ./dirgen.py
ndirs: 300000 nfiles: 300000
[~/test]$ ls -l
total 219058
-rw-r--r--      1 user  user       861 Feb  4 15:30 dirgen.py
drwx------  32767 user  user    300002 Feb  4 15:31 find_test_q65puW
[~/test]$ ls -lR find_test_q65puW | egrep "txt$" | wc -l
  300000
[~/test]$ find find_test_q65puW -name "*.txt" | wc -l
   32765

The count is not complete. (32765 instead of 300000)

Expected results:
The find command should indicate that there are 300000 files.
Comment 1 Will Dormann 2015-02-12 16:39:04 UTC
Also related:

rm -rf find_test_q65puW
will only delete 32765 subdirectories.  The others will give a "permission denied" error.
Comment 2 Andrey V. Elsukov freebsd_committer freebsd_triage 2015-02-17 09:00:57 UTC
Add link to the explanation from bde@
https://lists.freebsd.org/pipermail/freebsd-bugs/2015-February/060317.html
Comment 3 Will Dormann 2015-02-17 12:33:43 UTC
> It is impossible for the other file systems to work much better.  Perhaps
> they work up to 65535, or have the correct {LINK_MAX} and the python script
> is smart enough to avoid it.  I doubt that python messes with {LINK_MAX},
> but creation of subdirectories should stop when the advertized limit is
> hit, and python or the script should handle that, possibly just by
> stopping.


I don't know what to say about it being impossible that other filesystems work better.   Because FreeBSD doesn't support R/W ReiserFS, I used a USB thumb drive both formatted and populated (with the python script) on a Linux system.   And with the case of unionfs, I simply had a FreeBSD Jail that shared the underlying ZFS directory structure.

In both cases, the find command successfully reported the existence of all 300,000 subdirectories.
Comment 4 Jilles Tjoelker freebsd_committer freebsd_triage 2015-02-18 20:48:05 UTC
Although bde@ is right that FreeBSD cannot report the link count correctly if a directory has more than 32765 subdirectories, this usually need not be a problem. The fts(3) code underlying find(1) uses the link count to avoid stat calls, even on filesystems (most) that support d_type which allows avoiding the same stat calls.

Some possible solutions are disabling the nlink optimization if fts_nlink >= LINK_MAX, disabling the nlink optimization for ZFS, ignoring the nlink optimization if d_type tells otherwise and removing the nlink optimization entirely.

As a workaround, specify a find(1) expression that forces it to stat everything, such as -ls or -size 0 -o -size +0.
Comment 5 Jilles Tjoelker freebsd_committer freebsd_triage 2015-04-03 14:38:47 UTC
Created attachment 155161 [details]
fts(3) patch that disables the nlink optimization if st_nlink >= LINK_MAX

This patch disables the nlink optimization if st_nlink >= LINK_MAX (assuming that the real link count may be higher than st_nlink in that case).
Comment 6 Pau Amma 2022-04-30 18:39:57 UTC
Seems to no longer happen, at least on 13.0. Haven't tried 12.3.

pauamma@bug-197336:~/test % freebsd-version -kru
13.0-RELEASE-p11
13.0-RELEASE-p11
13.0-RELEASE-p11
pauamma@bug-197336:~/test % python2 ./dirgen.py 
ndirs: 300000 nfiles: 300000
pauamma@bug-197336:~/test % ls -l
total 33021
-rw-r--r--       1 pauamma  pauamma     869 Apr 30 17:49 dirgen.py
drwx------  300002 pauamma  pauamma  300002 Apr 30 17:56 find_test_oR_Hyn
pauamma@bug-197336:~/test % ls -lR find_test_oR_Hyn | egrep "txt$" | wc -l
Illegal variable name.
pauamma@bug-197336:~/test % ls -lR find_test_oR_Hyn | egrep 'txt$' | wc -l
  300000
pauamma@bug-197336:~/test % find find_test_oR_Hyn -name "*.txt" | wc -l
  300000
Comment 7 Jilles Tjoelker freebsd_committer freebsd_triage 2022-05-01 12:21:19 UTC
After ino64 which increased nlink_t to 64 bits, the problematic truncation of the link count no longer occurs. All supported releases have ino64, so I'm closing this bug.