When using glusterfs - both old 3.x version and new 8.x - we observe that while most (all?) files are correctly replicated, listing directories do not show all files. Tested with older and current glusterfs on FreeBSD versions from 10.x via 11.x to (now) 12.2. Example: Source filesystem has ~12k files across ~100 directories. We rsync this to a gluster volume mounted on /mnt. Afterwards we run find /mnt -type f | wc -l on two different nodes in the cluster. The numbers are not the same. Then we do, on two nodes find /mnt -type f | sort > /mnt/$(hostname).lst When comparing the two, the sets of files on the two seem different. HOWEVER: When performing spot-checks, we DO find allegedly missing files in the filesystem, but ONLY when referring to it directly. An 'ls' in the directory that contains a "missing" file will not show the file, but a 'file <filename>' or 'ls -la <file>' or any other operation that directly accesses the file works fine. Again, we've seen this every time we've tested glusterfs, causing us to abort our attempts. Now that we have a "fresh" maintainer, I'm hoping perhaps this can be solved. I don't know if the problem is in gluster or fuse or elsewhere..
/var/log/glusterfs/mnt.log has the following entries; presumably one for each directory where one or more file entries are missing: [2021-03-03 22:48:00.165217] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 164414: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.271681] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 165663: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.360047] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 166731: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.462738] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 167974: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.551223] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 169088: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.741055] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 171454: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.828434] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 172534: READDIR => -1 (Invalid argument) [2021-03-03 22:48:00.918418] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 173662: READDIR => -1 (Invalid argument) [2021-03-03 22:48:01.169557] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 176589: READDIR => -1 (Invalid argument) [2021-03-03 22:48:01.266421] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 177692: READDIR => -1 (Invalid argument) [2021-03-03 22:48:01.522581] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 180580: READDIR => -1 (Invalid argument) [2021-03-03 22:48:01.611598] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 181700: READDIR => -1 (Invalid argument)
(In reply to Eirik Oeverby from comment #1) (This is a result of a simple 'find .|wc -l' in the mount point)
Thanks you for testing and reporting this. I never noticed/ran into this issue myself with my limited testing, but it does sound terrible. I will try to duplicate this issue using my Vagrant test setup (https://github.com/tuaris/Vagrant_GlusterFS) on a new branch in that repo this weekend. This way I can go to the upstream devs with more information and (hopefully) have someone there to take a closer look.
Is it still relevant?
(In reply to Zsolt Udvari from comment #4) I have not tested for some time; is there reason to believe the behaviour has changed? If so, I can try to set up another test..
After your PR the port updated to 8.4 (some days later): https://cgit.freebsd.org/ports/commit/net/glusterfs?id=02f3df6d3d23c1083188c28c06e3c8e4e8ebc373