Bug 254000 - net/glusterfs: Directory entries corrupted
Summary: net/glusterfs: Directory entries corrupted
Status: New
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-ports-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-03 22:44 UTC by Eirik Oeverby
Modified: 2024-08-14 05:28 UTC (History)
3 users (show)

See Also:
bugzilla: maintainer-feedback? (daniel)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eirik Oeverby 2021-03-03 22:44:02 UTC
When using glusterfs - both old 3.x version and new 8.x - we observe that while most (all?) files are correctly replicated, listing directories do not show all files.

Tested with older and current glusterfs on FreeBSD versions from 10.x via 11.x to (now) 12.2.

Example: Source filesystem has ~12k files across ~100 directories. We rsync this to a gluster volume mounted on /mnt.

Afterwards we run
  find /mnt -type f | wc -l
on two different nodes in the cluster. The numbers are not the same.

Then we do, on two nodes
  find /mnt -type f | sort > /mnt/$(hostname).lst

When comparing the two, the sets of files on the two seem different.
HOWEVER: When performing spot-checks, we DO find allegedly missing files in the filesystem, but ONLY when referring to it directly. An 'ls' in the directory that contains a "missing" file will not show the file, but a 'file <filename>' or 'ls -la <file>' or any other operation that directly accesses the file works fine.

Again, we've seen this every time we've tested glusterfs, causing us to abort our attempts. Now that we have a "fresh" maintainer, I'm hoping perhaps this can be solved. I don't know if the problem is in gluster or fuse or elsewhere..
Comment 1 Eirik Oeverby 2021-03-03 22:49:02 UTC
/var/log/glusterfs/mnt.log has the following entries; presumably one for each directory where one or more file entries are missing:

[2021-03-03 22:48:00.165217] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 164414: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.271681] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 165663: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.360047] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 166731: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.462738] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 167974: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.551223] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 169088: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.741055] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 171454: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.828434] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 172534: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:00.918418] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 173662: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:01.169557] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 176589: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:01.266421] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 177692: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:01.522581] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 180580: READDIR => -1 (Invalid argument)
[2021-03-03 22:48:01.611598] W [fuse-bridge.c:3589:fuse_readdir_cbk] 0-glusterfs-fuse: 181700: READDIR => -1 (Invalid argument)
Comment 2 Eirik Oeverby 2021-03-03 22:49:30 UTC
(In reply to Eirik Oeverby from comment #1)
(This is a result of a simple 'find .|wc -l' in the mount point)
Comment 3 Daniel Morante 2021-03-05 04:12:08 UTC
Thanks you for testing and reporting this. I never noticed/ran into this issue myself with my limited testing, but it does sound terrible.

I will try to duplicate this issue using my Vagrant test setup (https://github.com/tuaris/Vagrant_GlusterFS) on a new branch in that repo this weekend.

This way I can go to the upstream devs with more information and (hopefully) have someone there to take a closer look.
Comment 4 Zsolt Udvari freebsd_committer freebsd_triage 2024-08-13 19:29:20 UTC
Is it still relevant?
Comment 5 Eirik Oeverby 2024-08-13 19:31:05 UTC
(In reply to Zsolt Udvari from comment #4)
I have not tested for some time; is there reason to believe the behaviour has changed? If so, I can try to set up another test..
Comment 6 Zsolt Udvari freebsd_committer freebsd_triage 2024-08-14 05:28:34 UTC
After your PR the port updated to 8.4 (some days later):
https://cgit.freebsd.org/ports/commit/net/glusterfs?id=02f3df6d3d23c1083188c28c06e3c8e4e8ebc373