Bug 271925

Summary: chflags(1) fails to remove uarch flag with hardlinked files (ZFS)
Product: Base System Reporter: Jamie Landeg-Jones <jamie>
Component: binAssignee: freebsd-fs (Nobody) <fs>
Status: In Progress ---    
Severity: Affects Some People CC: chris, jamie, jo, martymac, pi, se
Priority: ---    
Version: CURRENT   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
terminal output demonstrating the problem
none
Update stat info of file to be changed
none
Updated patch that fixes the problem none

Description Jamie Landeg-Jones 2023-06-09 21:48:43 UTC
Created attachment 242705 [details]
terminal output demonstrating the problem

Using chflags to remove "uarch" on more than one file at a time
(but when that file count is an even number) when the files are
hardlinked (have the same inode) and on ZFS doesn't work.

This becomes more problematic when using "find... -exec chflags nouarch {} +"
or "chmod -R".

The problrem occurs because if chflags(2) resets uarch on a file
which doesn't already have uarch set, then the very update causes
uarch to be set.

To mitigate that, chflags(1) does a stat of the file, and will only
attempt to remove the flag if it actually exists.

The problem here is that the stat operation is done on all files
before the chflag(2) operation is then applied to all files.

Apart from the fact this introduces generally a race condition,
it means in this specific case that the flag is removed, and then
added again. (A sort of race against itself that it always loses!)

The stat(2) and chflags(2) should be as close together as possible,
to keep the operation as atomic as possible.

If that is unacceptable due to inefficiencies, chflags(1) should
at least ignore files that it stats which have the same inode as
another file previously stat(2)'ed

A demonstration, along with a ktrace of the offending part of the
code is attached.

Cheers, Jamie
Comment 1 Stefan Eßer freebsd_committer freebsd_triage 2023-06-10 07:06:14 UTC
Created attachment 242708 [details]
Update stat info of file to be changed

This seems to be caused by ZFS collecting multiple flag updates and by effectively mapping the setting of flags to toggling of flags.

I'm not sure that I fully understand the complex mapping of attributes in sys/module/os/freebsd/zfs/zfs_vnops_os.c (line 4888ff), but I guess that the second call sees that the new attributes are identical to the (already altered) current attributes of the file and then marks it to not need a vnode update on stable storage.

Anyway, the attached simple patch fixes the issue on ZFS. I chose to update the flags in place instead of in a separate stat structure.
Comment 2 Stefan Eßer freebsd_committer freebsd_triage 2023-06-10 07:08:51 UTC
The attached patch has been verified to fix the issue on ZFS at the cost of an extra fstatat() call.

A fix in ZFS should also be possible, but would probably be much more complex.
Comment 3 Stefan Eßer freebsd_committer freebsd_triage 2023-06-10 13:45:02 UTC
Created attachment 242714 [details]
Updated patch that fixes the problem

The updated patch performs the fstatat() call only in case the fts derived flags indicate a chflags() call is required.
Comment 4 Stefan Eßer freebsd_committer freebsd_triage 2023-06-10 13:50:22 UTC
This issue should probably be fixed in zfs_vnops.c by not invalidating the flush request of the updated flags on each 2nd call of chflags within a very short time period. (I.e., a PR should be opened on the OpenZFS issue tracker.)

And a regression test should be added (but is only useful when executed on a file system that shows the issue, i.e. ZFS, not on /tmp if it is a TMPFS, for example. /var/tmp might be assumed to be a file system supporting stable storage and could be ZFS on at least some systems the test is performed on ...)
Comment 5 Jamie Landeg-Jones 2023-06-23 01:03:14 UTC
(In reply to Stefan Eßer from comment #1)

TThanks for the quick response!
Apologies for the late reply, I haven't been at home for the last 2 weeks, so online time is more sporadic.

The problem with /bin/chflags is more generic - it's just that this ZFS quirk highlighted the difference. The more general case is that whilst /bin/chflags will not modify a file if the modification isn't needed, the logic fails when files are hardlinked... Though fixing this may be not worth the effort, it's just pedantry at the end of the day, and might even be considered "correct" behaviour - i.e. "if a file is hardlinked then expect it to be accessed for each link found."

Anyway, if it was fixed, I think filtering out duplicate inodes would be the cleanest and most efficient solution.

For example, (and I forgot to mention in my post, sorry) I did a quick chflags(2) syscall wrapper code to set the flags on a file to 0:

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
#include <errno.h>
#include <sys/stat.h>

int main (int argc, char **argv)
  {
    while (*++argv != NULL)
      if (lchflags (*argv, 0) == -1)
        warnx("%s: %s", *argv, strerror (errno));
  }

On tmpfs, UFS, and ZFS,touching a test file, and setting some flags, then running the above program resets the flags, and updates ctime, as expected.

On all three filesystems, running it again always updates the ctime on all 3 filesystems, but on ZFS it switches on uarch. Running it again switches off uarch.

Using /bin/chflags, on all three filesystems, running it again doesn't update the ctime, or on ZFS, toggle the uarch.

So the consensus seems to be "chflags(2)" updates regardless, whilst /bin/chflags will stat the file and only make the chflags(2) call if necessary (of course, this gets tripped up in the hardlinked case mentioned in this bug report)

> This seems to be caused by ZFS collecting multiple flag updates and by effectively mapping the setting of flags to toggling of flags.

I don't think it's that - try the above test program - whilst filesystems have the logic "any change updates the ctime", ZFS has the additional "any change other than one that unsets uarch will also set uarch" - this seems to break down because when you attempt to remove uarch on a file that doesn't have uarch set, this fails the "we are removing uarch" logic. This, I suspect, should be fixed in ZFS.

>  but I guess that the second call sees that the new attributes are identical to the (already altered) current attributes of the file and then marks it to not need a vnode update on stable storage.

I think it's the other way around. The chflags system call (and therefore ZFS) run the code even if the attributes are identical.
/bin/chflags doesn't run the call if the new attributes are identical, but this logic fails if a file is hardlinked, because the initial stat shows "both" need to be updated,

There are also consistencies in chmod(1) - chmod on ZFS causes an update even if no actual chmod takes place - i haven't looked more closely at this, but this opens up a whole can of worms.

I guess the first question should be "If any metdata change is attempted on a file that doesn't actually result in the data being changed (because it's "changing" it to the value it already has) then whose responsibility is it to not attempt the change, and consequently update ctime?

Is it userland, or is it the syscall, or is it the filesystem code itself?

Cheers, Jamie