Bug 249413

Summary: bectl - should manipulate canmount on child datasets of the root instead of relying on /etc/rc.d/zfsbe
Product: Base System Reporter: cmh <freebsd>
Component: binAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Some People CC: allanjude, anyonearomatic, bdrewery, kevans, lwhsu, parv.0zero9+freebsd, vsasjason
Priority: ---    
Version: 11.4-RELEASE   
Hardware: Any   
OS: Any   

Description cmh 2020-09-17 23:58:12 UTC
bectl uses a counterintuitive and confusing approach to mounting child datasets of the root in boot environments.

The current approach sets all boot environment datasets to canmount=noauto and relies on /etc/rc.d/zfsbe to mount individual child datasets. 

This works, but is unnecessarily opaque and results in unexpected behaviour relative to the normal expectation of zfs filesystem mounting. In addition, the approach requires a specific naming scheme for child datasets (see comments in /etc/rc.d/zfsbe) that appears unnecessary and is undocumented from the perspective of a user trying to use the provided tools without detailed knowledge of the scheme.

I suggest that bectl be modified to set the zfs canmount=on property on the root dataset and all child datasets when activating a boot environment, and set the zfs canmount=noauto property on the root and child datasets of the boot environment that is deactivated. In this arrangement there would be no need for magic behaviour from /etc/rc.d/zfsbe at boot time.

In addition, the current behaviour for child datasets is undocumented in the bectl manpage (which is quite terse) as is the naming scheme requirements in /etc/rc.d/zfsbe. Regardless of the outcome of this bug, bectl needs additional documentation on child datasets of the root, as the process is not fully explained. For example, the significance of the -r flag on 'bectl create' needs to be explained in the context of child datasets of the root.

The discussion at the following forum thread may also be helpful.
https://forums.freebsd.org/threads/boot-environments-child-datasets-not-activating.77002/
Comment 1 Andriy Gapon freebsd_committer freebsd_triage 2020-09-18 12:09:36 UTC
I believe that bectl use very good approach to mounting subordinate datasets.

Can you share what you consider to be unnecessarily opaque?
What is unexpected behavior?

Which naming convention, in your opinion, does zfsbe require?
Comment 2 cmh 2020-09-19 13:23:50 UTC
(In reply to Andriy Gapon from comment #1)

Thanks for your questions.

Regarding opacity: ZFS documentation states that canmount=noauto means that the dataset can only be mounted explicitly, not automatically. Therefore, if I do a zfs list -o name,canmount and see that the dataset is set to noauto, I should trust it will not be mounted at boot time. The current FreeBSD approach violates that trust, because an undocumented startup script goes in and manually mounts those filesystems. 

Regarding the naming convention, /etc/rc.d/zfsbe contains a comment stating: "# Handle boot environment subordinate filesystems that may have canmount property set to noauto. For these filesystems mountpoint relative to / must be the same as their dataset name relative to BE root dataset." This is not mentioned in the bectl manpage, nor is the existence of /etc/rc.d/zfsbe mentioned there. Presumably the administrator is supposed to glean all of this without reading the code, but I don't see where or how.

I may be missing something, but it seems to me that this system is unnecessary. bectl is already adjusting the canmount property. It should simply set canmount as documented (i.e. between noauto and on as appropriate) and rely on the normal boottime zfs mounting mechanism. Unless I am misguided here, I think the additional magic script (zfsbe) should be removed as it makes the output of the zfs tools irrelevant and misleading.
Comment 3 Anton Saietskii 2022-05-26 12:45:25 UTC
(In reply to cmh from comment #2)

> It should simply set canmount as documented (i.e. between noauto and on as appropriate) and rely on the normal boottime zfs mounting mechanism.
Let's imagine that you have 2 BEs, each with subdataset:
zroot/ROOT/default
zroot/ROOT/default/var
and
zroot/ROOT/upgrade
zroot/ROOT/upgrade/var

Of course, each pair _needs_ to be mounted, but the mountpoints are the same. When you will boot to default, which one var will you get mounted if this will be performed automatically?

> Therefore, if I do a zfs list -o name,canmount and see that the dataset is set to noauto, I should trust it will not be mounted at boot time.
But I totally agree with this.

Perhaps using a custom property indicating BEs will be better.
Comment 4 Allan Jude freebsd_committer freebsd_triage 2022-05-27 12:59:38 UTC
You can temporarily change which BE is being booted from the loader menu, where there is no chance for bectl to swap all of the properties around. then you end up with the wrong var mounted.

I think canmount=noauto and letting rc.d/zfsbe do it, is the only way this can work.
Comment 5 Bryan Drewery freebsd_committer freebsd_triage 2022-06-16 15:57:50 UTC
I think canmount=on used to (many years ago) force a mount right away? It's at least a 10 year old memory. It could explain why we don't use canmount=on. Currently on OpenZFS I don't see it behaving like that.
Comment 6 Bryan Drewery freebsd_committer freebsd_triage 2022-06-16 16:32:14 UTC
(In reply to Bryan Drewery from comment #5)

Yup.

https://github.com/openzfs/zfs/commit/6f1ffb06655008c9b519108ed29fbf03acd6e5de
https://www.illumos.org/issues/2883
"changing "canmount" property to "on" should not always remount dataset"

This is why we avoided the "on" for canmount early on.
Comment 7 Jose Craig 2022-12-27 09:51:52 UTC
MARKED AS SPAM