Bug 147444 - [rc.d] [patch] /etc/rc.d/zfs stop not called on reboot & modules cause system hang
Summary: [rc.d] [patch] /etc/rc.d/zfs stop not called on reboot & modules cause system...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: conf (show other bugs)
Version: 7.3-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-rc (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-06-03 19:10 UTC by Mykah
Modified: 2020-09-12 16:14 UTC (History)
1 user (show)

See Also:


Attachments
Add 'shutdown' keyword to /etc/rc.d/zfs (230 bytes, patch)
2020-06-21 09:43 UTC, Markus Stoff
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mykah 2010-06-03 19:10:01 UTC
Hardware:
CPU: Quad-Core Xeon L3426
MEM: 16G
CONTROLLER: 3ware 9750-8i (all disks as single-disk units exported
individually, except for the OS which is a HW Raid1 = /dev/da0)

ZFS: 10 x 2TB raidz2 currently running defaults. (2TB Hitachi SATA)

ZFS is configured to start @ boot via 'zfs_enable="YES"' in the /etc/rc.conf.

Creation Commands:
# zpool create storage01 raidz2 da1 da2 da3 da4 da5 da6 da7 da8 da9 da10 
# zpool add storage01 spare da11 da12 
# zfs create storage01/tachyon 
# zfs create storage01/iron 
# zfs create storage01/boson 
# zfs create storage01/hydro

Fix: 

Note: You can get a clean 'shutdown -r now' after these edits. Unfortunately,
'reboot' still does not function and hangs the system. I believe 'reboot'
does not trigger rc.shutdown and hence does not fire off the 'rcorder -k
shutdown /etc/rc.d/*' that a 'shutdown' would.

Edit /etc/rc.d/zfs as follows:

ADD below '#REQUIRE: mountcritlocal':
# KEYWORD shutdown

ADD @ bottom of zfs_stop_main() subroutine/function:
kldunload zfs.ko  opensolaris.ko
sleep 5
kldunload zfs.ko opensolaris.ko
sleep 5
How-To-Repeat: Create a pool and then attempt to:
# reboot
or
# shutdown -r now

Result:
Waiting (max 60 seconds) for system process `vnlru' to stop...done.
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done.
Waiting (max 60 seconds) for system process `syncer' to stop...done.
Syncing disks, vnodes remaining... (etc, etc).. done
All buffers synced.
zfs_umount:1005[0] Force unmount is experimental - report any problems.
zfs_umount:1005[0] Force unmount is experimental - report any problems.
zfs_umount:1005[0] Force unmount is experimental - report any problems.
zfs_umount:1005[0] Force unmount is experimental - report any problems.
zfs_umount:1005[0] Force unmount is experimental - report any problems.
Uptime: <TIME>
--- HANGS ----
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2010-06-04 02:11:01 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-rc

Over to maintainer(s).
Comment 2 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:57 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 3 Markus Stoff 2020-06-21 09:43:14 UTC
Created attachment 215840 [details]
Add 'shutdown' keyword to /etc/rc.d/zfs
Comment 4 Markus Stoff 2020-06-21 09:43:59 UTC
It's amazing that after more than 10 years a single line could not be added.

Please find attached the diff to facilitate that change. If there is a reason why /etc/rc.d/zfs should not be called by rc.shutdown, please let us know.


Patch:

rc.shutdown calls rcorder(8) with the '-k shutdown' option. rcorder(8) will therefore only return files declaring the 'shutdown' keyword:
# KEYWORD: shutdown

As of 12.1, /etc/rc.d/zfs does not declare any keywords. This patch adds the 'shutdown' keyword.


Use case: Providing ZFS datasets to a jail


Problem:

Without calling /etc/rc.d/zfs on shutdown, resources associated with ZFS mounts are not freed and the jail will remain in dying state. In addition, the dataset is now in a dangling state, as the jail it is attached to is dying.


Workaround:

In /etc/jail.conf, make sure to run 'service zfs stop' when the jail is stopped:

    exec.stop = "/bin/sh /etc/rc.shutdown";
    exec.stop += "/usr/sbin/service zfs stop";


Desaster recovery:

With the jail in dying state, issue:

# After the following statements, the dataset will be unmounted and the jail
# will finally be gone
zfs set jailed=off tank/jaildata
zfs unmount tank/jaildata

# Don't forget to set jailed=on before starting the jail again
zfs set jailed=on tank/jaildata


How to reproduce:

# jail.conf
test {
        path = "/jails/test";
        exec.clean;
        exec.start = "/bin/sh /etc/rc";
        exec.stop = "/bin/sh /etc/rc.shutdown";

        # Make sure to unmount all ZFS datasets before stopping the jail
        # Required unless the jails /etc/rc.d/zfs contains '# KEYWORD: shutdown'
        exec.stop += "/usr/sbin/service zfs stop";

        # Mandatory to use ZFS in jail
        allow.mount;
        allow.mount.zfs;
        enforce_statfs = 1;     # must be less than 2

        # Attach ZFS dataset to jail
        exec.created = "/sbin/zfs jail test tank/jaildata";

        # Make sure the /dev/zfs device is included (it is with the default
        # devfs_ruleset = 4)
        mount.devfs;
}

# Create dataset
zfs create -o jailed=on -o mountpoint=/data tank/jaildata
mkdir -p /jails/test/data
sysrc -f /jails/test/etc/rc.conf zfs_enable=YES

# Start the jail
jail -c test

# List ZFS mounts
zfs mount | grep jaildata
tank/jaildata                   /jails/test/data

# Stop the jail
jail -r test

# List ZFS mounts (mount is still there)
zfs mount | grep jaildata
tank/jaildata                   /jails/test/data


Additional Remarks:

While the workaround seems to be okay-ish for the jail situation, it is still unclean. However, for physical hosts this may wreak havoc with the pool if shared spares are used, as 'zfs unshare' is never invoked on shutdown.
Comment 5 Markus Stoff 2020-06-21 10:54:01 UTC
Correction for the workaround proposed in my previous post:

Instead of

    exec.stop += "/usr/sbin/service zfs stop";

the second line must read:

    exec.stop += "/usr/sbin/service zfs stop || /usr/bin/true";

This is necessary because '/usr/sbin/service zfs stop' will always return false if the root filesystem of the jail happens to be a ZFS dataset, which is very likely to be the case.