Bug 260793 - 'swapon -a' can crash the system
Summary: 'swapon -a' can crash the system
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.3-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-29 11:10 UTC by Peter Much
Modified: 2022-01-12 01:17 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Much 2021-12-29 11:10:45 UTC
Reproduce: 
----------
1. Have some early and late swapspaces configured and decently filled already.
2. With paging not yet exhausted, add another *late* swapspace into fstab.
3. Run "swapon -a".

This should not do anything, because the old swapspaces are already active, and the new one is configured as "late", and a mere "swapon -a" would not activate it.

Error:
------
kernel: pid 12296 (daemon), jid 5, uid 5100: exited on signal 10 (core dumped)
kernel: pid 17717 (ruby27), jid 5, uid 5100: exited on signal 6 (core dumped)
kernel: pid 14938 (daemon), jid 10, uid 5100: exited on signal 10 (core dumped)
kernel: pid 19184 (ruby27), jid 10, uid 5100: exited on signal 11 (core dumped)
kernel: pid 19182 (ruby27), jid 10, uid 5100: exited on signal 10 (core dumped)
[etc. etc. lots of them]

And subsequent kernel crash.
(It might appear that the processes coredumping are those that are fully swapped out,  but this is unconfirmed).

I did not find time to analyze this in-depth; it happened twice, it is obvious that 'swapon -a' does mangle something seriousely, so I just don't use it anymore except during bringup.
Comment 1 Mark Johnston freebsd_committer freebsd_triage 2022-01-05 18:22:19 UTC
Do you have "trimonce" in your fstab options by any chance?  Can you show the full fstab entries for the swap devices?
Comment 2 Peter Much 2022-01-05 19:42:43 UTC
 @markj
You're great. That explains it.

So, indeed, one shouldn't re-run "swapon -a" when trimonce is set. 
Or, maybe, we could catch the issue?
Comment 3 Mark Johnston freebsd_committer freebsd_triage 2022-01-05 20:07:35 UTC
(In reply to Peter Much from comment #2)
I think swapoff should handle this scenario.  swapon -a already silently ignores EBUSY from swapon(), which occurs when the file is already a swap device, so it shouldn't also discard blocks in a file that is already a swap device.  Normally the kernel wouldn't allow this, but we have: https://github.com/freebsd/freebsd-src/blob/main/sys/vm/swap_pager.c#L3006

Perhaps swapon's swapon_trim() can check to see if a file at the path is already in use as a swap device.  I don't think we have a good way to do that though.  For devfs files we can compare the device number with the ones available from the vm.swap_info sysctl, but this won't work for regular files, I think.

Perhaps we can extend the swapon() syscall to let the kernel perform the trimming.  Or add an IS_THIS_A_SWAP_DEVICE ioctl that swapon can check before trying to erase the file blocks.
Comment 4 Mark Johnston freebsd_committer freebsd_triage 2022-01-11 15:54:42 UTC
https://reviews.freebsd.org/D33846
Comment 5 Peter Much 2022-01-11 19:49:18 UTC
Mark, 
there seems to be a logical flaw in here:

You say we would be safe if the swap would be opened in write-exclusive mode. And You say, this cannot be done because of 
https://github.com/freebsd/freebsd-src/blob/main/sys/vm/swap_pager.c#L3006
which says that savecore(8) needs write access to it.

Now I am wondering what business savecore(8) might have on an already activated swap  anyway - because I would think it's just too late then.

But what I definitely don't understand: what is the point of having savecore(8) access a swapspace that has been trimonce'd right before? (The swapon(8) manpage does already say that this wouldn't work.)
Comment 6 Mark Johnston freebsd_committer freebsd_triage 2022-01-12 01:17:48 UTC
(In reply to Peter Much from comment #5)
If trimonce is not set, then savecore will most likely be able to recover a kernel dump from a dump (swap) device, after which it clears metadata on the device.  This clearing is what prevents the swap pager from being able to exclusively "claim" a device.  The kernel has no idea whether swapon intends to trim the device or not.