Bug 224975 - shutdown(8) needs to wait longer for swapoff to avoid a “Cannot allocate memory” error
Summary: shutdown(8) needs to wait longer for swapoff to avoid a “Cannot allocate memo...
Status: Closed Unable to Reproduce
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Only Me
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on: 224479
Blocks: 187081
  Show dependency treegraph
 
Reported: 2018-01-07 18:27 UTC by Wolfram Schneider
Modified: 2023-02-08 07:43 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfram Schneider freebsd_committer freebsd_triage 2018-01-07 18:27:32 UTC
While analysing the bug #224479 I noticed that `shutdown -r now’ runs to fast and failed to swapoff a swap device or swap file.

I see on the console the error message:

  swapoff: /dev/md99: Cannot allocate memory

and soon later a kernel panic. Not good.

This happens when more swap space is in use than free memory is available. E.g. you have 2.5GB swap space, 69MB are in use and only 49MB Free memory is available (according to the top(1) command).

How to repeat:

# start some processes, which need a little bit more RAM than available
for i in $(seq 1 20);do perl -e '$a=`man tcsh`; for(0..100) { $b.=$a}; sleep 100' & done

top(1) reports:

Mem: 611M Active, 51M Inact, 112M Laundry, 142M Wired, 103M Buf, 49M Free
Swap: 2500M Total, 69M Used, 2431M Free, 2% Inuse


# now reboot with shutdown
$ shutdown -r now


you will see the “swapoff: /dev/md99: Cannot allocate memory” error message because 49M Free Mem is < than 69M used swap.

followed by a kernel swap_pager I/O error message

In case of low memory I think that shutdown/reboot needs to wait a little bit (3..10 seconds) after we kill the processes. Then there will be enough free memory available, and the swapoff call will run successfully.
Comment 1 Jilles Tjoelker freebsd_committer freebsd_triage 2018-01-07 21:19:50 UTC
Just "waiting for a few seconds" will not help. The order of operations would have to be adjusted. The current order is (incomplete):

 * shutdown(8) prints final warning message
 * shutdown(8) signals init(8)
 * init(8) sends SIGHUP to all /etc/ttys session leaders and revokes the terminals
 * init(8) starts rc.shutdown
 * rc.shutdown shuts down some daemons
 * rc.shutdown runs /etc/rc.d/swaplate, turning off swap with the late flag
 * rc.shutdown shuts down other daemons
 * init(8) revokes /dev/console
 * init(8) signals all processes with SIGTERM and then SIGKILL, waiting up to 20 seconds for them to terminate
 * init(8) calls reboot(2) with appropriate arguments
 * kernel syncs
 * kernel unmounts (forcibly) all filesystems
 * kernel turns off all swap
 * kernel instructs hardware to power off, reboot, etc.

As a result, any swap files must be turned off by /etc/rc.d/swaplate. If not, the kernel will panic when trying to read data from the swap file when turning it off, since the filesystems have already been unmounted.

You can make scenarios like yours work (without changes to FreeBSD) if you ensure the memory-eating processes are either shut down by an rc.d script that runs before swaplate in the shutdown order or are in the foreground of a tty which is enabled in /etc/ttys.

What could be done in FreeBSD is adding unforced unmount and swapoff after all processes have been signaled. This could be in init(8) or the kernel. Some looping may be beneficial since turning off a swap file may make it possible to unmount a filesystem without forcing.

In case of swap on fuse or the like, it is necessary to turn off the swap before stopping the fuse daemon. However, it is best to kill as many processes as possible before turning off swap to avoid paging in useless things and to avoid high memory pressure.
Comment 2 Wolfram Schneider freebsd_committer freebsd_triage 2023-02-06 16:15:21 UTC
I didn't had the issue for a long time, closing.
Comment 3 Mikhail T. 2023-02-06 18:48:53 UTC
A bumbling novice's question here: why bother with swapoff -- during a shutdown -- at all?

On that note, unmounting filesystems could be replaced by remounting them read-only as well, but that may be complicated...
Comment 4 Volodymyr Kostyrko 2023-02-08 07:43:02 UTC
Well, some device might be not local, and that means stopping some daemons will make us lose those devices and part of active memory. Imagine you are running a diskless host that has swap somewhere on rSCSI or gated.

Next there's a phase when kernel stops all disk activity, meaning that nothing can be extracted from swap at this time. Therefore all remaining activities will only be successful if you don't hit an evicted page. Well, this is quite unlikely, but in case you have a notebook or running on UPS you definitely don't want extra possibility for system to crash, wait for timeout, write core, reset, reboot, and be ready to poweroff again. Always better be safe then sorry.