206048 – 11.0-CURRENT -r293227 (and others) arm (rpi2/BeagleBone Black) amd64 etc: swapfile usage hangs; swap partition works

Bug 206048 - 11.0-CURRENT -r293227 (and others) arm (rpi2/BeagleBone Black) amd64 etc: swapfile usage hangs; swap partition works

Summary: 11.0-CURRENT -r293227 (and others) arm (rpi2/BeagleBone Black) amd64 etc: swa...

Status:	New

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	11.0-BETA2
Hardware:	Any Any

Importance:	--- Affects Some People
Assignee:	freebsd-arm (Nobody)

URL:
Keywords:

Depends on:	224479
Blocks:
	Show dependency tree / graph

Reported:	2016-01-08 21:43 UTC by Mark Millard
Modified:	2023-02-07 04:14 UTC (History)
CC List:	4 users (show)

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Mark Millard 2016-01-08 21:43:27 UTC

Description for my rpi2 based context: Using a swap partion on the sdcard or on an extern SSD both work fine (recently discovered). But using a fast external SSD for the root file system with a /swapfile0 would eventually hang when normal root-file-system IO was mixed with /swapfile0 IO, such as using gcc49 (from pkg install) to build something like devel/gcc5 (ports). This has been going on since I started with arm but only recently did I manage to get any evidence about the hangs and find a configuration that avoids them. I've also had hangs during svnlite status and svnlite diff and other things that would do lots of file activity.

Evidence from a rpi2 context taken from various hangs while using /swapfile0 as the swap space:

Note: a process such as top or gstat running over the serial console connection continues to update its display so long as it is not disturbed. I was also able to get into ddb. These were my limited window into seeing the following evidence about the hangs when /swapfile0 was the swap-space.

An example top display showed after the hang:

Mem: 764M Active 12M Inact 141M Wired 98M Buf 8k free
Swap: 2048M Total 29M Used 2019 Free 1% in use

The unusual STATEs for processes seemed to be (for the specific hang):

STATE   COMMANDs
pfault  [ld] [ld] /usr/sbin/syslogd
vmwait  [ld] [md0] [kernel]
wswbuf  [pagedaemon]

Those same 3 states seem to always be involved. Some of the processes vary from one hang to the next: the prior hang had build/genautoma , /usr/sbin/moused , and /usr/sbin/ntpd instead of 3 [ld]'s. /usr/sbin/syslogd sometimes looked normal for its state.

[md0], [kernel], and [pagedaemon] and their states did not vary from one hang to the next.

After a hang and waiting an hour+: "gstat -cod" showed exactly one non-zero figure in its grid: L(q) for "md0" showed a 4.

After a hang ddb's "ps" showed (my presentation order and formating):

[pagedaemon] had wmesg wswbuf0 and state D
[swapper]    had wmesg vmwait  and state D
[md0]        had wmesg vmwait  and state DL

[usb]'s threads:
  [usb0]     had wmesg -       and state D (all 5 such lines)
  [smsc0]    had wmesg -       and state D

"show pageq" listed:

pq_free 2 pq_cache 0
dom 0 page_cnt 234761 free 2 pq_act 164873 pq_inact 18563 pass 2

"show freepages" listed only one non-zero "NUMBER POOL 0":

ORDER (SIZE) NUMBER
             POOL 0
01 (000008k) 000001

(gstat and those last two are from Ian Lepore's suggestions for what to look at. But he was not sure either. Hans Petter Selasky had asked about usb process/thread related information.)



Non-RPI2 notes/summaries from others:

Ian Lepore reported similar symptoms but I'm unsure of the swapfile vs. swap partition status for his context(s). (He was attributing problems to slow IO devices when he wrote of having similar symptoms.) I expect that Ian has other arm systems involved than rpi2 and BeagleBone Black but I do not know the details, other than he mentioned arm64 examples.

Paul Mather reported that for BeagleBone Black ( https://lists.freebsd.org/pipermail/freebsd-arm/2016-January/013015.html ):

This meshes with recent experience I have been having with CURRENT on a BeagleBone Black.  I use a swapfile on the SD card (which also hosts all the OS file systems) and the system would regularly lock up, with GEOM complaining on the console about I/O errors to the mmcsd0 device.

In the past, too, I have experienced panics (on all my arm systems) when the system attempts to page in from swap.

For now, on the BeagleBone Black, I have attached an external USB hard drive and am using a swap partition on there to see if it "solves" the swap problem.  The good news is that it hasn't locked up so far.  (Usually, the nightly periodic jobs are enough to provoke the problem.)

Comment 1 Tom Vijlbrief 2016-01-17 08:50:01 UTC

Looks similar to:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194598

on amd64

Comment 2 Mark Millard 2016-01-19 10:53:56 UTC

I've never had a problem on powerpc64 or powerpc (all PowerMacs). But I've realized that I'd always done something distinct for them compared to where I've seen the hang-up problems:

A) create a partition for the swap file's file system

B) create the ufs file system in that partition, with trim enabled (SATA SSD context).

C) only explicitly add one file to that file system: the swap file --no others.

Thus there likely was never a case of I/O from both the containing file system and the swap file activity mixed together in that file system: The 2 types of I/O activity would be from distinct file systems (and distinct partitions). Using a swap partition also keeps the activity separated.

Every example of a problem that I've had or heard of so far did have both swapfile activity and normal file activity mixed together on the same file system (and, so, the same partition too). That may be essential to having the problem.

Comment 3 Tom Vijlbrief 2016-01-22 12:54:59 UTC

This is not related to ARM or USB, I can reproduce it in a VirtualBox amd64 client with the standard emulated hard disk.

Building a kernel where /usr/obj and the /swapfile are on the same filesystem is sufficient. It hangs when linking the kernel, when using 1 Gbyte ram and a 1 Gbyte /swapfile.

stress -d 2 -m 3 --vm-keep

will hang the system as well.

Interesting is that:

https://github.com/freebsd/freebsd/commit/0f56e66e2f53df9e66c87c4c703a093c7926dc1c

describes this behaviour, but somehow that fix is insufficient...

Comment 4 Sevan Janiyan 2016-07-27 15:24:55 UTC

Also applicable to the first generation PI, just stumbled across it a 1st gen model b board running the stock 11-BETA2 image.

Comment 5 Sevan Janiyan 2016-07-27 15:30:24 UTC

In my case, to reproduce the behaviour, boot PI, add swap file as per handbook instructions[1], run cat /somefile > /someotherfile

In this case /somefile is a 4GB file containing garbage from /dev/urandom


[1] https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/adding-swap-space.html

Comment 6 Mark Millard 2016-07-27 19:19:48 UTC

I'm changing Version to 11.0-BETA2 because comments 4 and 5 report seeing the problem in that context.

Comment 7 Mark Millard 2017-02-14 03:44:36 UTC

On 2017-Feb-13, at 7:20 PM, Konstantin Belousov <kostikbel at gmail.com> wrote
on the freebsd-arm list:

. . .

swapfile write requires the write request to come through the filesystem
write path, which might require the filesystem to allocate more memory
and read some data. E.g. it is known that any ZFS write request
allocates memory, and that write request on large UFS file might require
allocating and reading an indirect block buffer to find the block number
of the written block, if the indirect block was not yet read.

As result, swapfile swapping is more prone to the trivial and unavoidable
deadlocks where the pagedaemon thread, which produces free memory, needs
more free memory to make a progress.  Swap write on the raw partition over
simple partitioning scheme directly over HBA are usually safe, while e.g.
zfs over geli over umass is the worst construction.



===

Given what Konstantin wrote this report may be an
example of "the problem exists but there is no plan
to fix nor much probability of a future plan to fix".

(There seems to be no way to mark a bugzilla report
for such a status.)

Implication: avoid file based swapfiles.

It is too bad that crochet sets up a file based swap
context.

Comment 8 Mark Millard 2017-08-09 09:09:55 UTC

Konstantin Belousov wrote on the freebsd-hackers
list in response to another report:

See
https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html