Bug 238960 - panic in vm_pageout_collect_batch() when QUEUE_MACRO_DEBUG_TRASH is enabled
Summary: panic in vm_pageout_collect_batch() when QUEUE_MACRO_DEBUG_TRASH is enabled
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 12.0-RELEASE
Hardware: Any Any
: --- Affects Some People
Assignee: Mark Johnston
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2019-07-03 15:38 UTC by dgmorris@earthlink.net
Modified: 2019-07-07 17:45 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description dgmorris@earthlink.net 2019-07-03 15:38:08 UTC
Found when working in an environment where QUEUE_MACRO_DEBUG_TRASH is enabled by default and a system is brought to light memory pressure:

#14 0xffffffff81091654 in trap (frame=0xfffffe0031979600)
    at /usr/src/sys/amd64/amd64/trap.c:443
#15 <signal handler called>
#16 vm_pageout_collect_batch (ss=<optimized out>, dequeue=<optimized out>)
    at /usr/src/sys/vm/vm_pageout.c:283
#17 vm_pageout_next (ss=<optimized out>, dequeue=<optimized out>)
    at /usr/src/sys/vm/vm_pageout.c:315
#18 vm_pageout_scan_inactive (shortage=<optimized out>, vmd=<optimized out>, 
    addl_shortage=<optimized out>) at /usr/src/sys/vm/vm_pageout.c:1397
#19 vm_pageout_worker (arg=<optimized out>)
    at /usr/src/sys/vm/vm_pageout.c:1940
#20 0xffffffff80f10e86 in vm_pageout () at /usr/src/sys/vm/vm_pageout.c:2091

(kgdb) f 16
#16 vm_pageout_collect_batch (ss=<optimized out>, dequeue=<optimized out>)
    at /usr/src/sys/vm/vm_pageout.c:283
283         if ((m->flags & PG_MARKER) == 0) {

(kgdb) l
278 
279     vm_pagequeue_lock(pq);
280     for (m = TAILQ_NEXT(marker, plinks.q); m != NULL &&
281         ss->scanned < ss->maxscan && ss->bq.bq_cnt < VM_BATCHQUEUE_SIZE;
282         m = TAILQ_NEXT(m, plinks.q), ss->scanned++) {
283         if ((m->flags & PG_MARKER) == 0) {
284             KASSERT((m->aflags & PGA_ENQUEUED) != 0,
285                 ("page %p not enqueued", m));
286             KASSERT((m->flags & PG_FICTITIOUS) == 0,
287                 ("Fictitious page %p cannot be in page queue", m));

(kgdb) p m
$1 = (vm_page_t) 0xffffffffffffffff

The root cause is the logic for dequeue combined with the iterator of the while
loop:

		(void)vm_batchqueue_insert(&ss->bq, m);
		if (dequeue) {
			TAILQ_REMOVE(&pq->pq_pl, m, plinks.q);
			vm_page_aflag_clear(m, PGA_ENQUEUED);
		}

With m removed from the pagequeue TAILQ, it has no valid TAILQ_NEXT and the DEBUG mode exposes this. Dereference of the (-1) the tailq is set to results in panic shown.

One obvious fix would be to cache the TAILQ_NEXT() of m before dequeue and set m to that after the dequeue [with the non-dequeue case moving the set of m out of the loop statement]. This approach addresses the problem and removes the panic, but others may have prettier/nicer methods.
Comment 1 Mark Johnston freebsd_committer 2019-07-07 17:45:38 UTC
Fixed in r349671 and MFCed to stable/12 in r349812.  Thanks for the report.