Bug 206328 - Crash on shutdown with swap on NFS file (with patch)
Summary: Crash on shutdown with swap on NFS file (with patch)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs mailing list
URL:
Keywords: patch
Depends on:
Blocks:
 
Reported: 2016-01-17 08:44 UTC by Tom Vijlbrief
Modified: 2016-03-11 06:10 UTC (History)
1 user (show)

See Also:


Attachments
Proposed patch (634 bytes, patch)
2016-01-18 17:39 UTC, Tom Vijlbrief
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Vijlbrief 2016-01-17 08:44:08 UTC
When swap is using an NFS file, and it is filled before shutdown, eg by using

stress -m 4 --vm-keep

a crash occurs on a "shutdown -r" which prevents a reboot:


Jan 17 09:26:12 rpibsd syslogd: exiting on signal 15
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...4 4 4 1 1 1 0 0 0 done
All buffers synced.
No strategy for buffer at 0xc1a47b30
vnode
0xc2a60240: tag none, type VBAD
    usecount 1, writecount 0, refcount 448277 mountedhere 0
    flags (VI_DOOMED)
    lock type nfs: UNLOCKED
swap_pager: I/O error - pagein failed; blkno 23904,size 4096, error 45
panic: swap_pager_force_pagein: read from swap failed
KDB: enter: panic
[ thread pid 1 tid 100001 ]
Stopped at      $d.13:  ldrb    r15, [r15, r15, ror r15]!
db> 

==================================
Example is from Raspberry because I have a serial console attached, but it
also happens on a 64 bit Intel VirtualBox guest, so it is easy to reproduce. 

/etc/fstab:

swan.v7f.eu:/export/all/bsd /media/swan nfs rw,bg,noauto 0 0
/media/swan/swap none          swap    sw              0 0

A "swapoff -a" before shutdown prevents the problem
Comment 1 Tom Vijlbrief 2016-01-17 10:29:06 UTC
Some additional info, in the amd64 shutdown output is shown:

swapoff: /media/swan/swap: cannot allocate memory
Comment 2 Tom Vijlbrief 2016-01-17 14:11:42 UTC
It is not clear to me why swapoff_all is called at the end of bufshutdown(),
probably for a sanity check.

I created two potential quick fixes which work for me,
don't fire the specific KASSERT when rebooting:

*** sys/kern/vfs_bio.c.orig     2016-01-17 14:10:04.000000000 +0100
--- sys/kern/vfs_bio.c  2015-12-22 06:54:16.000000000 +0100
***************
*** 4542,4548 ****
        KASSERT(vp->v_type != VCHR && vp->v_type != VBLK,
            ("Wrong vnode in bufstrategy(bp=%p, vp=%p)", bp, vp));
        i = VOP_STRATEGY(vp, bp);
!       KASSERT(i == 0 && !rebooting, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
  }
  
  void
--- 4542,4548 ----
        KASSERT(vp->v_type != VCHR && vp->v_type != VBLK,
            ("Wrong vnode in bufstrategy(bp=%p, vp=%p)", bp, vp));
        i = VOP_STRATEGY(vp, bp);
!       KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
  }
  
  void


-----8<----------------------
 
or don't swapoff for special files in swapoff_all() which is only called at the end of a reboot.


*** sys/vm/swap_pager.c.orig    2016-01-17 14:24:40.000000000 +0100
--- sys/vm/swap_pager.c 2016-01-17 14:59:03.000000000 +0100
***************
*** 2284,2294 ****
        mtx_lock(&sw_dev_mtx);
        TAILQ_FOREACH_SAFE(sp, &swtailq, sw_list, spt) {
                mtx_unlock(&sw_dev_mtx);
!               if (vn_isdisk(sp->sw_vp, NULL))
                        devname = devtoname(sp->sw_vp->v_rdev);
!               else
                        devname = "[file]";
!               error = swapoff_one(sp, thread0.td_ucred);
                if (error != 0) {
                        printf("Cannot remove swap device %s (error=%d), "
                            "skipping.\n", devname, error);
--- 2284,2298 ----
        mtx_lock(&sw_dev_mtx);
        TAILQ_FOREACH_SAFE(sp, &swtailq, sw_list, spt) {
                mtx_unlock(&sw_dev_mtx);
!               if (vn_isdisk(sp->sw_vp, NULL)) {
                        devname = devtoname(sp->sw_vp->v_rdev);
!                       error = swapoff_one(sp, thread0.td_ucred);
!               } else {
                        devname = "[file]";
!                       error = 0;
!                       printf("Skip swapoff for (NFS) swap file.\n");
!               }
! 
                if (error != 0) {
                        printf("Cannot remove swap device %s (error=%d), "
                            "skipping.\n", devname, error);
Comment 3 Tom Vijlbrief 2016-01-18 09:48:10 UTC
bufshutdown() ends with:


		if (panicstr == NULL) 
			vfs_unmountall(); 
	} 
	swapoff_all(); 

Why not just reverse the order?

So swapoff_all() before the vfs_unmountall(), so that the NFS filesystem is still available.

Perhaps even at the start of bufshutdown().

Still wondering what the reason of a swapoff_all() call just before the actual reboot is.
Comment 4 Tom Vijlbrief 2016-01-18 17:39:47 UTC
Created attachment 165768 [details]
Proposed patch

Do swapoff_all() BEFORE unmounting the filesystems, so that an NFS swap file is not orphaned.

This prevents a panic on reboot/halt or when shutdown cannot do a successfull swapoff due to insufficient working memory.