Bug 134011 - [hang] swap_pager_getswapspace(4): failed
Summary: [hang] swap_pager_getswapspace(4): failed
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: Unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: Kubilay Kocak
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-26 01:40 UTC by randy
Modified: 2021-12-05 23:24 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description randy 2009-04-26 01:40:01 UTC
midnight (gmt) maint runs on a number of 8-current systems.  has been going on for weeks.

swap_pager_getswapspace(16): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(2): failed

many of these and then system either recovers or is dead to serial console.  command prompt remains on ssh shells, but type a command and the shell is locked.  system is pingable but non-responsive.

the below are from when running normally in the case where it has recovered.

work0.psg.com:/root# swapinfo
Device          1024-blocks     Used    Avail Capacity
/dev/mirror/bootb     8382708   307416  8075292     4%

work0.psg.com:/root# df -m
Filesystem                                   1M-blocks   Used  Avail Capacity  Mounted on
/dev/mirror/boota                                 7931    603   6693     8%    /
devfs                                                0      0      0   100%    /dev
procfs                                               0      0      0   100%    /proc
tank/data                                       663002      0 663002     0%    /data
tank/data/nfsen                                 828448 165446 663002    20%    /data/nfsen
tank/data/rpki                                  663179    177 663002     0%    /data/rpki
tank                                            663002      0 663002     0%    /tank
tank/usr                                        668072   5069 663002     1%    /usr
tank/usr/home                                   669874   6871 663002     1%    /usr/home
tank/usr/usr                                    668320   5317 663002     1%    /usr/usr
tank/var                                        663109    106 663002     0%    /var
tank/var/log                                    663105    103 663002     0%    /var/log
tank/var/spool                                  663171    168 663002     0%    /var/spool
tank/var/spool/var                              663002      0 663002     0%    /var/spool/var
tank/var/spool/var/spool                        663016     13 663002     0%    /var/spool/var/spool
/dev/md0                                           247      0    227     0%    /tmp
devfs                                                0      0      0   100%    /data/rpki/rcynic/dev
tank/usr@backup                                 668580   5578 663002     1%    /usr/.zfs/snapshot/backup
tank/usr/usr@2009-03-21-00-49-06                668341   5338 663002     1%    /usr/usr/.zfs/snapshot/2009-03-21-00-49-06
tank/var/spool/var/spool@2009-03-21-02-06-08    663016     13 663002     0%    /var/spool/var/spool/.zfs/snapshot/2009-03-21-02-06-08

work0.psg.com:/root# cat /etc/fstab
# Device        Mountpoint              FStype  Options         Dump    Pass#
#
/dev/mirror/boota       /                       ufs     rw              1       1
#
/dev/mirror/bootb       none                    swap    sw              0       0
#
/dev/acd0       /cdrom                  cd9660  ro,noauto       0       0
#
proc            /proc                   procfs  rw              0       0
#linprocfs      /compat/linux/proc      linprocfs rw            0       0
#
# end

work0.psg.com:/root# top -bores
last pid: 29419;  load averages:  1.03,  0.46,  0.17  up 6+06:12:06    12:46:21
147 processes: 3 running, 144 sleeping

Mem: 202M Active, 39M Inact, 1173M Wired, 2756K Cache, 39M Buf, 2540M Free
Swap: 8186M Total, 300M Used, 7886M Free, 3% Inuse


  PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1034 ejabberd     2  44    0   124M 51244K ucond   1   7:52  0.00% beam
  803 unbound      1  44    0 36420K 24988K select  0   6:05  0.00% unbound
 1370 root         1  76    0   225M 22056K nanslp  1 135:43  0.00% perl5.8.9
 1229 mysql        9  44    0 66984K 21428K sigwai  1   4:05  0.00% mysqld
29159 www          1  47    0    98M 12824K lockf   0   0:01  0.00% httpd
28447 www          1  44    0 99328K 11968K lockf   0   0:01  0.00% httpd
29081 www          1  44    0 99328K 11836K lockf   0   0:01  0.00% httpd
29217 www          1  45    0 99328K 11600K lockf   1   0:00  1.27% httpd
27072 www          1  44    0 98304K 11064K lockf   0   0:01  0.00% httpd
29018 www          1  44    0 98304K 10912K kqread  1   0:00  0.00% httpd
29141 www          1  44    0 98304K 10840K lockf   0   0:00  0.00% httpd
37090 root        19  44    0   170M  8492K select  0  11:33  0.00% asterisk
28098 root         1  48    0 10528K  6592K select  0   0:17  7.08% cvsup
29222 www          1  44    0 96256K  5700K lockf   0   0:00  0.00% httpd
29306 www          1  44    0 96256K  5624K lockf   0   0:00  0.00% httpd
29337 www          1  44    0 96256K  4536K lockf   0   0:00  0.00% httpd
28072 root         1  44    0 17764K  3316K select  1   0:00  0.00% sshd
 1302 root         1  44    0 96256K  2616K select  0   0:15  0.00% httpd


work0.psg.com:/root# vmstat -h 
 procs      memory      page                    disks     faults         cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad4 ad6   in   sy   cs us sy id
 0 0 10   4503M  2628M   219   0   0   2   470 423   0   0  205 2341 4388  2  1 97

How-To-Repeat: wait for midnight

work0.psg.com:/usr/src# cat /etc/daily.local
#!/bin/sh
#

#
# reset ipfw filter counts
#
ipfw -f resetlog
ipfw -f zero

#
# mail and ntp
#
( /usr/bin/ntpq -c peers ; echo ; echo ; \
  echo irrd mirror requestors ; \
      /usr/bin/gunzip -c /var/log/irrd/irrd.log.0.gz \
      | for i in `grep 'rror request' | awk '{print $10}' | sort | uniq`; do \
          host $i | awk '{print "   " $5}'; done \
      ; echo ; echo ; \
  /usr/local/sbin/eximstats /var/spool/exim/log/main) \
  | Mail -s "`hostname` ntp/mail log report" postmaster
/usr/local/sbin/exicyclog
#/usr/bin/gzip -9 main.01

#
# system log
#
/usr/bin/gunzip -c /var/log/messages.0.gz | \
    /usr/bin/egrep -iv '(last message repeated|logfile turned over|PAM: authentication error|reverse map|sshd.*(Did not receive identification string|Disabling protocol|does not map back|(ftp|uucp) not allowed|Invalid user|Failed password|accepted|connection closed|received disconnect|SSH: Server)|sshguard)' | \
  /usr/bin/Mail -s "`hostname` System Log" root

/usr/bin/gunzip -c /var/log/messages.0.gz | \
  /usr/bin/egrep 'sshd.*((ftp|uucp) not allowed|Invalid user|Failed password)|sshguard' | \
  /usr/bin/Mail -s "`hostname` attack Log" root

#
Comment 1 randy 2009-04-27 09:31:41 UTC
bug is not restricted to amd64.  here it is on an i386 system

rip1.psg.com:/root# uname -a
FreeBSD rip1.psg.com 8.0-CURRENT FreeBSD 8.0-CURRENT #9: Sun Apr  5 09:32:56 GMT 2009     root@rip1.psg.com:/usr/obj/usr/src/sys/RIP1  i386

Apr 25 00:07:46 rip1 kernel: swap_pager_getswapspace(16): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(12): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(6): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(16): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(12): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(9): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(5): failed
Apr 25 00:07:47 rip1 kernel: swap_pager_getswapspace(16): failed
Apr 25 00:07:47 rip1 kernel: pid 40337 (exim_tidydb), uid 0, was killed: out of swap space
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2009-05-18 03:04:18 UTC
Responsible Changed
From-To: freebsd-amd64->freebsd-bugs

Not specific to amd64.
Comment 3 randy 2009-05-29 01:37:19 UTC
this time it actually said something interesting on console!

for some values of 'interesting' :)

randy

---

swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapsp
a
lcatale t(ra1p 612): : pafgae ifaullte wdhi
 e sin wkearnepl m_odpe
acpugied =r 1_; agpiec itd s= w01a
afasulpt vairtcuael (ad3dre)ss  :=  0x0f
odfalulet dc
  e     s       =w saupeprv_isopr awgrieter _dagtae, ptagse nowta ppressepnat
0inest(ru3ct)i:on  pfoinateir   l= e0dx2
 :0sxfwfaffpff_ffp80a47gc25e6
rst_acgk eptoinsterw     a   p   s =p 0ax2c8:0exf(3ff)ff:f80 7f9fd1a680i
poieadme 
   ntserw         a   p  _ =p 0xag28e:0rxf_fffgff8e07t9fsd16we0
 0xdep ssegpmenta               c= beas(e 30x0), :lim itf 0xaffifflf,e tydpe
   1bs
w                       =a DpPL _0, ppraesg 1,e lorng_ 1g, deef3t2 s0,w garapn s1
ledaocceseso(r 3ef)la:gs        =f ianteirrulpt eendab
   , rseswumae, pIOP_L p= a0
gcuerrren_t pgroecests  s       = w789a (psysslpogad)
ctreap( n9umb)er:               =  f12
epainilc:e pdag
  fsauwlta
pcp_uipd =a 1g
eUptime: 9h50m49s
Physical memory: 4083 MB
Dumping 1958 MB:

<end. required power cycle>
Comment 4 randy 2009-05-31 02:48:05 UTC
a bit better in last night's syslog, possibly during backup

May 30 00:40:14 work0 kernel: lock order reversal:
May 30 00:40:14 work0 kernel: 1st 0xffffff0057d019d0 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:423
May 30 00:40:14 work0 kernel: 2nd 0xffffff8052c01aa0 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2556
May 30 00:40:14 work0 kernel: 3rd 0xffffff0004b8d098 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:544
May 30 00:40:16 work0 kernel: lock order reversal:
May 30 00:40:16 work0 kernel: 1st 0xffffff8052c01aa0 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2556
May 30 00:40:16 work0 kernel: 2nd 0xffffff00d35c7d30 snaplk (snaplk) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:793
May 30 00:50:14 work0 kernel: lock order reversal:
May 30 00:50:14 work0 kernel: 1st 0xffffff00d35c7d30 snaplk (snaplk) @ /usr/src/sys/kern/vfs_vnops.c:297
May 30 00:50:14 work0 kernel: 2nd 0xffffff0057d019d
May 30 00:50:14 work0 kernel: 0 ufs (ufs) @ /u
May 30 00:50:14 work0 kernel: s
May 30 00:50:15 work0 kernel: r/src/sys/
May 30 00:50:15 work0 kernel: ufs/ffs/
May 30 00:50:15 work0 kernel: ffs_snap
May 30 00:50:15 work0 kernel: shot.c:
May 30 00:50:15 work0 kernel: 1587
May 30 01:45:21 work0 kernel: 
May 30 01:45:21 work0 kernel: 
May 30 01:45:21 work0 kernel: Fatal trap 12: page fault while in kernel mode
May 30 01:45:21 work0 kernel: cpuid = 0; apic id = 00
May 30 01:45:21 work0 kernel: fault virtual address	= 0x0
May 30 01:45:21 work0 kernel: fault code		= supervisor write data, page not present
May 30 01:45:21 work0 kernel: instruction pointer	= 0x20:0xffffffff8047c256
May 30 01:45:21 work0 kernel: sta
May 30 01:45:21 work0 kernel: c
May 30 01:45:21 work0 kernel: k pointer	        = 0x28:0xffffff807a057680
May 30 01:45:21 work0 kernel: frame pointer	        = 0x28:0xffffff807a0576e0
May 30 01:45:21 work0 kernel: code segment		= base rx0, limit 0xfffff, type 0x1b
May 30 01:45:21 work0 kernel: = DPL 0, pres 1, long 1, def32 0, gran 
May 30 01:45:21 work0 kernel: 1
May 30 01:45:21 work0 kernel: processor eflags	= interrup
May 30 01:45:21 work0 kernel: t enabled, resume, 
May 30 01:45:21 work0 kernel: IOPL = 0
May 30 01:45:21 work0 kernel: current process
May 30 01:45:21 work0 kernel: = 9181 (nfcapd)
<required power cycle>
May 30 02:10:04 work0 syslogd: kernel boot file is /boot/kernel/kernel
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:59:50 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 6 Graham Perrin freebsd_committer freebsd_triage 2021-11-28 12:28:33 UTC
randy, is this still an issue?
Comment 7 randy 2021-11-28 17:47:31 UTC
not that i am seeing
Comment 8 Kubilay Kocak freebsd_committer freebsd_triage 2021-12-05 23:24:39 UTC
^Triage: Close based on comment 7