204764 – Filesystem deadlock, process in vodead state

Bug 204764 - Filesystem deadlock, process in vodead state

Summary: Filesystem deadlock, process in vodead state

Status:	Closed FIXED

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	10.2-STABLE
Hardware:	amd64 Any

Importance:	--- Affects Some People
Assignee:	freebsd-fs (Nobody)

URL:
Keywords:

Depends on:
Blocks:

Reported:	2015-11-23 10:30 UTC by Robert Schulze
Modified:	2017-01-06 15:52 UTC (History)
CC List:	19 users (show)

See Also:

Attachments
procstat -kk -a output (332.79 KB, text/plain) 2015-11-23 10:30 UTC, Robert Schulze	no flags	Details
threads frame output (7.62 KB, text/plain) 2016-01-28 15:55 UTC, Daniel Ylitalo	no flags	Details
Debugging patch (943 bytes, patch) 2016-01-30 10:42 UTC, Konstantin Belousov	no flags	Details \| Diff
patch to apply r302063 on FreeBSD-10 (3.80 KB, patch) 2016-07-27 12:45 UTC, Robert Schulze	no flags	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Robert Schulze 2015-11-23 10:30:52 UTC

Created attachment 163453 [details]
procstat -kk -a output

FreeBSD 10.2-RELEASE-p5 r289218

Deadlock is on an SSD-only mirrored ZFS pool:

home               ONLINE       0     0     0
  mirror-0         ONLINE       0     0     0
    gpt/mirror0-a  ONLINE       0     0     0
    gpt/mirror0-b  ONLINE       0     0     0
  mirror-1         ONLINE       0     0     0
    gpt/mirror1-a  ONLINE       0     0     0
    gpt/mirror1-b  ONLINE       0     0     0
  mirror-2         ONLINE       0     0     0
    gpt/mirror2-a  ONLINE       0     0     0
    gpt/mirror2-b  ONLINE       0     0     0
  mirror-3         ONLINE       0     0     0
    gpt/mirror3-a  ONLINE       0     0     0
    gpt/mirror3-b  ONLINE       0     0     0

The deadlocked process is a chksetuid run from periodic. The affected directory is identified. Here is a procstat -kk output:

 1816 102485 find             -                mi_switch+0xe1
sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100
zfs_freebsd_open+0xf5 VOP_OPEN_APV+0xa1 vn_open_vnode+0x234
vn_open_cred+0x33e kern_openat+0x26f amd64_syscall+0x33a Xfast_syscall+0xfb

Trying to ls the affected directory ends up in the same:

71376 102400 ls               -                mi_switch+0xe1
sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100
zfs_freebsd_open+0xf5 VOP_OPEN_APV+0xa1 vn_open_vnode+0x234
vn_open_cred+0x33e kern_openat+0x26f amd64_syscall+0x33a Xfast_syscall+0xfb

However, I'm able to list directory entries via zdb(8).

The processes are not killable.
The machine is running and non-productive, so I'm able to give any debugging information, just give me some pointers.

Attached is the output from procstat -a -kk.

Comment 1 Daniel Ylitalo 2016-01-08 11:47:12 UTC

I'm actually having the exact same problem, it started when we upgraded to 10.2 (we were running 10.1 before and it never happened there) and it happens about once a month and after a reboot it's good for about another month.

It seems that its the periodic find that always gets stuck and after that alot of other i/o procs get stuck

Here's the periodic find:
23293 100188 find             -                mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 ufs_lookup_ino+0xa0 VOP_CACHEDLOOKUP_APV+0xa1 vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0xa1 lookup+0x5a1 namei+0x4d4 kern_statat_vnhook+0xae sys_fstatat+0x2c amd64_syscall+0x357 Xfast_syscall+0xfb

And here's a rsync call thats stuck:
83577 100644 rsync            -                mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 ufs_open+0x6d VOP_OPEN_APV+0xa1 vn_open_vnode+0x234 vn_open_cred+0x33e kern_openat+0x26f amd64_syscall+0x357 Xfast_syscall+0xfb

One difference in setup is that I'm running a mfi raid and not zfs

[root@backup-se ~]# mfiutil show volumes
mfi0 Volumes:
  Id     Size    Level   Stripe  State   Cache   Name
 mfid0 (   12T) RAID-5      64K OPTIMAL Disabled
 mfid1 (  931G) RAID-0      64K OPTIMAL Disabled


Filesystem      Size    Used   Avail Capacity  Mounted on
/dev/mfid1p2    898G    4.8G    821G     1%    /
/dev/mfid0p1     12T    8.9T    1.8T    83%    /backups
devfs           1.0K    1.0K      0B   100%    /dev
fdescfs         1.0K    1.0K      0B   100%    /dev/fd

Comment 2 Konstantin Belousov freebsd_committer

2016-01-09 18:09:15 UTC

(In reply to daniel from comment #1)
Recompile the kernel with INVARIANTS etc, see the guide at https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

When the situation occurs, execute 'kgdb <path>/kernel.debug /dev/mem'.  After that, switch to the thread which hung, with the kgdb command 'thread <id>'.  Backtrace 'bt' would should you the arguments, find the vnode address which caused the hang (most likely it is shown as vp), and do 'p *vp', 'p *(vp->v_bufobj.bo_object)'.

Comment 3 Daniel Ylitalo 2016-01-10 11:49:08 UTC

Thanks for the pointers, I'll get back to you with the output as soon as it happens.

Comment 4 Daniel Ylitalo 2016-01-28 12:36:33 UTC

Okay! Got a proc stuck again, however, I'm having trouble understanding "find the vnode address which caused the hang", how do I know what caused the hang?

Heres all the output I've gatherd so far:


procstat -kk -a
=====================================
94614 100627 python2.7        -                mi_switch+0x179 sleepq_switch+0x152 sleepq_wait+0x43 _sleep+0x366 vnode_create_vobject+0xe9 ufs_lookup_ino+0xa0 VOP_CACHEDLOOKUP_APV+0x10f vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0x10f lookup+0x5c2 namei+0x504 kern_statat_vnhook+0xae sys_lstat+0x30 amd64_syscall+0x278 Xfast_syscall+0xfb
=====================================
[root@hub-se /usr/obj/usr/src/sys/GENERIC]# kgdb kernel.debug /dev/mem
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Reading symbols from /boot/kernel/aio.ko.symbols...done.
Loaded symbols for /boot/kernel/aio.ko.symbols
Reading symbols from /boot/kernel/accf_data.ko.symbols...done.
Loaded symbols for /boot/kernel/accf_data.ko.symbols
Reading symbols from /boot/kernel/accf_http.ko.symbols...done.
Loaded symbols for /boot/kernel/accf_http.ko.symbols
Reading symbols from /boot/kernel/fdescfs.ko.symbols...done.
Loaded symbols for /boot/kernel/fdescfs.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
#0  sched_switch (td=0xffffffff81850720, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);

(kgbd) info threads
  505 Thread 100627 (PID=94614: python2.7)  sched_switch (td=0xfffff80379b5e940, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
  
(kgdb) thread 505
[Switching to thread 505 (Thread 100627)]#0  sched_switch (td=0xfffff80379b5e940, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xfffff80379b5e940, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
#1  0xffffffff8094e149 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:491
#2  0xffffffff8098bb02 in sleepq_switch (wchan=<value optimized out>, pri=<value optimized out>) at /usr/src/sys/kern/subr_sleepqueue.c:538
#3  0xffffffff8098b963 in sleepq_wait (wchan=0xfffff808cdd41e00, pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:617
#4  0xffffffff8094da56 in _sleep (ident=<value optimized out>, lock=<value optimized out>, priority=<value optimized out>, wmesg=<value optimized out>, sbt=<value optimized out>, pr=<value optimized out>)
    at /usr/src/sys/kern/kern_synch.c:255
#5  0xffffffff80be8759 in vnode_create_vobject (vp=0xfffff8036ee569c0, isize=512, td=0xfffff80379b5e940) at /usr/src/sys/vm/vnode_pager.c:120
#6  0xffffffff80bb1010 in ufs_lookup_ino (vdp=0xfffff8036ee569c0, vpp=0xfffffe0a5bdfd9a8, cnp=0xfffffe0a5bdfd9d0, dd_ino=0x0) at /usr/src/sys/ufs/ufs/ufs_lookup.c:259
#7  0xffffffff80e791ef in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:197
#8  0xffffffff809dba26 in vfs_cache_lookup (ap=<value optimized out>) at vnode_if.h:80
#9  0xffffffff80e7902f in VOP_LOOKUP_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:129
#10 0xffffffff809e40c2 in lookup (ndp=0xfffffe0a5bdfd948) at vnode_if.h:54
#11 0xffffffff809e3764 in namei (ndp=0xfffffe0a5bdfd948) at /usr/src/sys/kern/vfs_lookup.c:302
#12 0xffffffff809f911e in kern_statat_vnhook (td=0xfffff80379b5e940, flag=<value optimized out>, fd=-100, path=0x8067103d4 <Error reading address 0x8067103d4: Bad address>, pathseg=UIO_USERSPACE, sbp=0xfffffe0a5bdfda60, hook=0)
    at /usr/src/sys/kern/vfs_syscalls.c:2298
#13 0xffffffff809f92b0 in sys_lstat (td=0x0, uap=0xfffffe0a5bdfdb80) at /usr/src/sys/kern/vfs_syscalls.c:2278
#14 0xffffffff80d51b38 in amd64_syscall (td=0xfffff80379b5e940, traced=0) at subr_syscall.c:134
#15 0xffffffff80d359ab in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#16 0x000000080157f8aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)


I'm letting the server stay stuck until I know how to get the info you need out of it :)

Comment 5 Konstantin Belousov freebsd_committer

2016-01-28 13:31:05 UTC

In the frame 5, the argument vp points to the vnode.  Also do 'p *td' in that frame.

Comment 6 Daniel Ylitalo 2016-01-28 14:49:04 UTC

Sorry, but this is the first time I'm doing any kind of kernel debuging.

So I'm supposed to write this after the backtrace?

(kgdb) p *0xfffff8036ee569c0
$1 = -2130911031


But then the second command fails
(kgdb) p *(vp->v_bufobj.bo_object)
No symbol "vp" in current context.

Comment 7 Konstantin Belousov freebsd_committer

2016-01-28 15:43:52 UTC

(In reply to Daniel Ylitalo from comment #6)
frame 5
p *vp
p *td
etc

You must use symbols to access data, otherwise gdb cannot deduce types.

Comment 8 Daniel Ylitalo 2016-01-28 15:55:53 UTC

Created attachment 166244 [details]
threads frame output

Comment 9 Daniel Ylitalo 2016-01-28 15:56:35 UTC

I've attached the output, let me know if you need something else or if I can reboot without the kernel debugging in place.

Comment 10 Konstantin Belousov freebsd_committer

2016-01-28 16:38:43 UTC

(In reply to Daniel Ylitalo from comment #9)
This is seemngly what I need to start, but I probably would not look closer until some time, most likely tomorrow.

Comment 11 Konstantin Belousov freebsd_committer

2016-01-28 21:29:28 UTC

(In reply to Konstantin Belousov from comment #10)
Is there a procstat -kk output somewhere ?  I want to see the state of all threads when the problem occurs.

Comment 12 Daniel Ylitalo 2016-01-29 08:34:00 UTC

Unfortunately I rebooted the server already, but I still have debug compiled in so I'll get one for you the next time it happens.

Comment 13 Konstantin Belousov freebsd_committer

2016-01-30 10:42:55 UTC

Created attachment 166295 [details]
Debugging patch

I attached the debugging patch to try to see something about code which set the OBJ_DEAD flag.  In fact, I do not expect that this would catch the offender, but we will see.

Somebody who wants to help with the bug should apply the patch, reproduce the issue and then provide the same information as I already asked, most important is the p *vp and p *bo_object printouts.  If I see what I wanted, I would provide further instructions how to proceed.

Comment 14 Daniel Ylitalo 2016-02-02 15:33:43 UTC

I've recompiled the kernel with the patch applied, I'll wait for it to happen again.

Comment 15 johans 2016-02-26 12:52:25 UTC

We've just ran into the same bug on one of our machines.

# procstat -kk 52827
  PID    TID COMM             TDNAME           KSTACK                       
52827 100187 ruby21           -                mi_switch+0xe1 thread_suspend_switch+0x170 thread_single+0x4e5 exit1+0xbe sigexit+0x925 postsig+0x286 ast+0x427 doreti_ast+0x1f 
52827 100389 ruby21           -                mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 zfs_freebsd_open+0xf5 VOP_OPEN_APV+0xa1 vn_open_vnode+0x234 vn_open_cred+0x33e kern_openat+0x26f amd64_syscall+0x357 Xfast_syscall+0xfb 


(kgdb) print *object
$10 = {
  lock = {
    lock_object = {
      lo_name = 0xffffffff80ff8196 "vm object", 
      lo_flags = 90374144, 
      lo_data = 0, 
      lo_witness = 0x0
    }, 
    rw_lock = 1
  }, 
  object_list = {
    tqe_next = 0xfffff80135aaa600, 
    tqe_prev = 0xfffff80135aaa420
  }, 
  shadow_head = {
    lh_first = 0x0
  }, 
  shadow_list = {
    le_next = 0x0, 
    le_prev = 0xfffff8019f85f030
  }, 
  memq = {
    tqh_first = 0x0, 
    tqh_last = 0xfffff80135aaa548
  }, 
  rtree = {
    rt_root = 0, 
    rt_flags = 0 '\0'
  }, 
  size = 0, 
  generation = 1, 
  ref_count = 0, 
  shadow_count = 0, 
  memattr = 6 '\006', 
  type = 2 '\002', 
  flags = 24584, 
  pg_color = 1569, 
  paging_in_progress = 0, 
  resident_page_count = 0, 
  backing_object = 0x0, 
  backing_object_offset = 0, 
  pager_object_list = {
    tqe_next = 0x0, 
    tqe_prev = 0x0
  }, 
  rvq = {
    lh_first = 0x0
  }, 
  cache = {
    rt_root = 0, 
    rt_flags = 0 '\0'
  }, 
  handle = 0xfffff8010bc08760, 
  un_pager = {
    vnp = {
      vnp_size = 0, 
      writemappings = 0
    }, 
    devp = {
      devp_pglist = {
        tqh_first = 0x0, 
        tqh_last = 0x0
      }, 
      ops = 0x0, 
      dev = 0x0
    }, 
    sgp = {
      sgp_pglist = {
        tqh_first = 0x0, 
        tqh_last = 0x0
      }
    }, 
    swp = {
      swp_tmpfs = 0x0, 
      swp_bcount = 0
    }
  }, 
  cred = 0x0, 
  charge = 0
}

(kgdb) p *vp
$11 = {
  v_tag = 0xffffffff81acefb6 "zfs", 
  v_op = 0xffffffff81ae12f0, 
  v_data = 0xfffff80031109450, 
  v_mount = 0xfffff8000801a330, 
  v_nmntvnodes = {
    tqe_next = 0xfffff801c912a1d8, 
    tqe_prev = 0xfffff8022bfa0780
  }, 
  v_un = {
    vu_mount = 0x0, 
    vu_socket = 0x0, 
    vu_cdev = 0x0, 
    vu_fifoinfo = 0x0
  }, 
  v_hashlist = {
    le_next = 0x0, 
    le_prev = 0x0
  }, 
  v_cache_src = {
    lh_first = 0x0
  }, 
  v_cache_dst = {
    tqh_first = 0xfffff801a260f4d0, 
    tqh_last = 0xfffff801a260f4f0
  }, 
  v_cache_dd = 0x0, 
  v_lock = {
    lock_object = {
      lo_name = 0xffffffff81acefb6 "zfs", 
      lo_flags = 117112832, 
      lo_data = 0, 
      lo_witness = 0x0
    }, 
    lk_lock = 1, 
    lk_exslpfail = 0, 
    lk_timo = 51, 
    lk_pri = 96
  }, 
  v_interlock = {
    lock_object = {
      lo_name = 0xffffffff80fd2f2d "vnode interlock", 
      lo_flags = 16973824, 
      lo_data = 0, 
      lo_witness = 0x0
    }, 
    mtx_lock = 4
  }, 
  v_vnlock = 0xfffff8010bc087c8, 
  v_actfreelist = {
    tqe_next = 0xfffff80079317000, 
    tqe_prev = 0xfffff80092a9e820
  }, 
  v_bufobj = {
    bo_lock = {
      lock_object = {
        lo_name = 0xffffffff80fd2f3d "bufobj interlock", 
        lo_flags = 86179840, 
        lo_data = 0, 
        lo_witness = 0x0
      }, 
      rw_lock = 1
    }, 
    bo_ops = 0xffffffff8149bff0, 
    bo_object = 0xfffff80135aaa500, 
    bo_synclist = {
      le_next = 0x0, 
      le_prev = 0x0
    }, 
    bo_private = 0xfffff8010bc08760, 
    __bo_vnode = 0xfffff8010bc08760, 
    bo_clean = {
      bv_hd = {
        tqh_first = 0x0, 
        tqh_last = 0xfffff8010bc08880
      }, 
      bv_root = {
        pt_root = 0
      }, 
      bv_cnt = 0
    }, 
    bo_dirty = {
      bv_hd = {
        tqh_first = 0x0, 
        tqh_last = 0xfffff8010bc088a0
      }, 
      bv_root = {
        pt_root = 0
      }, 
      bv_cnt = 0
    }, 
    bo_numoutput = 0, 
    bo_flag = 0, 
    bo_bsize = 131072
  }, 
  v_pollinfo = 0x0, 
  v_label = 0x0, 
  v_lockf = 0x0, 
  v_rl = {
    rl_waiters = {
      tqh_first = 0x0, 
      tqh_last = 0xfffff8010bc088e8
    }, 
    rl_currdep = 0x0
  }, 
  v_cstart = 0, 
  v_lasta = 0, 
  v_lastw = 0, 
  v_clen = 0, 
  v_holdcnt = 2, 
  v_usecount = 2, 
  v_iflag = 512, 
  v_vflag = 0, 
  v_writecount = 0, 
  v_hash = 17547399, 
  v_type = VREG
}

(kgdb) print vp->v_cache_dst->tqh_first->nc_name
$14 = 0xfffff801a260f512 "puppet20160226-52827-x6ea8y"

The file where things go wrong for us is: /tmp/puppet20160226-52827-x6ea8y

Performing a cat on this file also hangs on vodead.

Any relevant waiting channels:

# ps -o lwp -laxwwwSH | awk '{ if ($10 !~ "^(-|ttyin|lockf|wait|select|kqread|tx\->)") print; }' 
   LWP   UID   PID  PPID CPU  PRI  NI     VSZ    RSS MWCHAN   STAT TT         TIME COMMAND
100000     0     0     0   0  -16   0       0 285088 swapin   DLs   -      0:41.09 [kernel/swapper]
100048     0     0     0   0  -92   0       0 285088 vtbslp   DLs   -      0:00.00 [kernel/virtio_ballo]
100018     0     2     0   0  -16   0       0     16 crypto_w DL    -      0:00.00 [crypto]
100019     0     3     0   0  -16   0       0     16 crypto_r DL    -      0:00.00 [crypto returns]
100057     0     5     0   0   20   0       0   6176 arc_recl DL    -     13:07.13 [zfskern/arc_reclaim]
100058     0     5     0   0   -8   0       0   6176 l2arc_fe DL    -      0:40.44 [zfskern/l2arc_feed_]
102486     0     5     0   0   -8   0       0   6176 zio->io_ DL    -      8:24.66 [zfskern/txg_thread_]
100062     0     7     0   0  -16   0       0     32 psleep   DL    -     17:00.73 [pagedaemon/pagedaem]
100068     0     7     0   0  -16   0       0     32 umarcl   DL    -      0:00.00 [pagedaemon/uma]
100063     0     8     0   0  -16   0       0     16 psleep   DL    -      0:00.00 [vmdaemon]
100064     0     9     0   0  155   0       0     16 pgzero   DL    -      0:00.17 [pagezero]
100001     0    10     0   0  -16   0       0     16 audit_wo DL    -     14:36.35 [audit]
100065     0    16     0   0  -16   0       0     16 psleep   DL    -      0:51.50 [bufdaemon]
100066     0    17     0   0  -16   0       0     16 vlruwt   DL    -      1:27.70 [vnlru]
100067     0    18     0   0   16   0       0     16 syncer   DL    -     63:11.82 [syncer]
100406 65534   921     1   0   52   0   71104  33604 uwait    Is    -      0:00.00 /usr/local/bin/memcached -l 127.0.0.1 -m 1024 -p 11211 -C -d -u nobody -P /var/run/memcached/memcached.pid
100387     0   958     1   0   23   0   16612   2224 nanslp   Is    -     97:24.14 /usr/sbin/cron -s
100394     0  1194     0   0  -16   0       0     16 pftm     DL    -     30:48.46 [pf purge]
100388     0  1201     1   0   52   0   14700   2256 sbwait   Is    -      0:00.00 pflogd: [priv] (pflogd)
100375    64  1202  1201   0   20   0   14700   2312 bpf      S     -      5:36.59 pflogd: [running] -s 116 -i pflog0 -f /dev/null (pflogd)
100389     0 52827     1   0   52   0  213452 112968 vodead   T     -      0:08.69 ruby21: puppet agent: applying configuration (ruby21)
100400     0 55604     0   0  -16   0       0     16 ftcl     DL    -      0:00.00 [ftcleanup]
101076     0 58064     1   0   20   0   16588   2168 auditd   Is    -      0:07.02 /usr/sbin/auditd
101349     0 37651 37601   0   20   0   12356   2132 vodead   D+    0-     0:00.00 cat puppet20160226-52827-x6ea8y

This is the first time we're seeing this.

Comment 16 johans 2016-02-26 12:54:38 UTC

I have to reboot the machine now, so hope gives enough extra information (though I doubt it). This was all the relevant information I could come up with.

Comment 17 Jason Wolfe 2016-03-30 23:44:38 UTC

We are seeing this issue at Limelight across a good number of boxes on stable/10 @ r285800.  We have a newer 10 version in testing @ r296969, but have not used it with enough anger to empirically know if it has been fixed.  It should be getting deployed further in the next months or so.

Unfortunately we don't yet have a way to reproduce, so we can introduce INVARIANTS into a kernel to help capture some info.  I'll report back if we do, but in the meantime I did want to +1 this.

Comment 18 Rudolf Čejka 2016-04-01 12:04:31 UTC

Hello,
  another +1 for us.

Comment 19 Rudolf Čejka 2016-04-01 12:14:10 UTC

And now the debugging info. Unfortunately, we are new for this bug, so we have the kernel without INVARIANTS and without debugging.patch yet.

We have locked bacula-fd, which is backup client with similar function to find & stat and it occured during incremental backup operation:

procstat -kk (I can send full procstat -kk -a personally, it has 429876 bytes):

 1331 101013 bacula-fd        -                mi_switch+0xbe sleepq_catch_signals+0xab sleepq_wait_sig+0xc _cv_wait_sig+0x17a seltdwait+0xae kern_select+0x8fa sys_select+0x54 amd64_syscall+0x321 Xfast_syscall+0xfb 
 1331 101035 bacula-fd        -                mi_switch+0xbe sleepq_catch_signals+0xab sleepq_timedwait_sig+0xf _sleep+0x238 umtxq_sleep+0x125 do_wait+0x387 __umtx_op_wait_uint_private+0x83 amd64_syscall+0x321 Xfast_syscall+0xfb 
 1331 102066 bacula-fd        -                mi_switch+0xbe sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 ufs_lookup_ino+0xa0 VOP_CACHEDLOOKUP_APV+0x7a vfs_cache_lookup+0xd6 VOP_LOOKUP_APV+0x7a lookup+0x571 namei+0x344 kern_statat_vnhook+0xae sys_lstat+0x30 amd64_syscall+0x321 Xfast_syscall+0xfb 
 1331 103370 bacula-fd        -                mi_switch+0xbe sleepq_catch_signals+0xab sleepq_timedwait_sig+0xf _cv_timedwait_sig_sbt+0x19e seltdwait+0xa4 kern_select+0x8fa sys_select+0x54 amd64_syscall+0x321 Xfast_syscall+0xfb 

where the important thread is TID 102066, with vnode_create_vobject.

(kgdb) info threads
...
701 Thread 102066 (PID=1331: bacula-fd)  sched_switch (td=0xfffff803ec526960, newtd=0x0, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
...

(kgdb) thread 701
[Switching to thread 701 (Thread 102066)]#0  sched_switch (td=0xfffff803ec526960, newtd=0x0, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);

(kgdb) bt
#0  sched_switch (td=0xfffff803ec526960, newtd=0x0, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
#1  0xffffffff80580d4e in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:491
#2  0xffffffff805c155a in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:618
#3  0xffffffff80580837 in _sleep (ident=<value optimized out>, lock=<value optimized out>, priority=<value optimized out>, wmesg=<value optimized out>, sbt=<value optimized out>, pr=<value optimized out>, flags=<value optimized out>)
    at /usr/src/sys/kern/kern_synch.c:255
#4  0xffffffff80766e60 in vnode_create_vobject (vp=0xfffff816fe4f3588, isize=512, td=0xfffff803ec526960) at /usr/src/sys/vm/vnode_pager.c:120
#5  0xffffffff8072fe80 in ufs_lookup_ino (vdp=0xfffff816fe4f3588, vpp=0xfffffe17f2e63858, cnp=0xfffffe17f2e63880, dd_ino=0x0) at /usr/src/sys/ufs/ufs/ufs_lookup.c:259
#6  0xffffffff807f897a in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:197
#7  0xffffffff806092e6 in vfs_cache_lookup (ap=<value optimized out>) at vnode_if.h:80
#8  0xffffffff807f883a in VOP_LOOKUP_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:129
#9  0xffffffff806111f1 in lookup (ndp=0xfffffe17f2e637f8) at vnode_if.h:54
#10 0xffffffff80610a04 in namei (ndp=0xfffffe17f2e637f8) at /usr/src/sys/kern/vfs_lookup.c:302
#11 0xffffffff8062631e in kern_statat_vnhook (td=0xfffff803ec526960, flag=<value optimized out>, fd=-100, path=0x43c18128 <Error reading address 0x43c18128: Bad address>, pathseg=UIO_USERSPACE, sbp=0xfffffe17f2e63910, hook=0)
    at /usr/src/sys/kern/vfs_syscalls.c:2300
#12 0xffffffff80626420 in sys_lstat (td=0x0, uap=0xfffffe17f2e63a40) at /usr/src/sys/kern/vfs_syscalls.c:2280
#13 0xffffffff8078acd1 in amd64_syscall (td=0xfffff803ec526960, traced=0) at subr_syscall.c:141
#14 0xffffffff8076fffb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#15 0x00000000425a731a in ?? ()
Previous frame inner to this frame (corrupt stack?)

(kgdb) frame 4
#4  0xffffffff80766e60 in vnode_create_vobject (vp=0xfffff816fe4f3588, isize=512, td=0xfffff803ec526960) at /usr/src/sys/vm/vnode_pager.c:120
120                     VM_OBJECT_SLEEP(object, object, PDROP | PVM, "vodead", 0);

(kgdb) p *vp
$1 = {v_tag = 0xffffffff8085c651 "ufs", v_op = 0xffffffff80b0f950, v_data = 0xfffff81488311160, v_mount = 0xfffff8019a27d000, v_nmntvnodes = {tqe_next = 0xfffff816894dc3b0, tqe_prev = 0xfffff80b0d4991f8}, v_un = {vu_mount = 0x0, 
    vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0xfffff80824c363b0, le_prev = 0xfffff81742127b48}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0xfffff80e5f6a5af0, 
    tqh_last = 0xfffff80e5f6a5b10}, v_cache_dd = 0xfffff80e5f6a5af0, v_lock = {lock_object = {lo_name = 0xffffffff8085c651 "ufs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_exslpfail = 0, lk_timo = 51, 
    lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0xffffffff80870687 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xfffff816fe4f35f0, v_actfreelist = {
    tqe_next = 0xfffff80b0d4991d8, tqe_prev = 0xfffff816894dc470}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff80870697 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, 
    bo_ops = 0xffffffff80afc070, bo_object = 0xfffff802f8423d00, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff816fe4f3588, __bo_vnode = 0xfffff816fe4f3588, bo_clean = {bv_hd = {tqh_first = 0x0, 
        tqh_last = 0xfffff816fe4f36a8}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff816fe4f36c8}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 32768}, 
  v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff816fe4f3710}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 8, v_usecount = 7, 
  v_iflag = 512, v_vflag = 0, v_writecount = 0, v_hash = 76467851, v_type = VDIR}

(kgdb) p *(vp->v_bufobj.bo_object)
$2 = {lock = {lock_object = {lo_name = 0xffffffff8088aeb6 "vm object", lo_flags = 90374144, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff802f8423e00, tqe_prev = 0xfffff802f8423c20}, shadow_head = {
    lh_first = 0x0}, shadow_list = {le_next = 0x0, le_prev = 0xfffff801744afa30}, memq = {tqh_first = 0xfffff8183ea423d0, tqh_last = 0xfffff8183ea423e0}, rtree = {rt_root = 18446735381746688977, rt_flags = 0 '\0'}, size = 1, 
  generation = 1, ref_count = 0, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', flags = 16392, pg_color = 1608, paging_in_progress = 0, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, 
  pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, cache = {rt_root = 0, rt_flags = 0 '\0'}, handle = 0xfffff816fe4f3588, un_pager = {vnp = {vnp_size = 512, writemappings = 0}, devp = {devp_pglist = {
        tqh_first = 0x200, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x200, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x200, swp_bcount = 0}}, cred = 0x0, charge = 0}

(kgdb) p *td
$3 = {td_lock = 0xffffffff80b9d4d0, td_proc = 0xfffff8025dd694f0, td_plist = {tqe_next = 0xfffff801744ac4b0, tqe_prev = 0xfffff805374dd970}, td_runq = {tqe_next = 0x0, tqe_prev = 0xffffffff80b728b0}, td_slpq = {
    tqe_next = 0xfffff80389d11960, tqe_prev = 0xfffff80223243e80}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xfffffe00003849b8}, td_hash = {le_next = 0x0, le_prev = 0xfffffe00007d7590}, td_cpuset = 0xfffff801100593a8, 
  td_sel = 0xfffff806ce36eb00, td_sleepqueue = 0x0, td_turnstile = 0xfffff80174d07e40, td_rlqe = 0xfffff8017477ba28, td_umtxq = 0xfffff803a5a44f00, td_tid = 102066, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {
      __bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffff803ec526a18}, sq_proc = 0xfffff8025dd694f0, sq_flags = 1}, td_lend_user_pri = 255 '˙', td_flags = 4, td_inhibitors = 2, td_pflags = 0, td_dupfd = 0, 
  td_sqqueue = 0, td_wchan = 0xfffff802f8423d00, td_wmesg = 0xffffffff8088be0a "vodead", td_lastcpu = 13 '\r', td_oncpu = 255 '˙', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 26721, td_rw_rlocks = 0, td_lk_slocks = 0, 
  td_stopsched = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xfffff8017438e800, td_estcpu = 0, td_slptick = 1266348642, 
  td_blktick = 0, td_swvoltick = 1266348642, td_cow = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 7952, ru_ixrss = 21233268, ru_idrss = 1299996, ru_isrss = 13866624, 
    ru_minflt = 1, ru_majflt = 0, ru_nswap = 0, ru_inblock = 2105608, ru_oublock = 0, ru_msgsnd = 1531599, ru_msgrcv = 100, ru_nsignals = 0, ru_nvcsw = 1685809, ru_nivcsw = 9633}, td_rux = {rux_runtime = 2226241352945, 
    rux_uticks = 46227, rux_sticks = 62106, rux_iticks = 0, rux_uu = 365363300, rux_su = 490865795, rux_tu = 856229096}, td_incruntime = 0, td_runtime = 2226241352945, td_pticks = 0, td_sticks = 0, td_iticks = 0, td_uticks = 0, 
  td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_generation = 1695442, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 0}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, 
  td_name = "bacula-fd\000\000\000\000\000\000\000\000\000\000", td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, 
      si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, 
        __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, 
  td_map_def_user = 0x0, td_dbg_forked = 0, td_vp_reserv = 0, td_no_sleeping = 0, td_dom_rr_idx = 0, td_sigmask = {__bits = {0, 0, 0, 0}}, td_rqindex = 48 '0', td_base_pri = 152 '\230', td_priority = 152 '\230', 
  td_pri_class = 3 '\003', td_user_pri = 183 'ˇ', td_base_user_pri = 183 'ˇ', td_pcb = 0xfffffe17f2e63b80, td_state = TDS_INHIBITED, td_retval = {0, 1084620448}, td_slpcallout = {c_links = {le = {le_next = 0x0, 
        le_prev = 0xfffffe0000ad9288}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfffffe0000ad9288}}, c_time = 5430875852315327, c_precision = 26843, c_arg = 0xfffff803ec526960, 
    c_func = 0xffffffff805c1370 <sleepq_timeout>, c_lock = 0x0, c_flags = 2, c_iflags = 272, c_cpu = 11}, td_frame = 0xfffffe17f2e63ac0, td_kstack_obj = 0xfffff8076ec32a00, td_kstack = 18446741977545703424, td_kstack_pages = 4, 
  td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582, md_spurflt_addr = 1131679744}, td_sched = 0xfffff803ec526de0, td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, td_errno = 0, 
  td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0, td_rfppwait_p = 0xfffff80616f0a000, td_ma = 0x0, td_ma_cnt = 0, td_su = 0x0, td_dbg_sc_code = 0, td_dbg_sc_narg = 0, td_emuldata = 0x0}

(kgdb) p *object
$4 = {lock = {lock_object = {lo_name = 0xffffffff8088aeb6 "vm object", lo_flags = 90374144, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff802f8423e00, tqe_prev = 0xfffff802f8423c20}, shadow_head = {
    lh_first = 0x0}, shadow_list = {le_next = 0x0, le_prev = 0xfffff801744afa30}, memq = {tqh_first = 0xfffff8183ea423d0, tqh_last = 0xfffff8183ea423e0}, rtree = {rt_root = 18446735381746688977, rt_flags = 0 '\0'}, size = 1, 
  generation = 1, ref_count = 0, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', flags = 16392, pg_color = 1608, paging_in_progress = 0, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, 
  pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, cache = {rt_root = 0, rt_flags = 0 '\0'}, handle = 0xfffff816fe4f3588, un_pager = {vnp = {vnp_size = 512, writemappings = 0}, devp = {devp_pglist = {
        tqh_first = 0x200, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x200, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x200, swp_bcount = 0}}, cred = 0x0, charge = 0}

Comment 20 Konstantin Belousov freebsd_committer

2016-04-01 14:35:11 UTC

(In reply to Rudolf Čejka from comment #19)
From your report, I am almost certain (with ~95% level) that this issue, and the issue reported in PR204426 are the same.  I would follow on that PR since the reporter seems to have low time to reproduce, and more involved debugging patches are already in flight there.

Comment 21 Robert Schulze 2016-04-14 12:30:14 UTC

Just to clarify: After rebooting the machine once, the deadlock did not occur again. I'm currently unable to reproduce this issue but hope that the workloads of the other reporters could lead to a better diagnostic.

Comment 22 roel 2016-04-20 11:28:42 UTC

And another +1. Deadlock occurs on a raidz1 pool with 4 1.2TB members. This system is running on r294572. The issue started after approximately 59 days of production use with an average load of about 150 io transactions per second.

The deadlock is on a directory, which is located on a +-600GB ZFS dataset and contains 7 files and two subdirs, a few kb total.

procstat -kk for a stuck du process:

  PID    TID COMM             TDNAME           KSTACK                       
97300 101745 du               -                mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 zfs_freebsd_open+0xf5 VOP_OPEN_APV+0xa1 vn_open_vnode+0x234 vn_open_cred+0x36a kern_openat+0x26f amd64_syscall+0x40f Xfast_syscall+0xfb 

And here is an rsync that is stuck:

25689 101762 rsync-backup     -                mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 zfs_freebsd_open+0xf5 VOP_OPEN_APV+0xa1 vn_open_vnode+0x234 vn_open_cred+0x36a kern_openat+0x26f amd64_syscall+0x40f Xfast_syscall+0xfb

This filesystem is exported through nfs. We have a few nfsd threads stuck as well:

root@nfs01:/home/roel # procstat -kk 857
  PID    TID COMM             TDNAME           KSTACK                       
  857 100485 nfsd             nfsd: master     mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_timedwait_sig+0x10 _cv_timedwait_sig_sbt+0x18b svc_run_internal+0x46d svc_run+0x182 nfsrvd_nfsd+0x242 nfssvc_nfsd+0x107 sys_nfssvc+0x9c amd64_syscall+0x40f Xfast_syscall+0xfb 
  857 100505 nfsd             nfsd: service    mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _cv_wait_sig+0x16a svc_run_internal+0x86e svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe 
... lots of similar threads, except for: ...
  857 100541 nfsd             nfsd: service    mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 zfs_fhtovp+0x45b nfsvno_fhtovp+0x7c nfsd_fhtovp+0xc8 nfsrvd_dorpc+0x12c nfssvc_program+0x54e svc_run_internal+0xd2b svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe 
  857 100588 nfsd             nfsd: service    mi_switch+0xe1 sleepq_wait+0x3a _sleep+0x287 vnode_create_vobject+0x100 zfs_fhtovp+0x45b nfsvno_fhtovp+0x7c nfsd_fhtovp+0xc8 nfsrvd_dorpc+0x12c nfssvc_program+0x54e svc_run_internal+0xd2b svc_thread_start+0xb fork_exit+0x9a fork_trampoline+0xe 

Some kgdb output for one of the stuck du threads:

[Switching to thread 555 (Thread 101720)]#0  sched_switch (td=0xfffff8033a17e000, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xfffff8033a17e000, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
#1  0xffffffff80959ff1 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:491
#2  0xffffffff8099b31a in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:617
#3  0xffffffff80959a17 in _sleep (ident=<value optimized out>, lock=<value optimized out>, priority=<value optimized out>, wmesg=<value optimized out>, sbt=<value optimized out>, pr=<value optimized out>, 
    flags=<value optimized out>) at /usr/src/sys/kern/kern_synch.c:255
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8033a17e000) at /usr/src/sys/vm/vnode_pager.c:120
#5  0xffffffff81a7d5f5 in zfs_freebsd_open (ap=0xfffffe0466890658) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:6076
#6  0xffffffff80e7e651 in VOP_OPEN_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:469
#7  0xffffffff80a05694 in vn_open_vnode (vp=0xfffff801d3e11b10, fmode=1179653, cred=0xfffff803682bd400, td=0xfffff8033a17e000, fp=0xfffff80240d403c0) at vnode_if.h:196
#8  0xffffffff80a0528a in vn_open_cred (ndp=0xfffffe0466890870, flagp=0xfffffe046689094c, cmode=<value optimized out>, vn_open_flags=<value optimized out>, cred=0x0, fp=0xfffff80240d403c0)
    at /usr/src/sys/kern/vfs_vnops.c:268
#9  0xffffffff809fe44f in kern_openat (td=0xfffff8033a17e000, fd=-100, path=0x801422d48 <Error reading address 0x801422d48: Bad address>, pathseg=UIO_USERSPACE, flags=1179653, mode=<value optimized out>)
    at /usr/src/sys/kern/vfs_syscalls.c:1093
#10 0xffffffff80d55aff in amd64_syscall (td=0xfffff8033a17e000, traced=0) at subr_syscall.c:141
#11 0xffffffff80d3ad7b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#12 0x0000000800b7570a in ?? ()
Previous frame inner to this frame (corrupt stack?)

(kgdb) frame 4
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8033a17e000) at /usr/src/sys/vm/vnode_pager.c:120
120                     VM_OBJECT_SLEEP(object, object, PDROP | PVM, "vodead", 0);
(kgdb) print vp
$4 = (struct vnode *) 0xfffff801d3e11b10
(kgdb) print *vp
$5 = {v_tag = 0xffffffff81aecb3b "zfs", v_op = 0xffffffff81aff790, v_data = 0xfffff801a2d1ca10, v_mount = 0xfffff8011a46e660, v_nmntvnodes = {tqe_next = 0xfffff801a53281d8, tqe_prev = 0xfffff801a6b8e5a8}, v_un = {
    vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {lh_first = 0xfffff80019708850}, v_cache_dst = {tqh_first = 0xfffff80257063850, 
    tqh_last = 0xfffff80257063870}, v_cache_dd = 0xfffff80257063850, v_lock = {lock_object = {lo_name = 0xffffffff81aecb3b "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_exslpfail = 0, 
    lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0xffffffff80fe1a6c "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xfffff801d3e11b78, 
  v_actfreelist = {tqe_next = 0xfffff801a6b8e588, tqe_prev = 0xfffff801a5328298}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff80fe1a7c "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, 
      rw_lock = 1}, bo_ops = 0xffffffff814b3900, bo_object = 0xfffff801fd0ea100, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff801d3e11b10, __bo_vnode = 0xfffff801d3e11b10, bo_clean = {bv_hd = {
        tqh_first = 0x0, tqh_last = 0xfffff801d3e11c30}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff801d3e11c50}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, 
    bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff801d3e11c98}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, 
  v_clen = 0, v_holdcnt = 127, v_usecount = 126, v_iflag = 512, v_vflag = 0, v_writecount = 0, v_hash = 7955980, v_type = VDIR}
(kgdb) print *td
$6 = {td_lock = 0xffffffff81619670, td_proc = 0xfffff8006233f4f0, td_plist = {tqe_next = 0x0, tqe_prev = 0xfffff8006233f500}, td_runq = {tqe_next = 0x0, tqe_prev = 0xffffffff815e3998}, td_slpq = {
    tqe_next = 0xfffff803f7d36960, tqe_prev = 0xfffff802836ed030}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xfffffe03e44e7908}, td_hash = {le_next = 0x0, le_prev = 0xfffffe000087aac0}, td_cpuset = 0xfffff800083cd3a8, 
  td_sel = 0xfffff80039695e00, td_sleepqueue = 0x0, td_turnstile = 0xfffff8015b0cacc0, td_rlqe = 0xfffff80117ffa4d8, td_umtxq = 0xfffff802d770a780, td_tid = 101720, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, 
    sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffff8033a17e0b8}, sq_proc = 0xfffff8006233f4f0, sq_flags = 1}, td_lend_user_pri = 255 'ÿ', td_flags = 133124, td_inhibitors = 2, 
  td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0xfffff801fd0ea100, td_wmesg = 0xffffffff81006e8d "vodead", td_lastcpu = 1 '\001', td_oncpu = 255 'ÿ', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', 
  td_locks = 4348, td_rw_rlocks = 0, td_lk_slocks = 0, td_stopsched = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, 
  td_ucred = 0xfffff803682bd400, td_estcpu = 0, td_slptick = 444244584, td_blktick = 0, td_swvoltick = 444244584, td_cow = 39, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, 
    ru_maxrss = 1656, ru_ixrss = 192, ru_idrss = 64, ru_isrss = 2048, ru_minflt = 119, ru_majflt = 0, ru_nswap = 0, ru_inblock = 749, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 739, 
    ru_nivcsw = 2}, td_rux = {rux_runtime = 514120625, rux_uticks = 1, rux_sticks = 15, rux_iticks = 0, rux_uu = 12852, rux_su = 192791, rux_tu = 205644}, td_incruntime = 0, td_runtime = 514120625, td_pticks = 0, 
  td_sticks = 0, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_generation = 741, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, 
  td_profil_ticks = 0, td_name = "du", '\0' <repeats 17 times>, td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, 
      si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, 
        _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, 
      le_prev = 0x0}}, td_map_def_user = 0x0, td_dbg_forked = 0, td_vp_reserv = 0, td_no_sleeping = 0, td_dom_rr_idx = 0, td_sigmask = {__bits = {0, 0, 0, 0}}, td_rqindex = 31 '\037', td_base_pri = 120 'x', 
  td_priority = 120 'x', td_pri_class = 3 '\003', td_user_pri = 126 '~', td_base_user_pri = 126 '~', td_pcb = 0xfffffe0466890b80, td_state = TDS_INHIBITED, td_retval = {0, 1}, td_slpcallout = {c_links = {le = {
        le_next = 0x0, le_prev = 0xfffffe00008d0a10}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfffffe00008d0a10}}, c_time = 20354765951718809, c_precision = 268435437, c_arg = 0xfffff8033a17e000, 
    c_func = 0xffffffff8099b0d0 <sleepq_timeout>, c_lock = 0x0, c_flags = 0, c_iflags = 272, c_cpu = 2}, td_frame = 0xfffffe0466890ac0, td_kstack_obj = 0xfffff803af9b7200, td_kstack = 18446741893586407424, 
  td_kstack_pages = 4, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582, md_spurflt_addr = 0}, td_sched = 0xfffff8033a17e480, td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, 
  td_dtrace = 0xfffff8037c56e700, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0, td_rfppwait_p = 0xfffff802656e09e0, td_ma = 0x0, td_ma_cnt = 0, td_su = 0x0, td_dbg_sc_code = 0, td_dbg_sc_narg = 0, 
  td_emuldata = 0x0}


And another stuck du process:

(kgdb) thread 795
[Switching to thread 795 (Thread 101745)]#0  sched_switch (td=0xfffff8006f2e84b0, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xfffff8006f2e84b0, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
#1  0xffffffff80959ff1 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:491
#2  0xffffffff8099b31a in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:617
#3  0xffffffff80959a17 in _sleep (ident=<value optimized out>, lock=<value optimized out>, priority=<value optimized out>, wmesg=<value optimized out>, sbt=<value optimized out>, pr=<value optimized out>, 
    flags=<value optimized out>) at /usr/src/sys/kern/kern_synch.c:255
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8006f2e84b0) at /usr/src/sys/vm/vnode_pager.c:120
#5  0xffffffff81a7d5f5 in zfs_freebsd_open (ap=0xfffffe046690d658) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:6076
#6  0xffffffff80e7e651 in VOP_OPEN_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:469
#7  0xffffffff80a05694 in vn_open_vnode (vp=0xfffff801d3e11b10, fmode=1179653, cred=0xfffff802f6e2f000, td=0xfffff8006f2e84b0, fp=0xfffff80104fcc5a0) at vnode_if.h:196
#8  0xffffffff80a0528a in vn_open_cred (ndp=0xfffffe046690d870, flagp=0xfffffe046690d94c, cmode=<value optimized out>, vn_open_flags=<value optimized out>, cred=0x0, fp=0xfffff80104fcc5a0)
    at /usr/src/sys/kern/vfs_vnops.c:268
#9  0xffffffff809fe44f in kern_openat (td=0xfffff8006f2e84b0, fd=-100, path=0x801422d48 <Error reading address 0x801422d48: Bad address>, pathseg=UIO_USERSPACE, flags=1179653, mode=<value optimized out>)
    at /usr/src/sys/kern/vfs_syscalls.c:1093
#10 0xffffffff80d55aff in amd64_syscall (td=0xfffff8006f2e84b0, traced=0) at subr_syscall.c:141
#11 0xffffffff80d3ad7b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#12 0x0000000800b7570a in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) frame 4
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8006f2e84b0) at /usr/src/sys/vm/vnode_pager.c:120
120                     VM_OBJECT_SLEEP(object, object, PDROP | PVM, "vodead", 0);
(kgdb) p *vp
$7 = {v_tag = 0xffffffff81aecb3b "zfs", v_op = 0xffffffff81aff790, v_data = 0xfffff801a2d1ca10, v_mount = 0xfffff8011a46e660, v_nmntvnodes = {tqe_next = 0xfffff801a53281d8, tqe_prev = 0xfffff801a6b8e5a8}, v_un = {
    vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {lh_first = 0xfffff80019708850}, v_cache_dst = {tqh_first = 0xfffff80257063850, 
    tqh_last = 0xfffff80257063870}, v_cache_dd = 0xfffff80257063850, v_lock = {lock_object = {lo_name = 0xffffffff81aecb3b "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_exslpfail = 0, 
    lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0xffffffff80fe1a6c "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xfffff801d3e11b78, 
  v_actfreelist = {tqe_next = 0xfffff801a6b8e588, tqe_prev = 0xfffff801a5328298}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff80fe1a7c "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, 
      rw_lock = 1}, bo_ops = 0xffffffff814b3900, bo_object = 0xfffff801fd0ea100, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff801d3e11b10, __bo_vnode = 0xfffff801d3e11b10, bo_clean = {bv_hd = {
        tqh_first = 0x0, tqh_last = 0xfffff801d3e11c30}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff801d3e11c50}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, 
    bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff801d3e11c98}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, 
  v_clen = 0, v_holdcnt = 127, v_usecount = 126, v_iflag = 512, v_vflag = 0, v_writecount = 0, v_hash = 7955980, v_type = VDIR}
(kgdb) p *td
$8 = {td_lock = 0xffffffff81619670, td_proc = 0xfffff80009eb59e0, td_plist = {tqe_next = 0x0, tqe_prev = 0xfffff80009eb59f0}, td_runq = {tqe_next = 0x0, tqe_prev = 0xffffffff815e87a8}, td_slpq = {
    tqe_next = 0xfffff803cfb27960, tqe_prev = 0xfffff8006f2e8990}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xfffffe04662f90c8}, td_hash = {le_next = 0x0, le_prev = 0xfffffe000087ab88}, td_cpuset = 0xfffff800083cd3a8, 
  td_sel = 0xfffff80290088500, td_sleepqueue = 0x0, td_turnstile = 0xfffff800b09c4840, td_rlqe = 0xfffff8015a73a910, td_umtxq = 0xfffff8012d91b280, td_tid = 101745, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, 
    sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xfffff8006f2e8568}, sq_proc = 0xfffff80009eb59e0, sq_flags = 1}, td_lend_user_pri = 255 'ÿ', td_flags = 133124, td_inhibitors = 2, 
  td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0xfffff801fd0ea100, td_wmesg = 0xffffffff81006e8d "vodead", td_lastcpu = 7 '\a', td_oncpu = 255 'ÿ', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 3768, 
  td_rw_rlocks = 0, td_lk_slocks = 0, td_stopsched = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xfffff802f6e2f000, 
  td_estcpu = 0, td_slptick = 729400807, td_blktick = 0, td_swvoltick = 729400807, td_cow = 41, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 1696, ru_ixrss = 312, 
    ru_idrss = 104, ru_isrss = 3328, ru_minflt = 121, ru_majflt = 0, ru_nswap = 0, ru_inblock = 840, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 795, ru_nivcsw = 0}, td_rux = {
    rux_runtime = 547891773, rux_uticks = 1, rux_sticks = 25, rux_iticks = 0, rux_uu = 8428, rux_su = 210723, rux_tu = 219152}, td_incruntime = 0, td_runtime = 547891773, td_pticks = 0, td_sticks = 0, td_iticks = 0, 
  td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_generation = 795, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, 
  td_name = "du", '\0' <repeats 17 times>, td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = {ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, 
      si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {
          _band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, 
      le_prev = 0x0}}, td_map_def_user = 0x0, td_dbg_forked = 0, td_vp_reserv = 0, td_no_sleeping = 0, td_dom_rr_idx = 0, td_sigmask = {__bits = {0, 0, 0, 0}}, td_rqindex = 32 ' ', td_base_pri = 120 'x', 
  td_priority = 120 'x', td_pri_class = 3 '\003', td_user_pri = 130 '\202', td_base_user_pri = 130 '\202', td_pcb = 0xfffffe046690db80, td_state = TDS_INHIBITED, td_retval = {0, 1}, td_slpcallout = {c_links = {le = {
        le_next = 0x0, le_prev = 0xfffffe0000ac29b0}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfffffe0000ac29b0}}, c_time = 21579501454196634, c_precision = 268435437, c_arg = 0xfffff8006f2e84b0, 
    c_func = 0xffffffff8099b0d0 <sleepq_timeout>, c_lock = 0x0, c_flags = 0, c_iflags = 272, c_cpu = 9}, td_frame = 0xfffffe046690dac0, td_kstack_obj = 0xfffff80198fc0e00, td_kstack = 18446741893586919424, 
  td_kstack_pages = 4, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582, md_spurflt_addr = 0}, td_sched = 0xfffff8006f2e8930, td_ar = 0x0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, 
  td_dtrace = 0xfffff802a4f1fc00, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_intr_frame = 0x0, td_rfppwait_p = 0xfffff803cb6dd4f0, td_ma = 0x0, td_ma_cnt = 0, td_su = 0x0, td_dbg_sc_code = 0, td_dbg_sc_narg = 0, 
  td_emuldata = 0x0}


We are replicating this filesystem using incremental ZFS snapshots onto a secondary host (using zrep). I can mount the replicated fs and read the file there without issue.

I am currently compiling the latest -STABLE kernel and am scheduling a reboot tonight.

Comment 23 Andriy Gapon freebsd_committer

2016-04-20 11:51:44 UTC

(In reply to roel from comment #22)
It would be interesting to see *vp->v_bufobj.bo_object.
Also, it's better to add a really long output as an attachment rather than pasting it into a comment.

Comment 24 roel 2016-04-20 12:08:43 UTC

Here it is, for both du processes listed above:

(kgdb) thread 555
(kgdb) frame 4
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8033a17e000) at /usr/src/sys/vm/vnode_pager.c:120
120                     VM_OBJECT_SLEEP(object, object, PDROP | PVM, "vodead", 0);
(kgdb) p *vp->v_bufobj.bo_object
$1 = {lock = {lock_object = {lo_name = 0xffffffff810062d6 "vm object", lo_flags = 90374144, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff801fd0ea200, tqe_prev = 0xfffff801fd0ea020}, 
  shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0x0, le_prev = 0xfffff8022d69bb38}, memq = {tqh_first = 0x0, tqh_last = 0xfffff801fd0ea148}, rtree = {rt_root = 0, rt_flags = 0 '\0'}, size = 1, generation = 1, 
  ref_count = 0, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', flags = 16392, pg_color = 21, paging_in_progress = 0, resident_page_count = 0, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {
    tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, cache = {rt_root = 0, rt_flags = 0 '\0'}, handle = 0xfffff801d3e11b10, un_pager = {vnp = {vnp_size = 16, writemappings = 0}, devp = {devp_pglist = {
        tqh_first = 0x10, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x10, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x10, swp_bcount = 0}}, cred = 0x0, charge = 0}
(kgdb) thread 795
[Switching to thread 795 (Thread 101745)]#0  sched_switch (td=0xfffff8006f2e84b0, newtd=<value optimized out>, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1945
1945                    cpuid = PCPU_GET(cpuid);
(kgdb) frame 4
#4  0xffffffff80bed620 in vnode_create_vobject (vp=0xfffff801d3e11b10, isize=16, td=0xfffff8006f2e84b0) at /usr/src/sys/vm/vnode_pager.c:120
120                     VM_OBJECT_SLEEP(object, object, PDROP | PVM, "vodead", 0);
(kgdb) p *vp->v_bufobj.bo_object
$2 = {lock = {lock_object = {lo_name = 0xffffffff810062d6 "vm object", lo_flags = 90374144, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff801fd0ea200, tqe_prev = 0xfffff801fd0ea020}, 
  shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0x0, le_prev = 0xfffff8022d69bb38}, memq = {tqh_first = 0x0, tqh_last = 0xfffff801fd0ea148}, rtree = {rt_root = 0, rt_flags = 0 '\0'}, size = 1, generation = 1, 
  ref_count = 0, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', flags = 16392, pg_color = 21, paging_in_progress = 0, resident_page_count = 0, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {
    tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, cache = {rt_root = 0, rt_flags = 0 '\0'}, handle = 0xfffff801d3e11b10, un_pager = {vnp = {vnp_size = 16, writemappings = 0}, devp = {devp_pglist = {
        tqh_first = 0x10, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x10, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x10, swp_bcount = 0}}, cred = 0x0, charge = 0}

Comment 25 Erik 2016-04-21 08:50:54 UTC

+1

Same/similar enviroment like others 
- zfs mirrored pool with multiple ssd disks
- exported over NFS
- random directories are 'stuck' after a couple of days
- ls works fine, but accessing resluts in vodead state and newnfs state on client

this is on FreeBSD 10.3-RELEASE

I'll try to get the debug info later.

Comment 26 commit-hook freebsd_committer

2016-06-21 17:50:21 UTC

A commit references this bug:

Author: kib
Date: Tue Jun 21 17:49:33 UTC 2016
New revision: 302063
URL: https://svnweb.freebsd.org/changeset/base/302063

Log:
  The vmtotal sysctl handler marks active vm objects to calculate
  statistics.  Marking is done by setting the OBJ_ACTIVE flag.  The
  flags change is locked, but the problem is that many parts of system
  assume that vm object initialization ensures that no other code could
  change the object, and thus performed lockless.  The end result is
  corrupted flags in vm objects, most visible is spurious OBJ_DEAD flag,
  causing random hangs.

  Avoid the active object marking, instead provide equally inexact but
  immutable is_object_alive() definition for the object mapped state.

  Avoid iterating over the processes mappings altogether by using
  arguably improved definition of the paging thread as one which sleeps
  on the v_free_count.

  PR:	204764
  Diagnosed by:	pho
  Tested by:	pho (previous version)
  Reviewed by:	alc
  Sponsored by:	The FreeBSD Foundation
  MFC after:	1 week
  Approved by:	re (gjb)

Changes:
  head/sys/vm/vm_meter.c
  head/sys/vm/vm_object.h

Comment 27 Daniel Ylitalo 2016-07-04 08:31:43 UTC

Anyone aware if this will be released as a patch for 10.3 through freebsd-update or if one should go patch the diff? :)

Comment 28 Robert Schulze 2016-07-27 12:45:32 UTC

Created attachment 173037 [details]
patch to apply r302063 on FreeBSD-10

Due to the lack of r263260 ("Rename global cnt to vm_cnt to avoid shadowing.") the patch for r302063 cannot be applied directly to FreeBSD-10. I created a patch that does not rely on r263260, so users of FreeBSD-10 can test it, too.

with kind regards,
Robert Schulze

Comment 29 Jason Wolfe 2016-07-28 08:35:55 UTC

(In reply to Robert Schulze from comment #28)

kib MFCd r302063 into stable/10 a few weeks ago, 302243, so custom patching isn't needed if you bring your sources up to date. You are correct though cnt.v_free_count was renamed and should be the only difference between 10 and 11.

Comment 30 Bryan Drewery freebsd_committer

2016-08-09 18:02:15 UTC

https://lists.freebsd.org/pipermail/freebsd-stable/2016-August/085150.html

Comment 31 Christos Chatzaras 2016-10-01 17:01:30 UTC

Is it possible to have this for 10.3 as errata patch?

Comment 32 Christos Chatzaras 2016-10-11 13:48:54 UTC

Upgrading to 10-STABLE solve this issue for me.

Comment 33 Matthew Seaman freebsd_committer

2016-10-20 15:49:22 UTC

Yet another data point: 

Our environment: Dell R730 servers with ~2TB of RAID10 disk arrays running FreeBSD 10.3-RELEASE and postgresql92-server.  UFS2 filesystems.

We applied Kostik's patch from https://lists.freebsd.org/pipermail/freebsd-stable/2016-August/085150.html

Before this we were seeing the VODEAD problem about 2--3 times per week.  Since patching we've run for 16 days with out problems.

A definite +1 for issuing this patch as an Errata Note.

Comment 34 Robert Schulze 2017-01-06 15:52:28 UTC

Since the problem has been fixed, I think this PR can be closed now.