Bug 156193

Summary: [ufs] [hang] UFS snapshot hangs && deadlocks processes
Product: Base System Reporter: sec
Component: kernAssignee: freebsd-bugs mailing list <bugs>
Status: Open ---    
Severity: Affects Only Me CC: chris
Priority: Normal    
Version: 8.2-STABLE   
Hardware: Any   
OS: Any   

Description sec 2011-04-05 10:10:13 UTC
I'm doing nightly snapshots on my (geli-encrypted, ufs) filesystems.

The snapshot never completed, it has been hanging for about 5 hours now.

While regular access appears to work, both "sync" and "ls -l /opt/.snap"
hang instantly and are unkillable.

my rtorrent client running at the time the snapshot was started and
writing to the snapshotted filesystem also hangs unkillable.

ice:~>ps axlww|grep -E '(mount|sync|ls|torrent)'
    0    20     0   0  44  0     0    16 snaplk DL    ??   14:22.90 [syncer]
    0 21583 20051   0  44  0  6900  1260 ufs    D     ??    0:11.66 mount -u -o snapshot /opt/.snap/daily.0 /opt
 1000 66868  3190   0  44  0  2744   808 snaplk T+     2    0:00.13 sync
 1000 71590 71589   0  44  0  8232  1648 snaplk D     10    0:00.00 ls -l .snap
 1000  3208  3197   0  76  0  8276  1552 wait   Is+   18    0:00.01 /bin/sh /home/sec/bin/myrtorrent
 1000  3209  3208   0  44  0 52976 21276 -      T+    18   17:10.16 rtorrent -o directory=files,session=session

kgdb-backtrace for "mount":
(kgdb) thread 308
[Switching to thread 308 (Thread 100653)]#0  sched_switch (td=0xffffff0118398000, newtd=0xffffff0001a0c000, flags=Variable "flags" is not available.
) at /usr/src/sys/kern/sched_ule.c:1865
1865                    cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xffffff0118398000, newtd=0xffffff0001a0c000, flags=Variable "flags" is not available.
) at /usr/src/sys/kern/sched_ule.c:1865
#1  0xffffffff8042fd0f in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449
#2  0xffffffff80461c9b in sleepq_switch (wchan=Variable "wchan" is not available.
) at /usr/src/sys/kern/subr_sleepqueue.c:530
#3  0xffffffff80462985 in sleepq_wait (wchan=0xffffff00039357f8, pri=80) at /usr/src/sys/kern/subr_sleepqueue.c:609
#4  0xffffffff804102e2 in __lockmgr_args (lk=0xffffff00039357f8, flags=524544, ilk=0xffffff0003935820, wmesg=Variable "wmesg" is not available.
) at /usr/src/sys/kern/kern_lock.c:221
#5  0xffffffff8063efa7 in ffs_lock (ap=0xffffff811bee90f0) at lockmgr.h:94
#6  0xffffffff806f38e5 in VOP_LOCK1_APV (vop=0xffffffff80960160, a=0xffffff811bee90f0) at vnode_if.c:1988
#7  0xffffffff804c4918 in _vn_lock (vp=0xffffff0003935760, flags=524544, file=0xffffffff807708b0 "/usr/src/sys/ufs/ffs/ffs_snapshot.c", line=2422) at vnode_if.h:859
#8  0xffffffff806299c2 in process_deferred_inactive (mp=0xffffff0003142bc0) at /usr/src/sys/ufs/ffs/ffs_snapshot.c:2422
#9  0xffffffff804c38a2 in vfs_write_resume (mp=0xffffff0003142bc0) at /usr/src/sys/kern/vfs_vnops.c:1167
#10 0xffffffff8062b9ef in ffs_snapshot (mp=0xffffff0003142bc0, snapfile=0xffffff00a04eaaa0 "/opt/.snap/daily.0") at /usr/src/sys/ufs/ffs/ffs_snapshot.c:674
#11 0xffffffff8063b6c7 in ffs_mount (mp=0xffffff0003142bc0) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:392
#12 0xffffffff804b0249 in vfs_donmount (td=0x0, fsflags=65536, fsoptions=0xffffff0003466600) at /usr/src/sys/kern/vfs_mount.c:988
#13 0xffffffff804b0bb5 in nmount (td=0xffffff0118398000, uap=0xffffff811bee9bd0) at /usr/src/sys/kern/vfs_mount.c:424
#14 0xffffffff8046501e in syscallenter (td=0xffffff0118398000, sa=0xffffff811bee9bc0) at /usr/src/sys/kern/subr_trap.c:315
#15 0xffffffff80692811 in syscall (frame=0xffffff811bee9c50) at /usr/src/sys/amd64/amd64/trap.c:914
#16 0xffffffff8067b0e2 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:381
#17 0x00000008007ae5bc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

kgdb-backtraces for "rtorrent" (2 threads):

(kgdb) thread 175
[Switching to thread 175 (Thread 100331)]#0  sched_switch (td=0xffffff0050bd78c0, newtd=0xffffff0001a0c000, flags=) at /usr/src/sys/kern/sched_ule.c:1865
1865			cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xffffff0050bd78c0, newtd=0xffffff0001a0c000, flags=) at /usr/src/sys/kern/sched_ule.c:1865
#1  0xffffffff8042fd0f in mi_switch (flags=266, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449
#2  0xffffffff8043577d in thread_suspend_switch (td=0xffffff0050bd78c0) at /usr/src/sys/kern/kern_thread.c:798
#3  0xffffffff80436902 in thread_single (mode=1) at /usr/src/sys/kern/kern_thread.c:646
#4  0xffffffff803f9e48 in exit1 (td=0xffffff0050bd78c0, rv=9) at /usr/src/sys/kern/kern_exit.c:169
#5  0xffffffff80428f9b in sigexit (td=0xffffff0050bd78c0, sig=9) at /usr/src/sys/kern/kern_sig.c:2880
#6  0xffffffff8042aa94 in postsig (sig=9) at /usr/src/sys/kern/kern_sig.c:2767
#7  0xffffffff8046548f in ast (framep=0xffffff811b89fc50) at /usr/src/sys/kern/subr_trap.c:218
#8  0xffffffff8067bea9 in doreti_ast () at /usr/src/sys/amd64/amd64/exception.S:640
#9  0x0000000000000004 in ?? ()
#10 0x0000000802204000 in ?? ()
#11 0x0000000000000000 in ?? ()
#12 0x0000000802158000 in ?? ()
#13 0x0000000000004000 in ?? ()
#14 0x00007fffffbfef30 in ?? ()
#15 0x0000000000000004 in ?? ()
#16 0x000000080200b600 in ?? ()
#17 0x0000000000000001 in ?? ()
#18 0x00000000004be330 in ?? ()
#19 0x0000000802c27000 in ?? ()
#20 0x0004a024b96aee40 in ?? ()
#21 0x0000000000000000 in ?? ()
#22 0x000000080202c600 in ?? ()
#23 0x0000000000000000 in ?? ()
#24 0x001b00130000000c in ?? ()
#25 0x0000000000511dc8 in ?? ()
#26 0x003b003b00000001 in ?? ()
#27 0x0000000000000002 in ?? ()
#28 0x0000000801d7c93c in ?? ()
#29 0x0000000000000043 in ?? ()
#30 0x0000000000000207 in ?? ()
#31 0x00007fffffbfef28 in ?? ()
#32 0x000000000000003b in ?? ()
#33 0xffffffff00ffffff in ?? ()
#34 0xffffffff809add00 in affinity ()
#35 0xffffffff809add00 in affinity ()
#36 0xffffff0001a0c000 in ?? ()
#37 0xffffff811b89f790 in ?? ()
#38 0xffffff811b89f748 in ?? ()
#39 0xffffff0050bd78c0 in ?? ()
#40 0xffffffff8044c423 in sched_switch (td=0x80200b600, newtd=0x4a024b96aee40, flags=) at /usr/src/sys/kern/sched_ule.c:1859
(kgdb) thread 174
[Switching to thread 174 (Thread 100161)]#0  sched_switch (td=0xffffff0003be48c0, newtd=0xffffff005075b000, flags=) at /usr/src/sys/kern/sched_ule.c:1865
1865			cpuid = PCPU_GET(cpuid);
(kgdb) bt
#0  sched_switch (td=0xffffff0003be48c0, newtd=0xffffff005075b000, flags=) at /usr/src/sys/kern/sched_ule.c:1865
#1  0xffffffff8042fd0f in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449
#2  0xffffffff80461c9b in sleepq_switch (wchan=) at /usr/src/sys/kern/subr_sleepqueue.c:530
#3  0xffffffff80462985 in sleepq_wait (wchan=0xffffff0003112a30, pri=80) at /usr/src/sys/kern/subr_sleepqueue.c:609
#4  0xffffffff804102e2 in __lockmgr_args (lk=0xffffff0003112a30, flags=526592, ilk=0xffffff0003389bd0, wmesg=) at /usr/src/sys/kern/kern_lock.c:221
#5  0xffffffff80628d69 in ffs_copyonwrite (devvp=0xffffff0003389b10, bp=0xffffff80ef1f9168) at lockmgr.h:94
#6  0xffffffff80639235 in ffs_geom_strategy (bo=0xffffff0003389c28, bp=0xffffff80ef1f9168) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1997
#7  0xffffffff8064907b in ufs_strategy (ap=) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2212
#8  0xffffffff806f37b5 in VOP_STRATEGY_APV (vop=0xffffffff809609a0, a=0xffffff811b54d290) at vnode_if.c:2171
#9  0xffffffff8049bf2c in bufstrategy (bo=) at vnode_if.h:940
#10 0xffffffff804a259f in bufwrite (bp=0xffffff80ef1f9168) at buf.h:398
#11 0xffffffff8049b7c5 in bawrite (bp=) at buf.h:386
#12 0xffffffff804a4d6d in cluster_wbuild (vp=0xffffff0003935760, size=16384, start_lbn=9880, len=1) at /usr/src/sys/kern/vfs_cluster.c:808
#13 0xffffffff804a6854 in cluster_write (vp=0xffffff0003935760, bp=0xffffff80ef1f9168, filesize=367083290, seqcount=127) at /usr/src/sys/kern/vfs_cluster.c:579
#14 0xffffffff8063e642 in ffs_write (ap=0xffffff811b54d6b0) at /usr/src/sys/ufs/ffs/ffs_vnops.c:798
#15 0xffffffff806f3db5 in VOP_WRITE_APV (vop=0xffffffff80960160, a=0xffffff811b54d6b0) at vnode_if.c:951
#16 0xffffffff80671265 in vnode_pager_generic_putpages (vp=0xffffff0003935760, m=0xffffff811b54d8b0, bytecount=) at vnode_if.h:413
#17 0xffffffff804a6d1e in vop_stdputpages (ap=) at /usr/src/sys/kern/vfs_default.c:715
#18 0xffffffff806f3059 in VOP_PUTPAGES_APV (vop=) at vnode_if.c:2666
#19 0xffffffff8067142a in vnode_pager_putpages (object=0xffffff011de1a5e8, m=0xffffff811b54d8b0, count=16, sync=8, rtvals=0xffffff811b54d7e0) at vnode_if.h:1169
#20 0xffffffff8066bd06 in vm_pageout_flush (mc=0xffffff811b54d8b0, count=16, flags=8, mreq=0, prunlen=) at vm_pager.h:147
#21 0xffffffff80665e96 in vm_object_page_collect_flush (object=) at /usr/src/sys/vm/vm_object.c:884
#22 0xffffffff80666052 in vm_object_page_clean (object=0xffffff011de1a5e8, start=39424, end=) at /usr/src/sys/vm/vm_object.c:835
#23 0xffffffff80668829 in vm_object_sync (object=0xffffff011de1a5e8, offset=161480704, size=524288, syncio=0, invalidate=0) at /usr/src/sys/vm/vm_object.c:941
#24 0xffffffff8066170f in vm_map_sync (map=0xffffff009a892c40, start=34416885760, end=34417410048, syncio=0, invalidate=0) at /usr/src/sys/vm/vm_map.c:2633
#25 0xffffffff806645d2 in msync (td=) at /usr/src/sys/vm/vm_mmap.c:520
#26 0xffffffff8046501e in syscallenter (td=0xffffff0003be48c0, sa=0xffffff811b54dbc0) at /usr/src/sys/kern/subr_trap.c:315
#27 0xffffffff80692811 in syscall (frame=0xffffff811b54dc50) at /usr/src/sys/amd64/amd64/trap.c:914
#28 0xffffffff8067b0e2 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:381
#29 0x0000000801cfa8fc in ?? ()

If there is any other information that I can provide before rebooting
the system to get it usable again, please tell me.
Comment 1 Mark Linimon freebsd_committer freebsd_triage 2011-04-09 21:01:05 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 2 Alexander 2011-08-11 08:01:22 UTC
I have same problem.
FreeBSD domain.com 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Fri Feb 18 02:24:46
UTC 2011     root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386
Comment 3 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:01:24 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped