Bug 114676

Summary: [ufs] snapshot creation panics: snapacct_ufs2: bad block
Product: Base System Reporter: Gael Roualland <gael.roualland>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: chris, mckusick
Priority: Normal    
Version: 6.2-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
114676-kgdb1.txt
none
114676-kgdb2.txt
none
114676-dmesg.txt
none
114676-kernelconf.txt none

Description Gael Roualland 2007-07-17 22:50:01 UTC
I've been using automatic, daily snapshots on this system with no 
problems for a while on various sized UFS2 filesystems.

On the course of taking a snapshot yesterday night, the box crashed with 
the following dump :

Dump header from device /dev/ad0s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 402247680B (383 MB)
  Blocksize: 512
  Dumptime: Tue Jul 17 00:00:06 2007
  Hostname: jerry.priv
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 6.2-STABLE #1: Tue Mar 13 01:41:29 CET 2007
    gael@jerry:/home/cvsup/obj/home/cvsup/src/sys/JERRY
  Panic String: snapacct_ufs2: bad block
  Dump Parity: 1868970034
  Bounds: 4
  Dump Status: good

From the logs of the snapshot utility I can tell the filesystem which 
toggled the panic is /var, which on this system is not really large (2Gb).

After reboot, the background fsck tried to create a snapshot on the
same filesystem which ended up with the same panic, over and over.
 
Manually fscking the filesystems and removing the snapshots cured
the problem.

Here's the kgdb output on the dump, but unfortunately my kernel is not 
a debugging one, and the stack seems trashed, so it might not be 
very useful :

$ kgdb /boot/kernel/kernel /var/crash/vmcore.4
kgdb: kvm_nlist(_stopped_cpus): 
kgdb: kvm_nlist(_stoppcbs): 
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
(no debugging symbols found)...Attempt to extract a component of a value that is not a structure pointer.
(kgdb) bt
#0  0xc051d812 in doadump ()
#1  0xc0786240 in buf.0 ()
#2  0xd6097350 in ?? ()
#3  0xc051dc6d in boot ()
Previous frame inner to this frame (corrupt stack?)

Hope that helps anyway...

Fix: 

Run fsck in foreground and remove past snapshots.
How-To-Repeat: Unknown. Looks like file system and/or snapshot corruption.
Comment 1 Remko Lodder freebsd_committer freebsd_triage 2007-07-25 07:07:01 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Seems more FS related, reassign.
Comment 2 Joerg Wunsch 2008-06-30 21:34:42 UTC
FreeBSD-7-stable is completely unstable for me due to this panic.

I used to have regular snapshots enabled on /var and /home.  While
there has been random (and not quite reproducible) file corruption
happened within snapshotted binary files in the past, after upgrading
from FreeBSD 6.x to 7-stable, the system went completely unstable due
to this panic.  It crashes every couple of days now.  I have already
disabled the regular snapshots, but it's got a tendency for a complete
hard lockup during startup after a crash.  The only remedy then is to
manually fsck everything in single-user mode.

Here's the dump information from the recent crash dumps:

Dump header from device /dev/ad4s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 317976576B (303 MB)
  Blocksize: 512
  Dumptime: Wed Jun 25 04:10:25 2008
  Hostname: uriah.heep.sax.de
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-STABLE #5: Tue Jun 17 15:06:46 MET DST 2008
    r@uriah.heep.sax.de:/usr/obj/usr/src/sys/URIAH
  Panic String: snapacct_ufs2: bad block
  Dump Parity: 558154102
  Bounds: 42
  Dump Status: good
Dump header from device /dev/ad4s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 180707328B (172 MB)
  Blocksize: 512
  Dumptime: Mon Jun 30 22:05:09 2008
  Hostname: uriah.heep.sax.de
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 7.0-STABLE #6: Mon Jun 30 21:24:35 MET DST 2008
    root@uriah.heep.sax.de:/usr/obj/usr/src/sys/URIAH
  Panic String: snapacct_ufs2: bad block
  Dump Parity: 859829782
  Bounds: 43
  Dump Status: good

I'm going to avoid /any/ kind of snapshot (even dump -L) now just to
get a stable system again (hopefully).

-- 
cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/                        NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)
Comment 3 Martin Schütte 2009-03-04 01:46:03 UTC
Hello,
I encounter the same problem since updating from 6.0 to 7.1-RELEASE.

The system is a Pentium4 with a 500 Gb RAID1 (twe). The filesystems are 
clean, in case of the appended dumps the panics happened just after a 
complete fsck in single-user mode.
The panics are not completely reproducable as they happen quite often 
but not on every single snapshot.

-- 
Martin
Comment 4 jo 2010-09-22 13:57:23 UTC
Similar problem on 8-stable, crashed during snapshot creation on a UFS2
filesystem. But for me it seems to be caused by HDD errors:

acd0: FAILURE - ATA_IDENTIFY status=51<READY,DSC,ERROR> error=4<ABORTED>
LBA=0
Device: /dev/ad4, 12 Currently unreadable (pending) sectors
Device: /dev/ad4, 12 Offline uncorrectable sectors
Comment 5 Marcin Gryszkalis 2010-10-09 00:21:12 UTC
I have this panic after crash caused by failed UPS.
The system is 8.0-RELEASE-pX.
The worse thing is that it caused endless loop of reboots (crash caused 
background fsck+snapshot which caused panic which caused reboot+fsck etc.)

UFS is placed over gmirror-ed disks (additionally checked with smartctl) so I 
guess low level disk failure is not the cause.

I have few dumps available.

*blkp is BLK_SNAP 

*ibp is
$5 = {b_bufobj = 0xc585ea14, b_bcount = 16384, b_caller1 = 0x0, b_data = 
0xdd1f3000 "\001", b_error = 0, b_iocmd = 2 '\002', b_ioflags = 2 '\002', 
b_iooffset = 19662389248, 
  b_resid = 0, b_iodone = 0, b_blkno = 38403104, b_offset = -140257722368, 
b_bobufs = {tqe_next = 0xd9226e90, tqe_prev = 0xd90e9a08}, b_left = 
0xd92c08e0, 
  b_right = 0xd9281790, b_vflags = 0, b_freelist = {tqe_next = 0xd90e99d0, 
tqe_prev = 0xd9226edc}, b_qindex = 2, b_flags = 2147483808, b_xflags = 1 '\001', 
b_lock = {
    lock_object = {lo_name = 0xc0a74500 "bufwait", lo_flags = 91422720, lo_data 
= 0, lo_witness = 0x0}, lk_lock = 3311290624, lk_timo = 0, lk_pri = 80}, 
b_bufsize = 16384, 
  b_runningbufspace = 0, b_kvabase = 0xdd1f3000 "\001", b_kvasize = 16384, 
b_lblkno = -8560652, b_vp = 0xc585e96c, b_dirtyoff = 0, b_dirtyend = 0, b_rcred 
= 0x0, 
  b_wcred = 0x0, b_saveaddr = 0xdd1f3000, b_pager = {pg_reqpage = 0}, 
b_cluster = {cluster_head = {tqh_first = 0x0, tqh_last = 0x0}, cluster_entry = 
{tqe_next = 0x0, 
      tqe_prev = 0x0}}, b_pages = {0xc1a7e050, 0xc1a0a568, 0xc1a16ec8, 
0xc16fdfe0, 0x0 <repeats 28 times>}, b_npages = 4, b_dep = {lh_first = 0x0}, 
b_fsprivate1 = 0x0, 
  b_fsprivate2 = 0x0, b_fsprivate3 = 0x0, b_pin_count = 0}


-- 
Marcin Gryszkalis, PGP 0x9F183FA3 
jabber jid:mg@fork.pl, gg:2532994
http://the.fork.pl
Comment 6 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:01:16 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 7 Kirk McKusick freebsd_committer freebsd_triage 2021-02-25 17:53:13 UTC
This bug has finally been tracked down and fixed.
See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253158