Bug 150503 - [zfs] ZFS disks are UNAVAIL and corrupted after reboot
Summary: [zfs] ZFS disks are UNAVAIL and corrupted after reboot
Status: Closed Overcome By Events
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: unspecified
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-12 17:30 UTC by William FRANCK
Modified: 2019-07-18 11:35 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description William FRANCK 2010-09-12 17:30:04 UTC
After just creating zpool , not even zfs volumes, ZFS is fine.
After rebooting the system, all zfs disk are marked UNAVAIL.

Tested with different disk formating :
# dd if=/dev/zero of=/dev/ad4 bs=1m count=1
or 
# gpart create -s gpt ad8
# gpart add -b 34 -s 128 -t freebsd-boot ad8
# gpart add -b 162 -s 1465148973 -t freebsd-zfs ad8

Tested with and without any real data.

After reboot : 
# zpool status
pool: tank
state: FAULTED
status: One or more devices could not be used because the label is missing 
	or invalid.  There are insufficient replicas for the pool to continue
	functioning.
action: Destroy and re-create the pool from a backup source.
  see: http://www.sun.com/msg/ZFS-8000-5E
 scrub: none requested
config:
	NAME        STATE     READ WRITE CKSUM
	tank        FAULTED      0     0     0  corrupted data
	  raidz1    ONLINE       0     0     0
	    ad4p2   UNAVAIL      0     0     0  corrupted data
	    ad8p2   UNAVAIL      0     0     0  corrupted data

How-To-Repeat: # zpool destroy  tank 

CASE A  (same apply for ad4 and ad8) :
# gpart create -s gpt ad8
# gpart add -b 34 -s 128 -t freebsd-boot ad8
# gpart show ad8
# gpart add -b 162 -s 1465148973 -t freebsd-zfs ad8
# fdisk -a /dev/ad8

note : 1465148973 is the exact number reported by 'gpart show'

or CASE B :
# dd if=/dev/zero of=/dev/ad4 bs=1m count=1
# dd if=/dev/zero of=/dev/ad8 bs=1m count=1


AND FOLLOWING : 
# zpool create tank raidz ad4 ad8
# zfs create -p tank/ROOT/freebsd

REBOOT
# shutdown -r now
(restared... login ... )
# zpool status
Comment 1 wfr 2010-09-12 18:13:26 UTC
ZFS v4
ZPool v15
Motherboard MSI K9N-SLI with SATA 3Gb through nVIDIA® nForce 570 SLI  chipset
CPU Athlon 6000+ (2 CPU)
2 Gb of RAM


# atacontrol list
==========
ATA channel 0:
Master: ad0 <ST340016A/3.10> ATA/ATAPI revision 5
Slave: ad1 <ST3120022A/3.06> ATA/ATAPI revision 6
ATA channel 2:
Master: ad4 <ST3750330AS/SD1A> SATA revision 1.x
ATA channel 4:
Master: ad8 <ST3750330AS/SD1A> SATA revision 1.x

# cat /boot/loader.conf
===========
zfs_load="YES" # ZFS
zpool_cache_type="/boot/zfs/zpool.cache"
vfs.zfs.zil_disable="1" # !! avoid conflict between ZFS and NFS risk of data consistency ??
vfs.zfs.prefetch_disable="1" #Prefetch is disabled by default if less than 4GB of RAM is present;
### specific amd64 with 2MB of RAM ###
vm.kmem_size="1024M"
vm.kmem_size_max="1024M"
vfs.zfs.arc_max="100M"


see also PR150501 same environment and ZPool is bugging.=
Comment 2 wfr 2010-09-13 08:39:47 UTC
Guess has to do with the nVidia 570 SLI chipset SATA300 and/or MCP55


All of us with nVidia 570 SLI have the SATA not foolish !
	http://forums.freebsd.org/showthread.php?t=3D4574
	http://forums.freebsd.org/showthread.php?t=3D5286   (mcp55 ; =
nVidia 8200)
	http://forums.freebsd.org/showthread.php?t=3D2915  (mcp55 ; =
nVidia nForce4 )
	(... and others ...)


see :=20
	http://www.freebsd.org/cgi/query-pr.cgi?pr=3D120296
	http://www.freebsd.org/cgi/query-pr.cgi?pr=3D121396
	http://www.freebsd.org/cgi/query-pr.cgi?pr=3D150501
	http://www.freebsd.org/cgi/query-pr.cgi?pr=3D150503


Could you please forward to the right person ?
Thanks a lot,

William



Le 12 sept. 2010 =E0 18:30, freebsd-gnats-submit@FreeBSD.org a =E9crit :

> Thank you very much for your problem report.
> It has the internal identification `kern/150503'.
> The individual assigned to look at your
> report is: freebsd-bugs.=20
>=20
> You can access the state of your problem report at any time
> via this link:
>=20
> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D150503
>=20
>> Category:       kern
>> Responsible:    freebsd-bugs
>> Synopsis:       ZFS disks are UNAVAIL and corrupted after reboot
>> Arrival-Date:   Sun Sep 12 16:30:04 UTC 2010
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2010-09-17 09:11:48 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 4 Martin Matuska freebsd_committer 2011-07-18 12:53:55 UTC
Any news on this issue? Does it also happen with ZFS v28 and latest
8-STABLE / 9-CURRENT?

-- 
Martin Matuska
FreeBSD committer
http://blog.vx.sk
Comment 5 Guido Falsi freebsd_committer 2012-11-09 12:32:29 UTC
hi,

I'm seeing something similar on a CURRENT Machine:

I start from this system:

FreeBSD ted 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r242126: Fri Oct 26 
13:03:09 CEST 2012     root@ted:/usr/obj/usr/src/sys/TED  amd64

Please note that this machine was already built with clang, and has been 
since I reinstalled OS on it at the start of October

update sources:

gfalsi@ted:/usr/src [0]> svn info
Path: .
Working Copy Root Path: /usr/src
URL: svn://svn.freebsd.org/base/head
Repository Root: svn://svn.freebsd.org/base
Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
Revision: 242822
Node Kind: directory
Schedule: normal
Last Changed Author: gjb
Last Changed Rev: 242816
Last Changed Date: 2012-11-09 05:52:15 +0100 (Fri, 09 Nov 2012)

make buildworld, buildkernel and installkernel, reboot and I get this 
(copied by hand) after kernel booting(so loader seems to be ok):

Trying to mount root from zfs:tank []...
Mounting from zfs:tank failed with error 22

I then reboot the machine using a snapshot from 
https://pub.allbsd.org/FreeBSD-snapshots/amd64-amd64/10.0-HEAD-20121006-JPSNAP/ 
load zfs.ko and perform a zpool status and I get this output:

   pool: tank
  state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
         replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
    see: http://illumos.org/msg/ZFS-8000-3C
   scan: none requested
config:

         NAME                      STATE     READ WRITE CKSUM
         tank                      UNAVAIL      0     0     0
           mirror-0                UNAVAIL      0     0     0
             13149740312808713750  UNAVAIL      0     0     0  was 
/dev/gpt/disk0
             6984386892400701167   UNAVAIL      0     0     0  was 
/dev/gpt/disk1
           mirror-1                UNAVAIL      0     0     0
             10066834453677312324  UNAVAIL      0     0     0  was 
/dev/gpt/disk2
             571766486195567663    UNAVAIL      0     0     0  was 
/dev/gpt/disk3


I fought a little with this then tried:

zpool export tank ; zpool import -f -R /mnt/tank tank

and can read the pool once more:

   pool: tank
  state: ONLINE
   scan: scrub repaired 0 in 0h13m with 0 errors on Thu Nov  8 16:43:19 2012
config:

         NAME           STATE     READ WRITE CKSUM
         tank           ONLINE       0     0     0
           mirror-0     ONLINE       0     0     0
             gpt/disk0  ONLINE       0     0     0
             gpt/disk1  ONLINE       0     0     0
           mirror-1     ONLINE       0     0     0
             ada2p2     ONLINE       0     0     0
             gpt/disk3  ONLINE       0     0     0

errors: No known data errors


If I reboot now I get back to the same error and pool status as before, 
If I replace the old kernel machine reboots correctly.

I'm going now to test checking out some old revision of the source to 
single out the faulty commit.

Is this information useful to you?

This is a test machine, so I can test patches and fixes there. I'd 
rather not loose the pool, but if that happens it's not that bad either. 
So I'm available for any further test or experiment.

Thanks in advance!

-- 
Guido Falsi <mad@madpilot.net>
Comment 6 Andriy Gapon freebsd_committer 2012-11-09 15:14:52 UTC
>    pool: tank
>    state: UNAVAIL
>  status: One or more devices could not be opened.  There are insufficient
>           replicas for the pool to continue functioning.
>  action: Attach the missing device and online it using 'zpool online'.
>      see: http://illumos.org/msg/ZFS-8000-3C
>     scan: none requested
>  config:
>  
>           NAME                      STATE     READ WRITE CKSUM
>           tank                      UNAVAIL      0     0     0
>             mirror-0                UNAVAIL      0     0     0
>               13149740312808713750  UNAVAIL      0     0     0  was 
>  /dev/gpt/disk0
>               6984386892400701167   UNAVAIL      0     0     0  was 
>  /dev/gpt/disk1
>             mirror-1                UNAVAIL      0     0     0
>               10066834453677312324  UNAVAIL      0     0     0  was 
>  /dev/gpt/disk2
>               571766486195567663    UNAVAIL      0     0     0  was 
>  /dev/gpt/disk3


Commenting only on this piece.  After some ZFS changes from about a month ago you
can get this kind of output if your ZFS userland is older than your kernel ZFS.
If this is the case, the then above message is just symptom of that discrepancy.

> Trying to mount root from zfs:tank []...
>  Mounting from zfs:tank failed with error 22

22 is EINVAL, not sure how to interpret this failure.
Could be a result of zpool.cache being produced by the older code, but not sure...

-- 
Andriy Gapon
Comment 7 Guido Falsi freebsd_committer 2012-11-09 15:34:54 UTC
On 11/09/12 16:14, Andriy Gapon wrote:
>>     pool: tank
>>     state: UNAVAIL
>>   status: One or more devices could not be opened.  There are insufficient
>>            replicas for the pool to continue functioning.
>>   action: Attach the missing device and online it using 'zpool online'.
>>       see: http://illumos.org/msg/ZFS-8000-3C
>>      scan: none requested
>>   config:
>>
>>            NAME                      STATE     READ WRITE CKSUM
>>            tank                      UNAVAIL      0     0     0
>>              mirror-0                UNAVAIL      0     0     0
>>                13149740312808713750  UNAVAIL      0     0     0  was
>>   /dev/gpt/disk0
>>                6984386892400701167   UNAVAIL      0     0     0  was
>>   /dev/gpt/disk1
>>              mirror-1                UNAVAIL      0     0     0
>>                10066834453677312324  UNAVAIL      0     0     0  was
>>   /dev/gpt/disk2
>>                571766486195567663    UNAVAIL      0     0     0  was
>>   /dev/gpt/disk3
>
>
> Commenting only on this piece.  After some ZFS changes from about a month ago you
> can get this kind of output if your ZFS userland is older than your kernel ZFS.
> If this is the case, the then above message is just symptom of that discrepancy.
>

I also tried making make installkernel ; make installworld ; reboot, but 
had the same symptoms. luckily I was also able to rollback to a previous 
zfs snapshot from the USB key after the export/import trick.

>> Trying to mount root from zfs:tank []...
>>   Mounting from zfs:tank failed with error 22
>
> 22 is EINVAL, not sure how to interpret this failure.
> Could be a result of zpool.cache being produced by the older code, but not sure...
>

Uhm I don't know how to generate a new zpool.cache from a newly update 
system, since I can't export/import the root and don't have a newer 
system on a USB key. I'll have to produce one perhaps.

-- 
Guido Falsi <mad@madpilot.net>
Comment 8 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 07:58:42 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped
Comment 9 Andriy Gapon freebsd_committer 2019-07-18 11:35:12 UTC
Very old report, lots of ZFS changes since then.