Bug 192622 - restore corrupting filesystem when using dump levels
Summary: restore corrupting filesystem when using dump levels
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 10.0-RELEASE
Hardware: amd64 Any
: --- Affects Many People
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-12 20:00 UTC by ahamiltonwright
Modified: 2015-03-15 04:37 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ahamiltonwright 2014-08-12 20:00:28 UTC
I was attempting to restore my /usr partition today, and have encountered
some rather terrifying issues using restore.


Some background ...

I have used dump/restore for several years, very happily, to maintain
backups on my machine.

I have a level 0 dump of each file system, and then a cron-based script
that does higher level dumps on a regular basis.  I therefore have dumps
at the following levels for this filesystem at the moment:  0, 2, 3, 5

These were created using snapshots, so the level 0 was created via
        dump 0uLCf 32 - /usr
and higher level dumps were created similarly.

My uname info is:
        FreeBSD qemg.org 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #0: Tue Jul  8 06:37:44 UTC 2014     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
I wanted to restore the /usr partition to the state it was in at the last
(level 5) backup.  My expected steps to achieve this are:
    o go to single user (I did this through a full reboot)
        o create a replacement filesystem on the drive (these options
          were constructed by running "dumpfs -m" on the filesystem that
          was present and being replaced:
                newfs -O 2 -U -a 4 -b 32768 -d 32768 -e 4096 -f 4096 \
                                -g 16384 -h 64 -i 8192 -k 0 -m 8 -o time \
                                -s 415236096 /dev/ada0e
        o mount the drive as /usr, and change directory to the mount point
        o restore the level 0 dump (that was created a month ago -- July 11,
          2014, using this same kernel, but probably an earlier patch level)
                restore ruf /backup/dumps/current/usr.dump
        * this is the first sign of trouble, as restore output the warning
                expected next file 19266003, got 19100935

        o restore the level 2 dump (created August 1, 2014)
                restore ruf /backup/dumps/current/l1d0/l2d0/usr.dump
        * this failed, indicating that the restore was corrupt output
          of restore is as follows:
                ./src: (inode 10032000) not found on tape
                ./ports: (inode 14205312) not found on tape
                bad entry: removeleaf: not a leaf
                name: ./local/share/licenses/openldap-client-2.4.39
                parent name ./local/share/licenses
                sibling name: ./local/share/licenses/RSTTMP03130448
                next entry name: ./local/share/licenses/openldap-client-2.4.39/OPENLDAP
                entry type: NODE
                inode number: 3613621
                flags: NIL
                abort? [yn] y
                dump core? [yn] n


Frankly, this terrifies me.  If dump and restore cannot be trusted
as a robust backup solution, I don't know where to turn to.

If it would help to send along the actual dump files, I am happy to do this
(they simply contain the /usr partition data), but as they are many Gb in
size, I will wait until requested.

I will note that on the questions mailing list, someone brought up the 2011
bug regarding soft updates and snapshots, which may be a contributing factor
here.  The referenced bug is this one:
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=160662

I will certainly switch my dump script to ensure that soft updates are turned
off for the duration of the dump process.