Bug 228174 - [dump] dump(8) can read garbage and loop forever
Summary: [dump] dump(8) can read garbage and loop forever
Status: Closed Works As Intended
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 11.1-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-stable (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-11 22:21 UTC by Eugene Grosbein
Modified: 2018-05-26 14:30 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene Grosbein freebsd_committer freebsd_triage 2018-05-11 22:21:20 UTC
Hi!

I have several FreeBSD boxes with UFS2 and soft-updates enabled (no SU+J) mounted as (ufs, local, soft-updates). They periodically run dump(8) to make zero-level backups once a week and first level incremental backup on other days of week.

The command line is like: dump -$level -C 32 -h 0 -au -f - /home
That is, not snapshots used and cache size is big.

Sometimes it finished just fine. Sometimes it its four children start to eat CPU at max and loop forever not making progress and never finishing until killed manually.

I've rebuilt /sbin/dump with debugging options -O0 -g and installed unstripped binary and when the problem reproduced, I ran sysctl kern.corefile=/var/tmp/%N.%P.core && killall -QUIT dump. So, I have five core files with debug info.

I've digged it a bit and I think I found the problem. Here is what do I get for FreeBSD 11.1-STABLE/amd64 r332356 and file system having 41943040 sectors, 512 bytes per sector and standard bsize=32768, fsize=4096:

Core was generated by `dump: /dev/mirror/gm0s1g: pass 4: 60.53% done, finished in 0:03 at Fri May 11 03'.
Program terminated with signal 3, Quit.
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x000000080098bfea in _read () from /lib/libc.so.7
(gdb) bt
#0  0x000000080098bfea in _read () from /lib/libc.so.7
#1  0x0000000000408806 in atomic (func=0x4021f4 <read@plt>, fd=5, buf=0x7ffffffedb44 "", count=4)
    at /usr/local/src/sbin/dump/tape.c:877
#2  0x0000000000407d75 in flushtape () at /usr/local/src/sbin/dump/tape.c:245
#3  0x000000000040814b in dumpblock (blkno=3906360519982919218, size=32768)
    at /usr/local/src/sbin/dump/tape.c:188
#4  0x000000000040cd38 in ufs2_blksout (dp=0x8011a9fc0, blkp=0x7ffffffedc58, frags=679,
    ino=522029, last=1) at /usr/local/src/sbin/dump/traverse.c:704
#5  0x000000000040cfea in dmpindir (dp=0x8011a9fc0, ino=522029, blk=5832576, ind_level=0,
    size=0x7fffffffdd58) at /usr/local/src/sbin/dump/traverse.c:609
#6  0x000000000040c322 in dumpino (dp=0x8011a9fc0, ino=522029)
    at /usr/local/src/sbin/dump/traverse.c:573
#7  0x0000000000404cd1 in main (argc=0, argv=0x7fffffffec60)
    at /usr/local/src/sbin/dump/main.c:579

Note insanely large blkno for frame 3.
I think the problem is in the dmpindir() function, frame 5.

Note ino=522029. It corresponds to some large text CSV file this box produces periodically and "find /home -inode 522029 -ls" shows its pathname and I can see its contents to the end with "less" command just fine.

However:

(gdb) frame 5
#5  0x000000000040cfea in dmpindir (dp=0x8011a9fc0, ino=522029, blk=5832576, ind_level=0,
    size=0x7fffffffdd58) at /usr/local/src/sbin/dump/traverse.c:609
(gdb) l dmpindir
580      * Read indirect blocks, and pass the data blocks to be dumped.
581      */
582     static void
583     dmpindir(union dinode *dp, ino_t ino, ufs2_daddr_t blk, int ind_level,
584             off_t *size)
585     {
586             union {
587                     ufs1_daddr_t ufs1[MAXBSIZE / sizeof(ufs1_daddr_t)];
588                     ufs2_daddr_t ufs2[MAXBSIZE / sizeof(ufs2_daddr_t)];
589             } idblk;
590             int i, cnt, last;
591
592             if (blk != 0)
593                     bread(fsbtodb(sblock, blk), (char *)&idblk,
594                         (int)sblock->fs_bsize);
595             else
596                     memset(&idblk, 0, sblock->fs_bsize);
597             if (ind_level <= 0) {
598                     if (*size > NINDIR(sblock) * sblock->fs_bsize) {
599                             cnt = NINDIR(sblock) * sblock->fs_frag;
(gdb) p blk
$1 = 5832576
(gdb) p idblk.ufs2[0]
$2 = 3329910757767148590

The code bread's garbage from the file system (line 593) and uses it later without any sanity checks.
Comment 1 Kirk McKusick freebsd_committer freebsd_triage 2018-05-12 05:08:33 UTC
Are you are dumping on a live filesystem? If so, is it possible that the file causing the problem is being written at the time that dump is trying to dump it? If so, then dump may read it in an inconsistent state (e.g., it has allocated, but not yet initialised one of its indirect blocks (hence the garbage value to see when it reads the not-yet initialised block. The reason for dumping a snapshot is to ensure that you get a consistent view of the filesystem.
Comment 2 Eugene Grosbein freebsd_committer freebsd_triage 2018-05-12 08:39:11 UTC
(In reply to Kirk McKusick from comment #1)

Yes, this is mounted and live file system and files are created here.

Does UFS2+SU snapshot guarantee that dump won't get uninitialized indirection blocks from the file system if snapshot creation occurs at same moment?

I still believe dump should not loop forever and should perform some sanity checks on indirection blocks.
Comment 3 Kirk McKusick freebsd_committer freebsd_triage 2018-05-14 04:57:04 UTC
Taking a snapshot is the same as if you unmounted the filesystem and dumped it. When you take a snapshot, the following steps are taken:
1) Any new system calls that want to write to the filesystem are suspended.
2) All system calls currently writing to the filesystem are allowed to finish.
3) All dirty blocks are flushed to the filesystem.
4) A snapshot of the fully flushed filesystem is taken.
5) The suspended system calls are awaken, but none of their changes appear in the snapshot.

The dump of the snapshot therefore gets a consistent view of the filesystem. The date of the snapshot is recorded, so that when you later take a level-1 dump (using another snapshot) you get exactlty the changes between the two snapshots.

Putting in error checking for every possible inconsistency is a massive undertaking. And the resulting dumps are inconsistent and prone to all sort of errors when you attempt to restore from them. That is why snapshots were added as they allow consistent and correct dumps to be made on a live system.