(lldb) bt * thread #1, name = 'dump', stop reason = signal SIGSEGV * frame #0: 0x00000000002091f1 dump`flushtape + 1889 frame #1: 0x000000000020c53a dump`dumpmap + 250 frame #2: 0x00000000002068db dump`main + 3803 frame #3: 0x0000000000205095 dump`_start + 149 (lldb) up frame #1: 0x000000000020c53a dump`dumpmap + 250 dump`dumpmap: 0x20c53a <+250>: addl $0x1, %ebx 0x20c53d <+253>: addq $0x400, %r14 ; imm = 0x400 0x20c544 <+260>: cmpl 0x3c2e(%rip), %ebx ; u_spcl + 160 0x20c54a <+266>: jl 0x20c530 ; <+240> (lldb) up frame #2: 0x00000000002068db dump`main + 3803 dump`main: 0x2068db <+3803>: movl $0x3, 0x978f(%rip) 0x2068e5 <+3813>: movq 0x9744(%rip), %rsi ; disk 0x2068ec <+3820>: xorl %ebx, %ebx 0x2068ee <+3822>: movl $0x200aae, %edi ; imm = 0x200AAE (lldb) p sblock (void *) $0 = 0x00000008006a7000 (lldb) x/x sblock 0x8006a7000: 0x00000000 (lldb) 0x8006a7004: 0x00000000 (lldb) 0x8006a7008: 0x00000018 (lldb) 0x8006a700c: 0x00000020 (lldb) 0x8006a7010: 0x00000028 (lldb) 0x8006a7014: 0x000013c0
dump cores only when using a SATA SSD drive, on spinning rust it's fine.
> > + sblock = NULL; > > if ((ret = sbget(diskfd, &sblock, -1)) != 0) { > > switch (ret) { > > case ENOENT: This is all completely bogus as mentioned by @emaste. I know exactly the bug now. Here are the excruciating details. (Well most of it) Setting sblock to NULL is not needed here because of the way sbget works. Heinsenberg was playing here. dump main.c starts up and works out sizes of various things looking at line 454 and 455 We see the following snippet ... maxino = sblock->fs_ipg * sblock->fs_ncg; mapsize = roundup(howmany(maxino, CHAR_BIT), TP_BSIZE); ... Looking at my SSD and using printfs in main.c we see sblock->fs_ipg= 80256 sblock->fs_ncg = 655 maxino=52567680 mapsize = 6571008 inodes are dumped to "tape" from dumpino() line 459 in traverse.c dumpmap is then called line 832 traverse.c On my SSD spcl.c_count is worked out in and this results in a spcl.c_count=6417 Eventually flushtape() line 227 in tape.c is called line 296 of tape.c we have ... if (spcl.c_type != TS_END) { for (i = 0; i < spcl.c_count; i++) if (spcl.c_addr[i] != 0) blks++; } ... spcl.c_count of 6147 is quite a bit out of bounds for spcl.c_addr and it segfaults here. In fact, looking at /usr/include/protocols/dumprestore.h line 117 ... int32_t c_count; /* number of valid c_addr entries */ char c_addr[TP_NINDIR]; /* 1 => data; 0 => hole in inode */ char c_label[LBLSIZE]; /* dump label */ ... Looking at TP_NINDIR comment at line 53 of dumprestore.h ... * TP_NINDIR is the number of indirect pointers in a TS_INODE * or TS_ADDR record. Note that it must be a power of two. */ #define TP_BSIZE 1024 #define NTREC 10 #define HIGHDENSITYTREC 32 #define TP_NINDIR (TP_BSIZE/2) ... Hence c_addr[TP_NINDIR]; is 512 bytes. We overflow this fairly easily. &spcl = 0x2100d8 top = 0x2104d8 &pcl.c_addr[3715]=0x210fff (lldb) x/x 0x210ffe 0x00210ffe: 0x00000000 warning: Not all bytes (2/4) were able to be read from 0x210ffe. Boom. Apparently when the sblock buffer was static, the linked arrangment of sblock_buf[MAXBSIZE] and spcl from dumprestore.h was "just right" to mask this bounds error. This would also explain why malloc'ing a huge area for the superblock made no difference. I don't know how to fix this. You will note that a dump on my spinning rust system will also go out of bounds in c_addr but perhaps the following variables are not needed at this point? e.g. c_label and up. It's also overflowing spcl itself and going into memory it shouldn't. I'd not trust the dump itself at this point. ... sblock->fs_ipg= 80256 sblock->fs_ncg = 319 maxino=25601664 mapsize = 3201024 ... spcl=0x2100d8 top= 0x2104d8 spcl.c_count=3126 spcl.c_addr[3125]=0 blks=266 spcl.c_addr[3125]=0 blks=267 &spcl.c_addr[3125]=0x210db1 ^^ it doesn't segfault here because it doesn't get to a page boundary but it certainly is outside the bounds of spcl. This was fun!
Created attachment 194171 [details] Simple fix for now to abort dump if fs is too large I have a more complex diff that I am working on which will go on phabricator if Kirk (or someone) says I am on the right track.
A commit references this bug: Author: db Date: Mon Jun 11 19:12:51 UTC 2018 New revision: 334968 URL: https://svnweb.freebsd.org/changeset/base/334968 Log: Large file systems with inodes > 512K have been silently overflowing c_addr in spcl. So check before we start dumping otherwise we can end up with a corrupted dump. PR: 228807 Submitted by: db Reviewed by: imp Approved by: imp Changes: head/sbin/dump/main.c
A commit references this bug: Author: imp Date: Mon Jun 11 19:32:36 UTC 2018 New revision: 334969 URL: https://svnweb.freebsd.org/changeset/base/334969 Log: Add asserts to prevent overflows of c_addr. Add some asserts that prevents the overflows of c_addr. This can't happen, absent bugs. However, certain large filesystems can cause problems. These have been prevented by r334968, but a solution is needed. These asserts will help assure that solution is correct. PR: 228807 Reviewed by: db Changes: head/sbin/dump/tape.c head/sbin/dump/traverse.c
*** Bug 228882 has been marked as a duplicate of this bug. ***
Promoting this to the Release 12 meta bug
Is this now fixed?
It's fixed in head. I don't think it's been MFC'd.
(In reply to Warner Losh from comment #9) There is some other change that causes this to be a bug, I am not sure that change is merged to stable/11 either. I can build and do a test to see if this bug is in stable/11, or we can try to bisect head and find the trigger commit that made this a "bug" and make sure we merge both changes. I'll be back home tonight where resources are more plentiful if we need to track this down.
(In reply to Warner Losh from comment #9) My understanding was Kirk was going to look at it when he got back from his trip. dump works as is on 11.1 although it's silently overflowing c_addr like crazy for some people. It's not my call.
(In reply to Diane Bruce from comment #11) How do we know that it is "silently overflowing c_addr" in 11.1 if it is silent? Also if there is a problem in 11.1 we would need to issue an errata to fix it. And if that problem still exists in 11.2 it needs an errata too. I guess when I get home tonight I can start to look at issues in 11/11.1/11.2. Warner, do we know the commit in head that added the other use of c_addr?
(In reply to Warner Losh from comment #9) Thanks Warner - so as far as tracking this for FreeBSD 12 (PR228911) we're done.
Fixed in sbin/dump/tape.c: r334979 | imp | 2018-06-11 13:38:26 -0700 (Mon, 11 Jun 2018) | 10 lines Fix a bug in the counting of blks. We shouldn't count the bytes set in c_addr for TS_CLRI and TS_BITS nodes. Those block overload c_count to communicate how many blocks follow, not now many c_addr spaces are used. Dump would dump core (now) because memory layout moved around and we'd access elements past the end to make a count. Reviewed by: kib@