Bug 245907 - fsck_ufs segfaults with gjournal (SU+J)
Summary: fsck_ufs segfaults with gjournal (SU+J)
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: bin (show other bugs)
Version: 12.1-RELEASE
Hardware: amd64 Any
: --- Affects Only Me
Assignee: freebsd-fs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-25 10:57 UTC by crypt47
Modified: 2020-10-15 10:09 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description crypt47 2020-04-25 10:57:24 UTC
I use combination of ufs+gjournal volumes

/dev/ada0p2 on / (ufs, local, noatime, journaled soft-updates)
/dev/label/home on /usr/home (ufs, local, gjournal)


and when I outage the VM with reset, fsck successfully checks the FS with SU, then fails on checking the second FS, init drops to shell. Also after manual examination the FS with gjournal log already has a clean status. Below is bootlog where I run fsck with debugger. 

Trying to mount root from ufs:/dev/ada0p2 [rw,noatime]...
uhub1: 8 ports with 8 removable, self powered
Enter passphrase for ada1p1: uhub0: 8 ports with 8 removable, self powered
GEOM_ELI: Device ada1p1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI:     Crypto: software
GEOM_JOURNAL: Journal 1212489781: ada1p1.eli contains data.
GEOM_JOURNAL: Journal 1212489781: ada1p1.eli contains journal.
WARNING: / was not properly dismounted
WARNING: /: mount pending error: blocks 32 files 0
GEOM_JOURNAL: Journal ada1p1.eli consistent.
Setting hostuuid: f145b4bd-15db-4390-8ee5-f8675e40513f.
Setting hostid: 0x72fcbc5f.
Starting file system checks:
background with debugger
GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.1".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /sbin/fsck...
(No debugging symbols found in /sbin/fsck)
Starting program: /sbin/fsck -F -p
[Detaching after vfork from child process 53]
[Detaching after vfork from child process 54]
** SU+J Recovering /dev/ada0p2
** Reading 33554432 byte journal from inode 4.
** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.
** 6 journal records in 1024 bytes for 18.75% utilization
** Freed 1 inodes (0 dirs) 0 blocks, and 0 frags.

***** FILE SYSTEM MARKED CLEAN *****
[Detaching after vfork from child process 55]
[Detaching after vfork from child process 56]
pid 56 (fsck_ufs), jid 0, uid 0: exited on signal 11
fsck: /dev/label/home: Segmentation fault
[Inferior 1 (process 52) exited with code 01]
(gdb) bt
No stack.
(gdb) quit
Mounting local filesystems:.
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/compat/pkg
32-bit compatibility ldconfig path: /usr/lib32
Setting hostname: bewitched.

Please feel free to request for more information.
Comment 1 Conrad Meyer freebsd_committer 2020-04-25 14:12:02 UTC
Thanks for the report.

If possible, can you first save a copy of the offending volumes (/dev/label/home and corresponding gjournal), then try running 'gdb91 fsck_ufs -d -p /dev/label/home'?  If the filesystem is now "clean" and the problem can't be readily reproduced, I'm not sure how much we can do.
Comment 2 crypt47 2020-04-26 08:04:41 UTC
(In reply to Conrad Meyer from comment #1)

Hello Conrad. The problem is 100% reproducible, but could you please what do you mean by 'save a copy'. Did you want me to make a copy of the device file???

Also gdb doesn't run exactly this way, but I hope I got you right. So here is the execution of fsck_ufs with parameters you asked:

WARNING: / was not properly dismounted
WARNING: /: mount pending error: blocks 0 files 1
GEOM_JOURNAL: Journal ada1p1.eli consistent.
Setting hostuuid: f145b4bd-15db-4390-8ee5-f8675e40513f.
Setting hostid: 0x72fcbc5f.
Starting file system checks:
background with debugger
GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.1".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /sbin/fsck_ufs...
(No debugging symbols found in /sbin/fsck_ufs)
Starting program: /sbin/fsck_ufs -d -p /dev/label/home
/dev/label/home: starting

Program received signal SIGSEGV, Segmentation fault.
0x00000008003917da in free () from /lib/libc.so.7
(gdb) bt
#0  0x00000008003917da in free () from /lib/libc.so.7
#1  0x000000000021b6f2 in ?? ()
#2  0x000000000021b651 in ?? ()
#3  0x000000000020f665 in ?? ()
#4  0x000000000020810f in ?? ()
#5  0x0000000800246000 in ?? ()
#6  0x0000000000000000 in ?? ()
(gdb) quit
A debugging session is active.

	Inferior 1 [process 52] will be killed.

Quit anyway? (y or n) y
mount: /dev/ada0p2: R/W mount of / denied. Filesystem is not clean - run fsck. Forced mount will invalidate journal contents: Operation not permitted
Mounting root filesystem rw failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2020-04-26T21:58:59.255639+07:00  init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
Enter full pathname of shell or RETURN for /bin/sh: # fsck -p
** SU+J Recovering /dev/ada0p2
** Reading 33554432 byte journal from inode 4.
** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.
** 6 journal records in 1024 bytes for 18.75% utilization
** Freed 2 inodes (0 dirs) 0 blocks, and 0 frags.

***** FILE SYSTEM MARKED CLEAN *****
/dev/label/home: FILE SYSTEM CLEAN; SKIPPING CHECKS
# ^DSetting hostuuid: f145b4bd-15db-4390-8ee5-f8675e40513f.
Setting hostid: 0x72fcbc5f.
Fast boot: skipping disk checks.
Mounting local filesystems:.
Comment 3 crypt47 2020-04-26 08:42:44 UTC
Here I've installed debugging symbols and change command to fsck_ffs.

Starting program: /sbin/fsck_ffs -d -p /dev/label/home
warning: the debug information found in "/usr/lib/debug//lib/libc.so.7.debug" does not match "/lib/libc.so.7" (CRC mismatch).

/dev/label/home: starting

Program received signal SIGSEGV, Segmentation fault.
0x00000008003917da in free () from /lib/libc.so.7
#0  0x00000008003917da in free () from /lib/libc.so.7
#1  0x000000000021b6f2 in closedisk ()
    at /usr/src/sbin/fsck_ffs/gjournal.c:250
--Type <RET> for more, q to quit, c to continue without paging-- \^H \^H
#2  0x000000000021b651 in gjournal_check (filesys=<optimized out>)
    at /usr/src/sbin/fsck_ffs/gjournal.c:504
#3  0x000000000020f665 in checkfilesys (
    filesys=0x7fffffffef15 "/dev/label/home")
    at /usr/src/sbin/fsck_ffs/main.c:307
#4  main (argc=1, argv=0x7fffffffed08) at /usr/src/sbin/fsck_ffs/main.c:205
A debugging session is active.

	Inferior 1 [process 52] will be killed.

Quit anyway? (y or n) y
Comment 4 crypt47 2020-04-30 06:09:40 UTC
Hello Conrad,

Is it alright with this information?

Thanks.