Bug 256712 - UFS: kernel panic: ffs_blkfree_cg: freeing free frag – in response to pkg-delete(8) soon after login
Summary: UFS: kernel panic: ffs_blkfree_cg: freeing free frag – in response to pkg-del...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-RELEASE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-fs (Nobody)
URL:
Keywords: panic
Depends on:
Blocks:
 
Reported: 2021-06-19 10:39 UTC by Graham Perrin
Modified: 2021-06-22 08:27 UTC (History)
3 users (show)

See Also:


Attachments
photograph of the panic, including the backtrace (133.66 KB, image/jpeg)
2021-06-19 10:43 UTC, Graham Perrin
no flags Details
Konsole output, including vmcore.0 backtrace (3.11 KB, text/plain)
2021-06-20 17:16 UTC, Graham Perrin
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Graham Perrin 2021-06-19 10:39:18 UTC
I began with a fresh installation of FreeBSD 13.0 to UFS, then (in a nutshell): 

1. single user mode
2. tunefs -n disable
3. ee /etc/fstab
4. ee /etc/sysctl.conf
5. shutdown -r now
5. an update to the OS
6. shutdown -r now
7. /etc/pkg/FreeBSD.conf set to latest
8. tests for file system resilience. 

----

# Device	Mountpoint	FStype	Options	Dump	Pass#
/dev/ada0s1a	/		ufs	rw,sync	1	1
/dev/ada0s1b	none		swap	sw	0	0

----

kern.filedelay=10
kern.dirdelay=9
kern.metadelay=8

hw.acpi.power_button_state="NONE"

----

Tests typically involved pressing and holding the power button whilst installing packages: 

* specifically, timing the press for interruption to occur 
  during or immediately after extraction of devel/gdb

– most, if not all, results were OK. Dozens of tests yesterday and this morning. 

----

For just one test, after-effects were remarkable: 

pkg: sqlite error while executing INSERT INTO files (path, sha256, package_id) VALUES (?1, ?2, ?3) in file pkgdb.c:1825: database disk image is malformed

<https://forums.freebsd.org/threads/80925/> – the broken local.sqlite was no surprise, given the nature of the tests and honestly, I'm not certain that the breakage was a _direct_ consequence. I mention this incident solely as a point of reference; unless advised otherwise, I'll treat it as negligible. 

======================================================================

For a next round, this morning (with the file system reportedly clean) I _enabled_ soft updates then tested in much the same way – observing automated file system checks and repairs at startup time. 

Shortly after a login, I ran: 

pkg delete -fy python38 gdb

The system crashed. I awaited completion of the dump, suppressed automatic reboot and took a photograph of the backtrace. 

The round of testing that led to this situation was very brief – maybe only two or three tests (interruptions) – so I hope to make things reproducible with not too much difficulty.
Comment 1 Graham Perrin 2021-06-19 10:43:00 UTC
Created attachment 225927 [details]
photograph of the panic, including the backtrace
Comment 2 Graham Perrin 2021-06-19 12:32:36 UTC
root@mowa219-gjp4-ev631-freebsd-13:/var/crash # dumpfs -m /

# newfs command for / (/dev/ada0s1a)
newfs -O 2 -a 32 -b 32768 -d 32768 -e 4096 -f 4096 -g 16384 -h 64 -i 8192 -j -k 6400 -m 8 -o time -s 616562688 /dev/ada0s1a 
root@mowa219-gjp4-ev631-freebsd-13:/var/crash # freebsd-version -kru
13.0-RELEASE-p1
13.0-RELEASE-p1
13.0-RELEASE-p2
root@mowa219-gjp4-ev631-freebsd-13:/var/crash # date ; ls -hl /.sujournal 
Sat Jun 19 10:24:22 BST 2021
-r--------  1 root  wheel    32M Jun 17 12:12 /.sujournal
root@mowa219-gjp4-ev631-freebsd-13:/var/crash # cat info.0
Dump header from device: /dev/ada0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 335659008
  Blocksize: 512
  Compression: none
  Dumptime: 2021-06-19 08:48:06 +0100
  Hostname: mowa219-gjp4-ev631-freebsd-13
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 13.0-RELEASE-p1 #0: Wed May 26 22:15:09 UTC 2021
    root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
  Panic String: ffs_blkfree_cg: freeing free frag
  Dump Parity: 1107028828
  Bounds: 0
  Dump Status: good
Comment 3 Kirk McKusick freebsd_committer 2021-06-19 20:52:15 UTC
(In reply to Graham Perrin from comment #0)
Thanks for your report. Always great to have people doing hard failure testing.

When you say you enabled soft updates, did you do so with or without journaling?

When you say you were using `automated filesystem checks' were the runs of fsck being done before booting multi-user or were they being done in background after the system started running?
Comment 4 Graham Perrin 2021-06-20 10:26:00 UTC
Thanks for responding so soon, quite unexpected, especially at a weekend. 

(In reply to Kirk McKusick from comment #3)

> … When you say you enabled soft updates, did you do so 
> with or without journaling?

I never actively disabled journaling, so I imagined that it would be (by default) enabled – and effective – after re-enabling soft updates. 

Re: comment #2 I was struck by the date of modification of /.sujournal

Metadata now viewed with Dolphin, it's clear to me that the file was created (2021-06-17 12:12:49) but never modified. 

In retrospect, I guess that journaling was supposedly enabled 
– the '-j' in output from dumpfs -m / 
– but non-effective. Correct me, please, if I'm wrong.
Comment 5 Graham Perrin 2021-06-20 10:26:51 UTC
(In reply to Kirk McKusick from comment #3)

> When you say you were using `automated filesystem checks' 
> were the runs of fsck being done before booting multi-user or 
> were they being done in background after the system started running?

Here's where I admit to not properly understanding some of what's under <https://docs.freebsd.org/en/books/handbook/config/#soft-updates> and (unless I'm missing something) the history there touches upon journaling _but not_ journaled soft updates. 

The simplest answer to the question might be that (for tests of this nature) I would never manually exit to multi-user mode without seeing an explicit message confirming a mark of cleanliness. 

If there's the mark of cleaniness for a particular file system 
– in this case, / 
– and if the mark results from an automated fsck that precedes automated progress to multi-user mode – then will a background check occur (for this file system) regardless of the mark?

Here's where I take particular interest in comments at and under bug 255799 comment 2. 

Other questions bubbling under but for now, I'll can them … and I feel that I haven't properly answered your question (because I'm not certain of the answer). Sorry.
Comment 6 Graham Perrin 2021-06-20 10:34:04 UTC
(In reply to comment #2)

> … Dump Status: good

# gdb file /var/crash/vmcore.0
…
"/var/crash/vmcore.0" is not a core dump; file format not recognized
(gdb) q

From this, I assume that whilst there was true goodness when the status was determined, the file that was then written by savecore(8) became a casualty of file system inconsistency, despite a mark of cleanliness.
Comment 7 andrew 2021-06-20 12:40:18 UTC
I've been trying to help Graham reproduce this one (per discussions on the libera #freebsd irc channel) but so far what I seem to have managed is to produce a different error:

** SU+J Recovering /dev/vtbd1
** Reading 4194304 byte journal from inode 4.
** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.
/dev/vtbd1: ino_remref: ino 2767 mode 0100000 about to go negative
/dev/vtbd1: UNEXPECTED SU+J INCONSISTENCY
/dev/vtbd1: INTERNAL ERROR: GOT TO reply()
/dev/vtbd1: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.

I've managed to reproduce this one fairly reliably (3 times out of 4 tries) and I have a copy of a filesystem image taken before the fsck. The steps I used to reproduce were (this is in a 13-stable install in a bhyve vm, where vtbd0 is the system image and vtbd1 is a 256MB md device on the host):

newfs -O2 -U -j /dev/vtbd1
mount /dev/vtbd1 /mnt
tar -xf python38-3.8.10.tgz -C /mnt
find /mnt/usr -type f -print0 | xargs -0 rm
tar -xf python38-3.8.10.tgz -C /mnt
(when the prompt returns at this point, immediately destroy the vm)

On booting the vm again and then doing  fsck -p /dev/vtbd1, the above error appears. (vtbd1 is not mentioned in fstab)

Should I split this off into a separate bug? (I will continue to try to reproduce the original issue)
Comment 8 andrew 2021-06-20 13:19:42 UTC
Incidentally, background fsck (for SU without journalling) seems to have been broken for 6 months; commit 5cc52631b3b88 breaks it by trying to open the device for writing when invoked with -B -p as rc.d/bgfsck does. Result is a "NO WRITE ACCESS" error in the log and no fsck happens.
Comment 9 Gary Jennejohn 2021-06-20 13:32:41 UTC
(In reply to Graham Perrin from comment #6)
This is a kernel crash dump, not a core dump from a normal process.  You probably have to use kgdb to analyze it.
Comment 10 Graham Perrin 2021-06-20 15:34:42 UTC
Thanks, 

(In reply to Gary Jennejohn from comment #9)

Oops! I'm suitably embarrassed (probably never needed kgdb before today). 

----

# date ; uptime
Sun Jun 20 16:08:08 BST 2021
4:08PM up 24 mins, 5 users, load averages: 0.33, 0.17, 0.10
# kgdb file /var/crash/vmcore.0
GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD]
…
Reading symbols from file...
Reading symbols from /usr/lib/debug//usr/bin/file.debug...
/wrkdirs/usr/ports/devel/gdb/work-py38/gdb-10.2/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it. For instructions, see:
<https://www.gnu.org/software/gdb/bugs/>.

/wrkdirs/usr/ports/devel/gdb/work-py38/gdb-10.2/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info *): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) y
Abort trap (core dumped)
# pwd
/root
# ls -hl *.core
-rw------- 1 root wheel 354M Jun 20 16:09 kgdb102.core
# gdb file kgdb102.core
…

----

No symbols in the backtrace for kgdb102.core (and to keep things here reasonably concise, I'm taking advice elsewhere).
Comment 11 Graham Perrin 2021-06-20 17:16:42 UTC
Created attachment 225951 [details]
Konsole output, including vmcore.0 backtrace
Comment 12 Gary Jennejohn 2021-06-21 09:24:08 UTC
(In reply to Graham Perrin from comment #11)
This reminds me of a kernel panic which I saw and reported on.
I was asked to add INVARIANTS and DEBUG=-O0 to my kernel config file to get better backtraces.  Since I already had makeoptions DEBUG=-g, I actually added makeoptions DEBUG+=-O0 to get both -g and -O0 in the compiler flags.
As one can see in your backtrace, parts of the kernel were optimized out.  The -O0 assures that no optimization is done.
That will produce better backtrace.
For INVARIANTS I added options INVARIANTS and options INVARIANT_SUPPORT.
So, if you can easily reproduce the panic I recommend adding these to your kernel config file and then rerunning your test.
Comment 13 Kirk McKusick freebsd_committer 2021-06-21 12:04:54 UTC
(In reply to Graham Perrin from comment #4)
Thanks for these details. I have been able to reproduce the odd result where the reenabling of soft updates does not appear to reenable the journaling. I need some time to track down what is actually happening.
Comment 14 Kirk McKusick freebsd_committer 2021-06-21 12:15:52 UTC
(In reply to Graham Perrin from comment #5)
It appears that the default is to run `fsck -p' during startup and not kick it off in background which is good because it eliminates all the issues of bugs in the background checks. What I would like to see is the output of those checks which appear in the console output as the system starts up.
Comment 15 andrew 2021-06-21 12:16:43 UTC
I put the background fsck thing up as a separate bug #256746
Comment 16 Kirk McKusick freebsd_committer 2021-06-21 12:29:02 UTC
(In reply to andrew from comment #7)
There are three cases here:
1) failure to properly clean up the filesystem when using journal
2) failure to properly clean up when running in background
3) failure to properly clean up when running synchronous fsck -p during startup.

I would like to analyse each of these in turn so as to determine which method(s) are having problems. First I want to test (3); if it fails then the bug is in fsck_ffs itself. If the problem fails to manifest in (3), then next is to test (1); if it fails then the bug is in the journaling subsystem. Finally is to test (2); if it fails then the bug is in the background checking. So, useful to make forward progress is to test in each of the above ways in the order (3), (1), then (2).
Comment 17 Kirk McKusick freebsd_committer 2021-06-21 12:33:55 UTC
At this point getting more detail on the kernel crash is not important. It is an expected panic that comes about because the system is running with a corrupted filesystem. Separately we are working to have a better recovery mechanism when these corruptions are detected (forcible downgrade of the filesystem to read-only rather than panic'ing). But the focus of this bug is figuring out why the filesystem checks are failing to detect and fix the corruption.
Comment 18 andrew 2021-06-21 12:41:00 UTC
(In reply to Kirk McKusick from comment #16)

To clarify, for your case (3), are you interested in the case of synchronous fsck -p when journalling is NOT enabled (but softupdates are?)
Comment 19 Kirk McKusick freebsd_committer 2021-06-21 12:45:51 UTC
(In reply to andrew from comment #18)
Yes, case (3) is soft updates without journaling and running fsck -p during system startup (not in background).
If that still leads to kernel crashes, then a case (4) is running without softdates at all and running fsck -p during system startup (not in background).
Comment 20 Graham Perrin 2021-06-22 06:13:13 UTC
Quoting from comment #0

> # Device	Mountpoint	FStype	Options	Dump	Pass#
> /dev/ada0s1a	/		ufs	rw,sync	1	1
> /dev/ada0s1b	none		swap	sw	0	0

– that is, mount(8) option 'sync'; 

>> All I/O to the file system should be done synchronously.

Kirk, please: 

* for this bug 256712, should all tests adhere to the sync criterion?

----

Similarly, 

> kern.filedelay=10
> kern.dirdelay=9
> kern.metadelay=8

* for this bug 256712, should all tests adhere to those criteria?
Comment 21 Kirk McKusick freebsd_committer 2021-06-22 06:29:14 UTC
(In reply to Graham Perrin from comment #20)
I had missed that the filesystem had been mounted with the 'sync' option. I will have to check, but off the top of my head that will probably disable soft updates. If so it is very surprising that there can be any failures at all. I will investigate and let you know. Will be next week before I am likely have time to do the investigation though.

As for the other timings, they are irrelevant if soft updates are disabled.
Comment 22 Graham Perrin 2021-06-22 06:37:41 UTC
Earlier this month, I had a quite different test case with these criteria: 

a) soft updates disabled

b) sync _not_ set in /etc/fstab

Following interruption whilst installing a package, or packages, the base system was broken in that things such as /bin/sh and /sbin/fsck were unusable in single user mode (and so, no automated check). Whilst subsequent repairs yielded a file system that was marked clean, the base system remained essentially unusable. 

Please: should I open a separate bug for this? Or is breakage a (rare but explicable) possibility with the combination of those two criteria?

----

man 8 mount

> …
> noasync
>   Metadata I/O should be done synchronously, while data I/O
>   should be done asynchronously.  This is the default.
> …

<https://docs.freebsd.org/en/books/handbook/config/#_more_details_about_soft_updates> mentions "… no guarantee for a consistent state of the file system If there is a failure during an operation that updated large amounts of meta-data, …" 

* in the context of    async
* but not in the context of    noasync
Comment 23 Graham Perrin 2021-06-22 07:01:37 UTC
(In reply to Kirk McKusick from comment #21)

Thank you, no rush. 

(I sometimes need to switch off, mentally, for weeks, between test sessions. So, apologies in advance if I'm unusually slow to respond to anything.)

----

Re: the comment #0 case, I kicked myself for not taking an image of the entire hard disk at an appropriate time. I did pause for thought and discussion in IRC, then decided (for myself) that there probably was no bug … began allowing repairs and other changes to the file system … eventually realised that I had made a poor decision.