Bug 208691 - "panic: ffs_valloc: dup alloc" as soon as UFS root partition is mounted
Summary: "panic: ffs_valloc: dup alloc" as soon as UFS root partition is mounted
Status: Closed Not A Bug
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 10.2-STABLE
Hardware: Any Any
: --- Affects Some People
Assignee: freebsd-fs (Nobody)
URL:
Keywords: crash
Depends on:
Blocks:
 
Reported: 2016-04-11 07:37 UTC by Martin Guy
Modified: 2020-06-24 05:43 UTC (History)
4 users (show)

See Also:


Attachments
panic: ffs_valloc: dup alloc (29.84 KB, image/png)
2020-06-21 02:17 UTC, Kubilay Kocak
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Guy 2016-04-11 07:37:08 UTC
On a system running STABLE since a week with STABLE kernel updated from svn, I  left it on overnight, doing nothing, and woke up this morning to find it rebooting continually:
-----------------------
** Resolving unreferenced inode list.
** Processing journal entries.

***** FILE SYSTEM MARKED CLEAN *****
Mounting local file systems:.
mode = 0100600, inum = 6742034, fs=/
panic: ffs_valloc: dup alloc
cpuid = 1
KDB: stack backtrace:
#0 0xc0b7c3c2 at kdb_backtrace+0x52
#1 0xc0b3c72b at vpanic+0x11b
#2 0xc0b3c60b at panic+0x1b
#3 0xc0d94221 at ffs_valloc+0x961
#4 0xc0ddbec3 at ufs_makeinode+0x73
#5 0xc0dd8110 at ufs_create+0x30
#6 0xc108ecd5 at VOP_CREATE_APV+0x95
#7 0xc0bfa316 at vn_open_cred+0x2d6
#8 0xc0bfa02d at vn_open+0x3d
#9 0xc0bf1fe0 at kern_openat+0x310
#10 0xc0bf1cba at sys_openat+0x3a
#11 0xc1067359 at syscall+0x5c9
#12 0xc10510bf at Xint0x80_syscall+0x2f
Uptime: 12s
Physical memory: 990 MB
Dumping 63 MB: 48 32 16
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
-----------------------

This is using the i386 port running 10.2 STABLE with a GENERIC kernel recompiled from updated svn. It has a single Hitachi SATA hard disk using MBR default partitioning with everything on one slice.
Device         Start       End   Sectors  Size Type
/dev/sdg1         34      1057      1024  512K FreeBSD boot
/dev/sdg2       1058 111149089 111148032   53G FreeBSD UFS
/dev/sdg3  111149090 117010465   5861376  2.8G FreeBSD swap

I don't know if it's relevant but one thing I changed yesterday was to update the kernel (while chasing a different panic!) from rX to rY and buildkernel installkernel it but the symptom also happens if when I boot kernel.old instead.
Unfortunately I can't tell you the kernel revision numbers because I can't stop the boot sequence from scrolling, but the update was from the STABLE svn version from about a week ago to yesterday (10 april 2016), bringing 6 to 10 new commits.
The kernel config is GENERIC with no modifications to code or config except for installing ndiswrapper round bwnwl5.* to make bwnwl5_sys.ko for the BCM4311 wireless card (which also didn't work!).

The laptop tests 100% fine with memtest and has never given any signs of hardware flakiness. The hard disk is 60GB from a few years ago but has never given any signs of failure and read-test of the partition on a Linux box using dd reads the whole partition without problems.

Running from live CD (10.2-RELEASE #0 r286666), mount says:
/dev/ada0p2: R/W mount of / denied. Filesystem is not clean - run fsck. ...

"fack /dev/ada0p2" says:
-----------------------------------
USE JOURNAL? [yn] y

** SU+J Recovering /dev/ada0p2
Reading 33554432 byte journal from inode 4

RECOVER? [yn] y

** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.

***** FILE SYSTEM MARKED CLEAN *****
------------------------------------
But booting it from the HD after this still panics; the only difference is that it says "FILE SYSTEM CLEAN" before trying to mount the partition and panicking.
Comment 1 Kirk McKusick freebsd_committer 2016-04-11 22:36:32 UTC
This is a known problem with journaled soft updates. It only fixes things that are in its log. Most disks are run with write cache enabled which means that they lie about completing I/O operations. Specifically they report that an I/O has been made to stable store when in fact it is only in the RAM-cache on the disk. If power to the disk is lost before the cache is flushed, the write is lost, but journaled soft updates believes it to have been done so does not check for the error. Thus it marks the disk clean when in fact it is not clean. To resolve this problem, you must bring the machine up single user and run a full fsck on it using ``fsck -f -y /filesystem_in_question''. This will find the hidden problem and correct it.
Comment 2 Martin Guy 2016-04-12 13:05:50 UTC
Thanks Kirk, that made the system boot again. Unfortunately the interesting parts of the fsck output scrolled off the screen before I could copy them, so I don't know what the filesystem defect was that provoked this panic.

However, if the same thing happens to a distant server with only a console, it would panic while mounting / to go into single-user mode too, making the system unrecoverable.

Peter H wrote to say:
> the first (journaled) fsck just replays the journal; it does not
> check the file system. That is why journaling fsck is so fast.

So another way of seeing the problem is that, if "fsck -p" fails, it tells the user to run "fsck -y" which says "yes" to fsck's initial "USE JOURNAL?" question and this results in a faster check that doesn't fix the kernel-panicking inconsistency. It needs -fy to get round the "USE JOURNAL?" option.

Another solution could be to always do a full check without the journal when a fast filesystem is not clean, instead of "-p"? I don't see a combination of fsck flags to do that. -fyC looks like the closest but needs changing to make -C override -f and skip the check if the filesystem is clean.

Or turning write-behind off by default until a better fix is found?
Comment 3 Kirk McKusick freebsd_committer 2016-04-12 19:19:53 UTC
To your question of recovering when you have a corrupted root filesystem, that is not a problem because the kernel always comes up with the root filesystem read-only. The panics that you can get will only happen when attempting to write the filesystem. So if you are single-user, you will have the opportunity to fix the filesystem while it is mounted read-only. It only switches to read-write as it exits single-user and goes to multi-user.

The long-term fix is to note in the superblock when the filesystem has panic'ed and to force a full fsck when finding this flag set. This is part of a bigger project that reduces the number of panics in the filesystem that should be coming in over the next year.

As a general rule, you should run with write cache disabled if you want full recoverability after a crash.
Comment 4 Kirk McKusick freebsd_committer 2016-05-07 23:16:49 UTC
I am closing this bug as it is triggered by disks running in an unreliable mode. As such there is no change called for in the software, though one can legitimately argue that the default should be to configure disks to run with `write cache disabled' so that they will be reliable. FreeBSD ran for a while with this default, but for many disks the performance was abysmal, so the default reverted back to unreliable. Most disks support tag queuing today which does not suffer poor performance from `write cache disabled', so perhaps it is time to revisit this default. But that is not a topic for this report.
Comment 5 Martin Guy 2016-05-08 15:18:34 UTC
An alternative to prevent the observed behaviour, of rebooting continually, would be to always fully fsck the filesystem when it is dirty, rather than the current fsck -p behaviour of replaying the journal and applying simpler checks.

I'm not sure a new "filesystem panic flag" would help, as there's not a lot of difference between the state the FS can be left in after to a kernel panic and when it stops due to a power failure.
Comment 6 Kirk McKusick freebsd_committer 2016-05-08 16:22:56 UTC
You suggest ``An alternative to prevent the observed behaviour, of rebooting continually, would be to always fully fsck the filesystem when it is dirty, rather than the current fsck -p behaviour of replaying the journal and applying simpler checks.''

To get this behavior, turn off journalling using the command:

      tunefs -j disable /filesystem/to/disable

When the system finds a clean filesystem at boot, it skips fsck. When the system finds a dirty filesystem at boot and no journal, it runs a full fsck.

You note ``I'm not sure a new "filesystem panic flag" would help, as there's not a lot of difference between the state the FS can be left in after to a kernel panic and when it stops due to a power failure.''

When a panic occurs, the filesystem code never gets another chance to run, thus there is no way for it to set a `filesystem panic'ed flag'. The only indication of something unexpected having happened is the absence of a `filesystem cleanly unmounted flag'.
Comment 7 Jason W. Bacon freebsd_committer 2019-03-08 00:15:47 UTC
I just hit this on a laptop that was running with a completely dead battery for a while and went down hard a couple times due to a touchy power connector.

It would boot fine but always panic within an hour.

Manual "fsck -fy" resolved it - thanks for posting the workaround.

As for long-term solutions, it possible to quickly determine if a filesystem is still dirty after an "fsck -p", and automatically fall back to a full fsck when necessary?
Comment 8 Jason W. Bacon freebsd_committer 2020-05-14 15:12:58 UTC
Note to posterity:

I updated sysutils/desktop-installer to explain this situation to the user and offer the option to disable write caching.  This should help alleviate the problems until a systemic solution is implemented.
Comment 9 Kubilay Kocak freebsd_committer freebsd_triage 2020-06-21 02:17:10 UTC
Created attachment 215833 [details]
panic: ffs_valloc: dup alloc

Just hit what I believe is this issue on 13.0-CURRENT #9 r359315 GENERIC-NODEBUG amd64.

Attaching screenshot for our future selves
Comment 10 Kubilay Kocak freebsd_committer freebsd_triage 2020-06-21 02:20:30 UTC
Running tunefs -j disable per comment 6, I saw:

Clearing journal flags from inode 4
tunefs: Failed to write journal inode: failed to open disk for writing: Operation not permitted
tunefs: soft updates journaling cleared but soft updates still set.
tunefs: remove .sujournal to reclaim space
tunefs: /dev/da0p3: failed to open disk for writing

Does this, with the initial error, but subsequent "journaling cleared" mean it was disabled, or only partially?
Comment 11 Kirk McKusick freebsd_committer 2020-06-24 05:43:17 UTC
(In reply to Kubilay Kocak from comment #10)
Once journaling is disabled, it remains disabled until such time as you reenable it. It does leave you running with soft updates. But after a crash, you will always do a full fsck. There is no journal and it will not attempt a journal recovery.