| Summary: | "panic: ffs_valloc: dup alloc" as soon as UFS root partition is mounted | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Base System | Reporter: | Martin Guy <martinwguy> | ||||
| Component: | kern | Assignee: | freebsd-fs (Nobody) <fs> | ||||
| Status: | Closed Not A Bug | ||||||
| Severity: | Affects Some People | CC: | chris, jwb, koobs, martinwguy, mckusick | ||||
| Priority: | --- | Keywords: | crash | ||||
| Version: | 10.2-STABLE | ||||||
| Hardware: | Any | ||||||
| OS: | Any | ||||||
| Attachments: |
|
||||||
|
Description
Martin Guy
2016-04-11 07:37:08 UTC
This is a known problem with journaled soft updates. It only fixes things that are in its log. Most disks are run with write cache enabled which means that they lie about completing I/O operations. Specifically they report that an I/O has been made to stable store when in fact it is only in the RAM-cache on the disk. If power to the disk is lost before the cache is flushed, the write is lost, but journaled soft updates believes it to have been done so does not check for the error. Thus it marks the disk clean when in fact it is not clean. To resolve this problem, you must bring the machine up single user and run a full fsck on it using ``fsck -f -y /filesystem_in_question''. This will find the hidden problem and correct it. Thanks Kirk, that made the system boot again. Unfortunately the interesting parts of the fsck output scrolled off the screen before I could copy them, so I don't know what the filesystem defect was that provoked this panic.
However, if the same thing happens to a distant server with only a console, it would panic while mounting / to go into single-user mode too, making the system unrecoverable.
Peter H wrote to say:
> the first (journaled) fsck just replays the journal; it does not
> check the file system. That is why journaling fsck is so fast.
So another way of seeing the problem is that, if "fsck -p" fails, it tells the user to run "fsck -y" which says "yes" to fsck's initial "USE JOURNAL?" question and this results in a faster check that doesn't fix the kernel-panicking inconsistency. It needs -fy to get round the "USE JOURNAL?" option.
Another solution could be to always do a full check without the journal when a fast filesystem is not clean, instead of "-p"? I don't see a combination of fsck flags to do that. -fyC looks like the closest but needs changing to make -C override -f and skip the check if the filesystem is clean.
Or turning write-behind off by default until a better fix is found?
To your question of recovering when you have a corrupted root filesystem, that is not a problem because the kernel always comes up with the root filesystem read-only. The panics that you can get will only happen when attempting to write the filesystem. So if you are single-user, you will have the opportunity to fix the filesystem while it is mounted read-only. It only switches to read-write as it exits single-user and goes to multi-user. The long-term fix is to note in the superblock when the filesystem has panic'ed and to force a full fsck when finding this flag set. This is part of a bigger project that reduces the number of panics in the filesystem that should be coming in over the next year. As a general rule, you should run with write cache disabled if you want full recoverability after a crash. I am closing this bug as it is triggered by disks running in an unreliable mode. As such there is no change called for in the software, though one can legitimately argue that the default should be to configure disks to run with `write cache disabled' so that they will be reliable. FreeBSD ran for a while with this default, but for many disks the performance was abysmal, so the default reverted back to unreliable. Most disks support tag queuing today which does not suffer poor performance from `write cache disabled', so perhaps it is time to revisit this default. But that is not a topic for this report. An alternative to prevent the observed behaviour, of rebooting continually, would be to always fully fsck the filesystem when it is dirty, rather than the current fsck -p behaviour of replaying the journal and applying simpler checks. I'm not sure a new "filesystem panic flag" would help, as there's not a lot of difference between the state the FS can be left in after to a kernel panic and when it stops due to a power failure. You suggest ``An alternative to prevent the observed behaviour, of rebooting continually, would be to always fully fsck the filesystem when it is dirty, rather than the current fsck -p behaviour of replaying the journal and applying simpler checks.''
To get this behavior, turn off journalling using the command:
tunefs -j disable /filesystem/to/disable
When the system finds a clean filesystem at boot, it skips fsck. When the system finds a dirty filesystem at boot and no journal, it runs a full fsck.
You note ``I'm not sure a new "filesystem panic flag" would help, as there's not a lot of difference between the state the FS can be left in after to a kernel panic and when it stops due to a power failure.''
When a panic occurs, the filesystem code never gets another chance to run, thus there is no way for it to set a `filesystem panic'ed flag'. The only indication of something unexpected having happened is the absence of a `filesystem cleanly unmounted flag'.
I just hit this on a laptop that was running with a completely dead battery for a while and went down hard a couple times due to a touchy power connector. It would boot fine but always panic within an hour. Manual "fsck -fy" resolved it - thanks for posting the workaround. As for long-term solutions, it possible to quickly determine if a filesystem is still dirty after an "fsck -p", and automatically fall back to a full fsck when necessary? Note to posterity: I updated sysutils/desktop-installer to explain this situation to the user and offer the option to disable write caching. This should help alleviate the problems until a systemic solution is implemented. Created attachment 215833 [details]
panic: ffs_valloc: dup alloc
Just hit what I believe is this issue on 13.0-CURRENT #9 r359315 GENERIC-NODEBUG amd64.
Attaching screenshot for our future selves
Running tunefs -j disable per comment 6, I saw: Clearing journal flags from inode 4 tunefs: Failed to write journal inode: failed to open disk for writing: Operation not permitted tunefs: soft updates journaling cleared but soft updates still set. tunefs: remove .sujournal to reclaim space tunefs: /dev/da0p3: failed to open disk for writing Does this, with the initial error, but subsequent "journaling cleared" mean it was disabled, or only partially? (In reply to Kubilay Kocak from comment #10) Once journaling is disabled, it remains disabled until such time as you reenable it. It does leave you running with soft updates. But after a crash, you will always do a full fsck. There is no journal and it will not attempt a journal recovery. |