On system crash, fsck sets the last modified files size to 0, loosing their content. Step to reproduce 1) edit a file, for example /boot/loader.conf 2) cause a panic, for example sysctl debug.kdb.panic=1 On next boot the content of loader.conf is gone, and its size is set to zero. I'm not sure if this is an expected behavior, but it is really bad to loose the content of the last modified files on crash/panic. I don't pretend to find my last modifications save, but at least the original file. My UFS settings are the following, but I've tried different combinations, mainly playing with the -n and -j options, same result. tunefs: POSIX.1e ACLs: (-a) disabled tunefs: NFSv4 ACLs: (-N) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) enabled tunefs: soft update journaling: (-j) enabled tunefs: gjournal: (-J) disabled tunefs: trim: (-t) enabled tunefs: maximum blocks per file in a cylinder group: (-e) 4096 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: space to hold for metadata blocks: (-k) 6408 tunefs: optimization preference: (-o) time tunefs: volume label: (-L)
What editor are you using? Most editors follow the safe update practice: write file to new_name fsync(new_name); rename(new_name, orig_name); The fsync will not return until the contents are on the disk and the rename is atomic, so it will either point at the orig_name file contents or the new_name file contents which because of the fsync will be on the disk.
I had this issue when I edited loader.conf with nano. Actually I tested the edit/panic cycles many times on an old test machine. The problem does not always occur, but sometimes.
(In reply to Ali Abdallah from comment #2) I have looked at nano and it does not follow the proper protocol for writing out the file (which is detailed in my comment #1). It simply does: fd = open(file, O_WRONLY|O_CREAT|O_TRUNC, 0666); write new contents close(fd); The O_TRUNC flag truncates the file to zero length. If the system crashes before the new contents are written, you get a zero length file. By default the contents will not be written for up to 30 seconds. So if you panic within 30 seconds of the file being written, you will get a zero length file. The proper way to fix this is detailed in my comment #1. The gap could be closed significantly by adding an fsync(fd) before calling close as that would cause the file to be written to disk within a few milliseconds of its finishing being written thus closing the gap considerably (but it would still be possible to get a zero length file). So, it really should be fixed properly. I am closing this bug because it is a bug in nano and not in FreeBSD.
@Ali, Please file a bug with the nano project and chase up getting that fixed there. I know nano has a lot of users and it is a shame it isn't safe!
@Conrad I will file a bug against nano. @Kirk thanks for looking into this. BTW, while I was testing on my test system, I had done the following on a clean 11.2 installation 1) pkg install nano 2) nano /etc/sysctl.conf modify, save and exit 3) sysctl debug.kdb.panic=1 On the next boot, nano was registered as installed by pkg. But the nano binary (together with its indexinfo and gettext-runtime files) were not present on disk. I think pkg does not fsync the installed files in this case, right?
(In reply to Ali Abdallah from comment #5) You are correct that the installed binaries are not being fsync'ed before the database is being updated (which being database software DOES properly fsync). I have not looked at pkg, but if it directly does the write itself, it should do the fsync before it updates the database. It may be that it is running a shell script that uses the install(1) utility, then it is install that should be doing the fsync. Obviously, more checking needs to be done. Thanks for your help in tracking down these issues.
A commit references this bug: Author: mckusick Date: Mon Aug 27 15:20:42 UTC 2018 New revision: 338340 URL: https://svnweb.freebsd.org/changeset/base/338340 Log: When doing a -S "safe copy", the install command should do an fsync(2) system call after copying the installed file to ensure that it is on stable storage. PR: 230851 Reviewed by: kib Approved by: re (marius) Changes: head/usr.bin/xinstall/xinstall.c
(In reply to commit-hook from comment #7) Thanks a lot for your effort, this will make UFS even better. Unfortunately not all softwares follow the safe update practice. For example I've lost my .zsh_history on crash yesterday. I assume that zsh don't fsync on every command entered in the shell. I think I can do nothing in those situations unless I gjournal my home UFS partition or I can just use backups/snapshots. I had never experienced crashes in the past, but all my troubles started when I moved my production system from 11.1 to 11.2 (vboxdrv caused me all kind of troubles when running my Windows 10 vm machine, that unfortunately I need it for my work). Thanks!
(In reply to Ali Abdallah from comment #8) Yeah, with zsh I end up taking backups. I rely on ^R history a lot to work quickly, so losing it is painful. Here's a really ugly kludge I use (saves a new copy any time .zhistory is modified): https://github.com/cemeyer/backup_zhistory