When doing rsync trivial (avz) operations from zfs dataset to ext3 partition mounted with ext2fs.ko module, a kernel panic occurs trashing the whole server. First UFS filesystem (first disk) is corrupted Second UFS filesytem (second disk) is corrupted Ext3fs filesystem (third disk) is corrupted This is the output of /var/log/daemon.log Aug 19 07:27:47 myServer savecore[1396]: reboot after panic: ext2_dirbad: /EXT3-TARGET: bad dir ino 164859907 at offset 0: mangled entry Aug 19 07:27:47 myServer savecore[1396]: writing core to /var/crash/vmcore.1 - The ext2fs stability and reliability have to be fixed. This is mandatory. - Perhaps this was improved on the 13.1 branch ? - Is there a safe way to manage ext3 filesystems regarding kernel panic ?
Looks like corrupted directory entry. The simplest way to avoid this sort of panic is to run fsck for Ext3fs filesystem after server reboot. It is possible to install e2fsprogs from ports or build it from source and run e2fsck utility with appropriate flags.
Thanks for you answer. The problem is that the filesystem was freshly created on freebsd and the corruption seemed to occur while the rsync ran making the kernel panic unable to avoid. There should be some safety code to avoid any kernel panic on this usecase (as the data disk which hold ext3fs was distinct from the system disk).
> There should be some safety code to avoid any kernel panic on this usecase (as the data disk which hold ext3fs was distinct from the system disk). Of course. That's what makes this a bug. There's a fair amount of code already that tries to return an error rather than use erroneous data, but inside and outside of ext3fs code. Of course it shouldn't panic. To help make this bug report actionable, however, a more complete way to reproduce this would be useful. The description is a good start, but what options do I need to pass to rsync? What kind of files are required in the ZFS dataset? How many? What sizes? etc. The corruption could come from ext2fs.ko, zfs.ko or something else that's impossible to guess given the current information. Also, what was the corruption in UFS? On crashes, the superblocks and other metadata, let alone dirty data pages, aren't pushed out to disk (since when the system panics, it's hard to know if you are syncing good data, junk or if syncing is even possible given the damage). fsck will repair these problems, though. So more information about the corruption on the UFS partitions, and what options were used to create them would also help us know if that part of the bug is 'expected inflight operation uncertainty at the time of panic' or if its something new that also needs to be investigated.
Yep, it would be great to have reproducer. The rsync activity should not cause corrupted directory entries. Could you please provide extfs mkfs options: % dumpe2fs /path/to/corrupted/extfs/drive It is possible to try to reproduce it with this information, also freebsd/zfs version and pool parameters will help. At least: % uname -a % zpool status
Greetings and many thanks for your comment about ZFS. It made me seen i've done an error while reporting the bug => i've changed the title (from ZFS to UFS2) ZFS was on another disk too (my bad) so it's not a ZFS issue. The problem occurs from a simple system disk freshly installed from freebsd installer. The filesystem was UFS2 and that's why i was worried of a kernel panic in such a fresh and raw configuration. To test the data disk i did from first system disk (UFS2) to another disk (ext3) a rsync of /etc and a kernel panic occured. Nothing advanced done, and ext3 disk was wiped before with dd. i'll provide "uname -a" result and "% dumpe2fs /path/to/corrupted/extfs/drive" output when the server will be powered on and before it will be reinitialized. (it will be a production server and i can afford to let it in the debug state). But i will try to reproduce the error in a VM if you want. Best Regards.
Really sorry to have done wrong description (made a mess / confusion with zfs) I think i've done a copy of 1 or 2 files from zfs data disk on another previous test, but the problem occured (now i remember) when i did the rsync from /etc (so on first disk which is UFS2) Thank you for your understanding ! And again please accept my apologies. Regards.
greetings, i've tried to upgrade to 13.1 so uname won't report correct string as it was shown on the bug, but i've found some older log if it can help Kernel version: FreeBSD 13.0-RELEASE-p11 #0: Tue Apr 5 18:54:35 UTC 2022 Regards.
Regarding panic in case of corrupted directory entry: https://reviews.freebsd.org/D38503 Directory corruption using rsync on ext2fs need to be reproduced independently.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=3c2dc524c333747a8c5deb3f0f88b29a8e36dff4 commit 3c2dc524c333747a8c5deb3f0f88b29a8e36dff4 Author: Fedor Uporov <fsu@FreeBSD.org> AuthorDate: 2023-03-18 06:11:27 +0000 Commit: Fedor Uporov <fsu@FreeBSD.org> CommitDate: 2023-03-18 06:16:24 +0000 Do not panic in case of corrupted directory The panic() will be called under ext2_dirbad() function in case of rw mount. It cause user confusion, like in BZ 265951. PR: 265951 Reviewed by: pfg, mckusick MFC after: 2 week Differential revision: https://reviews.freebsd.org/D38503 sys/fs/ext2fs/ext2_lookup.c | 14 +++++--------- sys/fs/ext2fs/ext2_vnops.c | 9 +++++++++ 2 files changed, 14 insertions(+), 9 deletions(-)