265951 – ext2fs: kernel panic when trivial rsync operation from UFS2 (system disk) to ext3 partitions (data disk)

Bug 265951 - ext2fs: kernel panic when trivial rsync operation from UFS2 (system disk) to ext3 partitions (data disk)

Summary: ext2fs: kernel panic when trivial rsync operation from UFS2 (system disk) to ...

Status:	Open

Alias:	None

Product:	Base System
Classification:	Unclassified
Component:	kern (show other bugs)
Version:	13.0-RELEASE
Hardware:	Any Any

Importance:	--- Affects Many People
Assignee:	Fedor Uporov

URL:
Keywords:

Depends on:
Blocks:

Reported:	2022-08-20 00:54 UTC by clear.screen
Modified:	2024-11-25 07:49 UTC (History)
CC List:	4 users (show)

See Also:	https://reviews.freebsd.org/D38503

Flags:	linimon: mfc-stable14?

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description clear.screen 2022-08-20 00:54:04 UTC

When doing rsync trivial (avz) operations from zfs dataset to ext3 partition mounted with ext2fs.ko module, a kernel panic occurs trashing the whole server.

First UFS filesystem (first disk) is corrupted
Second UFS filesytem (second disk) is corrupted
Ext3fs filesystem (third disk) is corrupted

This is the output of /var/log/daemon.log

Aug 19 07:27:47 myServer savecore[1396]: reboot after panic: ext2_dirbad: /EXT3-TARGET: bad dir ino 164859907 at offset 0: mangled entry
Aug 19 07:27:47 myServer savecore[1396]: writing core to /var/crash/vmcore.1

- The ext2fs stability and reliability have to be fixed. This is mandatory.
- Perhaps this was improved on the 13.1 branch ?
- Is there a safe way to manage ext3 filesystems regarding kernel panic ?

Comment 1 Fedor Uporov freebsd_committer

2022-08-30 10:08:59 UTC

Looks like corrupted directory entry.

The simplest way to avoid this sort of panic is to run fsck for Ext3fs filesystem after server reboot.
It is possible to install e2fsprogs from ports or build it from source and run e2fsck utility with appropriate flags.

Comment 2 clear.screen 2022-08-30 18:27:53 UTC

Thanks for you answer.

The problem is that the filesystem was freshly created on freebsd and the corruption seemed to occur while the rsync ran making the kernel panic unable to avoid.

There should be some safety code to avoid any kernel panic on this usecase (as the data disk which hold ext3fs was distinct from the system disk).

Comment 3 Warner Losh freebsd_committer

2022-08-30 19:05:20 UTC

> There should be some safety code to avoid any kernel panic on this usecase (as the data disk which hold ext3fs was distinct from the system disk).

Of course. That's what makes this a bug. There's a fair amount of code already that tries to return an error rather than use erroneous data, but inside and outside of ext3fs code. Of course it shouldn't panic.

To help make this bug report actionable, however, a more complete way to reproduce this would be useful. The description is a good start, but what options do I need to pass to rsync? What kind of files are required in the ZFS dataset? How many? What sizes? etc. The corruption could come from ext2fs.ko, zfs.ko or something else that's impossible to guess given the current information.

Also, what was the corruption in UFS? On crashes, the superblocks and other metadata, let alone dirty data pages, aren't pushed out to disk (since when the system panics, it's hard to know if you are syncing good data, junk or if syncing is even possible given the damage). fsck will repair these problems, though. So more information about the corruption on the UFS partitions, and what options were used to create them would also help us know if that part of the bug is 'expected inflight operation uncertainty at the time of panic' or if its something new that also needs to be investigated.

Comment 4 Fedor Uporov freebsd_committer

2022-09-06 12:30:19 UTC

Yep, it would be great to have reproducer.
The rsync activity should not cause corrupted directory entries.
Could you please provide extfs mkfs options:
% dumpe2fs /path/to/corrupted/extfs/drive

It is possible to try to reproduce it with this information, also freebsd/zfs version and pool parameters will help. At least:
% uname -a
% zpool status

Comment 5 clear.screen 2022-09-07 01:32:09 UTC

Greetings and many thanks for your comment about ZFS. It made me seen i've done an error while reporting the bug => i've changed the title (from ZFS to UFS2)

ZFS was on another disk too (my bad) so it's not a ZFS issue.

The problem occurs from a simple system disk freshly installed from freebsd installer. The filesystem was UFS2 and that's why i was worried of a kernel panic in such a fresh and raw configuration.

To test the data disk i did from first system disk (UFS2) to another disk (ext3) a rsync of /etc and a kernel panic occured.

Nothing advanced done, and ext3 disk was wiped before with dd.

i'll provide "uname -a" result and "% dumpe2fs /path/to/corrupted/extfs/drive" output when the server will be powered on and before it will be reinitialized.

(it will be a production server and i can afford to let it in the debug state).

But i will try to reproduce the error in a VM if you want.
Best Regards.

Comment 6 clear.screen 2022-09-07 01:36:23 UTC

Really sorry to have done wrong description (made a mess / confusion with zfs)

I think i've done a copy of 1 or 2 files from zfs data disk on another previous test, but the problem occured (now i remember) when i did the rsync from /etc (so on first disk which is UFS2)

Thank you for your understanding !
And again please accept my apologies.

Regards.

Comment 7 clear.screen 2022-09-13 17:08:27 UTC

greetings,
i've tried to upgrade to 13.1 so uname won't report correct string as it was shown on the bug, but i've found some older log if it can help

Kernel version: FreeBSD 13.0-RELEASE-p11 #0: Tue Apr  5 18:54:35 UTC 2022

Regards.

Comment 8 Fedor Uporov freebsd_committer

2023-02-18 07:27:40 UTC

Regarding panic in case of corrupted directory entry:
https://reviews.freebsd.org/D38503

Directory corruption using rsync on ext2fs need to be reproduced independently.

Comment 9 commit-hook freebsd_committer

2023-03-18 06:17:13 UTC

A commit in branch main references this bug:

URL: https://cgit.FreeBSD.org/src/commit/?id=3c2dc524c333747a8c5deb3f0f88b29a8e36dff4

commit 3c2dc524c333747a8c5deb3f0f88b29a8e36dff4
Author:     Fedor Uporov <fsu@FreeBSD.org>
AuthorDate: 2023-03-18 06:11:27 +0000
Commit:     Fedor Uporov <fsu@FreeBSD.org>
CommitDate: 2023-03-18 06:16:24 +0000

    Do not panic in case of corrupted directory

    The panic() will be called under ext2_dirbad()
    function in case of rw mount. It cause user confusion,
    like in BZ 265951.

    PR:                     265951
    Reviewed by:            pfg, mckusick
    MFC after:              2 week
    Differential revision:  https://reviews.freebsd.org/D38503

 sys/fs/ext2fs/ext2_lookup.c | 14 +++++---------
 sys/fs/ext2fs/ext2_vnops.c  |  9 +++++++++
 2 files changed, 14 insertions(+), 9 deletions(-)