Bug 262157 - su+j: Crashes during mmc(4) fsck after timeout: Error reading journal block 41408 -> INTERNAL ERROR: GOT TO reply()
Summary: su+j: Crashes during mmc(4) fsck after timeout: Error reading journal block 4...
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-RELEASE
Hardware: arm64 Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords: needs-qa
Depends on:
Blocks:
 
Reported: 2022-02-23 17:45 UTC by Andreas Schuh
Modified: 2024-10-04 12:41 UTC (History)
4 users (show)

See Also:
koobs: maintainer-feedback? (mckusick)
koobs: maintainer-feedback? (manu)


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Schuh 2022-02-23 17:45:21 UTC
Compiled RELENG/13.0 with crochet, flashed the image onto a SD card, and booted an Orange Pi Zero Plus (Allwinner H5, http://www.orangepi.org/OrangePiZeroPlus/).
Ran freebsd-update fetch & install afterwards.

While extracting Python 3.8, system hung; after reboot, the filesystem was broken:
mmc0: <MMC/SD bus> on aw_mmc1
mmc0: No compatible cards found on bus
aw_mmc1: Spurious interrupt - no active request, rint: 0x00000004

mmc1: <MMC/SD bus> on aw_mmc0
mmcsd0: 64GB <SDHC SC64G 8.0 SN 8B0D14CD MFG 08/2018 by 3 SD> at mmc1 50.0MHz/4bit/32768-block
mmc1: Failed to set VCCQ for card at relative address 43690
uhub2: 1 port with 1 removable, self powered
uhub4: 1 port with 1 removable, self powered
mountroot: waiting for device /dev/mmcsd0s2a...
WARNING: / was not properly dismounted
Setting hostuuid: 30633238-3030-3130-6462-623536366366.
Setting hostid: 0x005c1ace.
Starting file system checks:
** SU+J Recovering /dev/mmcsd0s2a
** Reading 4194304 byte journal from inode 4.
aw_mmc0: prepare_dma failed: 27
aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout
/dev/mmcsd0s2a: Error reading journal block 41408
/dev/mmcsd0s2a: UNEXPECTED SU+J INCONSISTENCY
/dev/mmcsd0s2a: INTERNAL ERROR: GOT TO reply()
/dev/mmcsd0s2a: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY.

aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout
aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout
aw_mmc0: controller timeout


Booting into Single User mode and running fsck resulted in a kernel panic:

mmc1: <MMC/SD bus> on aw_mmc0
mmcsd0: 64GB <SDHC SC64G 8.0 SN 8B0D14CD MFG 08/2018 by 3 SD> at mmc1 50.0MHz/4bit/32768-block
mmc1: Failed to set VCCQ for card at relative address 43690
uhub1: 1 port with 1 removable, self powered
uhub0: 1 port with 1 removable, self powered
mountroot: waiting for device /dev/mmcsd0s2a...
Enter full pathname of shell or RETURN for /bin/sh: 
root@:/ # fsck -y /
** /dev/mmcsd0s2a
** SU+J Recovering /dev/mmcsd0s2a

USE JOURNAL? yes

** Reading 4194304 byte journal from inode 4.
aw_mmc0: prepare_dma failed: 27

aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout
Error reading journal block 41408
UNEXPECTED SU+J INCONSISTENCY

FALLBACK TO FULL FSCK? yes

** Skipping journal, falling through to full fsck

aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
aw_mmc0: controller timeout
aw_mmc0: timeout updating clock
mmcsd0: Error indicated: 1 Timeout

CANNOT READ BLK:panic: vm_fault_lookup: fault on nofault entry, addr: 0xffff000044189000
cpuid = 2
time = 1645555374
KDB: stack backtrace:
#0 0xffff00000050d058 at kdb_backtrace+0x60
#1 0xffff0000004b7228 at vpanic+0x184
#2 0xffff0000004b70a0 at panic+0x44
#3 0xffff00000077f790 at vm_fault+0x1964
#4 0xffff00000077dd48 at vm_fault_trap+0x9c
#5 0xffff000000824c48 at data_abort+0xf4
#6 0xffff000000805078 at handle_el1h_sync+0x78
#7 0xffff000000802164 at bounce_bus_dmamap_sync+0x22c
#8 0xffff0000007bd0d0 at aw_mmc_request+0x3ec
#9 0xffff000000203ed4 at mmc_wait_for_request+0x128
#10 0xffff00000020d3f4 at mmcsd_rw+0x1a8
#11 0xffff00000020bf0c at mmcsd_task+0x284
#12 0xffff0000004677ec at fork_exit+0x88
#13 0xffff000000823ae8 at fork_trampoline+0x10
Uptime: 1m14s


I then disabled journaling, and the system is running fine since then.
Comment 1 Kirk McKusick freebsd_committer freebsd_triage 2023-02-27 19:25:17 UTC
Is this still an issue? If so, is there a way to reproduce it? If there is please post it as that is the only way that I can debug it.
Comment 2 Andreas Schuh 2023-03-01 14:27:44 UTC
Guess it's still an issue; though, since I disabled journaling, machine is running fine.

Kernel panic was reproducible all the time, steps were:
1) Make sure, SU+J is enabled
2) Generate "high" I/O on the SD card, e.g. by installing Python with "pkg install python38"
3) At some point, machine would hang
4) Reboot; as the file system was broken after the system hung, fsck in single user mode was necessary
5) fsck always resulted in the reported kernel panic
Comment 3 Mark Linimon freebsd_committer freebsd_triage 2024-10-04 12:41:52 UTC
^Triage: clear unneeded flags.  Nothing has yet been committed to be merged.