Bug 251018 - sdhc errors on growfs with RPI3
Summary: sdhc errors on growfs with RPI3
Status: In Progress
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: CURRENT
Hardware: arm64 Any
: --- Affects Only Me
Assignee: Michal Meloun
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-10 15:31 UTC by Steve Wills
Modified: 2020-12-16 14:37 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Wills freebsd_committer 2020-11-10 15:31:05 UTC
Booting this image:

https://download.freebsd.org/ftp/snapshots/arm64/aarch64/ISO-IMAGES/13.0/FreeBSD-13.0-CURRENT-arm64-aarch64-RPI3-20201105-ef87bd449eb.img.xz

on RPI3 produces SDHC errors. See:

https://lists.freebsd.org/pipermail/freebsd-arm/2020-November/022692.html

It seemed to start with r366106 and was supposed to be fixed by rS367268 but still seems to be happening. The proposed patch from the arm email list didn't help.
Comment 1 Steve Wills freebsd_committer 2020-11-11 20:44:55 UTC
For the record, here's the full boot log:

Consoles: EFI console  
    Reading loader env vars from /efi/freebsd/loader.env
Setting currdev to disk0p1:
FreeBSD/arm64 EFI loader, Revision 1.1
(Thu Nov  5 07:53:09 UTC 2020 root@releng1.nyi.freebsd.org)

   Command line arguments: loader.efi
   Image base: 0x39e17000
   EFI version: 2.80
   EFI Firmware: Das U-Boot (rev 8224.1792)
   Console: comconsole (0)
   Load Path: /efi\boot\bootaa64.efi
   Load Device: /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)/HD(1,0x01,0,0x81f,0x18fa8)
Trying ESP: /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)/HD(1,0x01,0,0x81f,0x18fa8)
Setting currdev to disk0p1:
Trying: /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)/HD(2,0x01,0,0x197c7,0x5e6821)
Setting currdev to disk0p2:
Loading /boot/defaults/loader.conf
Loading /boot/defaults/loader.conf
Loading /boot/device.hints
Loading /boot/loader.conf
Loading /boot/loader.conf.local
Loading kernel...
/boot/kernel/kernel text=0x2a8 text=0x87ade8 text=0x1efa4c data=0x193940 data=0x0+0x541146 syms=[0x8+0x116598+0x8+0x13b120]
Loading configured modules...
/boot/kernel/umodem.ko text=0x2120 text=0x1390 data=0x6e0+0x10 syms=[0x8+0xf48+0x8+0xb6e]
loading required module 'ucom'
/boot/kernel/ucom.ko text=0x21a0 text=0x2e20 data=0x880+0x858 syms=[0x8+0x11a0+0x8+0xb2c]
can't find '/etc/hostid'
can't find '/boot/entropy'

Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...               
Using DTB provided by EFI at 0x7ef6000.
EFI framebuffer information:
addr, size     0x3eaf0000, 0x10a800
dimensions     656 x 416
stride         656
masks          0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000
---<<BOOT>>---
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2020 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #0 ef87bd449eb-c254225(main): Thu Nov  5 08:56:01 UTC 2020
    root@releng1.nyi.freebsd.org:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC arm64
FreeBSD clang version 11.0.0 (git@github.com:llvm/llvm-project.git llvmorg-11.0.0-0-g176249bd673)
WARNING: WITNESS option enabled, expect reduced performance.
VT(efifb): resolution 656x416
module firmware already present!
real memory  = 993849344 (947 MB)
avail memory = 943214592 (899 MB)
Starting CPU 1 (1)
Starting CPU 2 (2)
Starting CPU 3 (3)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
MAP 39f3f000 mode 2 pages 1
MAP 39f45000 mode 2 pages 2
MAP 3b350000 mode 2 pages 16
MAP 3f100000 mode 0 pages 1
WARNING: Device "kbd" is Giant locked and may be deleted before FreeBSD 13.0.
kbd0 at kbdmux0
WARNING: Device "openfirm" is Giant locked and may be deleted before FreeBSD 13.0.
ofwbus0: <Open Firmware Device Tree>
simplebus0: <Flattened device tree simple bus> on ofwbus0
ofw_clkbus0: <OFW clocks bus> on ofwbus0
clk_fixed0: <Fixed clock> on ofw_clkbus0
clk_fixed1: <Fixed clock> on ofw_clkbus0
regfix0: <Fixed Regulator> on ofwbus0
regfix1: <Fixed Regulator> on ofwbus0
psci0: <ARM Power State Co-ordination Interface Driver> on ofwbus0
lintc0: <BCM2836 Interrupt Controller> mem 0x40000000-0x400000ff on simplebus0
intc0: <BCM2835 Interrupt Controller> mem 0x7e00b200-0x7e00b3ff irq 48 on simplebus0
gpio0: <BCM2708/2835 GPIO controller> mem 0x7e200000-0x7e2000b3 irq 24,25 on simplebus0
gpiobus0: <OFW GPIO bus> on gpio0
mbox0: <BCM2835 VideoCore Mailbox> mem 0x7e00b880-0x7e00b8bf irq 23 on simplebus0
bcm2835_firmware0: <BCM2835 Firmware> on simplebus0
gpio1: <Raspberry Pi Firmware GPIO controller> on bcm2835_firmware0
gpiobus1: <GPIO bus> on gpio1
generic_timer0: <ARMv7 Generic Timer> irq 1,2,3,4 on ofwbus0
Timecounter "ARM MPCore Timecounter" frequency 19200000 Hz quality 1000
Event timer "ARM MPCore Eventtimer" frequency 19200000 Hz quality 1000
usb_nop_xceiv0: <USB NOP PHY> on ofwbus0
bcm_dma0: <BCM2835 DMA Controller> mem 0x7e007000-0x7e007eff irq 6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21 on simplebus0
bcmwd0: <BCM2708/2835 Watchdog> mem 0x7e100000-0x7e100113,0x7e00a000-0x7e00a023 on simplebus0
bcm2835_clkman0: <BCM283x Clock Manager> mem 0x7e101000-0x7e102fff on simplebus0
bcmrng0: <Broadcom BCM2835 RNG> mem 0x7e104000-0x7e10400f irq 22 on simplebus0
gpioc0: <GPIO controller> on gpio0
uart0: <PrimeCell UART (PL011)> mem 0x7e201000-0x7e2011ff irq 26 on simplebus0
uart0: console (115200,n,8,1)
spi0: <BCM2708/2835 SPI controller> mem 0x7e204000-0x7e2041ff irq 28 on simplebus0
spibus0: <OFW SPI bus> on spi0
spibus0: <unknown card> at cs 0 mode 0
spibus0: <unknown card> at cs 1 mode 0
iichb0: <BCM2708/2835 BSC controller> mem 0x7e804000-0x7e804fff irq 40 on simplebus0
bcm283x_dwcotg0: <DWC OTG 2.0 integrated USB controller (bcm283x)> mem 0x7e980000-0x7e98ffff,0x7e006000-0x7e006fff irq 46,47 on simplebus0
usbus1 on bcm283x_dwcotg0
sdhci_bcm0: <Broadcom 2708 SDHCI controller> mem 0x7e300000-0x7e3000ff irq 50 on simplebus0
mmc0: <MMC/SD bus> on sdhci_bcm0
gpioc1: <GPIO controller> on gpio1
fb0: <BCM2835 VT framebuffer driver> on simplebus0
fb0: keeping existing fb bpp of 32
fbd0 on fb0
WARNING: Device "fb" is Giant locked and may be deleted before FreeBSD 13.0.
VT: Replacing driver "efifb" with new "fb".
fb0: 656x416(656x416@0,0) 32bpp
fb0: fbswap: 1, pitch 2624, base rx3eaf0000, screen_size 1091584
pmu0: <Performance Monitoring Unit> irq 0 on ofwbus0
cpulist0: <Open Firmware CPU Group> on ofwbus0
cpu0: <Open Firmware CPU> on cpulist0
bcm2835_cpufreq0: <CPU Frequency Control> on cpu0
cpu1: <Open Firmware CPU> on cpulist0
cpu2: <Open Firmware CPU> on cpulist0
cpu3: <Open Firmware CPU> on cpulist0
gpioled0: <GPIO LEDs> on ofwbus0
gpioled0: <led0> failed to map pin
lock order reversal: (sleepable after non-sleepable)
 1st 0xffff000000c14ad8 LED mtx (LED mtx, sleep mutex) @ /usr/src/sys/dev/led/led.c:298
 2nd 0xffffa00000849810 Raspberry Pi firmware gpio (Raspberry Pi firmware gpio, sx) @ /usr/src/sys/arm/broadcom/bcm2835/raspberrypi_gpio.c:252
lock order LED mtx -> Raspberry Pi firmware gpio attempted at:
#0 0xffff000000568374 at witness_checkorder+0xc54
#1 0xffff000000505c84 at _sx_xlock+0x7c
#2 0xffff0000007dd6a4 at rpi_fw_gpio_pin_set+0xe8
#3 0xffff000000295afc at led_create_state+0x158
#4 0xffff000000267448 at gpioled_attach+0x290
#5 0xffff0000005345cc at device_attach+0x400
#6 0xffff000000534134 at device_probe_and_attach+0x7c
#7 0xffff00000053631c at bus_generic_new_pass+0xf8
#8 0xffff0000005362cc at bus_generic_new_pass+0xa8
#9 0xffff0000005362cc at bus_generic_new_pass+0xa8
#10 0xffff0000005316b0 at bus_set_pass+0x4c
#11 0xffff00000048c57c at mi_startup+0x12c
uma_zalloc_debug: zone "malloc-64" with the following non-sleepable locks held:
exclusive sleep mutex LED mtx (LED mtx) r = 0 (0xffff000000c14ad8) locked @ /usr/src/sys/dev/led/led.c:298
stack backtrace:
#0 0xffff000000568694 at witness_debugger+0x64
#1 0xffff000000569824 at witness_warn+0x3ec
#2 0xffff000000772194 at uma_zalloc_debug+0x2c
#3 0xffff000000771bf8 at uma_zalloc_arg+0x2c
#4 0xffff0000004d1848 at malloc+0xf0
#5 0xffff0000007d2864 at bcm2835_firmware_property+0x44
#6 0xffff0000007dd6bc at rpi_fw_gpio_pin_set+0x100
#7 0xffff000000295afc at led_create_state+0x158
#8 0xffff000000267448 at gpioled_attach+0x290
#9 0xffff0000005345cc at device_attach+0x400
#10 0xffff000000534134 at device_probe_and_attach+0x7c
#11 0xffff00000053631c at bus_generic_new_pass+0xf8
#12 0xffff0000005362cc at bus_generic_new_pass+0xa8
#13 0xffff0000005362cc at bus_generic_new_pass+0xa8
#14 0xffff0000005316b0 at bus_set_pass+0x4c
#15 0xffff00000048c57c at mi_startup+0x12c
uma_zalloc_debug: zone "malloc-16" with the following non-sleepable locks held:
exclusive sleep mutex LED mtx (LED mtx) r = 0 (0xffff000000c14ad8) locked @ /usr/src/sys/dev/led/led.c:298
stack backtrace:
#0 0xffff000000568694 at witness_debugger+0x64
#1 0xffff000000569824 at witness_warn+0x3ec
#2 0xffff000000772194 at uma_zalloc_debug+0x2c
#3 0xffff000000771bf8 at uma_zalloc_arg+0x2c
#4 0xffff0000004d1848 at malloc+0xf0
#5 0xffff0000007ea740 at bounce_bus_dmamem_alloc+0x50
#6 0xffff0000007d533c at bcm2835_mbox_property+0xdc
#7 0xffff0000007d2898 at bcm2835_firmware_property+0x78
#8 0xffff0000007dd6bc at rpi_fw_gpio_pin_set+0x100
#9 0xffff000000295afc at led_create_state+0x158
#10 0xffff000000267448 at gpioled_attach+0x290
#11 0xffff0000005345cc at device_attach+0x400
#12 0xffff000000534134 at device_probe_and_attach+0x7c
#13 0xffff00000053631c at bus_generic_new_pass+0xf8
#14 0xffff0000005362cc at bus_generic_new_pass+0xa8
#15 0xffff0000005362cc at bus_generic_new_pass+0xa8
#16 0xffff0000005316b0 at bus_set_pass+0x4c
#17 0xffff00000048c57c at mi_startup+0x12c
uma_zalloc_debug: zone "malloc-128" with the following non-sleepable locks held:
exclusive sleep mutex LED mtx (LED mtx) r = 0 (0xffff000000c14ad8) locked @ /usr/src/sys/dev/led/led.c:298
stack backtrace:
#0 0xffff000000568694 at witness_debugger+0x64
#1 0xffff000000569824 at witness_warn+0x3ec
#2 0xffff000000772194 at uma_zalloc_debug+0x2c
#3 0xffff000000771bf8 at uma_zalloc_arg+0x2c
#4 0xffff0000004d1848 at malloc+0xf0
#5 0xffff0000007ea788 at bounce_bus_dmamem_alloc+0x98
#6 0xffff0000007d533c at bcm2835_mbox_property+0xdc
#7 0xffff0000007d2898 at bcm2835_firmware_property+0x78
#8 0xffff0000007dd6bc at rpi_fw_gpio_pin_set+0x100
#9 0xffff000000295afc at led_create_state+0x158
#10 0xffff000000267448 at gpioled_attach+0x290
#11 0xffff0000005345cc at device_attach+0x400
#12 0xffff000000534134 at device_probe_and_attach+0x7c
#13 0xffff00000053631c at bus_generic_new_pass+0xf8
#14 0xffff0000005362cc at bus_generic_new_pass+0xa8
#15 0xffff0000005362cc at bus_generic_new_pass+0xa8
#16 0xffff0000005316b0 at bus_set_pass+0x4c
#17 0xffff00000048c57c at mi_startup+0x12c
uma_zalloc_debug: zone "malloc-64" with the following non-sleepable locks held:
exclusive sleep mutex LED mtx (LED mtx) r = 0 (0xffff000000c14ad8) locked @ /usr/src/sys/dev/led/led.c:298
stack backtrace:
#0 0xffff000000568694 at witness_debugger+0x64
#1 0xffff000000569824 at witness_warn+0x3ec
#2 0xffff000000772194 at uma_zalloc_debug+0x2c
#3 0xffff000000771bf8 at uma_zalloc_arg+0x2c
#4 0xffff0000004d1848 at malloc+0xf0
#5 0xffff0000007ea8d8 at bounce_bus_dmamem_alloc+0x1e8
#6 0xffff0000007d533c at bcm2835_mbox_property+0xdc
#7 0xffff0000007d2898 at bcm2835_firmware_property+0x78
#8 0xffff0000007dd6bc at rpi_fw_gpio_pin_set+0x100
#9 0xffff000000295afc at led_create_state+0x158
#10 0xffff000000267448 at gpioled_attach+0x290
#11 0xffff0000005345cc at device_attach+0x400
#12 0xffff000000534134 at device_probe_and_attach+0x7c
#13 0xffff00000053631c at bus_generic_new_pass+0xf8
#14 0xffff0000005362cc at bus_generic_new_pass+0xa8
#15 0xffff0000005362cc at bus_generic_new_pass+0xa8
#16 0xffff0000005316b0 at bus_set_pass+0x4c
#17 0xffff00000048c57c at mi_startup+0x12c
cryptosoft0: <software crypto>
Timecounters tick every 1.000 msec
usbus1: 480Mbps High Speed USB v2.0
iicbus0: <OFW I2C bus> on iichb0
iic0: <I2C generic I/O> on iicbus0
ugen1.1: <DWCOTG OTG Root HUB> at usbus1
uhub0 on usbus1
uhub0: <DWCOTG OTG Root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
mmcsd0: 32GB <SDHC SD32G 8.5 SN F1970496 MFG 09/2020 by 3 SD> at mmc0 50.0MHz/4bit/65535-block
bcm2835_cpufreq0: ARM 600MHz, Core 250MHz, SDRAM 400MHz, Turbo OFF
Release APs...done
CPU  0: ARM Cortex-A53 r0p4 affinity:  0
                   Cache Type = <64 byte D-cacheline,64 byte I-cacheline,VIPT ICache,64 byte ERG,64 byte CWG>
Trying to mount root from ufs:/dev/ufs/rootfs [rw]...
 Instruction Set Attributes 0 = <CRC32>
 Instruction Set Attributes 1 = <>
         Processor Features 0 = <AdvSIMD,FP,EL3 32,EL2 32,EL1 32,EL0 32>
         Processor Features 1 = <>
      Memory Model Features 0 = <TGran4,TGran64,SNSMem,BigEnd,16bit ASID,1TB PA>
      Memory Model Features 1 = <8bit VMID>
      Memory Model Features 2 = <32bit CCIDX,48bit VA>
             Debug Features 0 = <2 CTX BKPTs,4 Watchpoints,6 Breakpoints,PMUv3,Debugv8>
             Debug Features 1 = <>
         Auxiliary Features 0 = <>
         Auxiliary Features 1 = <>
CPU  1: ARM Cortex-A53 r0p4 affinity:  1
CPU  2: ARM Cortex-A53 r0p4 affinity:  2
CPU  3: ARM Cortex-A53 r0p4 affinity:  3
WARNING: WITNESS option enabled, expect reduced performance.
Warning: no time-of-day clock registered, system time will not be set accurately
Dual Console: Serial Primary, Video Secondary
uhub0: 1 port with 1 removable, self powered
Growing root partition to fill device
random: randomdev_wait_until_seeded unblock wait
ugen1.2: <vendor 0x0424 product 0x9514> at usbus1
uhub1 on uhub0
uhub1: <vendor 0x0424 product 0x9514, class 9/0, rev 2.00/2.00, addr 2> on usbus1
uhub1: MTT enabled
uhub1: 5 ports with 4 removable, self powered
random: unblocking device.
GEOM_PART: mmcsd0s2 was automatically resized.
  Use `gpart commit mmcsd0s2` to save changes or `gpart undo mmcsd0s2` to revert them.
mmcsd0s2 resized
mmcsd0s2a resized
gpart: arg0 'ufs/rootfs': Invalid argument
mmcsd0: Error indicated: 6 NO MEMORY
grordfs: read emmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
rror: 5129216: Operation not supported
sdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000010
sdhci_bcm0-slot0: Argument: 0x000197d7 | Trn mode: 0x00000036
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000003
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
ugen1.3: <vendor 0x0424 product 0xec00> at usbus1
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
smsc0 on uhub1
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
smsc0: <vendor 0x0424 product 0xec00, rev 2.00/2.00, addr 3> on usbus1
sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
sdhci_bcm0-slot0: ===========================================
sdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
smsc0: chip 0xec00, rev. 0002
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000010
sdhci_bcm0-slot0: Argument: 0x000197c7 | Trn mode: 0x00000036
miibus0: <MII bus>sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000002
 on smsc0
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
smscphy0: <SMC LAN8700 10/100 interface>sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
 PHY 1sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
 on miibus0
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
smscphy0:  sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
10baseTsdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
, 10baseT-FDXsdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
, 100baseTXsdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
, 100baseTX-FDXsdhci_bcm0-slot0: ===========================================
, autosdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.

sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
ue0: <USB Ethernet> on smsc0
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
ue0: Ethernet address: b8:27:eb:e7:f1:dc
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000010
sdhci_bcm0-slot0: Argument: 0x000199c7 | Trn mode: 0x00000036
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000003
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
sdhci_bcm0-slot0: ===========================================
sdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000001
sdhci_bcm0-slot0: Argument: 0x03b71ffe | Trn mode: 0x00000012
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000003
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
sdhci_bcm0-slot0: ===========================================
No suitable dump device was found.
eval: /sbin/md5: Input/output error
Setting hostuuid: 30303030-3030-3030-3930-653766316463.
Setting hostid: 0x.
sysctl: invalid unsigned long '0x'
eval: /sbin/swapon: Input/output error
Starting file system chesdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000028
sdhci_bcm0-slot0: Argument: 0x004b6400 | Trn mode: 0x00000036
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000003
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
sdhci_bcm0-slot0: ===========================================
cks:
eval: fsck: Input/output error
Unknown error 126; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
sdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000040
sdhci_bcm0-slot0: Argument: 0x00019980 | Trn mode: 0x00000036
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000002
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
sdhci_bcm0-slot0: 94ADMA addr:0x00000000 | Slot int: 0x00000000
ugen1.4: <Realtek 802.11n WLAN Adapter> at usbus1
sdhci_bcm0-slot0: ===========================================
nit 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
sdhci_bcm0-slot0: Got data interrupt 0x00600000, but there is no active command.
sdhci_bcm0-slot0: ============== REGISTER DUMP ==============
sdhci_bcm0-slot0: Sys addr: 0x00000000 | Version:  0x00009902
sdhci_bcm0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000008
sdhci_bcm0-slot0: Argument: 0x005f8218 | Trn mode: 0x00000036
sdhci_bcm0-slot0: Present:  0x010f0000 | Host ctl: 0x00000003
sdhci_bcm0-slot0: Power:    0x0000000f | Blk gap:  0x00000000
sdhci_bcm0-slot0: Wake-up:  0x00000000 | Clock:    0x00000207
sdhci_bcm0-slot0: Timeout:  0x0000000e | Int stat: 0x00000000
sdhci_bcm0-slot0: Int enab: 0x01ff00bb | Sig enab: 0x01ff00bb
sdhci_bcm0-slot0: AC12 err: 0x00000000 | Host ctl2:0x00000000
sdhci_bcm0-slot0: Caps:     0x00000000 | Caps2:    0x00000000
sdhci_bcm0-slot0: Max curr: 0x00000001 | ADMA err: 0x00000000
sdhci_bcm0-slot0: ADMA addr:0x00000000 | Slot int: 0x00000000
sdhci_bcm0-slot0: ===========================================
inpx scanner failed
Growing root partition to fill device
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
gpart: Operation not supported
gpart: Operation not supported
gpart: arg0 'ufs/rootfs': Invalid argument
growfs: unable to read superblock: Input/output error
No suitable dump device was found.
eval: /sbin/md5: Input/output error
Setting hostuuid: 30303030-3030-3030-3930-653766316463.
Setting hostid: 0x.
ugen1.5: <Logitech USB Receiver> at usbus1
ukbd0 on uhub1
ukbd0: <Logitech USB Receiver, class 0/0, rev 2.00/29.01, addr 5> on usbus1
invalid unsigned long '0x'
kbd1 at ukbd0
eval: /swapon: Input/output error
Fast boot: skipping disk checks.
mount: /dev/ufs/rootfs: Operation not supported
Mounting root filesystem rw failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2020-11-05T10:05:22.765870+00:00 - init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
input in flex scanner failed
Growing root partition to fill device
gpart: Operation not supported
gpart: Operation not supported
gpart: arg0 'ufs/rootfs': Invalid argument
growfs: unable to read superblock: Input/output error
No suitable dump device was found.
eval: /sbin/md5: Input/output error
Setting hostuuid: 30303030-3030-3030-3930-653766316463.
Setting hostid: 0x.
sysctl: invalid unsigned long '0x'
eval: /sbin/swapon: Input/output error
Fast boot: skipping disk checks.
WARNING: / was not properly dismounted
fsync: giving up on dirty (error = 45) 0xffffa00000d5f7c0: type VCHR
    usecount 1, writecount 0, refcount 25 seqc users 0 rdev 0xffffa00000f7bc00
    hold count flags ()
    flags ()
    v_object 0xffffa00000837b58 ref 0 pages 179 cleanbuf 22 dirtybuf 1
    lock type mntfs: EXCL by thread 0xffffa00000ff6580 (pid 171, mount, tid 100084)
mount: /dev/ufs/rootfs: Operation not supported
Mounting root filesystem rw failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2020-11-05T10:05:23.396207+00:00 - init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
input in flex scanner failed
Growing root partition to fill device
mmcsd0: Error indicated: 1 Timeout
gpaOperation not supported
mmcsd0: Error indicated: 1 Timeout
gparOperation not supported
gpart: arg0 'ufs/rootfs': Invalid argument
mmcsd0: Error indicated: 1 Timeout
growunable to read superblock: Input/output error
No suitable dump device was found.
mmcsd0: Error indicated: 1 Timeout
mmcsd0: Error indicated: 1 Timeout
eval: /sbin/md5: Input/output error
Setting hostuuid: 30303030-3030-3030-3930-653766316463.
Setting hostid: 0x.
sysctl: invalid unsigned long '0x'
eval: /sbin/swapon: Input/output error
Fast boot: skipping disk checks.
WARNING: / was not properly dismounted
Comment 2 Mark Johnston freebsd_committer 2020-11-11 21:48:33 UTC
I looked at this a bit.  For whatever reason, growfs is triggering a DMA transfer of length 512 on a buffer that is not cacheline size-aligned.  Because both the beginning and end of the buffer address range are not cacheline size-aligned, the loop in _bus_dmamap_count_pages() concludes that it needs 2 pages in order to bounce the request, but clearly that's not true since the entire request is only 512 bytes.
Comment 3 Mark Johnston freebsd_committer 2020-11-11 21:51:32 UTC
(In reply to Mark Johnston from comment #2)
... and the broadcom sdhci driver is using a busdma tag with a maximum transfer size of 512 bytes, so it didn't reserve enough pages, and bus_dmamap_load() fails.
Comment 4 Mark Johnston freebsd_committer 2020-11-11 22:06:14 UTC
BTW, I'm not sure why the driver fails so catastrophically after the initial failure.  Normal I/O should still work fine since it's sufficiently aligned.  Without knowing anything about these drivers it feels like the mmc driver or the broadcom driver have some error handling bug.
Comment 5 Steve Wills freebsd_committer 2020-11-23 16:20:30 UTC
This patch:

https://github.com/strejda/freebsd/commit/f7b80551d7e765d3b39f8ef9ed10cd18662dacc6

seems to fix it for me
Comment 6 Michal Meloun freebsd_committer 2020-12-03 12:59:55 UTC
Steve,
I afraid that above patch is wrong.
Mark,
because I cannot reproduce this issue, I need additional round of explanation :)
I think that you analysis is slightly unprecise. I think (if I’m able to read the code correctly)  that_bus_dmamap_count_pages()  returns 1 for unaligned buffer  *if whole buffer fits  into one page*. It returns 2 * if buffer is unaligned and if it crossing page boundary.  This looks like correct behavior for me.  IMHO, problem is that we should reflect this fact in pre-allocation of bounced pages - we short of 1 page in this computation (https://svnweb.freebsd.org/base/head/sys/arm64/arm64/busdma_bounce.c?annotate=367984#l311 or  https://svnweb.freebsd.org/base/head/sys/arm64/arm64/busdma_bounce.c?annotate=367984#l449 ) But arm code does this correctly (https://svnweb.freebsd.org/base/head/sys/arm/arm/busdma_machdep.c?annotate=365706#l546 or https://svnweb.freebsd.org/base/head/sys/arm/arm/busdma_machdep.c?annotate=365706#l672 ).

Did I miss something?
Thanks,
Michal
Comment 7 Mark Johnston freebsd_committer 2020-12-03 15:58:50 UTC
(In reply to Michal Meloun from comment #6)
> It returns 2 * if buffer is unaligned and if it crossing page boundary.  This looks like correct behavior for me.

So to be clear, I'm not familiar with busdma.  Why do we need two pages to bounce a 512 buffer?  Does busdma need to preserve the property that the original buffer crosses a page boundary?
Comment 8 Michal Meloun freebsd_committer 2020-12-03 16:44:14 UTC
Current busdma implementation may want to preserve page offset for bounced buffers (see BUS_DMA_KEEP_PG_OFFSET). Probably because of this, the all implementations splits source buffer on page boundaries, then process each fragment separately and at end bounced fragments are checked for physical continuity and eventually joined together.
Comment 9 Mark Johnston freebsd_committer 2020-12-03 17:04:56 UTC
(In reply to Michal Meloun from comment #8)
But here we are bouncing because the buffer is not aligned to a cache line, so we cannot preserve the page offset anyway.  Maybe I am misunderstanding what BUS_DMA_KEEP_PG_OFFSET means though.

Allocating two pages will work of course, but I do not see why it is required in this case.
Comment 10 Michal Meloun freebsd_committer 2020-12-04 08:34:32 UTC
We can hold page offset although buffer is not cacheline aligned. The cacheline alignment is necessary for generic buffer because we must ensure that other party doesn’t write simultaneously to other part of cacheline (thus making full cacheline dirty). But by putting bounced buffer to any location of fully owned page satisfies this condition – nobody can write to unused parts of start/tail of cacheline. 


But we slightly diverging. All implementation for counting bounced pages and loading buffers are now +/- consistent, although suboptimal. I’m not ready to change all these – so I want to change allocation of bounced pages. This is part where individual implementation differs, and where arm64 implementation is clearly not consistent with buffer loading/ page counting.
Comment 11 Mark Johnston freebsd_committer 2020-12-04 15:06:44 UTC
(In reply to Michal Meloun from comment #10)
That seems like a reasonable argument to me.  Thanks for explaining.
Comment 12 commit-hook freebsd_committer 2020-12-16 14:37:57 UTC
A commit references this bug:

Author: mmel
Date: Wed Dec 16 14:36:58 UTC 2020
New revision: 368697
URL: https://svnweb.freebsd.org/changeset/base/368697

Log:
  Allocate right number of pages for the bounced buffers crossing the page.

  One of the disadvantages of our current busdma code is the fact that
  we process the bounced buffer in a page-by-page manner. This means that
  the short (subpage) buffer allocated across page boundaries is bounced
  to 2 separate pages.

  This suboptimal behavior is consistent across all platforms and can be
  related to (probably unimplementable or incompatible with bouncing)
  BUS_DMA_KEEP_PG_OFFSET flag.

  Therefore, allocate one additional page to be fully comply with this
  requirement.

  Discused with:	markj
  PR:		251018

Changes:
  head/sys/arm64/arm64/busdma_bounce.c