Created attachment 201780 [details] Panic log after attempting to load sdhci during boot When booting a Jetson TK1 on FreeBSD 13.0-CURRENT, the kernel panics when trying to initialize the SD Controller (sdhci device). However, if this device is disabled, a `kldload sdhci` occurs successfully. The SD Controller appears to still not be functional since no kernel messages appear when an SD card is inserted and no `/dev/sd[a-z]` devices exist.
Created attachment 201781 [details] Full boot log (with VERBOSE_SYSINIT enabled)
In response to a query on IRC - "if this device is disabled" refers to removing it from the kernel config (so we can load it as a module for testing)
Reverting the base r342634 also seems to fix the issue. The cause of the issue seems related to DMA support since disabling that on the current version by commenting out `sys/dev/sdhci/sdhci.c:998-1006` also seems to fix the issue.
It works for me without issues (but I have tested only emmc now). For 'kldload sdhci' - this cannot work. sdhci module is generic driver for PCI based SDHCI cards. TK1 (and any other SoC based SDHCI implementation) uses it's own (subclassed from generic SDHCI) driver -> https://svnweb.freebsd.org/base/head/sys/arm/nvidia/tegra_sdhci.c?view=markup Also, lines 998-1006 in current sdhci.c are not related to DMA in any way: https://svnweb.freebsd.org/base/head/sys/dev/sdhci/sdhci.c?revision=342634&view=markup#l998 I think that this can be related to used DTB and/or to boot method. Since you already have WITHOUT_LLD_BOOTSTRAP/LD=ld.bfd, can you try to boot TK1 by loading kernel directly? Something like bootp 0x80200000 1.2.3.4:tegra/arm.armv7/sys/JETSON-TK1/kernel; go 0x80200100
The changes from r342634 apparently triggers a bug in the ARMv6 bus_dma(9) implementation, probably due to the increased alignment requirement or the increased size of the DMA allocation. For me, nothing obvious stands out in sys/arm/arm/busdma_machdep-v6.c but I'm not familiar with that code and an ARM maintainer needs to take care of that. It would be helpful to determine what line _bus_dmamap_load_phys+0x7d0 corresponds to in any case, though. A long shot would be to double MAX_BPAGES in busdma_machdep-v6.c. With r342634 in place, sdhci_tegra(4) probably takes 32 bounce pages on ARMv6 and if MAX_BPAGES is the global limit - which I'm not sure of -, the system might run out of total bounce pages. However, even if bumping MAX_BPAGES solves the problem, that nevertheless would indicate a bug in that bus_dma(9) implementation as it should fail gracefully when exceeding bounce pages. Michal, I found no indication in sdhci_tegra(4) (or the corresponding Linux driver) that r342634 would pose a problem for that driver or the hardware it drives. However, tegra_sdhci.c currently seems to obtain the clock twice: 314 rv = clk_get_by_ofw_index(dev, 0, 0, &sc->clk); 315 if (rv != 0) { 316 317 device_printf(dev, "Cannot get clock\n"); 318 goto fail; 319 } 320 321 rv = clk_get_by_ofw_index(dev, 0, 0, &sc->clk); 322 if (rv != 0) { 323 device_printf(dev, "Cannot get clock\n"); 324 goto fail; 325 } Is this done on purpose (in which case a comment would be helpful) or a bug and actually redundant or a copy & paste error with the second invocation in fact supposed to do something else?
So, I just tested SD card on my TK1. It also works without problems, so I'm unable to reproduce this issue. I don't think that this issue is related to DMA bounced pages because we never bounce pages allocated by bus_dmamem_alloc() (I hope). I'm also not sure about meaning of MAX_BPAGES, but I think that it limits maximum number of bounced pages per single busdma map. In any case, actual dump of my TK1 is attached (with two SDHCI controllers active - SD card + eMMC). And thanks for having noticed the problem with clocks - its a bug and actually redundant. I will fix this ASAP. root@tegra124:~ # sysctl hw.busdma hw.busdma.zone1.alignment: 131072 hw.busdma.zone1.lowaddr: 0xffffffff hw.busdma.zone1.total_deferred: 0 hw.busdma.zone1.total_bounced: 0 hw.busdma.zone1.active_bpages: 0 hw.busdma.zone1.reserved_bpages: 0 hw.busdma.zone1.free_bpages: 33 hw.busdma.zone1.total_bpages: 33 hw.busdma.zone0.alignment: 4096 hw.busdma.zone0.lowaddr: 0xffffffff hw.busdma.zone0.total_deferred: 0 hw.busdma.zone0.total_bounced: 33 hw.busdma.zone0.active_bpages: 0 hw.busdma.zone0.reserved_bpages: 0 hw.busdma.zone0.free_bpages: 516 hw.busdma.zone0.total_bpages: 516 hw.busdma.total_bpages: 549 hw.busdma.maploads_physmem: 17400 hw.busdma.maploads_mbuf: 259 hw.busdma.maploads_dmamem: 794 hw.busdma.maploads_coherent: 794 hw.busdma.maploads_bounced: 33 hw.busdma.maploads_total: 19479 hw.busdma.maps_coherent: 790 hw.busdma.maps_dmamem: 794 hw.busdma.maps_total: 1596 hw.busdma.tags_total: 34
Given that sdhci(4) creates the DMA tag with BUS_DMA_ALLOCNOW specified (as appropriate for its use case), that leads at least to the allocation of a bounce zone (zone1 with an alignment of 131072 in your dump). My understanding is that MAX_BPAGES is the maximum number of bounces pages per bounce zone. However, bounce zones are global and can be shared by different DMA tags. Thus, if another DMA tag happens to be served by the same bounce zone, the limit of 64 bounce pages may be too low with r342634 in place and trigger a bug in the ARMv6 bus_dma(9) code. However, like I wrote, this is a long shot, i. e. not necessarily the case. Whether, the bounce pages allocated at the time of DMA tag creation actually get used later on is a different story. Gerald, could you please get _bus_dmamap_load_phys+0x7d0 (or the corresponding address from a newer back trace of that panic) translated to the actual line number?
This should correspond with `_bus_dmamap_load_phys+0x7d0` ``` ; STAILQ_INSERT_TAIL(&(map->bpages), bpage, links); c0576d58: 18 70 85 e5 str r7, [r5, #24] c0576d5c: 04 00 94 e5 ldr r0, [r4, #4] c0576d60: 00 50 80 e5 str r5, [r0] <<< THIS ADDRESS c0576d64: 04 80 84 e5 str r8, [r4, #4] ``` which is actually a line in `add_bounce_page` ...
Created attachment 201790 [details] Protect all bounce page stailq manipulation in add_bounce_page() with bounce_lock
Hrm, that stailq manipulation is at least missing protection by the bounce_lock, making it racing which might explain why Michal doesn't manage to reproduce the panic. Gerald, could you please give the patch from attachment 201790 [details] a try?
No luck. After applying the patch, `llvm-objdump` points me to this line: ``` ; STAILQ_INSERT_TAIL(&(map->bpages), bpage, links); c0576d54: 04 00 94 e5 ldr r0, [r4, #4] c0576d58: 00 50 80 e5 str r5, [r0] <<< THIS ADDRESS ``` which is still in `add_bounce_page`. Also, `addr2line` points me here ``` 940 static int 941 _bus_dmamap_reserve_pages(bus_dma_tag_t dmat, bus_dmamap_t map, int flags) 942 { 943 944 /* Reserve Necessary Bounce Pages */ 945 mtx_lock(&bounce_lock); 946 if (flags & BUS_DMA_NOWAIT) { 947 if (reserve_bounce_pages(dmat, map, 0) != 0) { 948 map->pagesneeded = 0; 949 mtx_unlock(&bounce_lock); <<< THIS LINE 950 return (ENOMEM); 951 } ```
Created attachment 201793 [details] Improved panic log after applying patch I got a better error panic log after the applying the patch as well. It points to a different line: `sys/arm/arm/busdma_machdep-v6.c:1642`
Yeah, on a closer look, bounce_lock isn't actually meant to protect the per-map bounce page stailq and the patch from attachment 201790 [details] in fact is the cause for the panic seen in attachment 201793 [details], i. e. "page fault with the following non-sleepable locks held", sorry. Back to the previous panic, I don't see why that STAILQ_INSERT_TAIL() in add_bounce_page() should trigger an unresolvable page fault so far. So apart from giving a doubled MAX_BPAGES a shot, I'm out of ideas for now. Assuming that other ARMv6 sub-systems like the PMAP are fine, the only other thing that came to my mind is defective RAM or a silicon bug that needs a workaround applied. Are you using a current version of U-Boot?
I'm using the nvidia branch that comes with the default driver package. I'm pretty sure this is not the current version of u-boot
Just noticed this one on the mailing list this morning... It looks like part of the problem is that the per-map bounce page list is initialized by allocate_bz_and_pages(), which is only called for maps created through bus_dmamap_create(). The map created in sdhci_dma_alloc() is allocated through bus_dmamem_alloc(), so the bpages list head will be left zero-filled. That would explain the data abort with far==0. But we also never expect bounce pages to be needed for maps created with bus_dmamem_alloc() (maybe add a KASSERT on !(map->flags & DMAMAP_DMAMEM_ALLOC)). I'm gonna guess we're getting here because sdhci_dma_alloc() now passes slot->sdma_bbufsz as the alignment requirement. I don't think the allocators used by bus_dmamem_alloc() can guarantee anything more than system page size alignment (4K), so if the SDHCI bounce buffer size is bigger than that then alignment_bounce() could cause busdma to allocate bounce pages.
EDIT: actually it looks like the allocation logic in bus_dmamem_alloc() should handle > PAGE_SIZE alignment correctly, assuming kmem_alloc_contig() does the right thing. I think the problem might instead be that _bus_dmamap_load_buffer() iterates the buffer on a per-page basis when computing both pagesneeded and allocating bounce pages. So a page somewhere in the middle of > PAGE_SIZE buffer with > PAGE_SIZE alignment will fail the alignment check and trigger a bounce page allocation even though it wouldn't really be needed.
Yeah, that was also my conclusion after your comment #15 made me look at the code again.
Any chance this patch fixes the panic? Index: sys/arm/arm/busdma_machdep-v6.c =================================================================== --- sys/arm/arm/busdma_machdep-v6.c (revision 343478) +++ sys/arm/arm/busdma_machdep-v6.c (working copy) @@ -346,9 +346,10 @@ bus_size_t size) { - return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || + return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) && + ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || alignment_bounce(dmat, addr) || - cacheline_bounce(map, addr, size)); + cacheline_bounce(map, addr, size))); } /*
Yea that worked.
Jason, I don't agree. might_bounce() in _bus_dmamap_load_buffer() is called before page loop, not within. So it should be called with full buffer size and it should return false. Do to this, map->pagesneeded should be zero thus '(map-pagesneeded != 0 && must_bounce())' test inside page loop should be always negative. I really thing that we see some secondary damage caused by some sort of memory corruption (or, unlikely, kmem_alloc_contig() failure). Please, take in account that this works for me (on identical hardware but with different U-Boot version and with different boot sequence). So the problem must be triggered by different memory layout (or initial context) or so...
(In reply to Michal Meloun from comment #20) might_bounce() uses this logic right now: ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || alignment_bounce(dmat, addr) || cacheline_bounce(map, addr, size) While you're correct that alignment_bounce() and cacheline_bounce() should never return true for the base buffer address, the BUS_DMA_EXCL_BOUNCE check might. The tag passed by SDHCI specifies (BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR] as the exclusion zone. I don't think we'd ever expect a physical page on a 32-bit device like this to be outside that zone, but it's possible that a parent DMA tag has a bigger exclusion zone. In that case the SDHCI tag will inherit the tighter restriction from its parent. Regardless of what's really happening with this device, I think the arm busdma code needs to also check against DMAMAP_DMAMEM_ALLOC to cope with cases like that. That would be the direct equivalent of the check against nobounce_dmamap that gates all bounce tests in the x86 busdma_bounce implementation.
(In reply to Jason A. Harmening from comment #21) "outside that zone" -> "inside that zone"
In this case, parent tag is NULL(verified). > Regardless of what's really happening with this device, I think the arm > busdma code needs to also check against DMAMAP_DMAMEM_ALLOC to cope > with cases like that. Yes, true. I agree with this. But missing check cannot affect this problem, there is clearly some other issue and I don't want to "solve" it by papering over it.
Created attachment 201833 [details] testing patch Gerald, can you, please, try attached patch and send me console output? (lines after 'sdhci0: <Tegra SDHCI controller> mem 0x700b0400-0x700b05ff irq 78 on ofwbus0' are sufficient)
Created attachment 201848 [details] Test patch results
Thanks for the test. Bad news is that this issue is kmem_alloc_contig() failure. Good news is that I finally able to reproduce this on my TK1. It's not related to boot method or kernel load address, but to total memory size. U-Boot, when doing bootm boot sequence, adjusts memory size in DTB to 0x7FF0000 and this size triggers this bug. Direct boot (load kernel.bin and jump to start address) uses "safe" default size 0x70000000 and problem will not occur. Anyway, I need some time to debug this, the VM subsystem is not easiest part of kernel code.
Created attachment 201864 [details] Proposed patch Bah, I'm idiot. We uses (explicitly documented in comment, as micro optimization) VA address of buffer as argument to might_bounce() function. This, of course, cannot work for alignments larger than PAGE_SIZE. Please see attached patch. Jason, please, any comment? Is 'Reviewed by: jah' OK for you?
(In reply to Michal Meloun from comment #27) Good catch, I didn't see that when I looked through the code earlier. I think you need to handle the case where buf is a user VA and pmap is a user pmap though. But I also think it might still be better to just check against DMAMAP_DMAMEM_ALLOC in might_bounce() though. I say that because it's faster than walking the pmap and the existing scheme works fine as long as dmat->alignment <= PAGE_SIZE. We wouldn't realistically expect an arbitrary non-bus_dmamem_alloc()'ed buffer to satisfy > PAGE_SIZE physical alignment, and we don't really support it right now anyway. If we wanted to support it, we would need to check that each non-contiguous physical segment in the buffer satisfies the alignment requirement when in might_bounce(). And if a segment didn't satisfy the requirement, we would either need to error out or go to extra lengths to bounce the segment correctly, since bounce buffers are all single pages right now. But I don't think something like that would be worth the complexity, because I doubt anyone is going to need to map arbitrarily-allocated buffers through tags with > PAGE_SIZE alignment. AFAIK no other bounce-buffer busdma implementation supports that either. What would you think of a version of my earlier patch with a KASSERT to make sure (on INVARIANTS at least) that we don't paper over a legitimate allocation bug. Something like: Index: sys/arm/arm/busdma_machdep-v6.c =================================================================== --- sys/arm/arm/busdma_machdep-v6.c (revision 343478) +++ sys/arm/arm/busdma_machdep-v6.c (working copy) @@ -346,9 +346,18 @@ bus_size_t size) { - return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || + KASSERT(!(dmat->flags & DMAMAP_DMAMEM_ALLOC) || + ((addr <= dmat->lowaddr) && ((addr + size - 1) <= dmat->lowaddr) && + !alignment_bounce(dmat, addr) && + !cacheline_bounce(map, addr, size)), + ("%s: alloc()'ed buffer at PA:0x%08lx of size 0x%08lx" + "does not satisfy tag alignment 0x%08lx, lowaddr 0x%08lx", + __func__, dmat->alignment, dmat->lowaddr, addr, size)); + + return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) && + ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || alignment_bounce(dmat, addr) || - cacheline_bounce(map, addr, size)); + cacheline_bounce(map, addr, size))); } /*
(In reply to Jason A. Harmening from comment #28) EDIT: the logic in that KASSERT would need to change for the case when addr is a VA instead of a PA, and the assert should maybe also be moved to bus_dmamem_alloc().
(In reply to Jason A. Harmening from comment #28) > I think you need to handle the case where buf is a user VA and pmap is a user pmap though. Oups, right, my bad. > But I also think it might still be better to just check against > DMAMAP_DMAMEM_ALLOC in might_bounce() though. I say that because it's > faster than walking the pmap and the existing scheme works fine as long > as dmat->alignment <= PAGE_SIZE. I agree. Root cause of above problem is now known and we sure that we don't papering over real problem, so why not. > We wouldn't realistically expect an arbitrary non-bus_dmamem_alloc()'ed > buffer to satisfy > PAGE_SIZE physical alignment, and we don't really > support it right now anyway. Again agree. And due to this, KASSERT(or panic?) for this unsupported combination can be enough. For me, kernel is trustworthy, drivers (mainly in development phase) not. So what about this (plus some improvement of function comment): diff --git a/sys/arm/arm/busdma_machdep-v6.c b/sys/arm/arm/busdma_machdep-v6.c index 2aff3f8a150..a9e386e8bca 100644 --- a/sys/arm/arm/busdma_machdep-v6.c +++ b/sys/arm/arm/busdma_machdep-v6.c @@ -346,9 +346,16 @@ might_bounce(bus_dma_tag_t dmat, bus_dmamap_t map, bus_addr_t addr, bus_size_t size) { - return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || + KASSERT(dmat->flags & DMAMAP_DMAMEM_ALLOC || + dmat->alignment <= PAGE_SIZE, + ("%s: unsupported alignment (0x%08lx) for buffer not " + "allocated by bus_dmamem_alloc()", + __func__, dmat->alignment)); + + return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) && + ((dmat->flags & BUS_DMA_EXCL_BOUNCE) || alignment_bounce(dmat, addr) || - cacheline_bounce(map, addr, size)); + cacheline_bounce(map, addr, size))); }
(In reply to Michal Meloun from comment #30) That looks good. You can put me on the "Reviewed by:" tag for that.
A commit references this bug: Author: mmel Date: Sun Feb 10 14:25:29 UTC 2019 New revision: 343962 URL: https://svnweb.freebsd.org/changeset/base/343962 Log: Properly handle alignment requests bigger that page size. - for now, alignments bigger that page size is allowed only for buffers allocated by bus_dmamem_alloc(), cover this fact by KASSERT. - never bounce buffers allocated by bus_dmamem_alloc(), these always comply with the required rules (alignment, boundary, address range). MFC after: 1 week Reviewed by: jah PR: 235542 Changes: head/sys/arm/arm/busdma_machdep-v6.c
Fixed in r343962. Jason, many thanks for cooperation.