Bug 235542 - Jetson TK1 SD Card Controller causes kernel panic on FreeBSD 13.0-CURRENT
Summary: Jetson TK1 SD Card Controller causes kernel panic on FreeBSD 13.0-CURRENT
Status: Closed FIXED
Alias: None
Product: Base System
Classification: Unclassified
Component: arm (show other bugs)
Version: CURRENT
Hardware: arm Any
: --- Affects Only Me
Assignee: Michal Meloun
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-05 21:04 UTC by Gerald Aryeetey
Modified: 2019-02-10 14:47 UTC (History)
4 users (show)

See Also:


Attachments
Panic log after attempting to load sdhci during boot (5.29 KB, text/plain)
2019-02-05 21:04 UTC, Gerald Aryeetey
no flags Details
Full boot log (with VERBOSE_SYSINIT enabled) (42.44 KB, text/plain)
2019-02-05 21:06 UTC, Gerald Aryeetey
no flags Details
Protect all bounce page stailq manipulation in add_bounce_page() with bounce_lock (755 bytes, patch)
2019-02-06 16:44 UTC, Marius Strobl
no flags Details | Diff
Improved panic log after applying patch (5.50 KB, text/plain)
2019-02-06 17:32 UTC, Gerald Aryeetey
no flags Details
testing patch (3.28 KB, patch)
2019-02-08 09:08 UTC, Michal Meloun
no flags Details | Diff
Test patch results (6.36 KB, text/plain)
2019-02-08 17:30 UTC, Gerald Aryeetey
no flags Details
Proposed patch (2.42 KB, patch)
2019-02-09 09:18 UTC, Michal Meloun
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gerald Aryeetey 2019-02-05 21:04:22 UTC
Created attachment 201780 [details]
Panic log after attempting to load sdhci during boot

When booting a Jetson TK1 on FreeBSD 13.0-CURRENT, the kernel panics when trying to initialize the SD Controller (sdhci device).  

However, if this device is disabled, a `kldload sdhci` occurs successfully. The SD Controller appears to still not be functional since no kernel messages appear when an SD card is inserted and no `/dev/sd[a-z]` devices exist.
Comment 1 Gerald Aryeetey 2019-02-05 21:06:15 UTC
Created attachment 201781 [details]
Full boot log (with VERBOSE_SYSINIT enabled)
Comment 2 Ed Maste freebsd_committer 2019-02-05 22:09:41 UTC
In response to a query on IRC - "if this device is disabled" refers to removing it from the kernel config (so we can load it as a module for testing)
Comment 3 Gerald Aryeetey 2019-02-05 22:38:39 UTC
Reverting the base r342634 also seems to fix the issue. 

The cause of the issue seems related to DMA support since disabling that on the current version by commenting out `sys/dev/sdhci/sdhci.c:998-1006` also seems to fix the issue.
Comment 4 Michal Meloun freebsd_committer 2019-02-06 09:41:04 UTC
It works for me without issues (but I have tested only emmc now).

For 'kldload sdhci' - this cannot work. sdhci module is generic driver for PCI based SDHCI cards. TK1 (and any other SoC based SDHCI implementation) uses it's own (subclassed from generic SDHCI) driver ->
https://svnweb.freebsd.org/base/head/sys/arm/nvidia/tegra_sdhci.c?view=markup

Also, lines 998-1006 in current sdhci.c are not related to DMA in any way:
https://svnweb.freebsd.org/base/head/sys/dev/sdhci/sdhci.c?revision=342634&view=markup#l998

I think that this can be related to used DTB and/or to boot method.

Since you already have WITHOUT_LLD_BOOTSTRAP/LD=ld.bfd, can you try to boot
TK1 by loading kernel directly? Something like 
bootp 0x80200000 1.2.3.4:tegra/arm.armv7/sys/JETSON-TK1/kernel; go 0x80200100
Comment 5 Marius Strobl freebsd_committer 2019-02-06 11:28:47 UTC
The changes from r342634 apparently triggers a bug in the ARMv6 bus_dma(9)
implementation, probably due to the increased alignment requirement or
the increased size of the DMA allocation. For me, nothing obvious stands
out in sys/arm/arm/busdma_machdep-v6.c but I'm not familiar with that
code and an ARM maintainer needs to take care of that. It would be helpful
to determine what line _bus_dmamap_load_phys+0x7d0 corresponds to in any
case, though.
A long shot would be to double MAX_BPAGES in busdma_machdep-v6.c. With
r342634 in place, sdhci_tegra(4) probably takes 32 bounce pages on ARMv6
and if MAX_BPAGES is the global limit - which I'm not sure of -, the
system might run out of total bounce pages. However, even if bumping
MAX_BPAGES solves the problem, that nevertheless would indicate a bug in
that bus_dma(9) implementation as it should fail gracefully when exceeding
bounce pages.

Michal, I found no indication in sdhci_tegra(4) (or the corresponding
Linux driver) that r342634 would pose a problem for that driver or the
hardware it drives. However, tegra_sdhci.c currently seems to obtain
the clock twice:
314 	rv = clk_get_by_ofw_index(dev, 0, 0, &sc->clk);
315 	if (rv != 0) {
316 	
317 		device_printf(dev, "Cannot get clock\n");
318 		goto fail;
319 	}
320 	
321 	rv = clk_get_by_ofw_index(dev, 0, 0, &sc->clk);
322 	if (rv != 0) {
323 		device_printf(dev, "Cannot get clock\n");
324 		goto fail;
325 	}

Is this done on purpose (in which case a comment would be helpful) or
a bug and actually redundant or a copy & paste error with the second
invocation in fact supposed to do something else?
Comment 6 Michal Meloun freebsd_committer 2019-02-06 14:19:34 UTC
So, I just tested SD card on my TK1. It also works without problems, so I'm unable to reproduce this issue.

I don't think that this issue is related to DMA bounced pages because we never bounce pages allocated by bus_dmamem_alloc() (I hope).
I'm also not sure about meaning of MAX_BPAGES, but I think that it limits maximum number of bounced pages per single busdma map. In any case, actual dump of my TK1 is attached (with two SDHCI controllers active - SD card + eMMC). 

And thanks for having noticed the problem with clocks - its a bug and actually redundant. I will fix this ASAP.


root@tegra124:~ # sysctl hw.busdma
hw.busdma.zone1.alignment: 131072
hw.busdma.zone1.lowaddr: 0xffffffff
hw.busdma.zone1.total_deferred: 0
hw.busdma.zone1.total_bounced: 0
hw.busdma.zone1.active_bpages: 0
hw.busdma.zone1.reserved_bpages: 0
hw.busdma.zone1.free_bpages: 33
hw.busdma.zone1.total_bpages: 33
hw.busdma.zone0.alignment: 4096
hw.busdma.zone0.lowaddr: 0xffffffff
hw.busdma.zone0.total_deferred: 0
hw.busdma.zone0.total_bounced: 33
hw.busdma.zone0.active_bpages: 0
hw.busdma.zone0.reserved_bpages: 0
hw.busdma.zone0.free_bpages: 516
hw.busdma.zone0.total_bpages: 516
hw.busdma.total_bpages: 549
hw.busdma.maploads_physmem: 17400
hw.busdma.maploads_mbuf: 259
hw.busdma.maploads_dmamem: 794
hw.busdma.maploads_coherent: 794
hw.busdma.maploads_bounced: 33
hw.busdma.maploads_total: 19479
hw.busdma.maps_coherent: 790
hw.busdma.maps_dmamem: 794
hw.busdma.maps_total: 1596
hw.busdma.tags_total: 34
Comment 7 Marius Strobl freebsd_committer 2019-02-06 15:13:45 UTC
Given that sdhci(4) creates the DMA tag with BUS_DMA_ALLOCNOW specified
(as appropriate for its use case), that leads at least to the allocation
of a bounce zone (zone1 with an alignment of 131072 in your dump). My
understanding is that MAX_BPAGES is the maximum number of bounces pages
per bounce zone. However, bounce zones are global and can be shared by
different DMA tags. Thus, if another DMA tag happens to be served by the
same bounce zone, the limit of 64 bounce pages may be too low with r342634
in place and trigger a bug in the ARMv6 bus_dma(9) code. However, like I
wrote, this is a long shot, i. e. not necessarily the case.
Whether, the bounce pages allocated at the time of DMA tag creation
actually get used later on is a different story.

Gerald, could you please get _bus_dmamap_load_phys+0x7d0 (or the
corresponding address from a newer back trace of that panic) translated
to the actual line number?
Comment 8 Gerald Aryeetey 2019-02-06 16:29:46 UTC
This should correspond with `_bus_dmamap_load_phys+0x7d0`
```
; STAILQ_INSERT_TAIL(&(map->bpages), bpage, links);
c0576d58:       18 70 85 e5     str     r7, [r5, #24]
c0576d5c:       04 00 94 e5     ldr     r0, [r4, #4]
c0576d60:       00 50 80 e5     str     r5, [r0] <<< THIS ADDRESS
c0576d64:       04 80 84 e5     str     r8, [r4, #4]
```

which is actually a line in `add_bounce_page` ...
Comment 9 Marius Strobl freebsd_committer 2019-02-06 16:44:25 UTC
Created attachment 201790 [details]
Protect all bounce page stailq manipulation in add_bounce_page() with bounce_lock
Comment 10 Marius Strobl freebsd_committer 2019-02-06 16:46:56 UTC
Hrm, that stailq manipulation is at least missing protection by the
bounce_lock, making it racing which might explain why Michal doesn't
manage to reproduce the panic. Gerald, could you please give the patch
from attachment 201790 [details] a try?
Comment 11 Gerald Aryeetey 2019-02-06 17:04:59 UTC
No luck.

After applying the patch, `llvm-objdump` points me to this line:
```
; STAILQ_INSERT_TAIL(&(map->bpages), bpage, links);
c0576d54:       04 00 94 e5     ldr     r0, [r4, #4]
c0576d58:       00 50 80 e5     str     r5, [r0] <<< THIS ADDRESS
```
which is still in `add_bounce_page`.

Also, `addr2line` points me here
```
 940 static int
 941 _bus_dmamap_reserve_pages(bus_dma_tag_t dmat, bus_dmamap_t map, int flags)
 942 {
 943 
 944         /* Reserve Necessary Bounce Pages */
 945         mtx_lock(&bounce_lock);
 946         if (flags & BUS_DMA_NOWAIT) {
 947                 if (reserve_bounce_pages(dmat, map, 0) != 0) {
 948                         map->pagesneeded = 0;
 949                         mtx_unlock(&bounce_lock); <<< THIS LINE
 950                         return (ENOMEM);
 951                 }
```
Comment 12 Gerald Aryeetey 2019-02-06 17:32:00 UTC
Created attachment 201793 [details]
Improved panic log after applying patch

I got a better error panic log after the applying the patch as well. It points to a different line: `sys/arm/arm/busdma_machdep-v6.c:1642`
Comment 13 Marius Strobl freebsd_committer 2019-02-06 17:57:17 UTC
Yeah, on a closer look, bounce_lock isn't actually meant to protect the
per-map bounce page stailq and the patch from attachment 201790 [details] in fact
is the cause for the panic seen in attachment 201793 [details], i. e. "page fault
with the following non-sleepable locks held", sorry.

Back to the previous panic, I don't see why that STAILQ_INSERT_TAIL() in add_bounce_page() should trigger an unresolvable page fault so far. So
apart from giving a doubled MAX_BPAGES a shot, I'm out of ideas for now.

Assuming that other ARMv6 sub-systems like the PMAP are fine, the only
other thing that came to my mind is defective RAM or a silicon bug that
needs a workaround applied. Are you using a current version of U-Boot?
Comment 14 Gerald Aryeetey 2019-02-06 18:16:35 UTC
I'm using the nvidia branch that comes with the default driver package. I'm pretty sure this is not the current version of u-boot
Comment 15 Jason A. Harmening freebsd_committer 2019-02-06 18:37:22 UTC
Just noticed this one on the mailing list this morning...

It looks like part of the problem is that the per-map bounce page list is initialized by allocate_bz_and_pages(), which is only called for maps created through bus_dmamap_create().  The map created in sdhci_dma_alloc() is allocated through bus_dmamem_alloc(), so the bpages list head will be left zero-filled.  That would explain the data abort with far==0.

But we also never expect bounce pages to be needed for maps created with bus_dmamem_alloc()  (maybe add a KASSERT on !(map->flags & DMAMAP_DMAMEM_ALLOC)).
I'm gonna guess we're getting here because sdhci_dma_alloc() now passes  slot->sdma_bbufsz as the alignment requirement.  I don't think the allocators used by bus_dmamem_alloc() can guarantee anything more than system page size alignment (4K), so if the SDHCI bounce buffer size is bigger than that then alignment_bounce() could cause busdma to allocate bounce pages.
Comment 16 Jason A. Harmening freebsd_committer 2019-02-06 19:23:37 UTC
EDIT: actually it looks like the allocation logic in bus_dmamem_alloc() should handle > PAGE_SIZE alignment correctly, assuming kmem_alloc_contig() does the right thing.  I think the problem might instead be that _bus_dmamap_load_buffer() iterates the buffer on a per-page basis when computing both pagesneeded and allocating bounce pages.  So a page somewhere in the middle of > PAGE_SIZE buffer with > PAGE_SIZE alignment will fail the alignment check and trigger a bounce page allocation even though it wouldn't really be needed.
Comment 17 Marius Strobl freebsd_committer 2019-02-06 20:28:45 UTC
Yeah, that was also my conclusion after your comment #15 made me
look at the code again.
Comment 18 Jason A. Harmening freebsd_committer 2019-02-06 20:34:13 UTC
Any chance this patch fixes the panic?

Index: sys/arm/arm/busdma_machdep-v6.c
===================================================================
--- sys/arm/arm/busdma_machdep-v6.c     (revision 343478)
+++ sys/arm/arm/busdma_machdep-v6.c     (working copy)
@@ -346,9 +346,10 @@
     bus_size_t size)
 {
 
-       return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
+       return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) &&
+           ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
            alignment_bounce(dmat, addr) ||
-           cacheline_bounce(map, addr, size));
+           cacheline_bounce(map, addr, size)));
 }
 
 /*
Comment 19 Gerald Aryeetey 2019-02-06 21:07:54 UTC
Yea that worked.
Comment 20 Michal Meloun freebsd_committer 2019-02-08 07:38:43 UTC
Jason,
I don't agree. might_bounce() in _bus_dmamap_load_buffer() is called before page loop, not within. So it should be called with full buffer size and it should return false. Do to this, map->pagesneeded should be zero thus '(map-pagesneeded != 0 && must_bounce())' test inside page loop should be always negative.

I really thing that we see some secondary damage caused by some sort of memory corruption (or, unlikely, kmem_alloc_contig() failure).

Please, take in account that this works for me (on identical hardware but with different U-Boot version and with different boot sequence).
So the problem must be triggered by different memory layout (or initial context) or so...
Comment 21 Jason A. Harmening freebsd_committer 2019-02-08 07:55:27 UTC
(In reply to Michal Meloun from comment #20)

might_bounce() uses this logic right now:

           ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
            alignment_bounce(dmat, addr) ||
            cacheline_bounce(map, addr, size)

While you're correct that alignment_bounce() and cacheline_bounce() should never return true for the base buffer address, the BUS_DMA_EXCL_BOUNCE check might.  The tag passed by SDHCI specifies (BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR] as the exclusion zone.  I don't think we'd ever expect a physical page on a 32-bit device like this to be outside that zone, but it's possible that a parent DMA tag has a bigger exclusion zone.  In that case the SDHCI tag will inherit the tighter restriction from its parent.

Regardless of what's really happening with this device, I think the arm busdma code needs to also check against DMAMAP_DMAMEM_ALLOC to cope with cases like that.  That would be the direct equivalent of the check against nobounce_dmamap that gates all bounce tests in the x86 busdma_bounce implementation.
Comment 22 Jason A. Harmening freebsd_committer 2019-02-08 08:06:32 UTC
(In reply to Jason A. Harmening from comment #21)

"outside that zone" -> "inside that zone"
Comment 23 Michal Meloun freebsd_committer 2019-02-08 08:51:40 UTC
In this case, parent tag is NULL(verified).

> Regardless of what's really happening with this device, I think the arm
> busdma code needs to also check against DMAMAP_DMAMEM_ALLOC to cope
> with cases like that.

Yes, true. I agree with this. But missing check cannot affect this problem, there is clearly some other issue and I don't want to "solve" it by papering over it.
Comment 24 Michal Meloun freebsd_committer 2019-02-08 09:08:19 UTC
Created attachment 201833 [details]
testing patch

Gerald,
can you, please, try attached patch and send me console output?
(lines after 
'sdhci0: <Tegra SDHCI controller> mem 0x700b0400-0x700b05ff irq 78 on ofwbus0'
are sufficient)
Comment 25 Gerald Aryeetey 2019-02-08 17:30:27 UTC
Created attachment 201848 [details]
Test patch results
Comment 26 Michal Meloun freebsd_committer 2019-02-09 08:27:09 UTC
Thanks for the test.
Bad news is that this issue is kmem_alloc_contig() failure.
Good news is that I finally able to reproduce this on my TK1. It's not related to boot method or kernel load address, but to total memory size. U-Boot, when doing bootm boot sequence, adjusts memory size in DTB to 0x7FF0000 and this size triggers this bug. Direct boot (load kernel.bin and jump to start address) uses "safe" default size 0x70000000 and problem will not occur.

Anyway, I need some time to debug this, the VM subsystem is not easiest part of kernel code.
Comment 27 Michal Meloun freebsd_committer 2019-02-09 09:18:21 UTC
Created attachment 201864 [details]
Proposed patch

Bah, I'm idiot. We uses (explicitly documented in comment, as micro optimization) VA address of buffer as argument to might_bounce() function. This, of course, cannot work for alignments larger than PAGE_SIZE.
Please see attached patch.

Jason,
please, any comment? 
Is 'Reviewed by:  jah' OK for you?
Comment 28 Jason A. Harmening freebsd_committer 2019-02-09 10:04:12 UTC
(In reply to Michal Meloun from comment #27)

Good catch, I didn't see that when I looked through the code earlier.

I think you need to handle the case where buf is a user VA and pmap is a user pmap though.

But I also think it might still be better to just check against DMAMAP_DMAMEM_ALLOC in might_bounce() though.  I say that because it's faster than walking the pmap and the existing scheme works fine as long as dmat->alignment <= PAGE_SIZE.

We wouldn't realistically expect an arbitrary non-bus_dmamem_alloc()'ed buffer to satisfy > PAGE_SIZE physical alignment, and we don't really support it right now anyway.  If we wanted to support it, we would need to check that each non-contiguous physical segment in the buffer satisfies the alignment requirement when in might_bounce().  And if a segment didn't satisfy the requirement, we would either need to error out or go to extra lengths to bounce the segment correctly, since bounce buffers are all single pages right now.  But I don't think something like that would be worth the complexity, because I doubt anyone is going to need to map arbitrarily-allocated buffers through tags with > PAGE_SIZE alignment.  AFAIK no other bounce-buffer busdma implementation supports that either.

What would you think of a version of my earlier patch with a KASSERT to make sure (on INVARIANTS at least) that we don't paper over a legitimate allocation bug.  Something like:

Index: sys/arm/arm/busdma_machdep-v6.c
===================================================================
--- sys/arm/arm/busdma_machdep-v6.c     (revision 343478)
+++ sys/arm/arm/busdma_machdep-v6.c     (working copy)
@@ -346,9 +346,18 @@
     bus_size_t size)
 {
 
-       return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
+       KASSERT(!(dmat->flags & DMAMAP_DMAMEM_ALLOC) ||
+           ((addr <= dmat->lowaddr) && ((addr + size - 1) <= dmat->lowaddr) &&
+           !alignment_bounce(dmat, addr) &&
+           !cacheline_bounce(map, addr, size)),
+           ("%s: alloc()'ed buffer at PA:0x%08lx of size 0x%08lx"
+           "does not satisfy tag alignment 0x%08lx, lowaddr 0x%08lx",
+           __func__, dmat->alignment, dmat->lowaddr,  addr, size));
+
+       return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) &&
+           ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
            alignment_bounce(dmat, addr) ||
-           cacheline_bounce(map, addr, size));
+           cacheline_bounce(map, addr, size)));
 }
 
 /*
Comment 29 Jason A. Harmening freebsd_committer 2019-02-09 10:59:16 UTC
(In reply to Jason A. Harmening from comment #28)

EDIT: the logic in that KASSERT would need to change for the case when addr is a VA instead of a PA, and the assert should maybe also be moved to bus_dmamem_alloc().
Comment 30 Michal Meloun freebsd_committer 2019-02-09 12:10:16 UTC
(In reply to Jason A. Harmening from comment #28)
> I think you need to handle the case where buf is a user VA and pmap is a user pmap though.

Oups, right, my bad.

> But I also think it might still be better to just check against
> DMAMAP_DMAMEM_ALLOC in might_bounce() though.  I say that because it's
> faster than walking the pmap and the existing scheme works fine as long
> as dmat->alignment <= PAGE_SIZE.
I agree. Root cause  of above problem is now known and we  sure that we don't papering over real problem, so why not.

> We wouldn't realistically expect an arbitrary non-bus_dmamem_alloc()'ed
> buffer to satisfy > PAGE_SIZE physical alignment, and we don't really
> support it right now anyway.

Again agree. And due to this, KASSERT(or panic?) for this unsupported combination can be enough. For me, kernel is trustworthy, drivers (mainly in development phase) not.

So what about this (plus some improvement of function comment):
diff --git a/sys/arm/arm/busdma_machdep-v6.c b/sys/arm/arm/busdma_machdep-v6.c
index 2aff3f8a150..a9e386e8bca 100644
--- a/sys/arm/arm/busdma_machdep-v6.c
+++ b/sys/arm/arm/busdma_machdep-v6.c
@@ -346,9 +346,16 @@ might_bounce(bus_dma_tag_t dmat, bus_dmamap_t map, bus_addr_t addr,
     bus_size_t size)
 {

-  return ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
+  KASSERT(dmat->flags & DMAMAP_DMAMEM_ALLOC ||
+      dmat->alignment <= PAGE_SIZE,
+      ("%s: unsupported alignment (0x%08lx) for buffer not "
+      "allocated by bus_dmamem_alloc()",
+      __func__, dmat->alignment));
+
+  return (!(dmat->flags & DMAMAP_DMAMEM_ALLOC) &&
+      ((dmat->flags & BUS_DMA_EXCL_BOUNCE) ||
       alignment_bounce(dmat, addr) ||
-      cacheline_bounce(map, addr, size));
+      cacheline_bounce(map, addr, size)));
 }
Comment 31 Jason A. Harmening freebsd_committer 2019-02-09 18:35:28 UTC
(In reply to Michal Meloun from comment #30)

That looks good.  You can put me on the "Reviewed by:" tag for that.
Comment 32 commit-hook freebsd_committer 2019-02-10 14:25:48 UTC
A commit references this bug:

Author: mmel
Date: Sun Feb 10 14:25:29 UTC 2019
New revision: 343962
URL: https://svnweb.freebsd.org/changeset/base/343962

Log:
  Properly handle alignment requests bigger that page size.
   - for now, alignments bigger that page size is allowed only for buffers
     allocated by bus_dmamem_alloc(), cover this fact by KASSERT.
   - never bounce buffers allocated by bus_dmamem_alloc(), these always comply
     with the required rules (alignment, boundary, address range).

  MFC after:	1 week
  Reviewed by:	jah
  PR:		235542

Changes:
  head/sys/arm/arm/busdma_machdep-v6.c
Comment 33 Michal Meloun freebsd_committer 2019-02-10 14:47:04 UTC
Fixed in r343962.
Jason, many thanks for cooperation.