Hello Team, We are using the below 3 routines to alloc DMA memory for one of our devices connected through PCIe interface. Out of 10 rebooted 4-5 times we get the address outside greated 32G range of addresses, due to which our device functionality is effected and there is a crash seen. error = bus_dma_tag_create(bus_get_dma_tag(dev), PAGE_SIZE, 0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL, NULL, 128*1024*1024, 1, 128*1024*1024, BUS_DMA_COHERENT, NULL, NULL, &dma_tag); error = bus_dmamem_alloc(dma_tag, &vaddr, BUS_DMA_WAITOK | BUS_DMA_ZERO | BUS_DMA_COHERENT, &dma_map); error = bus_dmamap_load(dma_tag, dma_map, vaddr, 128*1024*1024, dma_mem_callback, &dma_paddr, BUS_DMA_NOWAIT); Below are the addresses we get for Working and non-working case Working case vaddr and dma_paddr: DMA phys addr with arm64_address_translate is 0xff000000cf000b80 for vaddr 0xffff00007ec00000 DMA phys addr with vtophys is 0xcf000000 for vaddr 0xffff00007ec00000 Non-Working case vaddr and dma_paddr: DMA phys addr with arm64_address_translate is 0xff00000163000b80 for vaddr 0xffff00007ec00000 DMA phys addr with vtophys is 0x163000000 for vaddr 0xffff00007ec00000
Which exact revision of FreeBSD are you running?
Below is the upstream hash, commit 8f00c7ed76076f69466f5324b1fe84da899316af Merge: f02f96ae032 f7f4bd06a8d Author: Steve Kiernan <stevek@juniper.net> Date: Wed Oct 4 12:02:49 2023 -0700 Merge remote-tracking branch 'origin/freebsd/main' Change-Id: I6ee447a86939db7bc52de14819e3fe77af46dce7
(In reply to Sha Faisal from comment #2) Ok, thank you, that's quite recent. Did you verify that kmem_alloc_contig() is getting called with low=0,high=0xffffffff?
Hi mark, below are the debug prints in kmem_alloc_contig dmat lowaddr=0xffffffff, dmat highaddr= 0xffffffffffffffff, dmat maxsize 0x8000000 In case kmem_alloc_contig Args to kmem_alloc_contig are dmat->alloc_size = 0x08000000, mflags= 0x102, dmat->common.lowaddr = 0xffffffff, dmat->alloc_alignment= 0x1000, dmat->common.boundary = 0x0, attr= 0x2 Regards Faisal
(In reply to Sha Faisal from comment #0) Could you please show the output of "sysctl vm.phys_segs" from the system where this is happening? Doug, is it possible that one of the recent commits to vm_phys.c introduced a regression?
I don't know that commit 9e8174289236 (as fixed by commit 58d427172157ded) introduced a regression here, but if I had to pick one to investigate, I'd start with that one.
The below o/p of "sysctl vm.phys_segs" Working case: In case kmem_alloc_contig Args to kmem_alloc_contig are dmat->alloc_size = 0x08000000, mflags= 0x102,dmat->common.lowaddr = 0xffffffff, dmat->alloc_alignment= 0x1000, dmat->common.boundary = 0x0, attr= 0x2 dma tag is 0xffffa000ad7f2500 and vaddr is 0xffff00007ec00000 DMA phys addr with arm64_address_translate is 0xff000000de000b80 for vaddr 0xffff00007ec00000 DMA phys addr with vtophys is 0xde000000 for vaddr 0xffff00007ec00000 root@:~ # sysctl vm.phys_segs vm.phys_segs: SEGMENT 0: start: 0x80000000 end: 0xf365f000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 1: start: 0xf3662000 end: 0xf3800000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 2: start: 0xf6364000 end: 0xfdaef000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 3: start: 0xfdaf8000 end: 0xfffc7000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 4: start: 0xfffc8000 end: 0x179b37000 domain: 0 free list: 0xffff0000007e6e00 root@:~ # Non-working case (where the error is seen) In case kmem_alloc_contig Args to kmem_alloc_contig are dmat->alloc_size = 0x08000000, mflags= 0x102,dmat->common.lowaddr = 0xffffffff, dmat->alloc_alignment= 0x1000, dmat->common.boundary = 0x0, attr= 0x2 dma tag is 0xffffa000afb94200 and vaddr is 0xffff00007ec00000 DMA phys addr with arm64_address_translate is 0xff0000014c000b80 for vaddr 0xffff00007ec00000 DMA phys addr with vtophys is 0x14c000000 for vaddr 0xffff00007ec00000 root@:~ # sysctl vm.phys_segs vm.phys_segs: SEGMENT 0: start: 0x80000000 end: 0xf365f000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 1: start: 0xf3662000 end: 0xf3800000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 2: start: 0xf6364000 end: 0xfdaef000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 3: start: 0xfdaf8000 end: 0xfffc7000 domain: 0 free list: 0xffff0000007e6e00 SEGMENT 4: start: 0xfffc8000 end: 0x179b37000 domain: 0 free list: 0xffff0000007e6e00 root@:~ #
I don't think Mark's question in comment 3 has been answered. It's not clear to me that this is reporting a regression; nothing written here says that this problem appeared only recently, or after some version. I cannot tell whether the bug reported is in code checked in already, or in the code that the reporter is developing. I'm sorry for my general unhelpfulness, but I don't yet understand the problem well enough to know whether some change of mine in the last year or two could have caused it.
(In reply to Doug Moore from comment #8) From a short conversation on IRC I believe this is indeed a regression, some driver code is being rebased onto a new FreeBSD, and this behaviour of kmem_alloc_contig() is causing problems because the constraints of the busdma tag aren't being honoured. (In reply to Sha Faisal from comment #7) Two more questions: which FreeBSD revision did you upgrade from, and are you running with KASSERTs enabled?
(In reply to Mark Johnston from comment #9) Last working version was with freeBSD 12, I guess KASSERTS are not enabled, i will confirm
(In reply to Doug Moore from comment #6) I took the patch "https://github.com/freebsd/freebsd-src/commit/9e8174289236" and reverted it on my SB, with the patch reverted the issue was seen. 3 out of 2 reboots, I got an address out-of-range and our device does not come up.
commit 342056fa1c7a36d90feafd593fb980f98563f32c is another change I'm responsible for that could be the cause if it has some problem I've missed.
(In reply to Doug Moore from comment #12) With this patch reverted, the issue is not seen with 6-8 reboots.
(In reply to Sha Faisal from comment #13) So that I can better understand what you mean by "with this patch reverted", can you provide a diff file showing the difference between the current vm_phys.c code and what you found to work? I'm not sure whether you've undone only the changes made in that patch, or also undone some more recent changes. If it was that patch only, then I'd expect that just removing these lines: /* * If a previous segment led to a search using * the same free lists as would this segment, then * we've actually already searched within this * too. So skip it. */ if (seg->free_queues == queues) continue; would fix the problem.
Created attachment 245992 [details] patch reverted on vm_phys.c, which is working fine the patch aplied on vm_phys.c seems to be working.
(In reply to Doug Moore from comment #14) with this change reverted the issue is still there.
I can't apply this patch to main. 4 of 7 hunks fail. So I'm not sure what branch/version you are applying it to.
(In reply to Doug Moore from comment #17) Yes, I had to apply manually and test it. Attaching the vm_phys.c file, which is pathed with this change "https://github.com/freebsd/freebsd-src/commit/342056fa1c7a36d90feafd593fb980f98563f32c?diff=split"
Created attachment 246017 [details] vm_phys.c patched - working the vm_phys.c file from our main and patched withthe below changes https://github.com/freebsd/freebsd-src/commit/342056fa1c7a36d90feafd593fb980f98563f32c?diff=split
Created attachment 246038 [details] An intermediate vm_phys.c I've taken your version of vm_phys.c (that works), and updated it toward the current vm_phys.c in ways that I don't expect to introduce problems. I hope that you'll tell me if this version works, or doesn't, so that I can do a sort of binary search for the part of the change that introduces the problem.
(In reply to Doug Moore from comment #20) with these changes the issue is not seen and works fine.
Created attachment 246060 [details] Simplify vm_phys_find_freelist_contig This version of vm_phys.c modifies only one function. Please verify that this change is enough to address your problem. It is an un-optimization, so if this test passes, I'll still want to fix this function to preserve that optimization.
(In reply to Doug Moore from comment #22) This change also works fine.
Created attachment 246076 [details] vm_phys_find_freelist_contig, but without using adjacent smaller blocks Can you test this please?
(In reply to Doug Moore from comment #24) With this patch in 6 reboots, twice the the box crashed with below bt The crashed happened in bus_dmamem_alloc Fatal data abort: x0: 0x0000000000000000 x1: 0x0000000000008000 x2: 0x0000000000000000 x3: 0x00000000ffffffff x4: 0x0000000000000138 x5: 0xffffffffff000000 x6: 0x0000000000000068 x7: 0x0000000001000000 x8: 0xffffa000fcac4860 x9: 0xffff00000080ee80 (vm_phys_free_queues + 0x0) x10: 0x0000000000000005 x11: 0x000000000000000f x12: 0x0000000008000000 x13: 0x00000000001fffff x14: 0x0000000000000000 x15: 0x0000000000000038 x16: 0xffff0000005a4ba8 (vm_phys_segs + 0x0) x17: 0x0000000000000018 x18: 0xffff00007bfbae80 (__stop_set_sysinit_set + 0x79633a48) x19: 0x0000000000000000 x20: 0x0000000000000000 x21: 0xffff0000005a4c88 (vm_phys_segs + 0xe0) x22: 0xffff0000005a4c90 (vm_phys_segs + 0xe8) x23: 0x067fa000fca5c860 x24: 0x00000000f7000000 x25: 0xffffa000fca74ec8 x26: 0x00000000f63c1000 x27: 0x0000000000000001 x28: 0x00000000f3800000 x29: 0xffff00007bfbae80 (__stop_set_sysinit_set + 0x79633a48) sp: 0xffff00007bfbae80 lr: 0xffff000000380cdc (vm_reserv_alloc_contig + 0x3fc) elr: 0xffff00000037fa40 (vm_phys_alloc_contig + 0x23c) spsr: 0x0000000060000045 far: 0x067fa000fca5c8bc esr: 0x0000000096000004 panic: vm_fault failed: 0xffff00000037fa40 error 1 cpuid = 0 time = 1699036038 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x34 vpanic() at vpanic+0x280 panic() at panic+0x48 data_abort() at data_abort+0x320 handle_el1h_sync() at handle_el1h_sync+0x18 --- exception, esr 0x96000004 vm_phys_alloc_contig() at vm_phys_alloc_contig+0x23c vm_reserv_alloc_contig() at vm_reserv_alloc_contig+0x3f8 vm_page_alloc_contig_domain() at vm_page_alloc_contig_domain+0xb4 kmem_alloc_contig_pages() at kmem_alloc_contig_pages+0x80 kmem_alloc_contig_domainset() at kmem_alloc_contig_domainset+0x15c bounce_bus_dmamem_alloc() at bounce_bus_dmamem_alloc+0x17c bus_dma_tag_destroy() at 0xffff00007e16d634 bus_dma_tag_destroy() at 0xffff00007e16c894 device_attach() at device_attach+0x3fc pci_driver_added() at pci_driver_added+0x110 devclass_driver_added() at devclass_driver_added+0x48 devclass_add_driver() at devclass_add_driver+0x138 module_register_init() at module_register_init+0xb0 linker_load_module() at linker_load_module+0x9d8 kern_kldload() at kern_kldload+0x148 sys_kldload() at sys_kldload+0x68 do_el0_sync() at do_el0_sync+0x648 handle_el0_sync() at handle_el0_sync+0x3c --- exception, esr 0x56000000
Created attachment 246094 [details] add check for upper bound while scanning for aligned block I have guessed a cause for the previous failure and seek to address that cause.
(In reply to Doug Moore from comment #26) with this patch, the box fails when it encounters the first dma_tag_create function, reboots continously. iproc_sdhci0: <Broadcom iProc SDHCI controller> mem 0x66420000-0x664200ff irq 74 on simplebus0 iproc_sdhci0-slot0: Can't create DMA tag for SDMA x0: 0x0000000000000008 x1: 0xffff00000042a326 (console_pausestr + 0x2d8) x2: 0x0000000000000192 x3: 0xffff000000439c46 (console_pausestr + 0xfbf8) x4: 0xffff0000003e9c76 (cam_status_table + 0xb5ae) x5: 0xffff000002ba3f48 (__stop_set_sysinit_set + 0x21cb10) x6: 0xffff000002ba3280 (__stop_set_sysinit_set + 0x21be48) x7: 0x0000000000000000 x8: 0x0000000000000000 x9: 0x0000000000000000 x10: 0xffff00000055ba20 (lock_classes + 0x0) x11: 0x000000007ff6d83f x12: 0x000000007ff6d83f x13: 0x0000000000002af8 x14: 0x0000000000000000 x15: 0x0000000000002af8 x16: 0x0000000000000000 x17: 0x0000000000000000 x18: 0xffff0000403e2990 (__stop_set_sysinit_set + 0x3da5b558) x19: 0x0000000000000040 x20: 0xffff00006c5b1f54 (__stop_set_sysinit_set + 0x69c2ab1c) x21: 0xffff0000403e2a00 (__stop_set_sysinit_set + 0x3da5b5c8) x22: 0xffff00000019bde8 (ithread_loop + 0x0) x23: 0xffffa000004fa460 x24: 0xffff000041c2a5a0 (__stop_set_sysinit_set + 0x3f2a3168) x25: 0x0000000000000000 x26: 0x0000000000000000 x27: 0x0000000000000000 x28: 0x0000000000000000 x29: 0xffff0000403e2990 (__stop_set_sysinit_set + 0x3da5b558) sp: 0xffff0000403e2990 lr: 0xffff0000001982b0 (fork_exit + 0x98) elr: 0xffff0000003a1b34 (spinlock_exit + 0x44) spsr: 0x0000000060000045 far: 0x0000000000000000 esr: 0x00000000bf000002 panic: Unhandled System Error cpuid = 0 time = 1 KDB: stack backtrace: db_trace_self() at db_trace_self db_trace_self_wrapper() at db_trace_self_wrapper+0x34 vpanic() at vpanic+0x280 panic() at panic+0x48 do_serror() at do_serror+0x6c handle_el1h_serror() at handle_el1h_serror+0x14 (null)() at 0xc60
Created attachment 246126 [details] Back to the inefficient page search, but with minor changes intended to support the efficient (broken) page search. Thanks for your patience.
(In reply to Doug Moore from comment #28) With this patch also, same issue as mentioned in comment 27
Created attachment 246141 [details] just change arguments to two functions, nothing else
Created attachment 246148 [details] Don't pass oind; use vm_seg_addr_to_page Discard the last version.
(In reply to Doug Moore from comment #31) This patch works fine. no issues are seen with 12 reboots
Created attachment 246165 [details] Use max_size. Reorder fictitious fields. Trying another small step forward.
(In reply to Doug Moore from comment #33) This patch also works fine, Please let us know once you start the review process and commit.
Created attachment 246184 [details] final draft I'll post this version for review once it has passed your tests.
(In reply to Doug Moore from comment #35) The patch works fine, without any issues. I have tested with 10-12 reboots.
The final draft has been posted for review to: https://reviews.freebsd.org/D42509
Created attachment 246280 [details] reviewed version From the last test version, code review has led to some changes. Review is now complete, and you may want to test the reviewed version to make sure everything still works.
(In reply to Doug Moore from comment #38) This reviewed patch works fine, no issue was seen with 12 reboots.
A commit in branch main references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=2a4897bd4e1bd8430d955abd3cf6675956bb9d61 commit 2a4897bd4e1bd8430d955abd3cf6675956bb9d61 Author: Doug Moore <dougm@FreeBSD.org> AuthorDate: 2023-11-15 09:25:45 +0000 Commit: Doug Moore <dougm@FreeBSD.org> CommitDate: 2023-11-15 09:25:45 +0000 vm_phys: fix freelist_contig vm_phys_find_freelist_contig is called to search a list of max-sized free page blocks and find one that, when joined with adjacent blocks in memory, can satisfy a request for a memory allocation bigger than any single max-sized free page block. In commit fa8a6585c7522b7de6d29802967bd5eba2f2dcf1, I defined this function in order to offer two improvements: 1) reduce the worst-case search time, and 2) allow solutions that include less-than max-sized free page blocks at the front or back of the giant allocation. However, it turns out that this change introduced an error, reported in In Bug 274592. That error concerns failing to check segment boundaries. This change fixes an error in vm_phys_find_freelist_config that resolves that bug. It also abandons improvement 2), because the value of that improvement is small and because preserving it would require more testing than I am able to do. PR: 274592 Reported by: shafaisal.us@gmail.com Reviewed by: alc, markj Tested by: shafaisal.us@gmail.com Fixes: fa8a6585c752 vm_phys: avoid waste in multipage allocation MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D42509 sys/vm/vm_phys.c | 146 +++++++++++++++++++++++-------------------------------- 1 file changed, 62 insertions(+), 84 deletions(-)
Thank you Doug and Sha Faisal for tracking down the problem. I'll work on getting an erratum prepared for 14.0 for this bug. I'm not sure that this bug is widely hit, but we don't have a good way to be sure.
A commit in branch stable/14 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=210fce73ae0e4106a3aeb1970ffbeb30d0baa6ba commit 210fce73ae0e4106a3aeb1970ffbeb30d0baa6ba Author: Doug Moore <dougm@FreeBSD.org> AuthorDate: 2023-11-15 09:25:45 +0000 Commit: Doug Moore <dougm@FreeBSD.org> CommitDate: 2023-11-25 01:19:05 +0000 vm_phys: fix freelist_contig vm_phys_find_freelist_contig is called to search a list of max-sized free page blocks and find one that, when joined with adjacent blocks in memory, can satisfy a request for a memory allocation bigger than any single max-sized free page block. In commit fa8a6585c7522b7de6d29802967bd5eba2f2dcf1, I defined this function in order to offer two improvements: 1) reduce the worst-case search time, and 2) allow solutions that include less-than max-sized free page blocks at the front or back of the giant allocation. However, it turns out that this change introduced an error, reported in In Bug 274592. That error concerns failing to check segment boundaries. This change fixes an error in vm_phys_find_freelist_config that resolves that bug. It also abandons improvement 2), because the value of that improvement is small and because preserving it would require more testing than I am able to do. PR: 274592 Reported by: shafaisal.us@gmail.com Reviewed by: alc, markj Tested by: shafaisal.us@gmail.com Fixes: fa8a6585c752 vm_phys: avoid waste in multipage allocation MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D42509 (cherry picked from commit 2a4897bd4e1bd8430d955abd3cf6675956bb9d61) sys/vm/vm_phys.c | 146 +++++++++++++++++++++++-------------------------------- 1 file changed, 62 insertions(+), 84 deletions(-)
This does not apply to stable/13 I assume?
(In reply to Ed Maste from comment #43) commit 342056fa1c7a36d90feafd593fb980f98563f32c, which introduced the regression, has not been merged into stable/13.
A commit in branch releng/14.0 references this bug: URL: https://cgit.FreeBSD.org/src/commit/?id=4be96902ba82364810b86a6a0b3c58065e45e4cd commit 4be96902ba82364810b86a6a0b3c58065e45e4cd Author: Doug Moore <dougm@FreeBSD.org> AuthorDate: 2023-11-15 09:25:45 +0000 Commit: Mark Johnston <markj@FreeBSD.org> CommitDate: 2023-12-04 14:05:55 +0000 vm_phys: fix freelist_contig vm_phys_find_freelist_contig is called to search a list of max-sized free page blocks and find one that, when joined with adjacent blocks in memory, can satisfy a request for a memory allocation bigger than any single max-sized free page block. In commit fa8a6585c7522b7de6d29802967bd5eba2f2dcf1, I defined this function in order to offer two improvements: 1) reduce the worst-case search time, and 2) allow solutions that include less-than max-sized free page blocks at the front or back of the giant allocation. However, it turns out that this change introduced an error, reported in In Bug 274592. That error concerns failing to check segment boundaries. This change fixes an error in vm_phys_find_freelist_config that resolves that bug. It also abandons improvement 2), because the value of that improvement is small and because preserving it would require more testing than I am able to do. PR: 274592 Reported by: shafaisal.us@gmail.com Reviewed by: alc, markj Tested by: shafaisal.us@gmail.com Fixes: fa8a6585c752 vm_phys: avoid waste in multipage allocation MFC after: 10 days Differential Revision: https://reviews.freebsd.org/D42509 Approved by: so Security: FreeBSD-EN-23:20.vm (cherry picked from commit 2a4897bd4e1bd8430d955abd3cf6675956bb9d61) (cherry picked from commit 210fce73ae0e4106a3aeb1970ffbeb30d0baa6ba) sys/vm/vm_phys.c | 146 +++++++++++++++++++++++-------------------------------- 1 file changed, 62 insertions(+), 84 deletions(-)
Fixed in 14.0-RELEASE-p2.