Created attachment 215037 [details] bounce_bus_dmamap_load_buffer fix. When the arm64 version of bounce_bus_dmamap_load_buffer() exits with error EFBIG (due to too many segments), it leaves the dma map in a state where the next call fails with EFBIG too even though the next request has a single segment. This happens when a network driver calls bus_dmamap_load_mbuf_sg() with a complicated mbuf chain. Typical drivers respond to the EFBIG error by defragging the mbuf and trying the bus_dmamap_load_mbuf_sg() again. In my case, the second call would also fail in which case the driver gives up and drops the packet. It looks like this was fixed in the arm and mips versions of busdma_machdep.c at some point. In those architectures, the routine calls bus_dmamap_unload() to clean up the state if it fails with EFBIG. The same change was made to bounce_bus_dmamap_load_phys() too. I've attached a possible fix based on those. It seems to work for me. It looks like riscv also has the same issue.
It looks like x86 has the same issue as well? I guess the problem is only noticeable if the request required the allocation of bounce pages.
For MIPS it least it looks like the unload was added back in r246713 and wasn't present in the old code prior to the rework. RISC-V undoubtedly copied the bug from arm64 (just as it copied all of the other arm64 code). arm64 was copied from x86 in r282655 and x86 didn't have the unload calls added in r246713. I guess it is less work to do the unloads in the MD backends vs doing them in subr_busdma.c itself though conceptually it would be a bit cleaner to do the unload in the MI code I think. However, I think you'd ultimately have to patch more places in the MI code?
I don't think the x86 code has a problem. It looks like the arm64 code keeps a list of virtual addresses for cache operations and the list doesn't get reset when bounce_bus_dmamap_load_buffer() fails with EFBIG. I think it's because map->sync_count isn't reset to zero that the next call fails. That was introduced after the arm64 code was cloned from x86.