1. debugnet_mbuf_reinit() is racy. With netdump we would only populate the mbuf cache when a device was *configured*. Now we populate the cache when the device comes up and if it *supports* debugnet. Thus if we have a driver with multiple devices then each device coming up will cause debugnet_mbuf_reinit() to race between multiple threads while touching the mbufqs. This is easily fixed but leaves more issues. Doing this during driver link up makes sense because we may not configure the device until after panic in ddb with .netdump. 2. dn_buf_import() may overflow an mbuf from the queue with trash_init() on <without INVARIANTS>. If 1 device has jumbo frames, MTU 9000, and the other normal MTU of 1500, the hwm/dn_clsize can become MJUM9BYTES (9216). [This next part may only be a problem for something like mlx4 which has some cached mbufs of its own. This can be seen in mlx4_en_alloc_buf() where it appears to always keep 1 extra mbuf around for each ring. It appears it may use that mbuf at panic time if mlx4_en_alloc_mbuf() fails. The issue I ran into downstream was a very different allocation scenario but the FreeBSD version appears to have a similar issue.] If the device that is used at dump time has an MTU of 1500 it is possible for the device to return a smaller mbuf to the dn_clustq than expected for that zone (vs the high water mark of 9216). When it is removed in dn_buf_import() it has trash_init(9216) ran over it rather than the expected MCLBYTES size.
^Triage: Unsure of specific relevance, but see also src 5a7de2b42caf via 258923 given mlx mention here and mlx, panic, debugnet_mbuf_reinit() and debugnet activation there.
See also base 5a7de2b42caf via bug 258923 apologies.
(In reply to Kubilay Kocak from comment #2) None of this is a recent regression. It is design flaws.