Bug 257318

Summary: uma consumes memory when reattaching device
Product: Base System Reporter: jcaplan
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: New ---    
Severity: Affects Some People    
Priority: ---    
Version: 13.0-STABLE   
Hardware: Any   
OS: Any   

Description jcaplan 2021-07-21 19:25:30 UTC
Overview
--------

After patching in fixes for all identified leaks, detaching and re-attaching the same device consistently results in kmem allocation.

currently open leaks:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256714
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257218

Steps to Reproduce:
-------------------

#!/bin/bash

for i in {1..1000}; do
        echo $i
        devctl detach pci0:3:0:0
        devctl attach pci0:3:0:0
        sysctl vm.uma_kmem_total
done

Expected Result
---------------

Perhaps after some initial warming stage, detaching and re-attaching the same device should not increase the amount of memory consumed by uma.

Actual Result
-------------

Even after many thousands of iterations, uma_kmem_total continues to increase

Build Date & Hardware
---------------------
FreeBSD freebsd 13.0-RELEASE FreeBSD 13.0-RELEASE #7 releng/13.0-n244733-ea31abc261f-dirty: Wed Jul 21 14:53:10 UTC 2021

Additional Information
----------------------

Seem to keep hitting keg_alloc_slab from cache_alloc_retry
Comment 1 jcaplan 2021-08-20 18:24:14 UTC
Just to add a bit more detail.

A single iteration might look like this:

# sysctl vm.uma_kmem_total
vm.uma_kmem_total: 26247168
#devctl attach pci0:3:0:0
#devctl detach pci0:3:0:0
# sysctl vm.uma_kmem_total                 
vm.uma_kmem_total: 26251264


Comparing the output before and after for vmstat -m shows that there is no change in dynamic memory allocated from malloc().

However we see from vm.uma_kmem_total that the UMA is gradually consuming more memory, and a diff of vmstat -z shows several changes in zone allocations at the UMA layer.

It appears that all allocations are accounted for at the malloc level yet there is a continuous slow leak of additional pages allocated at the UMA level.

I'm not sure there's anything particularly special about mounting/unmounting, it's just an obviously reversible operation that allocates and then frees a lot of memory. So it's possible this is a specific instance of a more general problem.