Bug 253281 - enabling ktls leads to severe kernel memory leak
Summary: enabling ktls leads to severe kernel memory leak
Status: New
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 13.0-STABLE
Hardware: amd64 Any
: --- Affects Some People
Assignee: freebsd-bugs (Nobody)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-05 21:46 UTC by weiss
Modified: 2021-05-30 18:59 UTC (History)
7 users (show)

See Also:


Attachments
dmesg.boot (13.64 KB, text/plain)
2021-02-05 21:46 UTC, weiss
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description weiss 2021-02-05 21:46:36 UTC
Created attachment 222192 [details]
dmesg.boot

System:
13.0-ALPHA3 on amd64 
lagg with two ix 10 gigabit ethernet adaptors
ktls_intel-isa-l crypto software crypto
nginx-lite as caching reverse proxy between client and web server on
different machines, both client - proxy and proxy - web server
connections over tls

with ktls enabled (kern.ipc.tls.enable=1) kernel memory disappears within 1 hour.
processes get killed by oom condition, system must be booted to get back to normal - terminating nginx is not enough

top:
Fri Feb  5 00:07:47 CET 2021
last pid:  6283;  load averages:  6.01,  4.65,  3.73  up 0+00:17:18    00:07:47
52 processes:  2 running, 50 sleeping
CPU:  4.6% user,  0.0% nice, 15.1% system,  0.1% interrupt, 80.2% idle
Mem: 1850M Active, 16G Inact, 2300K Laundry, 30G Wired, 10G Free
ARC: 27G Total, 5669M MFU, 20G MRU, 682M Anon, 118M Header, 612M Other
     24G Compressed, 28G Uncompressed, 1.16:1 Ratio
Swap: 16G Total, 13M Used, 16G Free

top:
Fri Feb  5 01:03:11 CET 2021
last pid:  6460;  load averages:  4.51,  5.62,  5.83  up 0+01:12:42    01:03:11
54 processes:  4 running, 50 sleeping
CPU:  2.9% user,  0.0% nice, 15.1% system,  0.1% interrupt, 81.9% idle
Mem: 7560K Active, 5204K Inact, 4314M Wired, 248M Free
ARC: 2060M Total, 156M MFU, 891M MRU, 468K Anon, 19M Header, 993M Other
     166M Compressed, 936M Uncompressed, 5.62:1 Ratio
Swap: 16G Total, 134M Used, 16G Free

netstat -m
25011/159/25170 mbufs in use (current/cache/total)
24698/2/24700/4074507 mbuf clusters in use (current/cache/total/max)
0/0 mbuf+clusters out of packet secondary zone in use (current/cache)
0/0/0/2037253 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/603630 9k jumbo clusters in use (current/cache/total/max)
0/0/0/339542 16k jumbo clusters in use (current/cache/total/max)
55648K/43K/55692K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
100940570 sendfile syscalls
28612821 sendfile syscalls completed without I/O request
72578063 requests for I/O initiated by sendfile
305506031 pages read by sendfile as part of a request
288967671 pages were valid at time of a sendfile request
197618 pages were valid and substituted to bogus page
0 pages were requested for read ahead by applications
377190 pages were read ahead by sendfile
4494 times sendfile encountered an already busy page
0 requests for sfbufs denied
0 requests for sfbufs delayed


same setup, with  ktls not enabled, nothing bad happens even after 1/2 a day
Comment 1 John Baldwin freebsd_committer freebsd_triage 2021-05-27 19:56:53 UTC
Can you confirm via the kern.ipc.tls.stats sysctls that KTLS is being used?  Are you able to reproduce the same leak if you use ktls_ocf with either aesni or security/isal-kmod from ports?
Comment 2 weiss 2021-05-30 18:59:32 UTC
yes, I verified with kern.ipc.tls.stats that KTLS was beeing used. I tested both ktls_ocf and ktls_intel-isa-l. Problem occurred in both cases.

I just upgraded to 13.0-RELEASE-p1 with freebsd-update, updated nginx from ports and the problem went away. Tested with both ktls_ocf and ktls_intel-isa-l. I wonder why this happened. The new nginx in userland should not make a difference and on first sight I did not find a change in -p1 related to ktls.