Bug 250929

Summary: sysutils/lsof consumes all swap space
Product: Ports & Packages Reporter: Sigi <freebsd-bt>
Component: Individual Port(s)Assignee: Larry Rosenman <ler>
Status: New ---    
Severity: Affects Only Me CC: Waldemar.dick, attila.kover, cy, devin, freebsd-doc, theraven
Priority: --- Flags: bugzilla: maintainer-feedback? (ler)
Version: Latest   
Hardware: amd64   
OS: Any   

Description Sigi 2020-11-07 16:15:40 UTC
Hi,

lsof without parameters consumes swap until full. 
System: 12.2-STABLE FreeBSD 12.2-STABLE r366130 GENERIC  amd64
lsof: rebuilt with portmaster. 
RAM 16GB, swap increased to 32GB just for fun. Still kills other services during periodic/weekly with its behavior.

Oct 31 05:01:59 myservername kernel: pid 5909 (mysqld), jid 0, uid 88, was killed: out of swap space
Oct 31 05:02:09 myservername kernel: pid 33760 (mongod), jid 0, uid 975, was killed: out of swap space
Oct 31 05:02:24 myservername kernel: pid 30034 (netdata), jid 0, uid 302, was killed: out of swap space
Oct 31 05:03:03 myservername kernel: pid 2873 (lsof), jid 0, uid 0, was killed: out of swap space
Oct 31 05:03:04 myservername kernel: pid 3338 (clamd), jid 0, uid 106, was killed: out of swap space
Oct 31 05:03:05 myservername kernel: pid 51377 (named), jid 0, uid 53, was killed: out of swap space
Nov  2 03:35:00 myservername kernel: pid 2427 (mysqld), jid 0, uid 88, was killed: out of swap space
Nov  2 03:35:19 myservername kernel: pid 53224 (lsof), jid 0, uid 0, was killed: out of swap space
Nov  7 04:56:36 myservername kernel: pid 2737 (netdata), jid 0, uid 302, was killed: out of swap space
Nov  7 04:57:25 myservername kernel: pid 3179 (mongod), jid 0, uid 975, was killed: out of swap space
Nov  7 04:57:40 myservername kernel: pid 63601 (mysqld), jid 0, uid 88, was killed: out of swap space
Nov  7 04:57:54 myservername kernel: pid 55095 (lsof), jid 0, uid 0, was killed: out of swap space
Nov  7 04:57:56 myservername kernel: pid 1212 (named), jid 0, uid 53, was killed: out of swap space
Nov  7 11:34:01 myservername kernel: pid 76164 (lsof), jid 0, uid 0, was killed: out of swap space
Nov  7 12:33:39 myservername kernel: pid 44701 (lsof), jid 0, uid 0, was killed: out of swap space

Had occasionally some output:
lsof: WARNING: name cache hash chain too long

and maaaany syslogs like
kernel: swap_pager_getswapspace(18): failed
Comment 1 Frank Leonhardt 2020-11-30 18:00:09 UTC
Doesn't it just! It ate through 32Gb of RAM and a 4Gb swap file for me - took a remote server out in the process.

The first time I ran it, nothing was displayed for several minutes so I broke out of it and tried again with the option for "non-blocking" system calls. Some while later it was "Thankyou and goodnight" from FreeBSD 12.2-RELEASE. Having driven 100 miles to find out what happened, I looked in the logs and it was clear the swap space had been eaten.

The only other thing running was nsfd, and I had cp in the background copying data to a small (400Gb) ZFS dataset - booted of UFS.
Comment 2 Cy Schubert freebsd_committer 2020-11-30 18:34:23 UTC
Can you provide uname -a output please.
Comment 3 David Chisnall freebsd_committer 2021-01-25 12:03:53 UTC
(In reply to Cy Schubert from comment #2)

```
$ uname -a
FreeBSD {hostname} 12.2-RELEASE-p1 FreeBSD 12.2-RELEASE-p1 GENERIC  amd64
```

This seems to be a fairly recent regression.  I saw it because I have rkhunter running its periodic security job and it dies (it also sometimes kills a bunch of other things because it runs the entire system out of memory).  Running lsof myself, I can see exactly the behaviour described here.  I have fdescfs mounted, but not procfs.

Looking at ktrace, it appears as if it is opening /dev/kmem and then reading the entire contents in a loop.  It is reading from fd 4 in a loop and that appears to have been opened very early on in the trace.
Comment 4 attila.kover 2021-02-02 18:18:11 UTC
Same here. The problem persist regardless of the amount and existence of swap.

sysctl hw.realmem
hw.realmem: 8589934592

uname -a
FreeBSD {hostname} 12.2-RELEASE-p1 FreeBSD 12.2-RELEASE-p1 GENERIC  amd64

freebsd-version -kru
12.2-RELEASE-p1
12.2-RELEASE-p1
12.2-RELEASE-p2

Best
Attila
Comment 5 Tod McQuillin 2021-02-11 04:13:26 UTC
I believe this is a duplicate of 250916
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250916

This bug talks about it going cpu bound and never exiting -- but at the same time it is also allocating memory.

I had the same symptoms until I upgraded my ports tree past r554915

Sigi, does this still happen with the latest lsof build from ports?