Bug 32659

Summary: [vm] [patch] vm and vnode leak with vm.swap_idle_enabled=1
Product: Base System Reporter: gemini
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me CC: gemini
Priority: Normal    
Version: 4.4-STABLE   
Hardware: Any   
OS: Any   
Attachments:
Description Flags
file.diff none

Description gemini 2001-12-10 00:20:00 UTC
In /usr/src/sys/vm/vm_glue.c (around line 482) there are two
if() clauses that deal with VM_SWAP_IDLE, the flag triggered by
"sysctl vm.swap_idle_enabled=1".

For this code to actually work the two if() clauses need to be logical
mirrors. However, they aren't. The case
"p->p_slptime == swap_idle_threshold2" isn't covered at all. Since we
have a granularity of one second here it won't take a busy system
too long to pass the code with both variables equal.

What happens then is that no swap-out takes place and the for() loop
just iterates to the next process. However, since we already
incremented "vm->vm_refcnt" by one at this point and don't drop it by
calling vmspace_free() we effectively have a stuck VM object now.

When the process later exits this also prevents the VNODE of
the backing object from being released, so we have a stuck VNODE
as well. In my case it was "/bin/sh".

Originally, I started my investigation because I couldn't "umount"
the respective filesystem without "-f", for no apparent reason.
I found out that the VNODE of "/bin/sh" had a usage count of 12,
without any matching processes running. After some days of searching
I finally traced it back to the code location given above.

Fix: When you know where to look the reason for the problem and its fix
are obvious. I suggest fixing it in a way outlined by the patch
below. With that change the problem went away and I didn't notice
any side effects.
How-To-Repeat: Set "vm.swap_idle_enabled=1" and run
"make -j4 -DMAKE_KERBEROS4 -DMAKE_KERBEROS5 buildworld" in
an endless loop for some hours. Then try to "umount" the filesystem.

Normally this is a problem since I think you can't "umount" the root
filesystem where "/bin/sh" resides on an active system. So instead
you may want to try it via the VN driver. I used one of these file
based filesystems containing a system set up for jail() and actually
ran the "buildworld" loop inside a jail. Then it shouldn't be a
problem to "umount" the filesystem and see how it balks.
Comment 1 Sheldon Hearn freebsd_committer freebsd_triage 2001-12-10 12:17:32 UTC
Responsible Changed
From-To: freebsd-bugs->jhb

John, you were the last person to fidle with the swap_idle stuff 
(in rev 1.114 of vm_glue.c).  Could you take a look?
Comment 2 Matt Dillon freebsd_committer freebsd_triage 2002-02-26 07:56:58 UTC
Responsible Changed
From-To: jhb->dillon

Mike Silbersack and I are working in this part of the system and we 
believe we may have reproduced the bug.  We will test a simplified 
version of the patch and hopefully commit the fix soon to -current 
and -stable.
Comment 3 Giorgos Keramidas freebsd_committer freebsd_triage 2003-02-23 02:16:14 UTC
Responsible Changed
From-To: dillon->freebsd-bugs

Back to the free pool.
Comment 4 K. Macy freebsd_committer freebsd_triage 2007-11-18 08:23:02 UTC
State Changed
From-To: open->feedback


Is this still an issue on RELENG_6 or RELENG_7?
Comment 5 Alan Cox freebsd_committer freebsd_triage 2007-11-23 18:57:48 UTC
State Changed
From-To: feedback->closed

The vmspace reference leak was eliminated in revision 1.94.2.3 
of vm_glue.c.