Bug 32672

Summary: Invalid FFS node allocation algorithm on systems with a lot of memory and lots of small files accessed
Product: Base System Reporter: vova <vova>
Component: kernAssignee: Matt Dillon <dillon>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.4-RELEASE   
Hardware: Any   
OS: Any   

Description vova 2001-12-10 14:10:00 UTC
In case of a lot of memory and lots of small files operations ('make release' in my case)
system can reach maximum of M_FFSNODE (inode) objects and deadlocks in 
ufs/ffs/ffs_vfsops.c:ffs_vget()

==============================================================================
        /*
         * Lock out the creation of new entries in the FFS hash table in
         * case getnewvnode() or MALLOC() blocks, otherwise a duplicate 
         * may occur!
         */
        if (ffs_inode_hash_lock) {
                while (ffs_inode_hash_lock) {
                        ffs_inode_hash_lock = -1;
                        tsleep(&ffs_inode_hash_lock, PVM, "ffsvgt", 0);
                }
                goto restart;
        }
        ffs_inode_hash_lock = 1;

        /*
         * If this MALLOC() is performed after the getnewvnode()
         * it might block, leaving a vnode with a NULL v_data to be
         * found by ffs_sync() if a sync happens to fire right then,
         * which will cause a panic because ffs_sync() blindly
         * dereferences vp->v_data (as well it should).
         */
        MALLOC(ip, struct inode *, sizeof(struct inode),
            ump->um_malloctype, M_WAITOK);
=========================================================================


One process gets sleeping on "FFS Node" (in MALLOC in the above code) because 
maximum of M_FFSNODE objects is reached (for me it is 0x6400000), in my case 
it was 'cvs checkout' from make release scripts.

All the other processes trying to get access to disk get locked on "ffsvgt"
(because ffs_inode_hash_lock is taken by cvs)

So some comments:

1st: I think the placement of lock and MALLOC in ffs_vget() needs to be 
changed to avoid deadlocks.
(first do MALLOC and then lock ffs_inode_hash_lock) 

2nd: We need to do something when the number of allocated ffsnode objects is exceeded (its
limit is set to vm_kmem_size/2 by default), free some cache objects or so.

Fix: 

See above
How-To-Repeat: 
Get 2Gb RAM system and run make release (with ports and docs)
Comment 1 Sheldon Hearn freebsd_committer freebsd_triage 2001-12-30 12:24:19 UTC
Responsible Changed
From-To: freebsd-bugs->dillon

FFS and lots of memory -- looks like Matt's field. :-)
Comment 2 Matt Dillon freebsd_committer freebsd_triage 2001-12-30 19:35:10 UTC
State Changed
From-To: open->closed

This is believed to be fixed in -stable (and thus for the upcoming 4.5 
release).  The problem was that the vnode/inode reclamation system depends 
on the VM system running out of memory and having to free vnodes/inodes up. 
Machines with large amounts of ram, however, will often run the malloc 
bucket for vnodes or inodes out before they run out of memory. 

Our solution is to enforce the kern.maxvnodes limit by proactively reclaiming 
vnodes/inodes when the limit is reached, even if there is still lots of free 
memory.