Bug 117954 - [ufs] dirhash on very large directories blocks the machine for tens of seconds
Summary: [ufs] dirhash on very large directories blocks the machine for tens of seconds
Status: Open
Alias: None
Product: Base System
Classification: Unclassified
Component: kern (show other bugs)
Version: 6.2-RELEASE
Hardware: Any Any
: Normal Affects Only Me
Assignee: freebsd-bugs mailing list
Depends on:
Reported: 2007-11-10 08:10 UTC by Martin Birgmeier
Modified: 2019-06-13 01:52 UTC (History)
1 user (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Martin Birgmeier 2007-11-10 08:10:00 UTC
	I am mirroring the KDE subversion repository via rsync. KDE currently holds at rev. 734839, meaning that there are two subdirectories (revs and revprops) holding 734840 files each. For this to work at all, I have enabled dirhash and set the hashing are to 32MB via vfs.ufs.dirhash_maxmem=33554432 in sysctl.conf.

	The problem is that whenever the hashing is done (i.e., after these directories have not been in the kernel for some time, and now are being accessed), they will be read in by the dirhash algorithm, and doing this, consume lots of processor time (my xload jumps to 8+ all at once), and, as far as I can make out in such a situation, also all (or at least most) of the available disk bandwidth.
	For my machine the behavior is so bad that for about a minute the X Window system freezes completely (including the cursor). (Note that in fact it is more like 2 x 30 secs, obviously for each of the two directories involved.) The xload spike is becoming visible after this. Also, as I am using pppoa (ADSL over USB, basically), the buffers allotted to this are exhausted, as shown by log messages to the console. To me this looks like even interrupts are not serviced any more.


I assume that the fix involves modifying the dirhash algorithm such that it obeys standard process scheduling behavior, esp. with regard to relinquishing the CPU according to the process' scheduling parameters.
	This probably means that the syscall in question can no longer be implemented as a single atomic operation (which it currently seems to be).
	Since I am no expert in this area, please take those ideas with a grain of salt!

Please note that the e-mail address given above is not valid, as I am paranoid about spam. Simply reply via adding to the PR, I'll monitor it regularly.
How-To-Repeat: 	Enter a directory with > 250 k entries after it has not been accessed for a long time.
Comment 1 Robert Watson freebsd_committer 2008-03-07 18:29:08 UTC
Responsible Changed
From-To: freebsd-bugs->iedowse

Over to Ian, who wrote UFS dirhash.
Comment 2 Mark Linimon freebsd_committer freebsd_triage 2009-05-28 23:08:53 UTC
Responsible Changed
From-To: iedowse->freebsd-bugs

iedowse is not actively working on this problem ATM.
Comment 3 John Baldwin freebsd_committer freebsd_triage 2010-04-01 14:01:57 UTC
While the kernel scheduler will not preempt a thread in the kernel (e.g. 
during a system call) if a timeslice expires, it will preempt that thread for 
interrupts (assuming you have 'options PREEMPTION' enabled which has been on 
by default in GENERIC for some time now on i386), thus the dirhash 
calculations should not starve interrupts.  However, X is not an interrupt, so 
while things like ping should still work, X will not get to run.

While it would be tempting to defer the hashing of the directory contents to 
an asynchronous task for large directories running in a thread with a low 
priority, this might have bad side effects due to priority inversions related 
to a very low priority thread holding various vnode locks.

John Baldwin
Comment 4 Alexander Best freebsd_committer 2010-09-07 01:21:40 UTC
Responsible Changed
From-To: freebsd-bugs->freebsd-fs

Over to maintainer(s).
Comment 5 Eitan Adler freebsd_committer freebsd_triage 2017-12-31 08:01:02 UTC
For bugs matching the following criteria:

Status: In Progress Changed: (is less than) 2014-06-01

Reset to default assignee and clear in-progress tags.

Mail being skipped