I'm benchmarking the Squid proxy cache, which generates a lot of disk I/O. After an hour or two, all of the Squid processes get stuck in the 'ffsvgt' state, and one of them is stuck in 'FFS no': PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 368 wessels 98 0 102M 99036K RUN 106:01 98.14% 98.14% squid 375 wessels -18 0 1272K 812K ffsvgt 2:23 0.00% 0.00% diskd 373 wessels -18 0 1272K 812K ffsvgt 2:22 0.00% 0.00% diskd 374 wessels -20 0 1272K 812K FFS no 2:22 0.00% 0.00% diskd 372 wessels -18 0 1272K 812K ffsvgt 2:20 0.00% 0.00% diskd 370 wessels -18 0 1272K 812K ffsvgt 2:19 0.00% 0.00% diskd 371 wessels -18 0 1272K 812K ffsvgt 2:19 0.00% 0.00% diskd Although the system continues to run, it seems like any process that tries to access a filesystem also becomes stuck in the ffsvgt state. I can't start new commands, log in, etc. Filesystems are mounted normally. No special options: /dev/da0c /cache1 ufs rw 0 3 /dev/da1c /cache2 ufs rw 0 3 /dev/da2c /cache3 ufs rw 0 3 /dev/da3c /cache4 ufs rw 0 3 /dev/da4c /cache5 ufs rw 0 3 /dev/da5c /cache6 ufs rw 0 3 I gather that 'FFS no' is related to filesystem vnodes. I wonder if I'm running out of vnodes, or if they're not getting freed? Here's systat after the benchmark has been running for 20 minutes: 2 users Load 1.23 1.16 0.90 Fri Jun 23 18:10 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 68748 1064 74260 3860 21700 count All 513584 1440 3026816 4408 pages 53 cow Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt 75 zfod 2377 total 1 7 4 527 297 6247 2377 268 152 50408 wire 100 clk0 irq0 71716 act 128 rtc0 irq8 45.3%Sys 6.3%Intr 47.1%User 0.0%Nice 1.2%Idl 370040 inact 1864 pci irq10 | | | | | | | | | | 20564 cache 266 pci irq11 =======================+++>>>>>>>>>>>>>>>>>>>>>> 1136 free fdc0 irq6 daefr 19 wdc0 irq14 Namei Name-cache Dir-cache 106 prcfr Calls hits % hits % react 1920 1529 80 6 0 pdwake 525 pdpgs Discs da0 da1 da2 da3 da4 da5 wd0 intrn KB/t 7.18 7.38 8.13 8.26 10.58 8.03 8.00 8342 buf tps 55 57 50 40 18 47 1 202 dirtybuf MB/s 0.39 0.41 0.39 0.32 0.18 0.37 0.01 32702 desiredvnodes % busy 25 24 25 16 12 21 0 55047 numvnodes 24 freevnodes How-To-Repeat: Install the latest squid-2.4 code and pummel it with a Web Polygraph workload.
State Changed From-To: open->feedback Does this problem still occur in newer versions of FreeBSD, such as 4.3-RELEASE?
Adding to Audit-Trail. On Sun, Jul 22, 2001 at 05:56:16AM -0600, Duane Wessels wrote: > > > On Sat, 21 Jul 2001 mike@FreeBSD.org wrote: > > > Synopsis: processes stuck in 'ffsvgt' and 'FFS no' states > > > > State-Changed-From-To: open->feedback > > State-Changed-By: mike > > State-Changed-When: Sat Jul 21 21:24:15 PDT 2001 > > State-Changed-Why: > > > > Does this problem still occur in newer versions of FreeBSD, > > such as 4.3-RELEASE? > > > > http://www.FreeBSD.org/cgi/query-pr.cgi?pr=19479 > > Nope. I've only seen this in 3.x versions. I haven't checked any > recent 3.x versions. All my servers run 4.x now. Back then, it > was really easy to reproduce the problem, however. > >
State Changed From-To: feedback->closed This is no longer a problem in newer versions of FreeBSD.
Adding to Audit-Trail. ----- Forwarded message from Chuck Cranor <chuck@research.att.com> ----- Delivered-To: mike@freebsd.org Date: Fri, 21 Sep 2001 13:41:34 -0400 From: Chuck Cranor <chuck@research.att.com> To: freebsd-stable@FreeBSD.org Cc: mike@FreeBSD.org Subject: more on 4.4-RELEASE kernel deadlock ("FFS node") User-Agent: Mutt/1.2.5i Organization: AT&T Labs-Research this is a followup to my previous message: first, the problem i am seeing is almost identical to the one reported in kern/19479 (see http://www.FreeBSD.org/cgi/query-pr.cgi?pr=19479) which was closed by mike@freebsd back in july. my data would suggest that the PR should be re-opened, because it doesn't seem fixed? i reran a "make release" while having a "vmstat -m | grep 'FFS node'" running every three seconds in another window. the memory allocation for "FFS node" is exploding during the "make release"... here is the last bit of vmstat -m output before the kernel deadlocked: Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) .... FFS node199352 99676K 99677K102400K 648929 0 0 512 FFS node199871 99936K 99936K102400K 649930 0 0 512 FFS node200457100229K 100229K102400K 651733 0 0 512 FFS node200794100397K 100398K102400K 653082 0 0 512 FFS node200875100438K 100440K102400K 653902 0 0 512 FFS node201466100733K 100733K102400K 655886 0 0 512 FFS node202732101366K 101367K102400K 658611 0 0 512 FFS node203300101650K 101651K102400K 660008 0 0 512 FFS node203851101926K 101926K102400K 661718 0 0 512 FFS node203923101962K 101962K102400K 661927 0 0 512 FFS node204489102245K 102245K102400K 663650 0 0 512 <deadlock here!> 102245K seems like an awful lot of memory for FFS nodes? what would cause this? does anyone else see very large FFS node memory allocation during "make release"? i got a stack trace of the process stuck in "FFS node"... it is basically: open vn_open namei lookup ufs_vnoperate vfs_cache_lookup ufs_vnoperate ufs_lookup ffs_vget +0x8b malloc + 0x1a3 tsleep clearly we are reaching the malloc limit for "FFS node" and hanging. chuck ----- End forwarded message -----
State Changed From-To: closed->open The originator's problem disappeared, but unfortunately another user is seeing a similar problem.
State Changed From-To: open->feedback This problem should be fixed in recent -STABLE versions. One problem was that I had accidentally pushed the size of the inode struct over 256 bytes, but a bigger issue relating to limiting the number of vnodes was fixed by Matt Dillon about a month ago. Can you confirm that this is fixed now?
State Changed From-To: feedback->closed Feedback timeout.