Bug 21009

Summary: /etc/security make the system hangup
Product: Base System Reporter: Masachika ISHIZUKA <ishizuka>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: 4.1-RELEASE   
Hardware: Any   
OS: Any   

Description Masachika ISHIZUKA 2000-09-03 07:00:01 UTC
	daily cron crashs system if I have too many files
	on disks as follows.

	/etc/crontab (periodic daily)
	-> /etc/periodic/daily/450.status-security
	-> /etc/security

Fix: 

I don't know.
	I have too many files on the disk mounted to /www,
	I changed line 30 of /etc/security from

	MP=`mount -t ufs | grep -v " nosuid" | sed 's;/dev/;&r;'
	     | awk '{ print $3 }'`

	to

	MP=`mount -t ufs | grep -v " nosuid" | sed 's;/dev/;&r;'
	     | awk '{ print $3 }' | grep -v '^/www$'`

	But it is not good idea, I think.
Comment 1 Sheldon Hearn freebsd_committer freebsd_triage 2000-09-04 14:32:11 UTC
State Changed
From-To: open->feedback

Please explain what "crashs system" means.  Since the environment 
you describe is hard to set up (especially given that you don't 
define "too many files"), this will be easier to investigate 
based on the actual error messages you see.
Comment 2 Masachika ISHIZUKA 2000-09-06 10:45:56 UTC
> Synopsis: /etc/security make the system hangup
> State-Changed-From-To: open->feedback
> State-Changed-By: sheldonh
> State-Changed-When: Mon Sep 4 06:32:11 PDT 2000
> State-Changed-Why: 
>
> Please explain what "crashs system" means.  Since the environment
> you describe is hard to set up (especially given that you don't
> define "too many files"), this will be easier to investigate
> based on the actual error messages you see.
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=21009

  Hi, sheldonh-san.
  Thank you for your mail.
There is no error messages. I can't login if it happens. There is
no response from ping.
  Only Alt-F? is available, but after screen is switched by Alt-F?,
I push return key and the login prompt will be scrolled up with no
news messages. After I typed return key for 25 times, there is no
messages on screen at all.

  There are 4 machines with hangup. The following is the output of
'df -ki' command. All machines are 4.1-RELEASE. Most of directories
and files on the disk mounted on /www have about 5000 hard links,
so there are more than 6,000,000 links, files or directories on each
/www disks.
  The running time for /etc/security is about 1.5 or 2.5 hours on
Pentium III/600 or Pentium II/400.  The ccd0c and /dev/vinum/www
are two 16GB or 20GB UDMA33 ata drives with striped.

(1)
/dev/ccd0c   17591175  1999479 14184402    12%  443610 3952932    10%   /www

(2)
/dev/ccd0c   17072631  2645276 13061545    17%  448429 8096465     5%   /www

(3)
/dev/vinum/www  17072515   353165 15353549     2%  119432 8425462     1%   /www

(4)
/dev/ccd0c   17072631  2407959 13298862    15%  454757 8090137     5%   /www

-- 
ishizuka@ish.org
Comment 3 Sheldon Hearn 2000-09-06 12:15:33 UTC
On Wed, 06 Sep 2000 02:50:04 MST, Masachika ISHIZUKA wrote:

>    There are 4 machines with hangup. The following is the output of
>  'df -ki' command. All machines are 4.1-RELEASE. Most of directories
>  and files on the disk mounted on /www have about 5000 hard links,
>  so there are more than 6,000,000 links, files or directories on each
>  /www disks.
>    The running time for /etc/security is about 1.5 or 2.5 hours on
>  Pentium III/600 or Pentium II/400.  The ccd0c and /dev/vinum/www
>  are two 16GB or 20GB UDMA33 ata drives with striped.

Could you stick a debugging kernel on one of those boxes and use DDB or
remote kgdb to figure out what the kernel's stuck in?  There are just
two many variables here.

Instructions that you might find helpful are available at:

	http://www.freebsd.org/handbook/kerneldebug.html

Ciao,
Sheldon.
Comment 4 Masachika ISHIZUKA 2000-09-06 17:13:49 UTC
>>    There are 4 machines with hangup. The following is the output of
>>  'df -ki' command. All machines are 4.1-RELEASE. Most of directories
>>  and files on the disk mounted on /www have about 5000 hard links,
>>  so there are more than 6,000,000 links, files or directories on each
>>  /www disks.
> 
> Could you stick a debugging kernel on one of those boxes and use DDB or
> remote kgdb to figure out what the kernel's stuck in?  There are just
> two many variables here.
>
> Instructions that you might find helpful are available at:
> 
> 	http://www.freebsd.org/handbook/kerneldebug.html

  Hi, sheldonh-san.
  Thank you for mail.
I'll try above. But these machines are not able to panic, so I
must create a test machine with the same environment. Because
I'm too busy to do so, please wait for a couple of weeks or more.

-- 
ishizuka@ish.org
Comment 5 Sheldon Hearn 2000-09-06 17:53:31 UTC
On Thu, 07 Sep 2000 01:15:20 +0900, Masachika ISHIZUKA wrote:

> I'll try above. But these machines are not able to panic, so I
> must create a test machine with the same environment. Because
> I'm too busy to do so, please wait for a couple of weeks or more.

Another suggestion that I got in private mail was this:

Run something like this on its own virtual terminal:

	while sleep 1; do
		vmstat -m | tail -2
	done

Report back the values displayed at the time of the lock-up.  This
should be easy to do, since you said that you were able to switch
virtual terminals during the lock-up.

Ciao,
Sheldon.
Comment 6 Dag-Erling Smørgrav freebsd_committer freebsd_triage 2001-03-13 04:04:34 UTC
State Changed
From-To: feedback->closed

Feedback timeout.