Bug 20804

Summary: deadlocking when using vnode disk file and quotas
Product: Base System Reporter: chrisx77 <chrisx77>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description chrisx77 2000-08-23 17:40:01 UTC
The problem is that I want the quotas 
to work inside of a jail which is running with the vnode disk as it's "root" 
directory.  The problem that I'm having is that after turning on the quotas 
(this must be done outside of the jail), if I try to access a file which 
already has a quota, or I try to modify the quota using edquota (from inside 
of the jail), the process locks.  Once the process is locked, there is no 
way to kill it, and any subsequent accesses to the vnode disk also locks. 
These processes will stop in 1 of 5 different tsleep statements (inode, 
chkdq1, chkdq2, chkiq1, or chkiq2).  I am assuming that the problems is that 
the main filesystem puts a lock on the vnode disk file, and subsequently 
tries to lock the indoe inside of the file and is unable b/c of the previous 
lock on the vnode disk file.  

This is being submitted at the request of Robert Watson.

Fix: 

none as of yet.
How-To-Repeat: set up a jail whose root directory is a filesystem in a file (vnode disk), and then turn on quotas from outside of the jail.  After doing this, from inside the jail, use the edquota command
Comment 1 Sheldon Hearn 2000-08-24 10:19:33 UTC
On Wed, 23 Aug 2000 09:32:12 MST, chrisx77@hotmail.com wrote:

> >Number:         20804
> >Category:       kern
> >Synopsis:       deadlocking when using vnode disk file and quotas
[...]

> This is being submitted at the request of Robert Watson.

Robert, do you want this PR for yourself, or was it your intention to
have it assigned to phk?

Ciao,
Sheldon.
Comment 2 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-24 14:18:28 UTC
State Changed
From-To: open->closed

Duplicate of PR 20787.
Comment 3 Robert Watson freebsd_committer freebsd_triage 2000-08-24 14:37:05 UTC
On Thu, 24 Aug 2000, Sheldon Hearn wrote:

> On Wed, 23 Aug 2000 09:32:12 MST, chrisx77@hotmail.com wrote:
> 
> > >Number:         20804
> > >Category:       kern
> > >Synopsis:       deadlocking when using vnode disk file and quotas
> [...]
> 
> > This is being submitted at the request of Robert Watson.
> 
> Robert, do you want this PR for yourself, or was it your intention to
> have it assigned to phk?

If phk will accept it, that's probably best, as I believe md is his baby,
and it's a recursive locking issue relating to UFS and md.  If not, I'll
take it but it may take me a bit to solve the problem as I'll be moving
to Massachusetts next week, and will be dealing with packing, furniture,
etc. :-)  I did request that a PR be filed as I think this is an important
bug for us to work through.


  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services
Comment 4 Robert Watson freebsd_committer freebsd_triage 2000-08-24 14:39:17 UTC
Poul-Henning,

Do you have a chance to look at this bug, which was raised on -fs
recently?  It has to do, I believe, with a recursive lock/deadlock issue
relating to the md device and the UFS quota implementation, presumably a
lock ordering issue relating to the md device vnode lock and the vnode
lock for the quota data file on that file system.  I haven't had a chance
to look much further, and apparently it mostly manifests in jail (I'm not
sure I understand why that might be the case -- possibly having to do with
the starting point (root) for recursive name lookups with lock requests).

Thanks,

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services

---------- Forwarded message ----------
Date: Thu, 24 Aug 2000 11:19:33 +0200
From: Sheldon Hearn <sheldonh@uunet.co.za>
To: rwatson@freebsd.org
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: kern/20804: deadlocking when using vnode disk file and quotas 



On Wed, 23 Aug 2000 09:32:12 MST, chrisx77@hotmail.com wrote:

> >Number:         20804
> >Category:       kern
> >Synopsis:       deadlocking when using vnode disk file and quotas
[...]

> This is being submitted at the request of Robert Watson.

Robert, do you want this PR for yourself, or was it your intention to
have it assigned to phk?

Ciao,
Sheldon.
Comment 5 Poul-Henning Kamp 2000-08-24 14:43:02 UTC
In message <Pine.NEB.3.96L.1000824093709.31571B-100000@fledge.watson.org>, Robe
rt Watson writes:
>
>Poul-Henning,
>
>Do you have a chance to look at this bug, which was raised on -fs
>recently?  It has to do, I believe, with a recursive lock/deadlock issue
>relating to the md device and the UFS quota implementation, presumably a
>lock ordering issue relating to the md device vnode lock and the vnode
>lock for the quota data file on that file system.  I haven't had a chance
>to look much further, and apparently it mostly manifests in jail (I'm not
>sure I understand why that might be the case -- possibly having to do with
>the starting point (root) for recursive name lookups with lock requests).
>
>Thanks,
>
>  Robert N M Watson 

Robert,

No I haven't even looked at it.  The jail stuff doesn't even know
what a lock is, much less touch one, so the jail involvement was
not enough to make me put it on my busy schedule...

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD coreteam member | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Comment 6 Sheldon Hearn freebsd_committer freebsd_triage 2000-08-24 14:48:09 UTC
State Changed
From-To: closed->open

PR grunt on drugs.
Comment 7 Sheldon Hearn 2000-08-24 14:50:31 UTC
On Thu, 24 Aug 2000 15:43:02 +0200, Poul-Henning Kamp wrote:

> No I haven't even looked at it.  The jail stuff doesn't even know
> what a lock is, much less touch one, so the jail involvement was
> not enough to make me put it on my busy schedule...

From Robert's earlier comments, it sounds more like an MD thing than a
jail thing?

Ciao,
Sheldon.
Comment 8 Robert Watson freebsd_committer freebsd_triage 2000-08-24 14:56:34 UTC
On Thu, 24 Aug 2000, Poul-Henning Kamp wrote:

> In message <Pine.NEB.3.96L.1000824093709.31571B-100000@fledge.watson.org>, Robe
> rt Watson writes:
> >
> >Poul-Henning,
> >
> >Do you have a chance to look at this bug, which was raised on -fs
> >recently?  It has to do, I believe, with a recursive lock/deadlock issue
> >relating to the md device and the UFS quota implementation, presumably a
> >lock ordering issue relating to the md device vnode lock and the vnode
> >lock for the quota data file on that file system.  I haven't had a chance
> >to look much further, and apparently it mostly manifests in jail (I'm not
> >sure I understand why that might be the case -- possibly having to do with
> >the starting point (root) for recursive name lookups with lock requests).
> 
> No I haven't even looked at it.  The jail stuff doesn't even know
> what a lock is, much less touch one, so the jail involvement was
> not enough to make me put it on my busy schedule...

My impression was that the bug had more to do with "md" -- is md yours, or
someone else's?  The reason I'm looking at the jail issue is that that's
how the reporter of the bug described it.  The interaction with jail() has
to do with chroot() and recursive name lookup and locking: when a process
is jailed, the name lookup, and hence tree walking and lock handling,
starts at a different point in the tree, probably with a root in the md
file system, which could affect the locking issues.


  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services
Comment 9 Poul-Henning Kamp 2000-08-24 14:58:27 UTC
In message <84319.967125031@axl.fw.uunet.co.za>, Sheldon Hearn writes:
>
>
>On Thu, 24 Aug 2000 15:43:02 +0200, Poul-Henning Kamp wrote:
>
>> No I haven't even looked at it.  The jail stuff doesn't even know
>> what a lock is, much less touch one, so the jail involvement was
>> not enough to make me put it on my busy schedule...
>
>>From Robert's earlier comments, it sounds more like an MD thing than a
>jail thing?

The only thing magic about MD is it's access time...

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD coreteam member | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Comment 10 Robert Watson freebsd_committer freebsd_triage 2000-08-24 15:01:33 UTC
On Thu, 24 Aug 2000, Poul-Henning Kamp wrote:

> In message <84319.967125031@axl.fw.uunet.co.za>, Sheldon Hearn writes:
> >
> >> No I haven't even looked at it.  The jail stuff doesn't even know
> >> what a lock is, much less touch one, so the jail involvement was
> >> not enough to make me put it on my busy schedule...
> >
> >>From Robert's earlier comments, it sounds more like an MD thing than a
> >jail thing?
> 
> The only thing magic about MD is it's access time...

I guess the magic question for the bug reporter would be, ``if you run
edquota from within a jail on a non-md disk, does the same lock problem
occur''.  He may have listed this in the original PR, but I don't recall.


  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services
Comment 11 Poul-Henning Kamp 2000-08-24 15:12:44 UTC
In message <Pine.NEB.3.96L.1000824100047.31571D-100000@fledge.watson.org>, Robe
rt Watson writes:
>On Thu, 24 Aug 2000, Poul-Henning Kamp wrote:
>
>> In message <84319.967125031@axl.fw.uunet.co.za>, Sheldon Hearn writes:
>> >
>> >> No I haven't even looked at it.  The jail stuff doesn't even know
>> >> what a lock is, much less touch one, so the jail involvement was
>> >> not enough to make me put it on my busy schedule...
>> >
>> >>From Robert's earlier comments, it sounds more like an MD thing than a
>> >jail thing?
>> 
>> The only thing magic about MD is it's access time...
>
>I guess the magic question for the bug reporter would be, ``if you run
>edquota from within a jail on a non-md disk, does the same lock problem
>occur''.  He may have listed this in the original PR, but I don't recall.

It think this is a preexisting bug which jail/chroot + MD exposes.

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD coreteam member | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Comment 12 tarcieri 2004-11-09 20:11:22 UTC
I've experienced this problem in FreeBSD 5.3-RELEASE as well, and I 
believe it's tied to the quota.user file.  Placing the quota.user file 
outside the md filesystem by mounting the md device using the 
userquota=/anywhere/outside/the/loopback/filesystem does *not* result 
in a deadlock and, in fact, seems to work perfectly.

Tony Arcieri
Comment 13 K. Macy freebsd_committer freebsd_triage 2007-11-16 03:54:17 UTC
State Changed
From-To: open->feedback


may be fixed
Comment 14 Mark Linimon freebsd_committer freebsd_triage 2008-03-01 19:40:05 UTC
State Changed
From-To: feedback->closed

Feedback timeout (> 3 months).