Bug 32331

Summary: system panic in quotaoff
Product: Base System Reporter: Derren Lu <derrenl>
Component: kernAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed FIXED    
Severity: Affects Only Me    
Priority: Normal    
Version: Unspecified   
Hardware: Any   
OS: Any   

Description Derren Lu 2001-11-27 13:20:05 UTC
In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 
1. unmount some of these file systems, 2. do some quota operations (e.g. setquota) on existed file systems, 3. re-mount those unmounted file systems and enable quota, 4. reapeat step 1 ~ 3 and system will panic in qutaoff. 

The code stack is:
dqflush()
quotaoff()
ffs_flushfiles()
ffs_unmount()
dounmount()
unmount()
syscall2()
Xint0x80_syscall()


The line begins with "=>" is exactly the source of panic. It gets page fault because the value of dq->dq_ump is 0.
static void
dqflush()
{
...
    for (dqh = ...) {
        for (dq = ...) {
            nextdq = dq->dq_hash.le_next;
=>          if (dq->dq_ump->um_quotas[dq->dq_type] != vp)
                contine;
            ...
        }
    }
...
}

Fix: 

I traced the codes and found the problem may be in dqget() of sys/ufs/ufs/ufs_quota.c

static int
dqget()
{
...
if (numdquot < desireddquot) {
    dq = ...
    bzero((char *) dq, sizeof *dq);
    numdquot++;
} else {
    if ((dq = dqfreelist.tqh_first) == NULL) {
        ...
    }
    if (dq->dq_cnt || (dq->dq_flags & DO_MOD))
        panic("...");
    TAILQ_REMOVE(&dqfreelist, dq, dq_freelist);
=>  LIST_REMOVE(dq, dq_hast);
}
...
}

In my opinion, the "LIST_REMOVE(dq, dq_hash)" is buggy because it assumes when you get a free dquot from freelist this dquot must also be linked in dqhash list. Unfortunately, this assumption is incorrect. Once this dquot is freed from quotaoff(), it will be removed from dqhast list in dqflush(). When you get this dquot in dqget(), it is actually not be linked in dqhash list. However, the dqflush() doesn't reset the fields of this dquot. So the execution of "LIST_REMOVE(dq, dq_hash)" may relink the next element of this dquot to dqhash list if it does have next element before freed. Since this "next elment" may also be a freed dquot, its dq_ump field may be 0 and this may result system panic in next dqflush().

To solve this problem, I try to add a new quota flag DQ_FLUSHED.

#define DQ_FLUSHED  0x40

And modify functions dqflush() and dqget(). Those lines begin with "=>" is the codes I added.

dqget()
{
...
if (numdquot < desireddquot) {
    dq = ...
    bzero((char *) dq, sizeof *dq);
    numdquot++;
} else {
    if ((dq = dqfreelist.tqh_first) == NULL) {
        ...
    }
    if (dq->dq_cnt || (dq->dq_flags & DO_MOD))
        panic("...");
    TAILQ_REMOVE(&dqfreelist, dq, dq_freelist);
    /* 
     * only the dquot linked in dqhast list will be removed from hash
     * list 
     */
=>  if ((dq->dq_flags & DQ_FLUSHED) == 0)
       LIST_REMOVE(dq, dq_hast);
}
...
}


dqflush()
{
...
    for (dqh = ...) {
        for (dq = ...) {
            ...
            dq->dq_ump = (struct ufsmount *)0;
            /*
             * Mark this dquot so that dqget() will not remove it from
             * dqhast list one more time.
             */
=>          dq->dq_flags |= DQ_FLUSHED;
        }
    }
...
}
How-To-Repeat: In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 
1. quotaoff some of these file systems, 2. do some quota operations (e.g. setquota), 3.quotaon them, 4. reapeat step 1 ~ 4 and system will panic in qutaoff
Comment 1 brad 2002-01-01 17:54:19 UTC
I'd like to confirm this same set of circumstances as of 4.2-STABLE and 
4.4-STABLE - I don't know whether this patch is still relevant or 
whether it worked, but this problem still exists.

I was unable to get the output of a crash report (the kernel message 
appearing onscreen), but I can provide other info if it is needed.
Comment 2 Poul-Henning Kamp freebsd_committer freebsd_triage 2002-01-10 15:03:10 UTC
State Changed
From-To: open->feedback

Can you please try the patch I committed to sys/ufs/ufs/ufs_quota.c 
in version 1.50 ? 

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/ufs/ufs/ufs_quota.c.diff?r1=1.49&r2=1.50 

Thanks!
Comment 3 Giorgos Keramidas freebsd_committer freebsd_triage 2002-08-22 00:21:28 UTC
State Changed
From-To: feedback->closed

Feedback timeout.