| Summary: | system panic in quotaoff | ||
|---|---|---|---|
| Product: | Base System | Reporter: | Derren Lu <derrenl> |
| Component: | kern | Assignee: | freebsd-bugs (Nobody) <bugs> |
| Status: | Closed FIXED | ||
| Severity: | Affects Only Me | ||
| Priority: | Normal | ||
| Version: | Unspecified | ||
| Hardware: | Any | ||
| OS: | Any | ||
I'd like to confirm this same set of circumstances as of 4.2-STABLE and 4.4-STABLE - I don't know whether this patch is still relevant or whether it worked, but this problem still exists. I was unable to get the output of a crash report (the kernel message appearing onscreen), but I can provide other info if it is needed. State Changed From-To: open->feedback Can you please try the patch I committed to sys/ufs/ufs/ufs_quota.c in version 1.50 ? http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/ufs/ufs/ufs_quota.c.diff?r1=1.49&r2=1.50 Thanks! State Changed From-To: feedback->closed Feedback timeout. |
In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 1. unmount some of these file systems, 2. do some quota operations (e.g. setquota) on existed file systems, 3. re-mount those unmounted file systems and enable quota, 4. reapeat step 1 ~ 3 and system will panic in qutaoff. The code stack is: dqflush() quotaoff() ffs_flushfiles() ffs_unmount() dounmount() unmount() syscall2() Xint0x80_syscall() The line begins with "=>" is exactly the source of panic. It gets page fault because the value of dq->dq_ump is 0. static void dqflush() { ... for (dqh = ...) { for (dq = ...) { nextdq = dq->dq_hash.le_next; => if (dq->dq_ump->um_quotas[dq->dq_type] != vp) contine; ... } } ... } Fix: I traced the codes and found the problem may be in dqget() of sys/ufs/ufs/ufs_quota.c static int dqget() { ... if (numdquot < desireddquot) { dq = ... bzero((char *) dq, sizeof *dq); numdquot++; } else { if ((dq = dqfreelist.tqh_first) == NULL) { ... } if (dq->dq_cnt || (dq->dq_flags & DO_MOD)) panic("..."); TAILQ_REMOVE(&dqfreelist, dq, dq_freelist); => LIST_REMOVE(dq, dq_hast); } ... } In my opinion, the "LIST_REMOVE(dq, dq_hash)" is buggy because it assumes when you get a free dquot from freelist this dquot must also be linked in dqhash list. Unfortunately, this assumption is incorrect. Once this dquot is freed from quotaoff(), it will be removed from dqhast list in dqflush(). When you get this dquot in dqget(), it is actually not be linked in dqhash list. However, the dqflush() doesn't reset the fields of this dquot. So the execution of "LIST_REMOVE(dq, dq_hash)" may relink the next element of this dquot to dqhash list if it does have next element before freed. Since this "next elment" may also be a freed dquot, its dq_ump field may be 0 and this may result system panic in next dqflush(). To solve this problem, I try to add a new quota flag DQ_FLUSHED. #define DQ_FLUSHED 0x40 And modify functions dqflush() and dqget(). Those lines begin with "=>" is the codes I added. dqget() { ... if (numdquot < desireddquot) { dq = ... bzero((char *) dq, sizeof *dq); numdquot++; } else { if ((dq = dqfreelist.tqh_first) == NULL) { ... } if (dq->dq_cnt || (dq->dq_flags & DO_MOD)) panic("..."); TAILQ_REMOVE(&dqfreelist, dq, dq_freelist); /* * only the dquot linked in dqhast list will be removed from hash * list */ => if ((dq->dq_flags & DQ_FLUSHED) == 0) LIST_REMOVE(dq, dq_hast); } ... } dqflush() { ... for (dqh = ...) { for (dq = ...) { ... dq->dq_ump = (struct ufsmount *)0; /* * Mark this dquot so that dqget() will not remove it from * dqhast list one more time. */ => dq->dq_flags |= DQ_FLUSHED; } } ... } How-To-Repeat: In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 1. quotaoff some of these file systems, 2. do some quota operations (e.g. setquota), 3.quotaon them, 4. reapeat step 1 ~ 4 and system will panic in qutaoff