FreeBSD Bugzilla – Attachment 161461 Details for
Bug 202607
[panic] Poudriere umounting file systems causes 'solaris assert: avl_is_empty(&dn->dn_dbufs)' panic
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Help
|
New Account
|
Log In
Remember
[x]
|
Forgot Password
Login:
[x]
[patch]
Generically handle deferred dbuf eviction
0001-dn_bonus-evicted-too-early.patch (text/plain), 9.58 KB, created by
Justin T. Gibbs
on 2015-09-27 21:43:36 UTC
(
hide
)
Description:
Generically handle deferred dbuf eviction
Filename:
MIME Type:
Creator:
Justin T. Gibbs
Created:
2015-09-27 21:43:36 UTC
Size:
9.58 KB
patch
obsolete
>From c4745133d13d085f12976de90d78f5a759afb2e9 Mon Sep 17 00:00:00 2001 >From: "Justin T. Gibbs" <gibbs@FreeBSD.org> >Date: Thu, 24 Sep 2015 06:39:14 -0600 >Subject: [PATCH] dn_bonus evicted too early > >The bonus buffer associated with a dnode is expected to remain resident >until: the dnode is evicted via dnode_buf_pageout(), the dnode is >freed in dnode_sync_free(), or the objset containing the dnode is >evicted via dmu_objset_evict(). However, since bonus buffers (and DMU >buffers in general) can have draining references when these events occur, >dbuf_rele_and_unlock() has logic to ensure that once these late references >are released affected buffers will be evicted. > >dbuf_rele_and_unlock() currently checks for a dbuf for an evicting >objset via the os->os_evicting flag, and detects buffers for a freed >dnode by testing dn->dn_type and dn->dn_free_txg fields. Unfortunately, >the free'd dnode test can fire prematurely - anytime after the dnode is >scheduled to be freed via dnode_free() until the free is committed in >dnode_sync_free(). If all references to the bonus buffer are dropped >within this window, the bonus buffer will be evicted and code in >dnode_sync() that relies on the bonus buffer will fail. > >Additionally, the "free'd dnode test" isn't applied to normal buffers >(buffers that are not the bonus buffer) and there is no mechanism to >guarantee eviction in the dnode_buf_pageout() case (the dnode is not >being freed nor is the objset being evicted). > >Replace the two existing deferred eviction mechanisms with a per-dbuf >flag, db_pending_evict. This is set when explicit eviction is requested >via either dnode_evict_dbufs() or dnode_evict_bonus(). These actions >only occur after it is safe for dnode buffers to be evicted (e.g. the >bonus buffer will not be referenced again). > >uts/common/fs/zfs/sys/dbuf.h: > Add comments for boolean fields in dmu_buf_impl_t. > > Add the db_pending_evict field. > >uts/common/fs/zfs/sys/dbuf.h: >uts/common/fs/zfs/dbuf.c: > Rename db_immediate_evict to db_user_immediate_evict to avoid > confusion between dbuf user state eviction and deferred eviction > of a dbuf. > >uts/common/fs/zfs/dbuf.c: > Consistently use TRUE/FALSE for boolean fields in > dmu_buf_impl_t. > > Simplify pending eviction logic to use the new db_pending_evict > flag in all cases. > >uts/common/fs/zfs/dmu_objset.c: >uts/common/fs/zfs/sys/dmu_objset.h: > Remove objset_t's os_evicting field. This same functionality > is now provided by db_pending_evict. > >uts/common/fs/zfs/dnode_sync.c: > In dnode_evict_dbufs() and dnode_evict_bonus(), mark dbufs > with draining references (dbufs that can't be evicted inline) > as pending_evict. > > In dnode_sync_free(), remove ASSERT() that a dnode being free'd > has no active dbufs. This is usually the case, but is not > guaranteed due to draining references. (e.g. The deadlist for a > deleted dataset may still be open if another thread referenced > the dataset just before it was freed and the dsl_dataset_t hasn't > been released or is still being evicted). >--- > usr/src/uts/common/fs/zfs/dbuf.c | 47 ++++++++---------------------- > usr/src/uts/common/fs/zfs/dmu_objset.c | 1 - > usr/src/uts/common/fs/zfs/dnode_sync.c | 14 +++++---- > usr/src/uts/common/fs/zfs/sys/dbuf.h | 18 +++++++++++- > usr/src/uts/common/fs/zfs/sys/dmu_objset.h | 1 - > 5 files changed, 38 insertions(+), 43 deletions(-) > >diff --git a/usr/src/uts/common/fs/zfs/dbuf.c b/usr/src/uts/common/fs/zfs/dbuf.c >index e75bac0..4a5faf5 100644 >--- a/usr/src/uts/common/fs/zfs/dbuf.c >+++ b/usr/src/uts/common/fs/zfs/dbuf.c >@@ -272,7 +272,7 @@ dbuf_verify_user(dmu_buf_impl_t *db, dbvu_verify_type_t verify_type) > */ > ASSERT3U(holds, >=, db->db_dirtycnt); > } else { >- if (db->db_immediate_evict == TRUE) >+ if (db->db_user_immediate_evict == TRUE) > ASSERT3U(holds, >=, db->db_dirtycnt); > else > ASSERT3U(holds, >, 0); >@@ -1834,8 +1834,9 @@ dbuf_create(dnode_t *dn, uint8_t level, uint64_t blkid, > db->db_blkptr = blkptr; > > db->db_user = NULL; >- db->db_immediate_evict = 0; >- db->db_freed_in_flight = 0; >+ db->db_user_immediate_evict = FALSE; >+ db->db_freed_in_flight = FALSE; >+ db->db_pending_evict = FALSE; > > if (blkid == DMU_BONUS_BLKID) { > ASSERT3P(parent, ==, dn->dn_dbuf); >@@ -2391,12 +2392,13 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) > arc_buf_freeze(db->db_buf); > > if (holds == db->db_dirtycnt && >- db->db_level == 0 && db->db_immediate_evict) >+ db->db_level == 0 && db->db_user_immediate_evict) > dbuf_evict_user(db); > > if (holds == 0) { > if (db->db_blkid == DMU_BONUS_BLKID) { > dnode_t *dn; >+ boolean_t evict_dbuf = db->db_pending_evict; > > /* > * If the dnode moves here, we cannot cross this >@@ -2411,7 +2413,7 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) > * Decrementing the dbuf count means that the bonus > * buffer's dnode hold is no longer discounted in > * dnode_move(). The dnode cannot move until after >- * the dnode_rele_and_unlock() below. >+ * the dnode_rele() below. > */ > DB_DNODE_EXIT(db); > >@@ -2421,35 +2423,10 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) > */ > mutex_exit(&db->db_mtx); > >- /* >- * If the dnode has been freed, evict the bonus >- * buffer immediately. The data in the bonus >- * buffer is no longer relevant and this prevents >- * a stale bonus buffer from being associated >- * with this dnode_t should the dnode_t be reused >- * prior to being destroyed. >- */ >- mutex_enter(&dn->dn_mtx); >- if (dn->dn_type == DMU_OT_NONE || >- dn->dn_free_txg != 0) { >- /* >- * Drop dn_mtx. It is a leaf lock and >- * cannot be held when dnode_evict_bonus() >- * acquires other locks in order to >- * perform the eviction. >- * >- * Freed dnodes cannot be reused until the >- * last hold is released. Since this bonus >- * buffer has a hold, the dnode will remain >- * in the free state, even without dn_mtx >- * held, until the dnode_rele_and_unlock() >- * below. >- */ >- mutex_exit(&dn->dn_mtx); >+ if (evict_dbuf) > dnode_evict_bonus(dn); >- mutex_enter(&dn->dn_mtx); >- } >- dnode_rele_and_unlock(dn, db); >+ >+ dnode_rele(dn, db); > } else if (db->db_buf == NULL) { > /* > * This is a special case: we never associated this >@@ -2496,7 +2473,7 @@ dbuf_rele_and_unlock(dmu_buf_impl_t *db, void *tag) > } else { > dbuf_clear(db); > } >- } else if (db->db_objset->os_evicting || >+ } else if (db->db_pending_evict || > arc_buf_eviction_needed(db->db_buf)) { > dbuf_clear(db); > } else { >@@ -2544,7 +2521,7 @@ dmu_buf_set_user_ie(dmu_buf_t *db_fake, dmu_buf_user_t *user) > { > dmu_buf_impl_t *db = (dmu_buf_impl_t *)db_fake; > >- db->db_immediate_evict = TRUE; >+ db->db_user_immediate_evict = TRUE; > return (dmu_buf_set_user(db_fake, user)); > } > >diff --git a/usr/src/uts/common/fs/zfs/dmu_objset.c b/usr/src/uts/common/fs/zfs/dmu_objset.c >index f84ff37..6846572 100644 >--- a/usr/src/uts/common/fs/zfs/dmu_objset.c >+++ b/usr/src/uts/common/fs/zfs/dmu_objset.c >@@ -686,7 +686,6 @@ dmu_objset_evict(objset_t *os) > if (os->os_sa) > sa_tear_down(os); > >- os->os_evicting = B_TRUE; > dmu_objset_evict_dbufs(os); > > mutex_enter(&os->os_lock); >diff --git a/usr/src/uts/common/fs/zfs/dnode_sync.c b/usr/src/uts/common/fs/zfs/dnode_sync.c >index 0787885..9aee513 100644 >--- a/usr/src/uts/common/fs/zfs/dnode_sync.c >+++ b/usr/src/uts/common/fs/zfs/dnode_sync.c >@@ -424,6 +424,7 @@ dnode_evict_dbufs(dnode_t *dn) > db_next = AVL_NEXT(&dn->dn_dbufs, &db_marker); > avl_remove(&dn->dn_dbufs, &db_marker); > } else { >+ db->db_pending_evict = TRUE; > mutex_exit(&db->db_mtx); > db_next = AVL_NEXT(&dn->dn_dbufs, db); > } >@@ -437,10 +438,14 @@ void > dnode_evict_bonus(dnode_t *dn) > { > rw_enter(&dn->dn_struct_rwlock, RW_WRITER); >- if (dn->dn_bonus && refcount_is_zero(&dn->dn_bonus->db_holds)) { >- mutex_enter(&dn->dn_bonus->db_mtx); >- dbuf_evict(dn->dn_bonus); >- dn->dn_bonus = NULL; >+ if (dn->dn_bonus != NULL) { >+ if (refcount_is_zero(&dn->dn_bonus->db_holds)) { >+ mutex_enter(&dn->dn_bonus->db_mtx); >+ dbuf_evict(dn->dn_bonus); >+ dn->dn_bonus = NULL; >+ } else { >+ dn->dn_bonus->db_pending_evict = TRUE; >+ } > } > rw_exit(&dn->dn_struct_rwlock); > } >@@ -492,7 +497,6 @@ dnode_sync_free(dnode_t *dn, dmu_tx_t *tx) > > dnode_undirty_dbufs(&dn->dn_dirty_records[txgoff]); > dnode_evict_dbufs(dn); >- ASSERT(avl_is_empty(&dn->dn_dbufs)); > > /* > * XXX - It would be nice to assert this, but we may still >diff --git a/usr/src/uts/common/fs/zfs/sys/dbuf.h b/usr/src/uts/common/fs/zfs/sys/dbuf.h >index 482ccb0..e2e7679 100644 >--- a/usr/src/uts/common/fs/zfs/sys/dbuf.h >+++ b/usr/src/uts/common/fs/zfs/sys/dbuf.h >@@ -230,9 +230,25 @@ typedef struct dmu_buf_impl { > /* User callback information. */ > dmu_buf_user_t *db_user; > >- uint8_t db_immediate_evict; >+ /* >+ * Evict user data as soon as the dirty and reference >+ * counts are equal. >+ */ >+ uint8_t db_user_immediate_evict; >+ >+ /* >+ * This block was freed while a read or write was >+ * active. >+ */ > uint8_t db_freed_in_flight; > >+ /* >+ * dnode_evict_dbufs() or dnode_evict_bonsu() tried to >+ * evict this dbuf, but couldn't due to outstanding >+ * references. Evict once the refcount drops to 0. >+ */ >+ uint8_t db_pending_evict; >+ > uint8_t db_dirtycnt; > } dmu_buf_impl_t; > >diff --git a/usr/src/uts/common/fs/zfs/sys/dmu_objset.h b/usr/src/uts/common/fs/zfs/sys/dmu_objset.h >index 9e98350..8a263a3 100644 >--- a/usr/src/uts/common/fs/zfs/sys/dmu_objset.h >+++ b/usr/src/uts/common/fs/zfs/sys/dmu_objset.h >@@ -93,7 +93,6 @@ struct objset { > uint8_t os_copies; > enum zio_checksum os_dedup_checksum; > boolean_t os_dedup_verify; >- boolean_t os_evicting; > zfs_logbias_op_t os_logbias; > zfs_cache_type_t os_primary_cache; > zfs_cache_type_t os_secondary_cache; >-- >1.9.4 >
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Diff
View Attachment As Raw
Actions:
View
|
Diff
Attachments on
bug 202607
:
160278
|
161281
|
161282
|
161283
|
161285
| 161461