summaryrefslogtreecommitdiff
path: root/drivers
diff options
context:
space:
mode:
authorMatthew Sakai <msakai@redhat.com>2024-02-06 22:00:42 -0500
committerMike Snitzer <snitzer@kernel.org>2024-02-20 13:43:18 -0500
commitea9ca07affd80668b207703919eaba849654e11f (patch)
tree354165437d43efc1363e9860f296790963970d76 /drivers
parent512039b41b08177dce08f5cf324f2f57f9629639 (diff)
dm vdo: add documentation details on zones and locking
Add details describing the vdo zone and thread model to the documentation comments for major vdo components. Also added some high-level description of the block map structure. Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Diffstat (limited to 'drivers')
-rw-r--r--drivers/md/dm-vdo/block-map.h15
-rw-r--r--drivers/md/dm-vdo/dedupe.c5
-rw-r--r--drivers/md/dm-vdo/recovery-journal.h4
-rw-r--r--drivers/md/dm-vdo/slab-depot.h16
4 files changed, 35 insertions, 5 deletions
diff --git a/drivers/md/dm-vdo/block-map.h b/drivers/md/dm-vdo/block-map.h
index c574bd524bc2..b662c318c2ea 100644
--- a/drivers/md/dm-vdo/block-map.h
+++ b/drivers/md/dm-vdo/block-map.h
@@ -19,6 +19,21 @@
#include "vio.h"
#include "wait-queue.h"
+/*
+ * The block map is responsible for tracking all the logical to physical mappings of a VDO. It
+ * consists of a collection of 60 radix trees gradually allocated as logical addresses are used.
+ * Each tree is assigned to a logical zone such that it is easy to compute which zone must handle
+ * each logical address. Each logical zone also has a dedicated portion of the leaf page cache.
+ *
+ * Each logical zone has a single dedicated queue and thread for performing all updates to the
+ * radix trees assigned to that zone. The concurrency guarantees of this single-threaded model
+ * allow the code to omit more fine-grained locking for the block map structures.
+ *
+ * Load operations must be performed on the admin thread. Normal operations, such as reading and
+ * updating mappings, must be performed on the appropriate logical zone thread. Save operations
+ * must be launched from the same admin thread as the original load operation.
+ */
+
enum {
BLOCK_MAP_VIO_POOL_SIZE = 64,
};
diff --git a/drivers/md/dm-vdo/dedupe.c b/drivers/md/dm-vdo/dedupe.c
index 4b00135511dd..d81065a0951c 100644
--- a/drivers/md/dm-vdo/dedupe.c
+++ b/drivers/md/dm-vdo/dedupe.c
@@ -14,6 +14,11 @@
* deduplicate against a single block instead of being serialized through a PBN read lock. Only one
* index query is needed for each hash_lock, instead of one for every data_vio.
*
+ * Hash_locks are assigned to hash_zones by computing a modulus on the hash itself. Each hash_zone
+ * has a single dedicated queue and thread for performing all operations on the hash_locks assigned
+ * to that zone. The concurrency guarantees of this single-threaded model allow the code to omit
+ * more fine-grained locking for the hash_lock structures.
+ *
* A hash_lock acts like a state machine perhaps more than as a lock. Other than the starting and
* ending states INITIALIZING and BYPASSING, every state represents and is held for the duration of
* an asynchronous operation. All state transitions are performed on the thread of the hash_zone
diff --git a/drivers/md/dm-vdo/recovery-journal.h b/drivers/md/dm-vdo/recovery-journal.h
index 19fa7ed9648a..d78c6c7da4ea 100644
--- a/drivers/md/dm-vdo/recovery-journal.h
+++ b/drivers/md/dm-vdo/recovery-journal.h
@@ -26,6 +26,10 @@
* write amplification of writes by providing amortization of slab journal and block map page
* updates.
*
+ * The recovery journal has a single dedicated queue and thread for performing all journal updates.
+ * The concurrency guarantees of this single-threaded model allow the code to omit more
+ * fine-grained locking for recovery journal structures.
+ *
* The journal consists of a set of on-disk blocks arranged as a circular log with monotonically
* increasing sequence numbers. Three sequence numbers serve to define the active extent of the
* journal. The 'head' is the oldest active block in the journal. The 'tail' is the end of the
diff --git a/drivers/md/dm-vdo/slab-depot.h b/drivers/md/dm-vdo/slab-depot.h
index efdef566709a..fba293f9713e 100644
--- a/drivers/md/dm-vdo/slab-depot.h
+++ b/drivers/md/dm-vdo/slab-depot.h
@@ -29,11 +29,17 @@
* a single array of slabs in order to eliminate the need for additional math in order to compute
* which physical zone a PBN is in. It also has a block_allocator per zone.
*
- * Load operations are required to be performed on a single thread. Normal operations are assumed
- * to be performed in the appropriate zone. Allocations and reference count updates must be done
- * from the thread of their physical zone. Requests to commit slab journal tail blocks from the
- * recovery journal must be done on the journal zone thread. Save operations are required to be
- * launched from the same thread as the original load operation.
+ * Each physical zone has a single dedicated queue and thread for performing all updates to the
+ * slabs assigned to that zone. The concurrency guarantees of this single-threaded model allow the
+ * code to omit more fine-grained locking for the various slab structures. Each physical zone
+ * maintains a separate copy of the slab summary to remove the need for explicit locking on that
+ * structure as well.
+ *
+ * Load operations must be performed on the admin thread. Normal operations, such as allocations
+ * and reference count updates, must be performed on the appropriate physical zone thread. Requests
+ * from the recovery journal to commit slab journal tail blocks must be scheduled from the recovery
+ * journal thread to run on the appropriate physical zone thread. Save operations must be launched
+ * from the same admin thread as the original load operation.
*/
enum {