From ac23d1a964600bb9c14b5048bdf4f18ae13226f4 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 13 Dec 2021 23:03:54 -0500 Subject: XArray: Document the locking requirement for the xa_state It wasn't obvious to all readers that it's unsafe to reuse an xa_state after dropping the xas_lock() or the rcu_read_lock(). Reported-by: Charan Teja Kalla Signed-off-by: Matthew Wilcox (Oracle) --- Documentation/core-api/xarray.rst | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index a137a0e6d068..77e0ece2b1d6 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -315,11 +315,15 @@ indeed the normal API is implemented in terms of the advanced API. The advanced API is only available to modules with a GPL-compatible license. The advanced API is based around the xa_state. This is an opaque data -structure which you declare on the stack using the XA_STATE() -macro. This macro initialises the xa_state ready to start walking -around the XArray. It is used as a cursor to maintain the position -in the XArray and let you compose various operations together without -having to restart from the top every time. +structure which you declare on the stack using the XA_STATE() macro. +This macro initialises the xa_state ready to start walking around the +XArray. It is used as a cursor to maintain the position in the XArray +and let you compose various operations together without having to restart +from the top every time. The contents of the xa_state are protected by +the rcu_read_lock() or the xas_lock(). If you need to drop whichever of +those locks is protecting your state and tree, you must call xas_pause() +so that future calls do not rely on the parts of the state which were +left unprotected. The xa_state is also used to store errors. You can call xas_error() to retrieve the error. All operations check whether -- cgit From 22f56b8e890d4e2835951b437bb6eeebfd1cb18b Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Thu, 3 Feb 2022 16:01:39 -0500 Subject: XArray: Include bitmap.h from xarray.h xas_find_chunk() calls find_next_bit(), which is defined in find.h, included from bitmap.h. Inside the kernel, this isn't a problem because bitmap.h is included from cpumask.h which is dragged in (eventually) by gfp.h. When building the test-suite, that doesn't happen, so we need to include bitmap.h explicitly. Fixes: 4ade0818cf04 ("tools: sync tools/bitmap with mother linux") Reported-by: Liam Howlett Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/xarray.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/xarray.h b/include/linux/xarray.h index d6d5da6ed735..66e28bc1a023 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -9,6 +9,7 @@ * See Documentation/core-api/xarray.rst for how to use the XArray. */ +#include #include #include #include -- cgit From 3e3c658055c002900982513e289398a1aad4a488 Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Mon, 28 Mar 2022 19:25:11 -0400 Subject: XArray: Fix xas_create_range() when multi-order entry present If there is already an entry present that is of order >= XA_CHUNK_SHIFT when we call xas_create_range(), xas_create_range() will misinterpret that entry as a node and dereference xa_node->parent, generally leading to a crash that looks something like this: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 32 Comm: khugepaged Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13 #0 RIP: 0010:xa_parent_locked include/linux/xarray.h:1207 [inline] RIP: 0010:xas_create_range+0x2d9/0x6e0 lib/xarray.c:725 It's deterministically reproducable once you know what the problem is, but producing it in a live kernel requires khugepaged to hit a race. While the problem has been present since xas_create_range() was introduced, I'm not aware of a way to hit it before the page cache was converted to use multi-index entries. Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Reported-by: syzbot+0d2b0bf32ca5cfd09f2e@syzkaller.appspotmail.com Signed-off-by: Matthew Wilcox (Oracle) --- lib/test_xarray.c | 22 ++++++++++++++++++++++ lib/xarray.c | 2 ++ 2 files changed, 24 insertions(+) diff --git a/lib/test_xarray.c b/lib/test_xarray.c index 8b1c318189ce..e77d4856442c 100644 --- a/lib/test_xarray.c +++ b/lib/test_xarray.c @@ -1463,6 +1463,25 @@ unlock: XA_BUG_ON(xa, !xa_empty(xa)); } +static noinline void check_create_range_5(struct xarray *xa, + unsigned long index, unsigned int order) +{ + XA_STATE_ORDER(xas, xa, index, order); + unsigned int i; + + xa_store_order(xa, index, order, xa_mk_index(index), GFP_KERNEL); + + for (i = 0; i < order + 10; i++) { + do { + xas_lock(&xas); + xas_create_range(&xas); + xas_unlock(&xas); + } while (xas_nomem(&xas, GFP_KERNEL)); + } + + xa_destroy(xa); +} + static noinline void check_create_range(struct xarray *xa) { unsigned int order; @@ -1490,6 +1509,9 @@ static noinline void check_create_range(struct xarray *xa) check_create_range_4(xa, (3U << order) + 1, order); check_create_range_4(xa, (3U << order) - 1, order); check_create_range_4(xa, (1U << 24) + 1, order); + + check_create_range_5(xa, 0, order); + check_create_range_5(xa, (1U << order), order); } check_create_range_3(); diff --git a/lib/xarray.c b/lib/xarray.c index 6f47f6375808..757644617b9b 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -722,6 +722,8 @@ void xas_create_range(struct xa_state *xas) for (;;) { struct xa_node *node = xas->xa_node; + if (node->shift >= shift) + break; xas->xa_node = xa_parent_locked(xas->xa, node); xas->xa_offset = node->offset - 1; if (node->offset != 0) -- cgit From 3ed4bb77156da0bc732847c8c9df92454c1fbeea Mon Sep 17 00:00:00 2001 From: "Matthew Wilcox (Oracle)" Date: Thu, 31 Mar 2022 08:27:09 -0400 Subject: XArray: Update the LRU list in xas_split() When splitting a value entry, we may need to add the new nodes to the LRU list and remove the parent node from the LRU list. The WARN_ON checks in shadow_lru_isolate() catch this oversight. This bug was latent until we stopped splitting folios in shrink_page_list() with commit 820c4e2e6f51 ("mm/vmscan: Free non-shmem folios without splitting them"). That allows the creation of large shadow entries, and subsequently when trying to page in a small page, we will split the large shadow entry in __filemap_add_folio(). Fixes: 8fc75643c5e1 ("XArray: add xas_split") Reported-by: Hugh Dickins Signed-off-by: Matthew Wilcox (Oracle) --- lib/xarray.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/xarray.c b/lib/xarray.c index 757644617b9b..88ca87435e3d 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1081,6 +1081,7 @@ void xas_split(struct xa_state *xas, void *entry, unsigned int order) xa_mk_node(child)); if (xa_is_value(curr)) values--; + xas_update(xas, child); } else { unsigned int canon = offset - xas->xa_sibs; @@ -1095,6 +1096,7 @@ void xas_split(struct xa_state *xas, void *entry, unsigned int order) } while (offset-- > xas->xa_offset); node->nr_values += values; + xas_update(xas, node); } EXPORT_SYMBOL_GPL(xas_split); #endif -- cgit