diff options
author | Alexei Starovoitov <ast@kernel.org> | 2025-02-21 18:44:23 -0800 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2025-02-27 09:32:27 -0800 |
commit | 97769a53f117e2f33864c587d85992ee35194ecf (patch) | |
tree | 257ef0eda79af4065d8a86845b948b7c853d8287 /mm/internal.h | |
parent | 0aaddfb06882504dded9cde57f91035ab9403b82 (diff) |
mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation
Tracing BPF programs execute from tracepoints and kprobes where
running context is unknown, but they need to request additional
memory. The prior workarounds were using pre-allocated memory and
BPF specific freelists to satisfy such allocation requests.
Instead, introduce gfpflags_allow_spinning() condition that signals
to the allocator that running context is unknown.
Then rely on percpu free list of pages to allocate a page.
try_alloc_pages() -> get_page_from_freelist() -> rmqueue() ->
rmqueue_pcplist() will spin_trylock to grab the page from percpu
free list. If it fails (due to re-entrancy or list being empty)
then rmqueue_bulk()/rmqueue_buddy() will attempt to
spin_trylock zone->lock and grab the page from there.
spin_trylock() is not safe in PREEMPT_RT when in NMI or in hard IRQ.
Bailout early in such case.
The support for gfpflags_allow_spinning() mode for free_page and memcg
comes in the next patches.
This is a first step towards supporting BPF requirements in SLUB
and getting rid of bpf_mem_alloc.
That goal was discussed at LSFMM: https://lwn.net/Articles/974138/
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20250222024427.30294-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'mm/internal.h')
-rw-r--r-- | mm/internal.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/mm/internal.h b/mm/internal.h index 109ef30fee11..10a8b4b3b86e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1187,6 +1187,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, #define ALLOC_NOFRAGMENT 0x0 #endif #define ALLOC_HIGHATOMIC 0x200 /* Allows access to MIGRATE_HIGHATOMIC */ +#define ALLOC_TRYLOCK 0x400 /* Only use spin_trylock in allocation path */ #define ALLOC_KSWAPD 0x800 /* allow waking of kswapd, __GFP_KSWAPD_RECLAIM set */ /* Flags that allow allocations below the min watermark. */ |