summaryrefslogtreecommitdiff
path: root/tools/sched_ext/include/scx/compat.bpf.h
AgeCommit message (Collapse)Author
2025-02-27tools/sched_ext: Provide a compatible helper for scx_bpf_events()Andrea Righi
Introduce __COMPAT_scx_bpf_events() to use scx_bpf_events() in a compatible way also with kernels that don't provide this kfunc. This also fixes the following error with scx_qmap when running on a kernel that does not provide scx_bpf_events(): ; scx_bpf_events(&events, sizeof(events)); @ scx_qmap.bpf.c:777 318: (b7) r2 = 72 ; R2_w=72 async_cb 319: <invalid kfunc call> kfunc 'scx_bpf_events' is referenced but wasn't resolved Fixes: 9865f31d852a4 ("sched_ext: Add scx_bpf_events() and scx_read_event() for BPF schedulers") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-24sched_ext: idle: Introduce scx_bpf_nr_node_ids()Andrea Righi
Similarly to scx_bpf_nr_cpu_ids(), introduce a new kfunc scx_bpf_nr_node_ids() to expose the maximum number of NUMA nodes in the system. BPF schedulers can use this information together with the new node-aware kfuncs, for example to create per-node DSQs, validate node IDs, etc. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-18sched_ext: idle: Introduce node-aware idle cpu kfunc helpersAndrea Righi
Introduce a new kfunc to retrieve the node associated to a CPU: int scx_bpf_cpu_node(s32 cpu) Add the following kfuncs to provide BPF schedulers direct access to per-node idle cpumasks information: const struct cpumask *scx_bpf_get_idle_cpumask_node(int node) const struct cpumask *scx_bpf_get_idle_smtmask_node(int node) s32 scx_bpf_pick_idle_cpu_node(const cpumask_t *cpus_allowed, int node, u64 flags) s32 scx_bpf_pick_any_cpu_node(const cpumask_t *cpus_allowed, int node, u64 flags) Moreover, trigger an scx error when any of the non-node aware idle CPU kfuncs are used when SCX_OPS_BUILTIN_IDLE_PER_NODE is enabled. Cc: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-08tools/sched_ext: Compatible testing of SCX_ENQ_CPU_SELECTEDChangwoo Min
This provides compatible testing of SCX_ENQ_CPU_SELECTED. More specifically, it handles two cases: 1. a BPF scheduler is compiled against vmlinux.h where SCX_ENQ_CPU_SELECTED is defined, but it runs on a kernel that does not have SCX_ENQ_CPU_SELECTED. In this case, the test result of 'enq_flags & SCX_ENQ_CPU_SELECTED' will always be false. That test result is semantically incorrect because the kernel before SCX_ENQ_CPU_SELECTED has never skipped select_task_rq_scx(), so the result should be true. 2. a BPF scheduler is compiling against vmlinux.h where SCX_ENQ_CPU_SELECTED is not defined. In this case, directly using SCX_ENQ_CPU_SELECTED causes compilation errors. To hide such complexity, introduce __COMPAT_is_enq_cpu_selected(), which checks if SCX_ENQ_CPU_SELECTED exists in runtime using BPF CO-RE. This consists of three parts: 1. Add enum_defs.autogen.h, which has macros (HAVE_{enum name}) denoting whether SCX enums are defined in the vmlinux.h or not. 2. Implement __COMPAT_is_enq_cpu_selected(), which provide the test of SCX_ENQ_CPU_SELECTED in a compatible way. 3. Use __COMPAT_is_enq_cpu_selected() in scx_qmap. Note that this is a sync of the relevant PR [1] in the scx repo. [1] https://github.com/sched-ext/scx/pull/1314 Signed-off-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Add scx_bpf_now() for BPF schedulerChangwoo Min
scx_bpf_now() is added to the header files so the BPF scheduler can use it. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-11-11sched_ext: Rename scx_bpf_dispatch[_vtime]_from_dsq*() -> ↵Tejun Heo
scx_bpf_dsq_move[_vtime]*() In sched_ext API, a repeatedly reported pain point is the overuse of the verb "dispatch" and confusion around "consume": - ops.dispatch() - scx_bpf_dispatch[_vtime]() - scx_bpf_consume() - scx_bpf_dispatch[_vtime]_from_dsq*() This overloading of the term is historical. Originally, there were only built-in DSQs and moving a task into a DSQ always dispatched it for execution. Using the verb "dispatch" for the kfuncs to move tasks into these DSQs made sense. Later, user DSQs were added and scx_bpf_dispatch[_vtime]() updated to be able to insert tasks into any DSQ. The only allowed DSQ to DSQ transfer was from a non-local DSQ to a local DSQ and this operation was named "consume". This was already confusing as a task could be dispatched to a user DSQ from ops.enqueue() and then the DSQ would have to be consumed in ops.dispatch(). Later addition of scx_bpf_dispatch_from_dsq*() made the confusion even worse as "dispatch" in this context meant moving a task to an arbitrary DSQ from a user DSQ. Clean up the API with the following renames: 1. scx_bpf_dispatch[_vtime]() -> scx_bpf_dsq_insert[_vtime]() 2. scx_bpf_consume() -> scx_bpf_dsq_move_to_local() 3. scx_bpf_dispatch[_vtime]_from_dsq*() -> scx_bpf_dsq_move[_vtime]*() This patch performs the third set of renames. Compatibility is maintained by: - The previous kfunc names are still provided by the kernel so that old binaries can run. Kernel generates a warning when the old names are used. - compat.bpf.h provides wrappers for the new names which automatically fall back to the old names when running on older kernels. They also trigger build error if old names are used for new builds. - scx_bpf_dispatch[_vtime]_from_dsq*() were already wrapped in __COMPAT macros as they were introduced during v6.12 cycle. Wrap new API in __COMPAT macros too and trigger build errors on both __COMPAT prefixed and naked usages of the old names. The compat features will be dropped after v6.15. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Andrea Righi <arighi@nvidia.com> Acked-by: Changwoo Min <changwoo@igalia.com> Acked-by: Johannes Bechberger <me@mostlynerdless.de> Acked-by: Giovanni Gherdovich <ggherdovich@suse.com> Cc: Dan Schatzberg <dschatzberg@meta.com> Cc: Ming Yang <yougmark94@gmail.com>
2024-11-11sched_ext: Rename scx_bpf_consume() to scx_bpf_dsq_move_to_local()Tejun Heo
In sched_ext API, a repeatedly reported pain point is the overuse of the verb "dispatch" and confusion around "consume": - ops.dispatch() - scx_bpf_dispatch[_vtime]() - scx_bpf_consume() - scx_bpf_dispatch[_vtime]_from_dsq*() This overloading of the term is historical. Originally, there were only built-in DSQs and moving a task into a DSQ always dispatched it for execution. Using the verb "dispatch" for the kfuncs to move tasks into these DSQs made sense. Later, user DSQs were added and scx_bpf_dispatch[_vtime]() updated to be able to insert tasks into any DSQ. The only allowed DSQ to DSQ transfer was from a non-local DSQ to a local DSQ and this operation was named "consume". This was already confusing as a task could be dispatched to a user DSQ from ops.enqueue() and then the DSQ would have to be consumed in ops.dispatch(). Later addition of scx_bpf_dispatch_from_dsq*() made the confusion even worse as "dispatch" in this context meant moving a task to an arbitrary DSQ from a user DSQ. Clean up the API with the following renames: 1. scx_bpf_dispatch[_vtime]() -> scx_bpf_dsq_insert[_vtime]() 2. scx_bpf_consume() -> scx_bpf_dsq_move_to_local() 3. scx_bpf_dispatch[_vtime]_from_dsq*() -> scx_bpf_dsq_move[_vtime]*() This patch performs the second rename. Compatibility is maintained by: - The previous kfunc names are still provided by the kernel so that old binaries can run. Kernel generates a warning when the old names are used. - compat.bpf.h provides wrappers for the new names which automatically fall back to the old names when running on older kernels. They also trigger build error if old names are used for new builds. The compat features will be dropped after v6.15. v2: Comment and documentation updates. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Andrea Righi <arighi@nvidia.com> Acked-by: Changwoo Min <changwoo@igalia.com> Acked-by: Johannes Bechberger <me@mostlynerdless.de> Acked-by: Giovanni Gherdovich <ggherdovich@suse.com> Cc: Dan Schatzberg <dschatzberg@meta.com> Cc: Ming Yang <yougmark94@gmail.com>
2024-11-11sched_ext: Rename scx_bpf_dispatch[_vtime]() to scx_bpf_dsq_insert[_vtime]()Tejun Heo
In sched_ext API, a repeatedly reported pain point is the overuse of the verb "dispatch" and confusion around "consume": - ops.dispatch() - scx_bpf_dispatch[_vtime]() - scx_bpf_consume() - scx_bpf_dispatch[_vtime]_from_dsq*() This overloading of the term is historical. Originally, there were only built-in DSQs and moving a task into a DSQ always dispatched it for execution. Using the verb "dispatch" for the kfuncs to move tasks into these DSQs made sense. Later, user DSQs were added and scx_bpf_dispatch[_vtime]() updated to be able to insert tasks into any DSQ. The only allowed DSQ to DSQ transfer was from a non-local DSQ to a local DSQ and this operation was named "consume". This was already confusing as a task could be dispatched to a user DSQ from ops.enqueue() and then the DSQ would have to be consumed in ops.dispatch(). Later addition of scx_bpf_dispatch_from_dsq*() made the confusion even worse as "dispatch" in this context meant moving a task to an arbitrary DSQ from a user DSQ. Clean up the API with the following renames: 1. scx_bpf_dispatch[_vtime]() -> scx_bpf_dsq_insert[_vtime]() 2. scx_bpf_consume() -> scx_bpf_dsq_move_to_local() 3. scx_bpf_dispatch[_vtime]_from_dsq*() -> scx_bpf_dsq_move[_vtime]*() This patch performs the first set of renames. Compatibility is maintained by: - The previous kfunc names are still provided by the kernel so that old binaries can run. Kernel generates a warning when the old names are used. - compat.bpf.h provides wrappers for the new names which automatically fall back to the old names when running on older kernels. They also trigger build error if old names are used for new builds. The compat features will be dropped after v6.15. v2: Documentation updates. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Andrea Righi <arighi@nvidia.com> Acked-by: Changwoo Min <changwoo@igalia.com> Acked-by: Johannes Bechberger <me@mostlynerdless.de> Acked-by: Giovanni Gherdovich <ggherdovich@suse.com> Cc: Dan Schatzberg <dschatzberg@meta.com> Cc: Ming Yang <yougmark94@gmail.com>
2024-09-25sched_ext: Add __COMPAT helpers for features added during v6.12 devel cycleTejun Heo
cgroup support and scx_bpf_dispatch[_vtime]_from_dsq() are newly added since 8bb30798fd6e ("sched_ext: Fixes incorrect type in bpf_scx_init()") which is the current earliest commit targeted by BPF schedulers. Add compat helpers for them and apply them in the example schedulers. These will be dropped after a few kernel releases. The exact backward compatibility window hasn't been decided yet. Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-18sched_ext: Add scx_simple and scx_example_qmap example schedulersTejun Heo
Add two simple example BPF schedulers - simple and qmap. * simple: In terms of scheduling, it behaves identical to not having any operation implemented at all. The two operations it implements are only to improve visibility and exit handling. On certain homogeneous configurations, this actually can perform pretty well. * qmap: A fixed five level priority scheduler to demonstrate queueing PIDs on BPF maps for scheduling. While not very practical, this is useful as a simple example and will be used to demonstrate different features. v7: - Compat helpers stripped out in prepartion of upstreaming as the upstreamed patchset will be the baselinfe. Utility macros that can be used to implement compat features are kept. - Explicitly disable map autoattach on struct_ops to avoid trying to attach twice while maintaining compatbility with older libbpf. v6: - Common header files reorganized and cleaned up. Compat helpers are added to demonstrate how schedulers can maintain backward compatibility with older kernels while making use of newly added features. - simple_select_cpu() added to keep track of the number of local dispatches. This is needed because the default ops.select_cpu() implementation is updated to dispatch directly and won't call ops.enqueue(). - Updated to reflect the sched_ext API changes. Switching all tasks is the default behavior now and scx_qmap supports partial switching when `-p` is specified. - tools/sched_ext/Kconfig dropped. This will be included in the doc instead. v5: - Improve Makefile. Build artifects are now collected into a separate dir which change be changed. Install and help targets are added and clean actually cleans everything. - MEMBER_VPTR() improved to improve access to structs. ARRAY_ELEM_PTR() and RESIZEABLE_ARRAY() are added to support resizable arrays in .bss. - Add scx_common.h which provides common utilities to user code such as SCX_BUG[_ON]() and RESIZE_ARRAY(). - Use SCX_BUG[_ON]() to simplify error handling. v4: - Dropped _example prefix from scheduler names. v3: - Rename scx_example_dummy to scx_example_simple and restructure a bit to ease later additions. Comment updates. - Added declarations for BPF inline iterators. In the future, hopefully, these will be consolidated into a generic BPF header so that they don't need to be replicated here. v2: - Updated with the generic BPF cpumask helpers. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: David Vernet <dvernet@meta.com> Acked-by: Josh Don <joshdon@google.com> Acked-by: Hao Luo <haoluo@google.com> Acked-by: Barret Rhoden <brho@google.com>