diff options
| author | Jakub Kicinski <kuba@kernel.org> | 2025-09-22 17:40:32 -0700 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2025-09-22 17:40:32 -0700 |
| commit | c5aaf0225a81f94ac64efb477d17b36dc692617d (patch) | |
| tree | 49f92e73d86d5b9dcb5fe3856c5f84901fca0424 /rust/helpers/xarray.c | |
| parent | dfff18082a6c2c52b07a6e0830298c6b3f48dfa9 (diff) | |
| parent | 27ce71e1ce81875df72f7698ba27988392bef602 (diff) | |
Merge branch 'net-replace-wq-users-and-add-wq_percpu-to-alloc_workqueue-users'
Marco Crivellari says:
====================
net: replace wq users and add WQ_PERCPU to alloc_workqueue() users
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
Link: https://lore.kernel.org/20250221112003.1dSuoGyc@linutronix.de
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this series ===
1) [P 1-2] Replace use of system_wq and system_unbound_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_wq has been renamed in system_percpu_wq, and
system_unbound_wq has been renamed in system_dfl_wq.
2) [P 3] add WQ_PERCPU to remaining alloc_workqueue() users
Every alloc_workqueue() caller should use one among WQ_PERCPU or
WQ_UNBOUND.
WQ_UNBOUND will be removed in a next release cycle.
====================
Link: https://patch.msgid.link/20250918142427.309519-1-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'rust/helpers/xarray.c')
0 files changed, 0 insertions, 0 deletions
