Age | Commit message (Collapse) | Author |
|
Add initial settings to all core resources, such as
the RX, AGG, TX, CQ, and NQ rings, as well as the VNIC.
This will help enable these resources in future patches.
Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20250919174742.24969-6-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the VNIC-specific structures and DMA memory necessary to support
UC/MC and RSS functionality.
Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20250919174742.24969-5-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allocate CP and NQ related data structures and add support to
associate NQ and CQ rings. Also, add the association of NQ, NAPI,
and interrupts.
Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20250919174742.24969-4-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allocate data structures to support RX, AGG, and TX rings.
While data structures for RX/AGG rings are allocated,
initialise the page pool accordingly.
Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20250919174742.24969-3-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ensure bnge_alloc_ring() frees any intermediate allocations
when it fails. This enables later patches to rely on this
self-unwinding behavior.
Signed-off-by: Bhargava Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com>
Link: https://patch.msgid.link/20250919174742.24969-2-bhargava.marreddy@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Marco Crivellari says:
====================
net: replace wq users and add WQ_PERCPU to alloc_workqueue() users
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
Link: https://lore.kernel.org/20250221112003.1dSuoGyc@linutronix.de
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this series ===
1) [P 1-2] Replace use of system_wq and system_unbound_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_wq has been renamed in system_percpu_wq, and
system_unbound_wq has been renamed in system_dfl_wq.
2) [P 3] add WQ_PERCPU to remaining alloc_workqueue() users
Every alloc_workqueue() caller should use one among WQ_PERCPU or
WQ_UNBOUND.
WQ_UNBOUND will be removed in a next release cycle.
====================
Link: https://patch.msgid.link/20250918142427.309519-1-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This change adds a new WQ_PERCPU flag at the network subsystem, to explicitly
request the use of the per-CPU behavior. Both flags coexist for one release
cycle to allow callers to transition their calls.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
All existing users have been updated accordingly.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-4-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-3-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
system_unbound_wq should be the default workqueue so as not to enforce
locality constraints for random work whenever it's not required.
Adding system_dfl_wq to encourage its use when unbound work should be used.
The old system_unbound_wq will be kept for a few release cycles.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Link: https://patch.msgid.link/20250918142427.309519-2-marco.crivellari@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2025-09-22
1) Fix 0 assignment for SPIs. 0 is not a valid SPI,
it means no SPI assigned.
2) Fix offloading for inter address family tunnels.
* tag 'ipsec-2025-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: fix offloading of cross-family tunnels
xfrm: xfrm_alloc_spi shouldn't use 0 as SPI
====================
Link: https://patch.msgid.link/20250922073512.62703-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-09-19 (ice, idpf, iavf, ixgbevf, fm10k)
Paul adds support for Earliest TxTime First (ETF) hardware offload
for E830 devices on ice. ETF is configured per-queue using tc-etf Qdisc;
a new Tx flow mechanism utilizes a dedicated timestamp ring alongside
the standard Tx ring. The timestamp ring contains descriptors that
specify when hardware should transmit packets; up to 2048 Tx queues can
be supported.
Additional info: https://lore.kernel.org/intel-wired-lan/20250818132257.21720-1-paul.greenwalt@intel.com/
Dave removes excess cleanup call to ice_lag_move_new_vf_nodes() in error
path.
Milena adds reporting of timestamping statistics to idpf.
Alex changes error variable type for code clarity for iavf and ixgbevf.
Brahmajit Das removes unused parameter from fm10k_unbind_hw_stats_q().
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
net: intel: fm10k: Fix parameter idx set but not used
ixgbevf: fix proper type for error code in ixgbevf_resume()
iavf: fix proper type for error code in iavf_resume()
idpf: add HW timestamping statistics
ice: Remove deprecated ice_lag_move_new_vf_nodes() call
ice: add E830 Earliest TxTime First Offload support
ice: move ice_qp_[ena|dis] for reuse
====================
Link: https://patch.msgid.link/20250919175412.653707-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
i40e: virtchnl improvements
Przemek Kitszel says:
Improvements hardening PF-VF communication for i40e driver.
This patchset targets several issues that can cause undefined behavior
or be exploited in some other way.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: improve VF MAC filters accounting
i40e: add mask to apply valid bits for itr_idx
i40e: add max boundary check for VF filters
i40e: fix validation of VF state in get resources
i40e: fix input validation logic for action_meta
i40e: fix idx validation in config queues msg
i40e: fix idx validation in i40e_validate_queue_map
i40e: add validation for ring_len param
====================
Link: https://patch.msgid.link/20250919184959.656681-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the PHY_ID_MATCH_MODEL() macro instead of hardcoding the values in
asix_driver[] and asix_tbl[].
In asix_tbl[], the macro also uses designated initializers instead of
positional initializers, which allows the struct fields to be reordered.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20250919103944.854845-2-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add missing "Return:" sections to kernel-doc comments for four functions:
- axienet_calc_cr()
- axienet_device_reset()
- axienet_free_tx_chain()
- axienet_dim_coalesce_count_rx()
Also standardize the return documentation format by replacing inline
"Returns" text with proper "Return:" tags as per kernel documentation
guidelines.
Fixes below kernel-doc warnings:
- Warning: No description found for return value of 'axienet_calc_cr'
- Warning: No description found for return value of 'axienet_device_reset'
- Warning: No description found for return value of 'axienet_free_tx_chain'
- Warning: No description found for return value of
'axienet_dim_coalesce_count_rx'
Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
Link: https://patch.msgid.link/20250919103754.434711-1-suraj.gupta2@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'net-dsa-microchip-add-strap-description-to-set-spi-as-interface-bus'
Bastien Curutchet says:
====================
net: dsa: microchip: Add strap description to set SPI as interface bus
At reset, the KSZ8463 uses a strap-based configuration to set SPI as
interface bus. If the required pull-ups/pull-downs are missing
(by mistake or by design to save power) the pins may float and the
configuration can go wrong preventing any communication with the switch.
This small series aims to allow to configure the KSZ8463 switch at
reset when the hardware straps are missing.
PATCH 0 and 1 add a new property to the bindings that describes the GPIOs
to be set during reset in order to configure the switch properly.
PATCH 2 implements the use of these properties in the driver.
====================
Link: https://patch.msgid.link/20250918-ksz-strap-pins-v3-0-16662e881728@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
At reset, the KSZ8463 uses a strap-based configuration to set SPI as
bus interface. SPI is the only bus supported by the driver. If the
required pull-ups/pull-downs are missing (by mistake or by design to
save power) the pins may float and the configuration can go wrong
preventing any communication with the switch.
Introduce a ksz8463_configure_straps_spi() function called during the
device reset. It relies on the 'straps-rxd-gpios' OF property and the
'reset' pinmux configuration to enforce SPI as bus interface.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20250918-ksz-strap-pins-v3-3-16662e881728@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
At reset, KSZ8463 uses a strap-based configuration to set SPI as
interface bus. If the required pull-ups/pull-downs are missing (by
mistake or by design to save power) the pins may float and the
configuration can go wrong preventing any communication with the switch.
Add a 'reset' pinmux state
Add a KSZ8463 specific strap description that can be used by the driver
to drive the strap pins during reset. Two GPIOs are used. Users must
describe either both of them or none of them.
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20250918-ksz-strap-pins-v3-2-16662e881728@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Upcoming patch adds a new if/then clause. It requires to be grouped with
the already existing if/then clause under an 'allOf:' tag.
Move the if/then clause under the already existing 'allOf:' tag to
prepare next patch.
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20250918-ksz-strap-pins-v3-1-16662e881728@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:
====================
mlx5-next updates 2025-09-21
* tag 'mlx5-next-counters' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Add uar access and odp page fault counters
====================
Link: https://patch.msgid.link/1758443940-708689-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Russell King says:
====================
net: rework SFP capability parsing and quirks
The original SPF module parsing was implemented prior to gaining any
quirks, and was designed such that the upstream calls the parsing
functions to get the translated capabilities of the module.
SFP quirks were then added to cope with modules that didn't correctly
fill out their ID EEPROM. The quirk function was called from
sfp_parse_support() to allow quirks to modify the ethtool link mode
masks.
Using just ethtool link mode masks eventually lead to difficulties
determining the correct phy_interface_t mode, so a bitmap of these
modes were added - needing both the upstream API and quirks to be
updated.
We have had significantly more SFP module quirks added since, some
which are modifying the ID EEPROM as a way of influencing the data
we provide to the upstream - for example, sfp_fixup_10gbaset_30m()
changes id.base.connector so we report PORT_TP. This could be done
more cleanly if the quirks had access to the parsed SFP port.
In order to improve flexibility, and to simplify some of the upstream
code, we group all module capabilities into a single structure that
the upstream can access via sfp_module_get_caps(). This will allow
the module capabilities to be expanded if required without reworking
all the infrastructure and upstreams again.
In this series, we rework the SFP code to use the capability structure
and then rework all the upstream implementations, finally removing the
old kernel internal APIs.
====================
Link: https://patch.msgid.link/aMnaoPjIuzEAsESZ@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the old sfp_parse_*() functions that are now no longer used.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVz-000000061Wj-13Yd@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update all PHYs to use sfp_get_module_caps() rather than the
sfp_parse_*() family of functions.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVu-000000061Wd-0cAG@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use sfp_get_module_caps() to get SFP module's capabilities.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVp-000000061WW-08YM@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Provide a function to retrieve the current sfp_module_caps structure
so that upstreams can get the entire module support in one go.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVj-000000061WQ-3q47@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to provide extensible module support properties, arrange for
the SFP quirks to modify any member of the sfp_module_support struct,
rather than just the ethtool link modes and interfaces.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVe-000000061WK-3KwI@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pre-parse the module support on insert rather than when the upstream
requests the data. This will allow more flexible and extensible
parsing.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVZ-000000061WE-2pXD@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a helper for copying PHY interface bitmasks. This will be used by
the SFP bus code, which will then be moved to phylink in the subsequent
patches.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1uydVU-000000061W8-2IDT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The is_prs_invalid helper function is redundant as it serves a similar
purpose to is_partition_invalid. It can be fully replaced by the existing
is_partition_invalid function, so this patch removes the is_prs_invalid
helper.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
If the parent is not a valid partition, an error will be returned before
any partition update command is processed. This means the
WARN_ON_ONCE(!is_partition_valid(parent)) can never be triggered, so
it is safe to remove.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The nodelist_parse function already handles empty nodemask input
appropriately, making it unnecessary to handle this case separately
during the node mask update process.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The hfsplus_strcasecmp() logic can trigger the issue:
[ 117.317703][ T9855] ==================================================================
[ 117.318353][ T9855] BUG: KASAN: slab-out-of-bounds in hfsplus_strcasecmp+0x1bc/0x490
[ 117.318991][ T9855] Read of size 2 at addr ffff88802160f40c by task repro/9855
[ 117.319577][ T9855]
[ 117.319773][ T9855] CPU: 0 UID: 0 PID: 9855 Comm: repro Not tainted 6.17.0-rc6 #33 PREEMPT(full)
[ 117.319780][ T9855] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 117.319783][ T9855] Call Trace:
[ 117.319785][ T9855] <TASK>
[ 117.319788][ T9855] dump_stack_lvl+0x1c1/0x2a0
[ 117.319795][ T9855] ? __virt_addr_valid+0x1c8/0x5c0
[ 117.319803][ T9855] ? __pfx_dump_stack_lvl+0x10/0x10
[ 117.319808][ T9855] ? rcu_is_watching+0x15/0xb0
[ 117.319816][ T9855] ? lock_release+0x4b/0x3e0
[ 117.319821][ T9855] ? __kasan_check_byte+0x12/0x40
[ 117.319828][ T9855] ? __virt_addr_valid+0x1c8/0x5c0
[ 117.319835][ T9855] ? __virt_addr_valid+0x4a5/0x5c0
[ 117.319842][ T9855] print_report+0x17e/0x7e0
[ 117.319848][ T9855] ? __virt_addr_valid+0x1c8/0x5c0
[ 117.319855][ T9855] ? __virt_addr_valid+0x4a5/0x5c0
[ 117.319862][ T9855] ? __phys_addr+0xd3/0x180
[ 117.319869][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490
[ 117.319876][ T9855] kasan_report+0x147/0x180
[ 117.319882][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490
[ 117.319891][ T9855] hfsplus_strcasecmp+0x1bc/0x490
[ 117.319900][ T9855] ? __pfx_hfsplus_cat_case_cmp_key+0x10/0x10
[ 117.319906][ T9855] hfs_find_rec_by_key+0xa9/0x1e0
[ 117.319913][ T9855] __hfsplus_brec_find+0x18e/0x470
[ 117.319920][ T9855] ? __pfx_hfsplus_bnode_find+0x10/0x10
[ 117.319926][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10
[ 117.319933][ T9855] ? __pfx___hfsplus_brec_find+0x10/0x10
[ 117.319942][ T9855] hfsplus_brec_find+0x28f/0x510
[ 117.319949][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10
[ 117.319956][ T9855] ? __pfx_hfsplus_brec_find+0x10/0x10
[ 117.319963][ T9855] ? __kmalloc_noprof+0x2a9/0x510
[ 117.319969][ T9855] ? hfsplus_find_init+0x8c/0x1d0
[ 117.319976][ T9855] hfsplus_brec_read+0x2b/0x120
[ 117.319983][ T9855] hfsplus_lookup+0x2aa/0x890
[ 117.319990][ T9855] ? __pfx_hfsplus_lookup+0x10/0x10
[ 117.320003][ T9855] ? d_alloc_parallel+0x2f0/0x15e0
[ 117.320008][ T9855] ? __lock_acquire+0xaec/0xd80
[ 117.320013][ T9855] ? __pfx_d_alloc_parallel+0x10/0x10
[ 117.320019][ T9855] ? __raw_spin_lock_init+0x45/0x100
[ 117.320026][ T9855] ? __init_waitqueue_head+0xa9/0x150
[ 117.320034][ T9855] __lookup_slow+0x297/0x3d0
[ 117.320039][ T9855] ? __pfx___lookup_slow+0x10/0x10
[ 117.320045][ T9855] ? down_read+0x1ad/0x2e0
[ 117.320055][ T9855] lookup_slow+0x53/0x70
[ 117.320065][ T9855] walk_component+0x2f0/0x430
[ 117.320073][ T9855] path_lookupat+0x169/0x440
[ 117.320081][ T9855] filename_lookup+0x212/0x590
[ 117.320089][ T9855] ? __pfx_filename_lookup+0x10/0x10
[ 117.320098][ T9855] ? strncpy_from_user+0x150/0x290
[ 117.320105][ T9855] ? getname_flags+0x1e5/0x540
[ 117.320112][ T9855] user_path_at+0x3a/0x60
[ 117.320117][ T9855] __x64_sys_umount+0xee/0x160
[ 117.320123][ T9855] ? __pfx___x64_sys_umount+0x10/0x10
[ 117.320129][ T9855] ? do_syscall_64+0xb7/0x3a0
[ 117.320135][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 117.320141][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 117.320145][ T9855] do_syscall_64+0xf3/0x3a0
[ 117.320150][ T9855] ? exc_page_fault+0x9f/0xf0
[ 117.320154][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 117.320158][ T9855] RIP: 0033:0x7f7dd7908b07
[ 117.320163][ T9855] Code: 23 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 08
[ 117.320167][ T9855] RSP: 002b:00007ffd5ebd9698 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
[ 117.320172][ T9855] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dd7908b07
[ 117.320176][ T9855] RDX: 0000000000000009 RSI: 0000000000000009 RDI: 00007ffd5ebd9740
[ 117.320179][ T9855] RBP: 00007ffd5ebda780 R08: 0000000000000005 R09: 00007ffd5ebd9530
[ 117.320181][ T9855] R10: 00007f7dd799bfc0 R11: 0000000000000202 R12: 000055e2008b32d0
[ 117.320184][ T9855] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 117.320189][ T9855] </TASK>
[ 117.320190][ T9855]
[ 117.351311][ T9855] Allocated by task 9855:
[ 117.351683][ T9855] kasan_save_track+0x3e/0x80
[ 117.352093][ T9855] __kasan_kmalloc+0x8d/0xa0
[ 117.352490][ T9855] __kmalloc_noprof+0x288/0x510
[ 117.352914][ T9855] hfsplus_find_init+0x8c/0x1d0
[ 117.353342][ T9855] hfsplus_lookup+0x19c/0x890
[ 117.353747][ T9855] __lookup_slow+0x297/0x3d0
[ 117.354148][ T9855] lookup_slow+0x53/0x70
[ 117.354514][ T9855] walk_component+0x2f0/0x430
[ 117.354921][ T9855] path_lookupat+0x169/0x440
[ 117.355325][ T9855] filename_lookup+0x212/0x590
[ 117.355740][ T9855] user_path_at+0x3a/0x60
[ 117.356115][ T9855] __x64_sys_umount+0xee/0x160
[ 117.356529][ T9855] do_syscall_64+0xf3/0x3a0
[ 117.356920][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 117.357429][ T9855]
[ 117.357636][ T9855] The buggy address belongs to the object at ffff88802160f000
[ 117.357636][ T9855] which belongs to the cache kmalloc-2k of size 2048
[ 117.358827][ T9855] The buggy address is located 0 bytes to the right of
[ 117.358827][ T9855] allocated 1036-byte region [ffff88802160f000, ffff88802160f40c)
[ 117.360061][ T9855]
[ 117.360266][ T9855] The buggy address belongs to the physical page:
[ 117.360813][ T9855] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x21608
[ 117.361562][ T9855] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 117.362285][ T9855] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
[ 117.362929][ T9855] page_type: f5(slab)
[ 117.363282][ T9855] raw: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[ 117.364015][ T9855] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[ 117.364750][ T9855] head: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[ 117.365491][ T9855] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[ 117.366232][ T9855] head: 00fff00000000003 ffffea0000858201 00000000ffffffff 00000000ffffffff
[ 117.366968][ T9855] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
[ 117.367711][ T9855] page dumped because: kasan: bad access detected
[ 117.368259][ T9855] page_owner tracks the page as allocated
[ 117.368745][ T9855] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN1
[ 117.370541][ T9855] post_alloc_hook+0x240/0x2a0
[ 117.370954][ T9855] get_page_from_freelist+0x2101/0x21e0
[ 117.371435][ T9855] __alloc_frozen_pages_noprof+0x274/0x380
[ 117.371935][ T9855] alloc_pages_mpol+0x241/0x4b0
[ 117.372360][ T9855] allocate_slab+0x8d/0x380
[ 117.372752][ T9855] ___slab_alloc+0xbe3/0x1400
[ 117.373159][ T9855] __kmalloc_cache_noprof+0x296/0x3d0
[ 117.373621][ T9855] nexthop_net_init+0x75/0x100
[ 117.374038][ T9855] ops_init+0x35c/0x5c0
[ 117.374400][ T9855] setup_net+0x10c/0x320
[ 117.374768][ T9855] copy_net_ns+0x31b/0x4d0
[ 117.375156][ T9855] create_new_namespaces+0x3f3/0x720
[ 117.375613][ T9855] unshare_nsproxy_namespaces+0x11c/0x170
[ 117.376094][ T9855] ksys_unshare+0x4ca/0x8d0
[ 117.376477][ T9855] __x64_sys_unshare+0x38/0x50
[ 117.376879][ T9855] do_syscall_64+0xf3/0x3a0
[ 117.377265][ T9855] page last free pid 9110 tgid 9110 stack trace:
[ 117.377795][ T9855] __free_frozen_pages+0xbeb/0xd50
[ 117.378229][ T9855] __put_partials+0x152/0x1a0
[ 117.378625][ T9855] put_cpu_partial+0x17c/0x250
[ 117.379026][ T9855] __slab_free+0x2d4/0x3c0
[ 117.379404][ T9855] qlist_free_all+0x97/0x140
[ 117.379790][ T9855] kasan_quarantine_reduce+0x148/0x160
[ 117.380250][ T9855] __kasan_slab_alloc+0x22/0x80
[ 117.380662][ T9855] __kmalloc_noprof+0x232/0x510
[ 117.381074][ T9855] tomoyo_supervisor+0xc0a/0x1360
[ 117.381498][ T9855] tomoyo_env_perm+0x149/0x1e0
[ 117.381903][ T9855] tomoyo_find_next_domain+0x15ad/0x1b90
[ 117.382378][ T9855] tomoyo_bprm_check_security+0x11c/0x180
[ 117.382859][ T9855] security_bprm_check+0x89/0x280
[ 117.383289][ T9855] bprm_execve+0x8f1/0x14a0
[ 117.383673][ T9855] do_execveat_common+0x528/0x6b0
[ 117.384103][ T9855] __x64_sys_execve+0x94/0xb0
[ 117.384500][ T9855]
[ 117.384706][ T9855] Memory state around the buggy address:
[ 117.385179][ T9855] ffff88802160f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 117.385854][ T9855] ffff88802160f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 117.386534][ T9855] >ffff88802160f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 117.387204][ T9855] ^
[ 117.387566][ T9855] ffff88802160f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 117.388243][ T9855] ffff88802160f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 117.388918][ T9855] ==================================================================
The issue takes place if the length field of struct hfsplus_unistr
is bigger than HFSPLUS_MAX_STRLEN. The patch simply checks
the length of comparing strings. And if the strings' length
is bigger than HFSPLUS_MAX_STRLEN, then it is corrected
to this value.
v2
The string length correction has been added for hfsplus_strcmp().
Reported-by: Jiaming Zhang <r772577952@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
cc: syzkaller@googlegroups.com
Link: https://lore.kernel.org/r/20250919191243.1370388-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
|
|
Make use of the newly-available `Alignment` type and remove the
corresponding TODO item.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Alignment operations are very common in the kernel. Since they are
always performed using a power-of-two value, enforcing this invariant
through a dedicated type leads to fewer bugs and can improve the
generated code.
Introduce the `Alignment` type, inspired by the nightly Rust type of the
same name and providing the same interface, and a new `Alignable` trait
allowing unsigned integers to be aligned up or down.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
[ Used `build_assert!`, added intra-doc link, `allow`ed
`clippy::incompatible_msrv`, added `feature(const_option)`, capitalized
safety comment. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Currently, the QMI interface only works on little endian systems due to how
it encodes and decodes data. Most QMI related data structures do not use
endian specific types and are already defined in CPU native order. The
ath12k specific QMI structs are an exception: they use partially endian
specific types, which prevents the QMI interface from being extended to
support big endian systems.
Update the two affected ath12k QMI structs to use CPU order types instead.
This is required because the QMI interface is being extended to support big
endian system, and that support depends on QMI data structures being
defined in CPU native order.
This change:
* preserves compatibility with existing kernels, which only support little
endian system
* enables future support for big endian systems
* aligns ath12k QMI handling with the general QMI design
Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250922061607.11543-1-alexander.wilhelm@westermo.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Currently, if the descriptor size exceeds 128 bytes, the total
descriptor is split into multiple 128-byte segments, each
requiring a separate flush cache queue command. This results in
multiple commands being issued to flush a single TID, which
negatively impacts performance. To optimize this, use the
_FLUSH_QUEUE_1K_DESC REO command to flush a 1KB descriptor in a single
operation to optimize performance.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-8-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Flush cache failures were observed after RX queue update for TID
delete. This occurred because the queue was invalid during flush.
Set the VLD bit in the RX queue update command for TID delete.
This ensures the queue remains valid during the flush cache process.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-7-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
During stress test scenarios, when the REO command ring becomes full,
the RX queue update command issued during peer deletion fails due to
insufficient space. In response, the host performs a dma_unmap and
frees the associated memory. However, the hardware still retains a
reference to the same memory address. If the kernel later reallocates
this address, unaware that the hardware is still using it, it can
lead to memory corruption-since the host might access or modify
memory that is still actively referenced by the hardware.
Implement a retry mechanism for the HAL_REO_CMD_UPDATE_RX_QUEUE
command during TID deletion to prevent memory corruption. Introduce
a new list, reo_cmd_update_rx_queue_list, in the struct ath12k_dp to
track pending RX queue updates. Protect this list with
reo_rxq_flush_lock, which also ensures synchronized access to
reo_cmd_cache_flush_list. Defer memory release until hardware
confirms the virtual address is no longer in use, avoiding immediate
deallocation on command failure. Release memory for pending RX queue
updates via ath12k_dp_rx_reo_cmd_list_cleanup() on system reset
if hardware confirmation is not received.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Manish Dharanenthiran <manish.dharanenthiran@oss.qualcomm.com>
Co-developed-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-6-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Introduce ath12k_dp_rx_tid_rxq as a lightweight structure to represent
only the necessary fields for REO command construction. Replace direct
usage of ath12k_dp_rx_tid in REO command paths with this new structure.
This decouples REO command logic from internal TID state representation,
improves modularity, and reduces unnecessary data dependencies.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-5-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Introduce ath12k_dp_rx_tid_cleanup() to handle RX TID buffer
unmapping and freeing. This replaces duplicated cleanup logic
across multiple code paths.
This improves code maintainability and avoids redundancy in
buffer cleanup operations.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-4-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Refactor RX TID deletion handling by moving the REO command
setup and send sequence into a new helper function:
ath12k_dp_rx_tid_delete_handler().
This improves code readability and modularity, and prepares
the codebase for potential reuse of the REO command logic in
other contexts where RX TID deletion is required.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-3-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Increase DP_REO_CMD_RING_SIZE from 128 to 256 to avoid
queuing failures observed during stress test scenarios.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Nithyanantham Paramasivam <nithyanantham.paramasivam@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250806111750.3214584-2-nithyanantham.paramasivam@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
https://github.com/Rust-for-Linux/linux into rust-next
Pull timekeeping updates from Andreas Hindborg:
- Add methods on 'HrTimer' that can only be called with exclusive
access to an unarmed timer, or form timer callback context.
- Add arithmetic operations to 'Instant' and 'Delta'.
- Add a few convenience and access methods to 'HrTimer' and 'Instant'.
* tag 'rust-timekeeping-v6.18' of https://github.com/Rust-for-Linux/linux:
rust: time: Implement basic arithmetic operations for Delta
rust: time: Implement Add<Delta>/Sub<Delta> for Instant
rust: hrtimer: Add HrTimer::expires()
rust: time: Add Instant::from_ktime()
rust: hrtimer: Add forward_now() to HrTimer and HrTimerCallbackContext
rust: hrtimer: Add HrTimerCallbackContext and ::forward()
rust: hrtimer: Add HrTimer::raw_forward() and forward()
rust: hrtimer: Add HrTimerInstant
rust: hrtimer: Document the return value for HrTimerHandle::cancel()
|
|
This is a port of the Binder data structure introduced in commit
15d9da3f818c ("binder: use bitmap for faster descriptor lookup") to
Rust.
Like drivers/android/dbitmap.h, the ID pool abstraction lets
clients acquire and release IDs. The implementation uses a bitmap to
know what IDs are in use, and gives clients fine-grained control over
the time of allocation. This fine-grained control is needed in the
Android Binder. We provide an example that release a spinlock for
allocation and unit tests (rustdoc examples).
The implementation does not permit shrinking below capacity below
BITS_PER_LONG.
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Burak Emir <bqe@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
Microbenchmark protected by a config FIND_BIT_BENCHMARK_RUST,
following `find_bit_benchmark.c` but testing the Rust Bitmap API.
We add a fill_random() method protected by the config in order to
maintain the abstraction.
The sample output from the benchmark, both C and Rust version:
find_bit_benchmark.c output:
```
Start testing find_bit() with random-filled bitmap
[ 438.101937] find_next_bit: 860188 ns, 163419 iterations
[ 438.109471] find_next_zero_bit: 912342 ns, 164262 iterations
[ 438.116820] find_last_bit: 726003 ns, 163419 iterations
[ 438.130509] find_nth_bit: 7056993 ns, 16269 iterations
[ 438.139099] find_first_bit: 1963272 ns, 16270 iterations
[ 438.173043] find_first_and_bit: 27314224 ns, 32654 iterations
[ 438.180065] find_next_and_bit: 398752 ns, 73705 iterations
[ 438.186689]
Start testing find_bit() with sparse bitmap
[ 438.193375] find_next_bit: 9675 ns, 656 iterations
[ 438.201765] find_next_zero_bit: 1766136 ns, 327025 iterations
[ 438.208429] find_last_bit: 9017 ns, 656 iterations
[ 438.217816] find_nth_bit: 2749742 ns, 655 iterations
[ 438.225168] find_first_bit: 721799 ns, 656 iterations
[ 438.231797] find_first_and_bit: 2819 ns, 1 iterations
[ 438.238441] find_next_and_bit: 3159 ns, 1 iterations
```
find_bit_benchmark_rust.rs output:
```
[ 451.182459] find_bit_benchmark_rust:
[ 451.186688] Start testing find_bit() Rust with random-filled bitmap
[ 451.194450] next_bit: 777950 ns, 163644 iterations
[ 451.201997] next_zero_bit: 918889 ns, 164036 iterations
[ 451.208642] Start testing find_bit() Rust with sparse bitmap
[ 451.214300] next_bit: 9181 ns, 654 iterations
[ 451.222806] next_zero_bit: 1855504 ns, 327026 iterations
```
Here are the results from 32 samples, with 95% confidence interval.
The microbenchmark was built with RUST_BITMAP_HARDENED=n and run on a
machine that did not execute other processes.
Random-filled bitmap:
+-----------+-------+-----------+--------------+-----------+-----------+
| Benchmark | Lang | Mean (ms) | Std Dev (ms) | 95% CI Lo | 95% CI Hi |
+-----------+-------+-----------+--------------+-----------+-----------+
| find_bit/ | C | 825.07 | 53.89 | 806.40 | 843.74 |
| next_bit | Rust | 870.91 | 46.29 | 854.88 | 886.95 |
+-----------+-------+-----------+--------------+-----------+-----------+
| find_zero/| C | 933.56 | 56.34 | 914.04 | 953.08 |
| next_zero | Rust | 945.85 | 60.44 | 924.91 | 966.79 |
+-----------+-------+-----------+--------------+-----------+-----------+
Rust appears 5.5% slower for next_bit, 1.3% slower for next_zero.
Sparse bitmap:
+-----------+-------+-----------+--------------+-----------+-----------+
| Benchmark | Lang | Mean (ms) | Std Dev (ms) | 95% CI Lo | 95% CI Hi |
+-----------+-------+-----------+--------------+-----------+-----------+
| find_bit/ | C | 13.17 | 6.21 | 11.01 | 15.32 |
| next_bit | Rust | 14.30 | 8.27 | 11.43 | 17.17 |
+-----------+-------+-----------+--------------+-----------+-----------+
| find_zero/| C | 1859.31 | 82.30 | 1830.80 | 1887.83 |
| next_zero | Rust | 1908.09 | 139.82 | 1859.65 | 1956.54 |
+-----------+-------+-----------+--------------+-----------+-----------+
Rust appears 8.5% slower for next_bit, 2.6% slower for next_zero.
In summary, taking the arithmetic mean of all slow-downs, we can say
the Rust API has a 4.5% slowdown.
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Reviewed-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Burak Emir <bqe@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
Provides an abstraction for C bitmap API and bitops operations.
This commit enables a Rust implementation of an Android Binder
data structure from commit 15d9da3f818c ("binder: use bitmap for faster
descriptor lookup"), which can be found in drivers/android/dbitmap.h.
It is a step towards upstreaming the Rust port of Android Binder driver.
We follow the C Bitmap API closely in naming and semantics, with
a few differences that take advantage of Rust language facilities
and idioms. The main types are `BitmapVec` for owned bitmaps and
`Bitmap` for references to C bitmaps.
* We leverage Rust type system guarantees as follows:
* all (non-atomic) mutating operations require a &mut reference which
amounts to exclusive access.
* the `BitmapVec` type implements Send. This enables transferring
ownership between threads and is needed for Binder.
* the `BitmapVec` type implements Sync, which enables passing shared
references &Bitmap between threads. Atomic operations can be
used to safely modify from multiple threads (interior
mutability), though without ordering guarantees.
* The Rust API uses `{set,clear}_bit` vs `{set,clear}_bit_atomic` as
names for clarity, which differs from the C naming convention
`set_bit` for atomic vs `__set_bit` for non-atomic.
* we include enough operations for the API to be useful. Not all
operations are exposed yet in order to avoid dead code. The missing
ones can be added later.
* We take a fine-grained approach to safety:
* Low-level bit-ops get a safe API with bounds checks. Calling with
an out-of-bounds arguments to {set,clear}_bit becomes a no-op and
get logged as errors.
* We also introduce a RUST_BITMAP_HARDENED config, which
causes invocations with out-of-bounds arguments to panic.
* methods correspond to find_* C methods tolerate out-of-bounds
since the C implementation does. Also here, out-of-bounds
arguments are logged as errors, or panic in RUST_BITMAP_HARDENED
mode.
* We add a way to "borrow" bitmaps from C in Rust, to make C bitmaps
that were allocated in C directly usable in Rust code (`Bitmap`).
* the Rust API is optimized to represent the bitmap inline if it would
fit into a pointer. This saves allocations which is
relevant in the Binder use case.
The underlying C bitmap is *not* exposed for raw access in Rust. Doing so
would permit bypassing the Rust API and lose static guarantees.
An alternative route of vendoring an existing Rust bitmap package was
considered but suboptimal overall. Reusing the C implementation is
preferable for a basic data structure like bitmaps. It enables Rust
code to be a lot more similar and predictable with respect to C code
that uses the same data structures and enables the use of code that
has been tried-and-tested in the kernel, with the same performance
characteristics whenever possible.
We use the `usize` type for sizes and indices into the bitmap,
because Rust generally always uses that type for indices and lengths
and it will be more convenient if the API accepts that type. This means
that we need to perform some casts to/from u32 and usize, since the C
headers use unsigned int instead of size_t/unsigned long for these
numbers in some places.
Adds new MAINTAINERS section BITMAP API [RUST].
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Burak Emir <bqe@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
Makes atomic set_bit and clear_bit inline functions as well as the
non-atomic variants __set_bit and __clear_bit available to Rust.
Adds a new MAINTAINERS section BITOPS API BINDINGS [RUST].
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Burak Emir <bqe@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
Makes the bitmap_copy_and_extend inline function available to Rust.
Adds F: to existing MAINTAINERS section BITMAP API BINDINGS [RUST].
-
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Burak Emir <bqe@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Acked-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.
The Rockchip PCIe PHY driver, used on the RK3399, has its own definition
of HIWORD_UPDATE.
Remove it, and replace instances of it with hw_bitfield.h's
FIELD_PREP_WM16. To achieve this, some mask defines are reshuffled, as
FIELD_PREP_WM16 uses the mask as both the mask of bits to write and to
derive the shift amount from in order to shift the value.
In order to ensure that the mask is always a constant, the inst->index
shift is performed after the FIELD_PREP_WM16, as this is a runtime
value.
>From this, we gain compile-time error checking, and in my humble opinion
nicer code, as well as a single definition of this macro across the
entire codebase to aid in code comprehension.
Tested on a RK3399 ROCKPro64, where PCIe still works as expected when
accessing an NVMe drive.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
The sp7021 clock driver has its own shifted high word mask macro,
similar to the ones many Rockchip drivers have.
Remove it, and replace instances of it with hw_bitfield.h's
FIELD_PREP_WM16 macro, which does the same thing except in a common
macro that also does compile-time error checking.
This was compile-tested with 32-bit ARM with Clang, no runtime tests
were performed as I lack the hardware. However, I verified that fix
commit 5c667d5a5a3e ("clk: sp7021: Adjust width of _m in HWM_FIELD_PREP()")
is not regressed. No warning is produced.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
|
|
Replace the opencoded for_each_cpu(cpu, cpu_online_mask) loop with the
more readable and equivalent for_each_online_cpu(cpu) macro.
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Link: https://lore.kernel.org/r/20250811065216.3320-1-wangfushuai@baidu.com
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
|