Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into soc/drivers
soc/tegra: Changes for v6.9-rc1
This set of changes adds ACPI support for the APBMISC driver and cleans
up a few things like dependencies and unused code.
* tag 'tegra-for-6.9-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
soc/tegra: pmc: Add SD wake event for Tegra234
soc/tegra: pmc: Update scratch as an optional aperture
soc/tegra: pmc: Update address mapping sequence for PMC apertures
bus: tegra-aconnect: Update dependency to ARCH_TEGRA
soc/tegra: Fix build failure on Tegra241
soc/tegra: fuse: Fix crash in tegra_fuse_readl()
soc/tegra: fuse: Define tegra194_soc_attr_group for Tegra241
soc/tegra: fuse: Add support for Tegra241
soc/tegra: fuse: Add ACPI support for Tegra194 and Tegra234
soc/tegra: fuse: Add function to print SKU info
soc/tegra: fuse: Add function to add lookups
soc/tegra: fuse: Add tegra_acpi_init_apbmisc()
soc/tegra: fuse: Refactor resource mapping
soc/tegra: fuse: Use dev_err_probe for probe failures
mm/util: Introduce kmemdup_array()
soc/tegra: pmc: Remove some old and deprecated functions and constants
Link: https://lore.kernel.org/r/20240223174849.1509465-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers
Arm SCMI updates for v6.9
Quite a few changes to extend support to SCMI v3.2 specification,
to enhance notification handling and other miscellaneous updates.
1. Enhancements to notification handling
Until now, trying to register a notifier for an unsuppported
notification returned an error genrating unneeded message exchanges
with the SCMI platform. This can be avoided by looking up in advance
the specific protocol and resources available.
With these changes SCMI driver user will fail to register a notifier
if the related command or resource is not supported (like before)
without the need of exchanging any message.
Perf notifications are also extended to provide the pre-calculated
frequencies corresponding to the level or index carried by the
2. More SCMI v3.2 related updates
One of the main addition includes a centralized support to the SCMI
core to handle v3.2 optional protocol version negotiation, so that
at protocol initialization time, if the platform advertised version
is newer than supported by the kernel and protocol version negotiation
is supported, the SCMI core will attempt to negotiate an older protocol
version.
It also includes the clock get permissions which indicates if any of
the clock operations are forbidden by the platform for the OSPM agent.
It can be used in the clock driver to avoid unnecessary message
exchanges between the kernel and the platform which will always end
up with the failure. It also includes other missing bits of clock
v3.2 protocol so that the supported protocol version can be bumped
to 0x30000 (v3.2).
3. Miscellaneous updates
This includes addition of warning if the domain frequency multiplier
is 0 or rounded off to indicate the actual frequencies are either
wrong ot rounded off, hardening of clock domain info lookups, addition
of multiple protocols registration support within a SCMI driver,
update to SCMI entry in MAINTAINERS to include HWMON driver and
constifying the scmi_bus_type structure.
This also includes couple for fixes to minor issues: double free in
SMC transport cleanup path and struct kernel-doc warnings in optee
transport.
* tag 'scmi-updates-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (29 commits)
MAINTAINERS: Update SCMI entry with HWMON driver
firmware: arm_scmi: Update the supported clock protocol version
firmware: arm_scmi: Add standard clock OEM definitions
firmware: arm_scmi: Add clock check for extended config support
firmware: arm_scmi: Add support for v3.2 NEGOTIATE_PROTOCOL_VERSION
firmware: arm_scmi: Fix struct kernel-doc warnings in optee transport
firmware: arm_scmi: Report frequencies in the perf notifications
firmware: arm_scmi: Use opps_by_lvl to store opps
firmware: arm_scmi: Implement is_notify_supported callback in powercap protocol
firmware: arm_scmi: Implement is_notify_supported callback in reset protocol
firmware: arm_scmi: Implement is_notify_supported callback in sensor protocol
firmware: arm_scmi: Implement is_notify_supported callback in clock protocol
firmware: arm_scmi: Implement is_notify_supported callback in system power protocol
firmware: arm_scmi: Implement is_notify_supported callback in power protocol
firmware: arm_scmi: Implement is_notify_supported callback in perf protocol
firmware: arm_scmi: Add a common helper to check if a message is supported
firmware: arm_scmi: Check for notification support
firmware: arm_scmi: Make scmi_bus_type const
firmware: arm_scmi: Fix double free in SMC transport cleanup path
firmware: arm_scmi: Implement clock get permissions
...
Link: https://lore.kernel.org/r/20240223033435.118028-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
i2c_adapter_type and i2c_client_type variables to be constant structures as
well, placing it into read-only memory which can not be modified at
runtime.
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
There is no point in having seven architectures implementing the same empty
stub.
Provide a weak function in the init code and remove the stubs.
This also allows to utilize the function on UP which is required to
sanitize the per CPU handling on X86 UP.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.567671691@linutronix.de
|
|
(skb_transport_header(skb) - skb_network_header(skb))
can be replaced by skb_network_header_len(skb)
Add a DEBUG_NET_WARN_ON_ONCE() in skb_network_header_len()
to catch cases were the transport_header was not set.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the driver core can properly handle constant struct bus_type,
move the serio_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240210-bus_cleanup-input2-v1-2-0daef7e034e0@marliere.net
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver
core can properly handle constant struct device_type. Move the
sdw_master_type and sdw_slave_type variables to be constant structures as
well, placing it into read-only memory which can not be modified at
runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net>
Link: https://lore.kernel.org/r/20240219-device_cleanup-soundwire-v1-1-9edd51767611@marliere.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
of_machine_compatible_match() works with a table of strings.
of_machine_is_compatible() is a simplier version with only one string.
Re-implement of_machine_is_compatible() by setting a table of strings
with a single string then using of_machine_compatible_match().
Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231214103152.12269-3-mpe@ellerman.id.au
|
|
of_machine_is_compatible() currently returns a positive integer if it
finds a match. However none of the callers ever check the value, they
all treat it as a true/false.
So change of_machine_is_compatible() to return bool, which will allow
the implementation to be changed in a subsequent patch.
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231214103152.12269-2-mpe@ellerman.id.au
|
|
We have of_machine_is_compatible() to check if a machine is compatible
with a single compatible string. However some code is able to support
multiple compatible boards, and so wants to check for one of many
compatible strings.
So add of_machine_compatible_match() which takes a NULL terminated
array of compatible strings to check against the root node's
compatible property.
Compared to an open coded match this is slightly more self
documenting, and also avoids the caller needing to juggle the root
node either directly or via of_find_node_by_path().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231214103152.12269-1-mpe@ellerman.id.au
|
|
Previously driver got a few updates in order to replace OF APIs by
respective firmware node, however it was not finished to the logical
end, e.g., some APIs that has been used are still require OF node
to be passed. Finish that job by converting leftovers to use firmware
node APIs.
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20240302173401.217830-1-andy.shevchenko@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-02-29
We've added 119 non-merge commits during the last 32 day(s) which contain
a total of 150 files changed, 3589 insertions(+), 995 deletions(-).
The main changes are:
1) Extend the BPF verifier to enable static subprog calls in spin lock
critical sections, from Kumar Kartikeya Dwivedi.
2) Fix confusing and incorrect inference of PTR_TO_CTX argument type
in BPF global subprogs, from Andrii Nakryiko.
3) Larger batch of riscv BPF JIT improvements and enabling inlining
of the bpf_kptr_xchg() for RV64, from Pu Lehui.
4) Allow skeleton users to change the values of the fields in struct_ops
maps at runtime, from Kui-Feng Lee.
5) Extend the verifier's capabilities of tracking scalars when they
are spilled to stack, especially when the spill or fill is narrowing,
from Maxim Mikityanskiy & Eduard Zingerman.
6) Various BPF selftest improvements to fix errors under gcc BPF backend,
from Jose E. Marchesi.
7) Avoid module loading failure when the module trying to register
a struct_ops has its BTF section stripped, from Geliang Tang.
8) Annotate all kfuncs in .BTF_ids section which eventually allows
for automatic kfunc prototype generation from bpftool, from Daniel Xu.
9) Several updates to the instruction-set.rst IETF standardization
document, from Dave Thaler.
10) Shrink the size of struct bpf_map resp. bpf_array,
from Alexei Starovoitov.
11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer,
from Benjamin Tissoires.
12) Fix bpftool to be more portable to musl libc by using POSIX's
basename(), from Arnaldo Carvalho de Melo.
13) Add libbpf support to gcc in CORE macro definitions,
from Cupertino Miranda.
14) Remove a duplicate type check in perf_event_bpf_event,
from Florian Lehner.
15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them
with notrace correctly, from Yonghong Song.
16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible
array to fix build warnings, from Kees Cook.
17) Fix resolve_btfids cross-compilation to non host-native endianness,
from Viktor Malik.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
selftests/bpf: Test if shadow types work correctly.
bpftool: Add an example for struct_ops map and shadow type.
bpftool: Generated shadow variables for struct_ops maps.
libbpf: Convert st_ops->data to shadow type.
libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
bpf, arm64: use bpf_prog_pack for memory management
arm64: patching: implement text_poke API
bpf, arm64: support exceptions
arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT
bpf: add is_async_callback_calling_insn() helper
bpf: introduce in_sleepable() helper
bpf: allow more maps in sleepable bpf programs
selftests/bpf: Test case for lacking CFI stub functions.
bpf: Check cfi_stubs before registering a struct_ops type.
bpf: Clarify batch lookup/lookup_and_delete semantics
bpf, docs: specify which BPF_ABS and BPF_IND fields were zero
bpf, docs: Fix typos in instruction-set.rst
selftests/bpf: update tcp_custom_syncookie to use scalar packet offset
bpf: Shrink size of struct bpf_map/bpf_array.
...
====================
Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A new port configuration was added to set max_queue_size. Clamp user
configuration to RDMA transport limits.
Increase the maximal queue size of RDMA controllers from 128 to 256
(the default size stays 128 same as before).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
This definition will be used by controllers that are configured with
metadata support. For now, both regular and metadata controllers have
the same maximal queue size but later commit will increase the maximal
queue size for regular RDMA controllers to 256.
We'll keep the maximal queue size for metadata controllers to be 128
since there are more resources that are needed for metadata operations
and 128 is the optimal size found for metadata controllers base on
testing.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The correct place for this definition is the nvme rdma header file and
not the common nvme header file.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
We have a macro. It should be used.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://lore.kernel.org/r/20240229132401.3270-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next
Mika writes:
thunderbolt: Changes for v6.9 merge window
This includes following USB4/Thunderbolt changes for the v6.9 merge
window:
- Reset the topology also for USB4 v1 routers on driver load
- DisplayPort tunneling and bandwidth allocation mode improvements
- Tracepoint support for the control channel
- Couple of minor fixes and cleanups.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (23 commits)
thunderbolt: Constify the struct device_type usage
thunderbolt: Add trace events support for the control channel
thunderbolt: Keep the domain powered when USB4 port is in redrive mode
thunderbolt: Improve DisplayPort tunnel setup process to be more robust
thunderbolt: Calculate DisplayPort tunnel bandwidth after DPRX capabilities read
thunderbolt: Reserve released DisplayPort bandwidth for a group for 10 seconds
thunderbolt: Introduce tb_tunnel_direction_downstream()
thunderbolt: Re-order bandwidth group functions
thunderbolt: Fail the failed bandwidth request properly
thunderbolt: Log an error if DPTX request is not cleared
thunderbolt: Handle bandwidth allocation mode disable request
thunderbolt: Re-calculate estimated bandwidth when allocation mode is enabled
thunderbolt: Use DP_LOCAL_CAP for maximum bandwidth calculation
thunderbolt: Correct typo in host_reset parameter
thunderbolt: Skip discovery also in USB4 v2 host
thunderbolt: Reset only non-USB4 host routers in resume
thunderbolt: Remove usage of the deprecated ida_simple_xx() API
thunderbolt: Fix rollback in tb_port_lane_bonding_enable() for lane 1
thunderbolt: Fix XDomain rx_lanes_show and tx_lanes_show
thunderbolt: Reset topology created by the boot firmware
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux into char-misc-next
Suzuki writes:
coresight: hwtracing subsystem updates for v6.9
Changes targeting Linux v6.9 include:
- CoreSight: Enable W=1 warnings as default
- CoreSight: Clean up sysfs/perf mode handling for tracing
- Support for Qualcomm TPDM CMB Dataset
- Miscellaneous fixes to the CoreSight subsystem
- Fix for hisi_ptt PMU to reject events targeting other PMUs
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
* tag 'coresight-next-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux: (32 commits)
coresight-tpda: Change qcom,dsb-element-size to qcom,dsb-elem-bits
dt-bindings: arm: qcom,coresight-tpdm: Rename qcom,dsb-element-size
hwtracing: hisi_ptt: Move type check to the beginning of hisi_ptt_pmu_event_init()
coresight: tpdm: Fix build break due to uninitialised field
coresight: etm4x: Set skip_power_up in etm4_init_arch_data function
coresight-tpdm: Add msr register support for CMB
dt-bindings: arm: qcom,coresight-tpdm: Add support for TPDM CMB MSR register
coresight-tpdm: Add timestamp control register support for the CMB
coresight-tpdm: Add pattern registers support for CMB
coresight-tpdm: Add support to configure CMB
coresight-tpda: Add support to configure CMB element
coresight-tpdm: Add CMB dataset support
dt-bindings: arm: qcom,coresight-tpdm: Add support for CMB element size
coresight-tpdm: Optimize the useage of tpdm_has_dsb_dataset
coresight-tpdm: Optimize the store function of tpdm simple dataset
coresight: Add helper for setting csdev->mode
coresight: Add a helper for getting csdev->mode
coresight: Add helper for atomically taking the device
coresight: Add explicit member initializers to coresight_dev_type
coresight: Remove unused stubs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi into char-misc-next
Manivannan writes:
MHI Host
========
- Added new MHI_PM_SYS_ERR_FAIL state to the MHI state machine to properly
cleanup the channel state if the device fails to respond to the MHI reset
during SYS_ERR handling. This issue was discovered with the Qualcomm AIC100 AI
accelerator device.
- Modified the code that reads and exposes the OEM_PK_HASH registers through
sysfs to read them on-demand instead of reading once during boot. Qualcomm
AIC100 devices support provisioning the keys dynamically, so this allows the
users to know the upto date information.
- Added tracepoint support to expose the debug information over tracefs.
- Reverted the commit that reads the MHI device revision from the device during
boot. This is done because the read info was not used anywhere (dead code) and
also it is not possible to read the revision info from all the devices.
- Constified the modem config for Telit FN980 modem as required by the MHI core.
MHI Endpoint
============
- Replaced kzalloc() with kcalloc() in an effort to avoid integer overflows
during multiplication. Even though there is no potential overflow in the
endpoint code, this is done for the sake of uniformity and best practice.
- Fixed the kmem_cache_create() failure check to use the correct variable.
* tag 'mhi-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi:
bus: mhi: host: pci_generic: constify modem_telit_fn980_hw_v1_config
bus: mhi: host: Change the trace string for the userspace tools mapping
bus: mhi: ep: check the correct variable in mhi_ep_register_controller()
Revert "bus: mhi: core: Add support for reading MHI info from device"
bus: mhi: host: Add tracing support
bus: mhi: ep: Use kcalloc() instead of kzalloc()
bus: mhi: host: Read PK HASH dynamically
bus: mhi: host: Add MHI_PM_SYS_ERR_FAIL state
|
|
In commit 19416123ab3e ("block: define 'struct bvec_iter' as packed"),
what we need is to save the 4byte padding, and avoid `bio` to spread on
one extra cache line.
It is enough to define it as '__packed __aligned(4)', as '__packed'
alone means byte aligned, and can cause compiler to generate horrible
code on architectures that don't support unaligned access in case that
bvec_iter is embedded in other structures.
Cc: Mikulas Patocka <mpatocka@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 19416123ab3e ("block: define 'struct bvec_iter' as packed")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Functions which can't access MFRL (Management Firmware Reset Level)
register, have no use of fw_reset structures or events. Remove fw_reset
structures allocation and registration for fw reset events notifications
for these functions.
Having the devlink param enable_remote_dev_reset on functions that don't
have this capability is misleading as these functions are not allowed to
influence the reset flow. Hence, this patch removes this parameter for
such functions.
In addition, return not supported on devlink reload action fw_activate
for these functions.
Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The __is_constexpr() macro is dark magic. Shed some light on it with
a comment to explain how and why it works.
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20240301044428.work.411-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A common use of type_max() is to find the max for the type of a
variable. Using the pattern type_max(typeof(var)) is needlessly
verbose. Instead, since typeof(type) == type we can just explicitly
call typeof() on the argument to type_max() and type_min(). Add
wrappers for readability.
We can do some replacements right away:
$ git grep '\btype_\(min\|max\)(typeof' | wc -l
11
Link: https://lore.kernel.org/r/20240301062221.work.840-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the power_supply_class structure to be declared at build
time placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20240301-class_cleanup-power-v1-1-97e0b7bf9c94@marliere.net
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
The x86 architecture has an idle routine for AMD CPUs which are affected
by erratum 400. On the affected CPUs the local APIC timer stops in the
C1E halt state.
It therefore requires tick broadcasting. The invocation of
tick_broadcast_enter()/exit() from this function violates the RCU
constraints because it can end up in lockdep or tracing, which
rightfully triggers a warning.
tick_broadcast_enter()/exit() must be invoked before ct_cpuidle_enter()
and after ct_cpuidle_exit() in default_idle_call().
Add a static branch conditional invocation of tick_broadcast_enter()/exit()
into this function to allow X86 to replace the AMD specific idle code. It's
guarded by a config switch which will be selected by x86. Otherwise it's
a NOOP.
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.266708822@linutronix.de
|
|
Add a small wrapper around blk_stack_limits that allows passing a bdev
for the bottom device and prints an error in case of misaligned
device. The name fits into the new queue limits API and the intent is
to eventually replace disk_stack_limits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240228225653.947152-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a small wrapper around queue_limits_commit_update for stacking
drivers that don't want to update existing limits, but set an
entirely new set.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240228225653.947152-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Chain RDMA Writes that convey Write chunks onto the local Send
chain. This means all WRs for an RPC Reply are now posted with a
single ib_post_send() call, and there is a single Send completion
when all of these are done. That reduces both the per-transport
doorbell rate and completion rate.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Refactor to eventually enable svcrdma to post the Write WRs for each
RPC response using the same ib_post_send() as the Send WR (ie, as a
single WR chain).
svc_rdma_result_payload (originally svc_rdma_read_payload) was added
so that the upper layer XDR encoder could identify a range of bytes
to be possibly conveyed by RDMA (if a Write chunk was provided by
the client).
The purpose of commit f6ad77590a5d ("svcrdma: Post RDMA Writes while
XDR encoding replies") was to post as much of the result payload
outside of svc_rdma_sendto() as possible because svc_rdma_sendto()
used to be called with the xpt_mutex held.
However, since commit ca4faf543a33 ("SUNRPC: Move xpt_mutex into
socket xpo_sendto methods"), the xpt_mutex is no longer held when
calling svc_rdma_sendto(). Thus, that benefit is no longer an issue.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Reduce the doorbell and Send completion rates when sending RPC/RDMA
replies that have Reply chunks. NFS READDIR procedures typically
return their result in a Reply chunk, for example.
Instead of calling ib_post_send() to post the Write WRs for the
Reply chunk, and then calling it again to post the Send WR that
conveys the transport header, chain the Write WRs to the Send WR
and call ib_post_send() only once.
Thanks to the Send Queue completion ordering rules, when the Send
WR completes, that guarantees that Write WRs posted before it have
also completed successfully. Thus all Write WRs for the Reply chunk
can remain unsignaled. Instead of handling a Write completion and
then a Send completion, only the Send completion is seen, and it
handles clean up for both the Writes and the Send.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since the RPC transaction's svc_rdma_send_ctxt will stay around for
the duration of the RDMA Write operation, the write_info structure
for the Reply chunk can reside in the request's svc_rdma_send_ctxt
instead of being allocated separately.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Eventually I'd like the server to post the reply's Send WR along
with any Write WRs using only a single call to ib_post_send(), in
order to reduce the NIC's doorbell rate.
To do this, add an anchor for a WR chain to svc_rdma_send_ctxt, and
refactor svc_rdma_send() to post this WR chain to the Send Queue. For
the moment, the posted chain will continue to contain a single Send
WR.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Now that this isn't used anywhere, remove it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Since only one service actually reports the rpc stats there's not much
of a reason to have a pointer to it in the svc_program struct. Adjust
the svc_create_pooled function to take the sv_stats as an argument and
pass the struct through there as desired instead of getting it from the
svc_program->pg_stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Make pointer to fwnode_handle a pointer to const for code safety.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240216144027.185959-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args. Make the argument
pointer to const for code safety and readability.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240216144027.185959-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Moving pidfds from the anonymous inode infrastructure to a separate tiny
in-kernel filesystem similar to sockfs, pipefs, and anon_inodefs causes
selinux denials and thus various userspace components that make heavy
use of pidfds to fail as pidfds used anon_inode_getfile() which aren't
subject to any LSM hooks. But dentry_open() is and that would cause
regressions.
The failures that are seen are selinux denials. But the core failure is
dbus-broker. That cascades into other services failing that depend on
dbus-broker. For example, when dbus-broker fails to start polkit and all
the others won't be able to work because they depend on dbus-broker.
The reason for dbus-broker failing is because it doesn't handle failures
for SO_PEERPIDFD correctly. Last kernel release we introduced
SO_PEERPIDFD (and SCM_PIDFD). SO_PEERPIDFD allows dbus-broker and polkit
and others to receive a pidfd for the peer of an AF_UNIX socket. This is
the first time in the history of Linux that we can safely authenticate
clients in a race-free manner.
dbus-broker immediately made use of this but messed up the error
checking. It only allowed EINVAL as a valid failure for SO_PEERPIDFD.
That's obviously problematic not just because of LSM denials but because
of seccomp denials that would prevent SO_PEERPIDFD from working; or any
other new error code from there.
So this is catching a flawed implementation in dbus-broker as well. It
has to fallback to the old pid-based authentication when SO_PEERPIDFD
doesn't work no matter the reasons otherwise it'll always risk such
failures. So overall that LSM denial should not have caused dbus-broker
to fail. It can never assume that a feature released one kernel ago like
SO_PEERPIDFD can be assumed to be available.
So, the next fix separate from the selinux policy update is to try and
fix dbus-broker at [3]. That should make it into Fedora as well. In
addition the selinux reference policy should also be updated. See [4]
for that. If Selinux is in enforcing mode in userspace and it encounters
anything that it doesn't know about it will deny it by default. And the
policy is entirely in userspace including declaring new types for stuff
like nsfs or pidfs to allow it.
For now we continue to raise S_PRIVATE on the inode if it's a pidfs
inode which means things behave exactly like before.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2265630
Link: https://github.com/fedora-selinux/selinux-policy/pull/2050
Link: https://github.com/bus1/dbus-broker/pull/343 [3]
Link: https://github.com/SELinuxProject/refpolicy/pull/762 [4]
Reported-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240222190334.GA412503@dev-arch.thelio-3990X
Link: https://lore.kernel.org/r/20240218-neufahrzeuge-brauhaus-fb0eb6459771@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use the newly added path_from_stashed() helper for nsfs.
Link: https://lore.kernel.org/r/20240218-neufahrzeuge-brauhaus-fb0eb6459771@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This moves pidfds from the anonymous inode infrastructure to a tiny
pseudo filesystem. This has been on my todo for quite a while as it will
unblock further work that we weren't able to do simply because of the
very justified limitations of anonymous inodes. Moving pidfds to a tiny
pseudo filesystem allows:
* statx() on pidfds becomes useful for the first time.
* pidfds can be compared simply via statx() and then comparing inode
numbers.
* pidfds have unique inode numbers for the system lifetime.
* struct pid is now stashed in inode->i_private instead of
file->private_data. This means it is now possible to introduce
concepts that operate on a process once all file descriptors have been
closed. A concrete example is kill-on-last-close.
* file->private_data is freed up for per-file options for pidfds.
* Each struct pid will refer to a different inode but the same struct
pid will refer to the same inode if it's opened multiple times. In
contrast to now where each struct pid refers to the same inode. Even
if we were to move to anon_inode_create_getfile() which creates new
inodes we'd still be associating the same struct pid with multiple
different inodes.
The tiny pseudo filesystem is not visible anywhere in userspace exactly
like e.g., pipefs and sockfs. There's no lookup, there's no complex
inode operations, nothing. Dentries and inodes are always deleted when
the last pidfd is closed.
We allocate a new inode for each struct pid and we reuse that inode for
all pidfds. We use iget_locked() to find that inode again based on the
inode number which isn't recycled. We allocate a new dentry for each
pidfd that uses the same inode. That is similar to anonymous inodes
which reuse the same inode for thousands of dentries. For pidfds we're
talking way less than that. There usually won't be a lot of concurrent
openers of the same struct pid. They can probably often be counted on
two hands. I know that systemd does use separate pidfd for the same
struct pid for various complex process tracking issues. So I think with
that things actually become way simpler. Especially because we don't
have to care about lookup. Dentries and inodes continue to be always
deleted.
The code is entirely optional and fairly small. If it's not selected we
fallback to anonymous inodes. Heavily inspired by nsfs which uses a
similar stashing mechanism just for namespaces.
Link: https://lore.kernel.org/r/20240213-vfs-pidfd_fs-v1-2-f863f58cfce1@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
It is now possible to disable BQL, but that causes the cpsw driver to break:
drivers/net/ethernet/ti/am65-cpsw-nuss.c:297:28: error: no member named 'dql' in 'struct netdev_queue'
297 | dql_avail(&netif_txq->dql),
There is already a helper function in net/sch_generic.h that could
be used to help here. Move its implementation into the common
linux/netdevice.h along with the other bql interfaces and change
both users over to the new interface.
Fixes: ea7f3cfaa588 ("net: bql: allow the config to be disabled")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is no need to wrap calls to the no_printk() helper inside an
always-false check, as no_printk() already does that internally.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPv6 TX and RX fast path use the following fields:
- disable_ipv6
- hop_limit
- mtu6
- forwarding
- disable_policy
- proxy_ndp
Place them in a group to increase data locality.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The VMBUS_RING_SIZE macro adds space for a ring buffer header to the
requested ring buffer size. The header size is always 1 page, and so
its size varies based on the PAGE_SIZE for which the kernel is built.
If the requested ring buffer size is a large power-of-2 size and the header
size is small, the resulting size is inefficient in its use of memory.
For example, a 512 Kbyte ring buffer with a 4 Kbyte page size results in
a 516 Kbyte allocation, which is rounded to up 1 Mbyte by the memory
allocator, and wastes 508 Kbytes of memory.
In such situations, the exact size of the ring buffer isn't that important,
and it's OK to allocate the 4 Kbyte header at the beginning of the 512
Kbytes, leaving the ring buffer itself with just 508 Kbytes. The memory
allocation can be 512 Kbytes instead of 1 Mbyte and nothing is wasted.
Update VMBUS_RING_SIZE to implement this approach for "large" ring buffer
sizes. "Large" is somewhat arbitrarily defined as 8 times the size of
the ring buffer header (which is of size PAGE_SIZE). For example, for
4 Kbyte PAGE_SIZE, ring buffers of 32 Kbytes and larger use the first
4 Kbytes as the ring buffer header. For 64 Kbyte PAGE_SIZE, ring buffers
of 512 Kbytes and larger use the first 64 Kbytes as the ring buffer
header. In both cases, smaller sizes add space for the header so
the ring size isn't reduced too much by using part of the space for
the header. For example, with a 64 Kbyte page size, we don't want
a 128 Kbyte ring buffer to be reduced to 64 Kbytes by allocating half
of the space for the header. In such a case, the memory allocation
is less efficient, but it's the best that can be done.
While the new algorithm slightly changes the amount of space allocated
for ring buffers by drivers that use VMBUS_RING_SIZE, the devices aren't
known to be sensitive to small changes in ring buffer size, so there
shouldn't be any effect.
Fixes: c1135c7fd0e9 ("Drivers: hv: vmbus: Introduce types of GPADL")
Fixes: 6941f67ad37d ("hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218502
Cc: stable@vger.kernel.org
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Tested-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Link: https://lore.kernel.org/r/20240229004533.313662-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240229004533.313662-1-mhklinux@outlook.com>
|
|
The new flags parameter allows controlling
- Whether or not the units suffix is separated by a space, for
compatibility with sort -h
- Whether or not to append a B suffix - we're not always printing
bytes.
Co-developed-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20240229205345.93902-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
net/mptcp/protocol.c
adf1bb78dab5 ("mptcp: fix snd_wnd initialization for passive socket")
9426ce476a70 ("mptcp: annotate lockless access for RX path fields")
https://lore.kernel.org/all/20240228103048.19255709@canb.auug.org.au/
Adjacent changes:
drivers/dpll/dpll_core.c
0d60d8df6f49 ("dpll: rely on rcu for netdev_dpll_pin()")
e7f8df0e81bf ("dpll: move xa_erase() call in to match dpll_pin_alloc() error path order")
drivers/net/veth.c
1ce7d306ea63 ("veth: try harder when allocating queue memory")
0bef512012b1 ("net: add netdev_lockdep_set_classes() to virtual drivers")
drivers/net/wireless/intel/iwlwifi/mvm/d3.c
8c9bef26e98b ("wifi: iwlwifi: mvm: d3: implement suspend with MLO")
78f65fbf421a ("wifi: iwlwifi: mvm: ensure offloading TID queue exists")
net/wireless/nl80211.c
f78c1375339a ("wifi: nl80211: reject iftype change with mesh ID change")
414532d8aa89 ("wifi: cfg80211: use IEEE80211_MAX_MESH_ID_LEN appropriately")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Boqun pointed out that workqueues aren't handling BH work items on offlined
CPUs. Unlike tasklet which transfers out the pending tasks from
CPUHP_SOFTIRQ_DEAD, BH workqueue would just leave them pending which is
problematic. Note that this behavior is specific to BH workqueues as the
non-BH per-CPU workers just become unbound when the CPU goes offline.
This patch fixes the issue by draining the pending BH work items from an
offlined CPU from CPUHP_SOFTIRQ_DEAD. Because work items carry more context,
it's not as easy to transfer the pending work items from one pool to
another. Instead, run BH work items which execute the offlined pools on an
online CPU.
Note that this assumes that no further BH work items will be queued on the
offlined CPUs. This assumption is shared with tasklet and should be fine for
conversions. However, this issue also exists for per-CPU workqueues which
will just keep executing work items queued after CPU offline on unbound
workers and workqueue should reject per-CPU and BH work items queued on
offline CPUs. This will be addressed separately later.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Link: http://lkml.kernel.org/r/Zdvw0HdSXcU3JZ4g@boqun-archlinux
|
|
The check_shl_overflow() uses u64 type that is defined in types.h.
Instead of including that header, just switch to use POD type
directly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240228204919.3680786-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
The lib/cmdline.c is basically a set of some small string parsers
which are wide used in the kernel. Their prototypes belong to the
string.h rather then kernel.h.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20231003130142.2936503-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Improve the reporting of buffer overflows under CONFIG_FORTIFY_SOURCE to
help accelerate debugging efforts. The calculations are all just sitting
in registers anyway, so pass them along to the function to be reported.
For example, before:
detected buffer overflow in memcpy
and after:
memcpy: detected buffer overflow: 4096 byte read of buffer size 1
Link: https://lore.kernel.org/r/20230407192717.636137-10-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
The standard C string APIs were not designed to have a failure mode;
they were expected to always succeed without memory safety issues.
Normally, CONFIG_FORTIFY_SOURCE will use fortify_panic() to stop
processing, as truncating a read or write may provide an even worse
system state. However, this creates a problem for testing under things
like KUnit, which needs a way to survive failures.
When building with CONFIG_KUNIT, provide a failure path for all users
of fortify_panic, and track whether the failure was a read overflow or
a write overflow, for KUnit tests to examine. Inspired by similar logic
in the slab tests.
Signed-off-by: Kees Cook <keescook@chromium.org>
|