Age | Commit message (Collapse) | Author |
|
Cross-merge BPF and other fixes after downstream PR.
No conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN, netfilter and wireless.
We have an IPv6 routing regression with the relevant fix still a WiP.
This includes a last-minute revert to avoid more problems.
Current release - new code bugs:
- wifi: nl80211: completely disable per-link stats for now
Previous releases - regressions:
- dev_ioctl: take ops lock in hwtstamp lower paths
- netfilter:
- fix spurious set lookup failures
- fix lockdep splat due to missing annotation
- genetlink: fix genl_bind() invoking bind() after -EPERM
- phy: transfer phy_config_inband() locking responsibility to phylink
- can: xilinx_can: fix use-after-free of transmitted SKB
- hsr: fix lock warnings
- eth:
- igb: fix NULL pointer dereference in ethtool loopback test
- i40e: fix Jumbo Frame support after iPXE boot
- macsec: sync features on RTM_NEWLINK
Previous releases - always broken:
- tunnels: reset the GSO metadata before reusing the skb
- mptcp: make sync_socket_options propagate SOCK_KEEPOPEN
- can: j1939: implement NETDEV_UNREGISTER notification hanidler
- wifi: ath12k: fix WMI TLV header misalignment"
* tag 'net-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
Revert "net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups"
hsr: hold rcu and dev lock for hsr_get_port_ndev
hsr: use hsr_for_each_port_rtnl in hsr_port_get_hsr
hsr: use rtnl lock when iterating over ports
wifi: nl80211: completely disable per-link stats for now
net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups
net: ethtool: fix wrong type used in struct kernel_ethtool_ts_info
MAINTAINERS: add Phil as netfilter reviewer
netfilter: nf_tables: restart set lookup on base_seq change
netfilter: nf_tables: make nft_set_do_lookup available unconditionally
netfilter: nf_tables: place base_seq in struct net
netfilter: nft_set_rbtree: continue traversal if element is inactive
netfilter: nft_set_pipapo: don't check genbit from packetpath lookups
netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation
can: rcar_can: rcar_can_resume(): fix s2ram with PSCI
can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB
can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails
can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed
can: j1939: implement NETDEV_UNREGISTER notification handler
selftests: can: enable CONFIG_CAN_VCAN as a module
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a nasty hibernation regression introduced during the 6.16
cycle, an issue related to energy model management occurring on Intel
hybrid systems where some CPUs are offline to start with, and two
regressions in the amd-pstate driver:
- Restore a pm_restrict_gfp_mask() call in hibernation_snapshot()
that was removed incorrectly during the 6.16 development cycle
(Rafael Wysocki)
- Introduce a function for registering a perf domain without
triggering a system-wide CPU capacity update and make the
intel_pstate driver use it to avoid reocurring unsuccessful
attempts to update capacities of all CPUs in the system (Rafael
Wysocki)
- Fix setting of CPPC.min_perf in the active mode with performance
governor in the amd-pstate driver to restore its expected behavior
changed recently (Gautham Shenoy)
- Avoid mistakenly setting EPP to 0 in the amd-pstate driver after
system resume as a result of recent code changes (Mario
Limonciello)"
* tag 'pm-6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: hibernate: Restrict GFP mask in hibernation_snapshot()
PM: EM: Add function for registering a PD without capacity update
cpufreq/amd-pstate: Fix a regression leading to EPP 0 after resume
cpufreq/amd-pstate: Fix setting of CPPC.min_perf in active mode for performance governor
|
|
The Texas Instruments BQ25703A is an integrated charger manager and
boost converter.
The MFD driver initializes the device for the regulator driver
and power supply driver.
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://lore.kernel.org/r/20250904160530.66178-3-macroalpha82@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
mc13xxx_irq_ack() got deprecated and became dead code with commit
10f9edaeaa30 ("mfd: mc13xxx: Use regmap irq framework for interrupts").
It should be safe to remove it now.
Signed-off-by: Alexander Kurz <akurz@blala.de>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> # for input
Link: https://lore.kernel.org/r/20250811064358.1659-1-akurz@blala.de
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Recent changes to genpd prevents those PM domains being powered-on during
initialization from being powered-off during the boot sequence. Based upon
whether CONFIG_PM_CONFIG_PM_GENERIC_DOMAINS_OF is set of not, genpd relies
on the sync_state mechanism or the genpd_power_off_unused() (which is a
late_initcall_sync), to understand when it's okay to allow these PM domains
to be powered-off.
This new behaviour in genpd has lead to problems on different platforms.
Let's therefore restore the behavior of genpd_power_off_unused().
Moreover, let's introduce GENPD_FLAG_NO_STAY_ON, to allow genpd OF
providers to opt-out from the new behaviour.
Link: https://lore.kernel.org/all/20250701114733.636510-1-ulf.hansson@linaro.org/
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/20250902-rk3576-lockup-regression-v1-1-c4a0c9daeb00@collabora.com/
Reported-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Fixes: 0e789b491ba0 ("pmdomain: core: Leave powered-on genpds on until sync_state")
Fixes: 13a4b7fb6260 ("pmdomain: core: Leave powered-on genpds on until late_initcall_sync")
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
Add tee_shm_alloc_dma_mem() to allocate DMA memory. The memory is
represented by a tee_shm object using the new flag TEE_SHM_DMA_MEM to
identify it as DMA memory. The allocated memory will later be lent to
the TEE to be used as protected memory.
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
Add a userspace API to create a tee_shm object that refers to a dmabuf
reference.
Userspace registers the dmabuf file descriptor as in a tee_shm object.
The registration is completed with a tee_shm returned file descriptor.
Userspace is free to close the dmabuf file descriptor after it has been
registered since all the resources are now held via the new tee_shm
object.
Closing the tee_shm file descriptor will eventually release all
resources used by the tee_shm object when all references are released.
The new IOCTL, TEE_IOC_SHM_REGISTER_FD, supports dmabuf references to
physically contiguous memory buffers. Dmabuf references acquired from
the TEE DMA-heap can be used as protected memory for Secure Video Path
and such use cases. It depends on the TEE and the TEE driver if dmabuf
references acquired by other means can be used.
A new tee_shm flag is added to identify tee_shm objects built from a
registered dmabuf, TEE_SHM_DMA_BUF.
Signed-off-by: Etienne Carriere <etienne.carriere@foss.st.com>
Signed-off-by: Olivier Masse <olivier.masse@nxp.com>
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
Implement DMA heap for protected DMA-buf allocation in the TEE
subsystem.
Protected memory refers to memory buffers behind a hardware enforced
firewall. It is not accessible to the kernel during normal circumstances
but rather only accessible to certain hardware IPs or CPUs executing in
higher or differently privileged mode than the kernel itself. This
interface allows to allocate and manage such protected memory buffers
via interaction with a TEE implementation.
The protected memory is allocated for a specific use-case, like Secure
Video Playback, Trusted UI, or Secure Video Recording where certain
hardware devices can access the memory.
The DMA-heaps are enabled explicitly by the TEE backend driver. The TEE
backend drivers needs to implement protected memory pool to manage the
protected memory.
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 15 are cc:stable and the remainder address post-6.16
issues or aren't considered necessary for -stable kernels. 14 of these
fixes are for MM.
This includes
- kexec fixes from Breno for a recently introduced
use-uninitialized bug
- DAMON fixes from Quanmin Yan to avoid div-by-zero crashes
which can occur if the operator uses poorly-chosen insmod
parameters
and misc singleton fixes"
* tag 'mm-hotfixes-stable-2025-09-10-20-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS: add tree entry to numa memblocks and emulation block
mm/damon/sysfs: fix use-after-free in state_show()
proc: fix type confusion in pde_set_flags()
compiler-clang.h: define __SANITIZE_*__ macros only when undefined
mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()
ocfs2: fix recursive semaphore deadlock in fiemap call
mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
mm/mremap: fix regression in vrm->new_addr check
percpu: fix race on alloc failed warning limit
mm/memory-failure: fix redundant updates for already poisoned pages
s390: kexec: initialize kexec_buf struct
riscv: kexec: initialize kexec_buf struct
arm64: kexec: initialize kexec_buf struct in load_other_segments()
mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
mm/damon/core: set quota->charged_from to jiffies at first charge window
mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()
init/main.c: fix boot time tracing crash
mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range()
mm/khugepaged: fix the address passed to notifier on testing young
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vmescape mitigation fixes from Dave Hansen:
"Mitigate vmscape issue with indirect branch predictor flushes.
vmscape is a vulnerability that essentially takes Spectre-v2 and
attacks host userspace from a guest. It particularly affects
hypervisors like QEMU.
Even if a hypervisor may not have any sensitive data like disk
encryption keys, guest-userspace may be able to attack the
guest-kernel using the hypervisor as a confused deputy.
There are many ways to mitigate vmscape using the existing Spectre-v2
defenses like IBRS variants or the IBPB flushes. This series focuses
solely on IBPB because it works universally across vendors and all
vulnerable processors. Further work doing vendor and model-specific
optimizations can build on top of this if needed / wanted.
Do the normal issue mitigation dance:
- Add the CPU bug boilerplate
- Add a list of vulnerable CPUs
- Use IBPB to flush the branch predictors after running guests"
* tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmscape: Add old Intel CPUs to affected list
x86/vmscape: Warn when STIBP is disabled with SMT
x86/bugs: Move cpu_bugs_smt_update() down
x86/vmscape: Enable the mitigation
x86/vmscape: Add conditional IBPB mitigation
x86/vmscape: Enumerate VMSCAPE bug
Documentation/hw-vuln: Add VMSCAPE documentation
|
|
To ensure successful builds when CONFIG_IMX_SCMI_CPU_DRV is not enabled,
this patch adds static inline stub implementations for the following
functions:
- scmi_imx_cpu_start()
- scmi_imx_cpu_started()
- scmi_imx_cpu_reset_vector_set()
These stubs return -EOPNOTSUPP to indicate that the functionality is not
supported in the current configuration. This avoids potential build or
link errors in code that conditionally calls these functions based on
feature availability.
Fixes: 1055faa5d660 ("firmware: imx: Add i.MX95 SCMI CPU driver")
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
To ensure successful builds when CONFIG_IMX_SCMI_LMM_DRV is not enabled,
this patch adds static inline stub implementations for the following
functions:
- scmi_imx_lmm_operation()
- scmi_imx_lmm_info()
- scmi_imx_lmm_reset_vector_set()
These stubs return -EOPNOTSUPP to indicate that the functionality is not
supported in the current configuration. This avoids potential build or
link errors in code that conditionally calls these functions based on
feature availability.
Fixes: 7242bbf418f0 ("firmware: imx: Add i.MX95 SCMI LMM driver")
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
To ensure successful builds when CONFIG_IMX_SCMI_MISC_DRV is not enabled,
this patch adds static inline stub implementations for the following
functions:
- scmi_imx_misc_ctrl_get()
- scmi_imx_misc_ctrl_set()
These stubs return -EOPNOTSUPP to indicate that the functionality is not
supported in the current configuration. This avoids potential build or
link errors in code that conditionally calls these functions based on
feature availability.
This patch also drops the changes in commit 540c830212ed ("firmware: imx:
remove duplicate scmi_imx_misc_ctrl_get()").
The original change aimed to simplify the handling of optional features by
removing conditional stubs. However, the use of conditional stubs is
necessary when CONFIG_IMX_SCMI_MISC_DRV is n, while consumer driver is
set to y.
This is not a matter of preserving legacy patterns, but rather to ensure
that there is no link error whether for module or built-in.
Fixes: 0b4f8a68b292 ("firmware: imx: Add i.MX95 MISC driver")
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
In C, enumerated types do not have a defined size, apart from being
compatible with one of the standard types. This allows an ABI /
compiler to choose the type of an enum depending on the values it
needs to store, and storing larger values in it can lead to undefined
behaviour.
The tx_type and rx_filters members of struct kernel_ethtool_ts_info
are defined as enumerated types, but are bit arrays, where each bit
is defined by the enumerated type. This means they typically store
values in excess of the maximum value of the enumerated type, in
fact (1 << max_value) and thus must not be declared using the
enumated type.
Fix both of these to use u32, as per the corresponding __u32 UAPI type.
Fixes: 2111375b85ad ("net: Add struct kernel_ethtool_ts_info")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/E1uvMEK-00000003Amd-2pWR@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
to cgroup.procs
The static usage pattern of creating a cgroup, enabling controllers,
and then seeding it with CLONE_INTO_CGROUP doesn't require write
locking cgroup_threadgroup_rwsem and thus doesn't benefit from this
patch.
To avoid affecting other users, the per threadgroup rwsem is only used
when the favordynmods is enabled.
As computer hardware advances, modern systems are typically equipped
with many CPU cores and large amounts of memory, enabling the deployment
of numerous applications. On such systems, container creation and
deletion become frequent operations, making cgroup process migration no
longer a cold path. This leads to noticeable contention with common
process operations such as fork, exec, and exit.
To alleviate the contention between cgroup process migration and
operations like process fork, this patch modifies lock to take the write
lock on signal_struct->group_rwsem when writing pid to
cgroup.procs/threads instead of holding a global write lock.
Cgroup process migration has historically relied on
signal_struct->group_rwsem to protect thread group integrity. In commit
<1ed1328792ff> ("sched, cgroup: replace signal_struct->group_rwsem with
a global percpu_rwsem"), this was changed to a global
cgroup_threadgroup_rwsem. The advantage of using a global lock was
simplified handling of process group migrations. This patch retains the
use of the global lock for protecting process group migration, while
reducing contention by using per thread group lock during
cgroup.procs/threads writes.
The locking behavior is as follows:
write cgroup.procs/threads | process fork,exec,exit | process group migration
------------------------------------------------------------------------------
cgroup_lock() | down_read(&g_rwsem) | cgroup_lock()
down_write(&p_rwsem) | down_read(&p_rwsem) | down_write(&g_rwsem)
critical section | critical section | critical section
up_write(&p_rwsem) | up_read(&p_rwsem) | up_write(&g_rwsem)
cgroup_unlock() | up_read(&g_rwsem) | cgroup_unlock()
g_rwsem denotes cgroup_threadgroup_rwsem, p_rwsem denotes
signal_struct->group_rwsem.
This patch eliminates contention between cgroup migration and fork
operations for threads that belong to different thread groups, thereby
reducing the long-tail latency of cgroup migrations and lowering system
load.
With this patch, under heavy fork and exec interference, the long-tail
latency of cgroup migration has been reduced from milliseconds to
microseconds. Under heavy cgroup migration interference, the multi-CPU
score of the spawn test case in UnixBench increased by 9%.
tj: Update comment in cgroup_favor_dynmods() and switch WARN_ONCE() to
pr_warn_once().
Signed-off-by: Yi Tao <escape@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Dynamic cgroup migration involving threadgroup locks can be in one of
two states: no lock held, or holding the global lock. Explicitly
declaring the different lock modes to make the code easier to
understand and facilitates future extensions of the lock modes.
Signed-off-by: Yi Tao <escape@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add scoped versions of fwnode child node iterators that automatically
handle reference counting cleanup using the __free() attribute:
- fwnode_for_each_child_node_scoped()
- fwnode_for_each_available_child_node_scoped()
These macros follow the same pattern as existing scoped iterators in the
kernel, ensuring fwnode references are automatically released when the
iterator variable goes out of scope. This prevents resource leaks and
eliminates the need for manual cleanup in error paths.
The implementation mirrors the non-scoped variants but uses
__free(fwnode_handle) for automatic resource management, providing a
safer and more convenient interface for drivers iterating over firmware
node children.
Signed-off-by: Jean-François Lessard <jefflessard3@gmail.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
Catching up with some display dependencies.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
No functional changes are intended, some drivers like mdraid will split
bio by internal processing, prepare to unify bio split codes.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
bio->issue_time_ns is initialized for every bio, however, it's only used
by blk-iolatency. Add a new queue_flag and only set this flag when
blk-iolatency is enabled, so that extra blk_time_get_ns() can be saved
for disks that blk-iolatency is not enabled.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now that bio->bi_issue is only used by blk-iolatency to get bio issue
time, replace bio_issue with u64 time directly and remove bio_issue to
make code cleaner.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The intel_pstate driver manages CPU capacity changes itself and it does
not need an update of the capacity of all CPUs in the system to be
carried out after registering a PD.
Moreover, in some configurations (for instance, an SMT-capable
hybrid x86 system booted with nosmt in the kernel command line) the
em_check_capacity_update() call at the end of em_dev_register_perf_domain()
always fails and reschedules itself to run once again in 1 s, so
effectively it runs in vain every 1 s forever.
To address this, introduce a new variant of em_dev_register_perf_domain(),
called em_dev_register_pd_no_update(), that does not invoke
em_check_capacity_update(), and make intel_pstate use it instead of the
original.
Fixes: 7b010f9b9061 ("cpufreq: intel_pstate: EAS support for hybrid platforms")
Closes: https://lore.kernel.org/linux-pm/40212796-734c-4140-8a85-854f72b8144d@panix.com/
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We only need to consider data and metadata dma mapping types separately.
The request and bio integrity payload have enough flag bits to
internally track the mapping type for each. Use these so the caller
doesn't need to track them, and provide separete request and integrity
helpers to the common code. This will make it easier to scale new
mappings, like the proposed MMIO attribute, without burdening the caller
to track such things.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No more callers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
No more callers.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The block layer tries to align bio vectors to the block device's logical
block size. Some cases don't have a block device, or we may need to
align to something larger, which we can't derive it from the queue
limits. Have the caller specify what they want, or allow any length
alignment if nothing was specified. Since the most common use case
relies on the block device's limits, a helper function is provided.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We're already iterating every segment, so check these for a valid IO
lengths at the same time. Individual segment lengths will not be checked
on passthrough commands. The read/write command segments must be sized
to the dma alignment.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Replace hverkuil@xs4all.nl by hverkuil@kernel.org.
Signed-off-by: Hans Verkuil <hverkuil@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Add missing documentation for the tags_srcu member that was introduced
to defer freeing of tags page_list to prevent use-after-free when
iterating tags.
Fixes htmldocs warning:
WARNING: include/linux/blk-mq.h:536 struct member 'tags_srcu' not described in 'blk_mq_tag_set'
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Bios are embedded into other structures, and at least spare is unhappy
about embedding structures with variable sized arrays. There's no
real need to the array anyway, we can replace it with a helper pointing
to the memory just behind the bio, and with the previous cleanups there
is very few site doing anything special with it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just a simpler wrapper around bio_init for callers that want to
initialize a bio with inline bvecs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The get_time() callbacks always need to match the bases clockid.
Instead of maintaining that association twice in hrtimer_bases,
use a helper.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250821-hrtimer-cleanup-get_time-v2-8-3ae822e5bfbd@linutronix.de
|
|
Various other helpers contain open-coded implementations of
hrtimer_cb_get_time(). This prevents refactoring the implementation.
Reuse the existing helper.
For this to work, move hrtimer_cb_get_time() a bit up in the file and also
make its argument 'const'.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250821-hrtimer-cleanup-get_time-v2-7-3ae822e5bfbd@linutronix.de
|
|
Clang 22 recently added support for defining __SANITIZE__ macros similar
to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e)
with the existing defines that the kernel creates to emulate this behavior
with existing clang versions.
In file included from <built-in>:3:
In file included from include/linux/compiler_types.h:171:
include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined]
37 | #define __SANITIZE_THREAD__
| ^
<built-in>:352:9: note: previous definition is here
352 | #define __SANITIZE_THREAD__ 1
| ^
Refactor compiler-clang.h to only define the sanitizer macros when they
are undefined and adjust the rest of the code to use these macros for
checking if the sanitizers are enabled, clearing up the warnings and
allowing the kernel to easily drop these defines when the minimum
supported version of LLVM for building the kernel becomes 22.0.0 or newer.
Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf3702ca3d92@kernel.org
Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444dae4fdc8a9c [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and
always allocate memory using the hardcoded GFP_KERNEL flag. This makes
them inconsistent with vmalloc(), which was recently extended to support
GFP_NOFS and GFP_NOIO allocations.
Page table allocations performed during shadow population also ignore the
external gfp_mask. To preserve the intended semantics of GFP_NOFS and
GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate
memalloc scope.
xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock.
There was a report here
https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com
This patch:
- Extends kasan_populate_vmalloc() and helpers to take gfp_mask;
- Passes gfp_mask down to alloc_pages_bulk() and __get_free_page();
- Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore()
around apply_to_page_range();
- Updates vmalloc.c and percpu allocator call sites accordingly.
Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com
Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reported-by: syzbot+3470c9ffee63e4abafeb@syzkaller.appspotmail.com
Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute__const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
Add missing __attribute_const__ annotations to generic implementations of
ffs(), __ffs(), fls(), and __fls() functions. These are pure mathematical
functions that always return the same result for the same input with no
side effects, making them eligible for compiler optimization.
Build tested with x86_64 defconfig using GCC 14.2.0, which should validate
the implementations when used by ARM, ARM64, LoongArch, Microblaze,
NIOS2, and SPARC32 architectures.
Link: https://github.com/KSPP/linux/issues/364 [1]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250804164417.1612371-2-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
It's only used to hold the corresponding receive message, so fix the
name to make that clear and the type so nothing else can be accidentally
assigned to it.
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
Getting ready for handling when a BMC is non-responsive or broken, allow
the sender operation to fail in an SMI. If it was a user-generated
message it will return the error.
The powernv code was already doing this internally, but the way it was
written could result in deep stack descent if there were a lot of
messages queued. Have its send return an error in this case.
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"fuse:
- Prevent opening of non-regular backing files.
Fuse doesn't support non-regular files anyway.
- Check whether copy_file_range() returns a larger size than
requested.
- Prevent overflow in copy_file_range() as fuse currently only
supports 32-bit sized copies.
- Cache the blocksize value if the server returned a new value as
inode->i_blkbits isn't modified directly anymore.
- Fix i_blkbits handling for iomap partial writes.
By default i_blkbits is set to PAGE_SIZE which causes iomap to mark
the whole folio as uptodate even on a partial write. But fuseblk
filesystems support choosing a blocksize smaller than PAGE_SIZE
risking data corruption. Simply enforce PAGE_SIZE as blocksize for
fuseblk's internal inode for now.
- Prevent out-of-bounds acces in fuse_dev_write() when the number of
bytes to be retrieved is truncated to the fc->max_pages limit.
virtiofs:
- Fix page faults for DAX page addresses.
Misc:
- Tighten file handle decoding from userns.
Check that the decoded dentry itself has a valid idmapping in the
user namespace.
- Fix mount-notify selftests.
- Fix some indentation errors.
- Add an FMODE_ flag to indicate IOCB_HAS_METADATA availability.
This will be moved to an FOP_* flag with a bit more rework needed
for that to happen not suitable for a fix.
- Don't silently ignore metadata for sync read/write.
- Don't pointlessly log warning when reading coredump sysctls"
* tag 'vfs-6.17-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fuse: virtio_fs: fix page fault for DAX page address
selftests/fs/mount-notify: Fix compilation failure.
fhandle: use more consistent rules for decoding file handle from userns
fuse: Block access to folio overlimit
fuse: fix fuseblk i_blkbits for iomap partial writes
fuse: reflect cached blocksize if blocksize was changed
fuse: prevent overflow in copy_file_range return value
fuse: check if copy_file_range() returns larger than requested size
fuse: do not allow mapping a non-regular backing file
coredump: don't pointlessly check and spew warnings
fs: fix indentation style
block: don't silently ignore metadata for sync read/write
fs: add a FMODE_ flag to indicate IOCB_HAS_METADATA availability
Please enter a commit message to explain why this merge is necessary,
especially if it merges an updated upstream into a topic branch.
|
|
A manual application of this patch resulted in a typo for the stub
function __io_uring_cmd_do_in_task(), for the case where CONFIG_IO_URING
isn't true. Fix that up.
Reported-by: Klara Modin <klarasmodin@gmail.com>
Fixes: df3a7762ee24 ("io_uring/uring_cmd: add io_uring_cmd_tw_t type alias")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Tag iterators can race with the freeing of the request pages(tags->page_list),
potentially leading to use-after-free issues.
Defer the freeing of the page list and the tags structure itself until
after an SRCU grace period has passed. This ensures that any concurrent
tag iterators have completed before the memory is released. With this
way, we can replace the big tags->lock in tags iterator code path with
srcu for solving the issue.
This is achieved by:
- Adding a new `srcu_struct tags_srcu` to `blk_mq_tag_set` to protect
tag map iteration.
- Adding an `rcu_head` to `struct blk_mq_tags` to be used with
`call_srcu`.
- Moving the page list freeing logic and the `kfree(tags)` call into a
new callback function, `blk_mq_free_tags_callback`.
- In `blk_mq_free_tags`, invoking `call_srcu` to schedule the new
callback for deferred execution.
The read-side protection for the tag iterators will be added in a
subsequent patch.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Implement locking in the hardware monitoring core for drivers using
the _with_info() API functions.
Most hardware monitoring drivers need to support locking to protect
against parallel accesses from userspace. With older API functions, such
locking had to be implemented in the driver code since sysfs attributes
were created by the driver. However, the _with_info() API creates sysfs
attributes in the hardware monitoring core. This makes it easy to move
the locking primitives into that code. This has the benefit of simplifying
driver code while at the same time reducing the risk of incomplete of bad
locking implementations in hardware monitoring drivers.
While this means that all accesses are forced to be synchronized, this
has little if any practical impact since accesses are expected to be low
frequency and are typically synchronized from userspace anyway since
only a single process is accessing the data. On top of that, many drivers
use regmap, which also has its own locking scheme and already serializes
accesses.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Move the range_overflows() and range_end_overflows() along with the _t
variants over from drm/i915 and drm/buddy to overflow.h.
Cc: Kees Cook <kees@kernel.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250829174601.2163064-3-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
This generic pin config property is confusingly named so let's
rename it to make things clearer.
There are already drivers in the tree that use PIN_CONFIG_OUTPUT
to *read* the value of an output driven pin, which is a big
semantic confusion for the head: are we then reading the
setting of the output or the actual value/level that is put
out on the pin?
We already have PIN_CONFIG_OUTPUT_ENABLE that turns on driver
buffers for output, so this can by logical conclusion only
drive the voltage level if it should be any different.
But if we read the pin, are we then reading the *setting* of
the output value or the *actual* value we can see on the
line?
If the pin has not first been set into output mode with
PIN_CONFIG_OUTPUT_ENABLE, but is instead in some input mode
or tristate, what will reading this property actually
return?
Reading the current users reading this property it is clear
that what we read is the logical level of the pin as 0 or 1
depending on if it is low or high.
Rename it to PIN_CONFIG_LEVEL so it is crystal clear that
we set or read the voltage level of the pin and nothing else.
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The name of the pin function has no real meaning to pinctrl core and is
there only for human readability of device properties. Some pins are
muxed as GPIOs but for "strict" pinmuxers it's impossible to request
them as GPIOs if they're bound to a devide - even if their function name
explicitly says "gpio". Add a new field to struct pinfunction that
allows to pass additional flags to pinctrl core. While we could go with
a boolean "is_gpio" field, a flags field is more future-proof.
If the PINFUNCTION_FLAG_GPIO is set for a given function, the pin muxed
to it can be requested as GPIO even on strict pin controllers. Add a new
callback to struct pinmux_ops - function_is_gpio() - that allows pinmux
core to inspect a function and see if it's a GPIO one. Provide a generic
implementation of this callback.
Tested-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Provide a function similar to devm_strdup_const() but for copying blocks
of memory that are likely to be placed in .rodata.
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into gpio/for-next
Linux 6.17-rc5
|
|
Many chips require 64-bit variables to display the accumulated energy,
even more so since the energy units are micro-Joule. Add new sensor type
"energy64" to support reporting the chip energy as 64-bit values.
Changing the entire hardware monitoring API is not feasible, and it is only
really necessary to support reading 64-bit values for the "energyX_input"
attribute. For this reason, keep the API as-is and use type casts on both
ends to pass 64-bit pointers when reading the accumulated energy. On the
write side (which is only useful for the energyX_enable attribute), keep
passing the written value as long.
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> # INA780
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix a severe slowdown regression in the timer vDSO code related to the
while() loop in __iter_div_u64_rem(), when the AUX-clock is enabled"
* tag 'timers-urgent-2025-09-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
vdso/vsyscall: Avoid slow division loop in auxiliary clock update
|