Age | Commit message (Collapse) | Author |
|
Wiring up uretprobe system call, which comes in following changes.
We need to do the wiring before, because the uretprobe implementation
needs the syscall number.
Note at the moment uretprobe syscall is supported only for native
64-bit process.
Link: https://lore.kernel.org/all/20240611112158.40795-3-jolsa@kernel.org/
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Delete kvm_arch_sched_in() now that all implementations are nops.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a kvm_vcpu.scheduled_out flag to track if a vCPU is in the process of
being scheduled out (vCPU put path), or if the vCPU is being reloaded
after being scheduled out (vCPU load path). In the short term, this will
allow dropping kvm_arch_sched_in(), as arch code can query scheduled_out
during kvm_arch_vcpu_load().
Longer term, scheduled_out opens up other potential optimizations, without
creating subtle/brittle dependencies. E.g. it allows KVM to keep guest
state (that is managed via kvm_arch_vcpu_{load,put}()) loaded across
kvm_sched_{out,in}(), if KVM knows the state isn't accessed by the host
kernel. Forcing arch code to coordinate between kvm_arch_sched_{in,out}()
and kvm_arch_vcpu_{load,put}() is awkward, not reusable, and relies on the
exact ordering of calls into arch code.
Adding scheduled_out also obviates the need for a kvm_arch_sched_out()
hook, e.g. if arch code needs to do something novel when putting vCPU
state.
And even if KVM never uses scheduled_out for anything beyond dropping
kvm_arch_sched_in(), just being able to remove all of the arch stubs makes
it worth adding the flag.
Link: https://lore.kernel.org/all/20240430224431.490139-1-seanjc@google.com
Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Setup empty IRQ routing during VM creation so that x86 and s390 don't need
to set empty/dummy IRQ routing during KVM_CREATE_IRQCHIP (in future
patches). Initializing IRQ routing before there are any potential readers
allows KVM to avoid the synchronize_srcu() in kvm_set_irq_routing(), which
can introduces 20+ milliseconds of latency in the VM creation path.
Ensuring that all VMs have non-NULL IRQ routing also hardens KVM against
misbehaving userspace VMMs, e.g. RISC-V dynamically instantiates its
interrupt controller, but doesn't override kvm_arch_intc_initialized() or
kvm_arch_irqfd_allowed(), and so can likely reach kvm_irq_map_gsi()
without fully initialized IRQ routing.
Signed-off-by: Yi Wang <foxywang@tencent.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20240506101751.3145407-2-foxywang@tencent.com
[sean: init refcount after IRQ routing, fix stub, massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
GPIO chips should be added with driver-private data associated with the
chip. If none is needed, NULL can be used. All users already do this
except one, fix that here. With no more users of the base gpiochip_add()
we can drop this function so no more users show up later.
Signed-off-by: Andrew Davis <afd@ti.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240610135313.142571-1-afd@ti.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"Misc:
- Restore debugfs behavior of ignoring unknown mount options
- Fix kernel doc for netfs_wait_for_oustanding_io()
- Fix struct statx comment after new addition for this cycle
- Fix a check in find_next_fd()
iomap:
- Fix data zeroing behavior when an extent spans the block that
contains i_size
- Restore i_size increasing in iomap_write_end() for now to avoid
stale data exposure on xfs with a realtime device
Cachefiles:
- Remove unneeded fdtable.h include
- Improve trace output for cachefiles_obj_{get,put}_ondemand_fd()
- Remove requests from the request list to prevent accessing already
freed requests
- Fix UAF when issuing restore command while the daemon is still
alive by adding an additional reference count to requests
- Fix UAF by grabbing a reference during xarray lookup with xa_lock()
held
- Simplify error handling in cachefiles_ondemand_daemon_read()
- Add consistency checks read and open requests to avoid crashes
- Add a spinlock to protect ondemand_id variable which is used to
determine whether an anonymous cachefiles fd has already been
closed
- Make on-demand reads killable allowing to handle broken cachefiles
daemon better
- Flush all requests after the kernel has been marked dead via
CACHEFILES_DEAD to avoid hung-tasks
- Ensure that closed requests are marked as such to avoid reusing
them with a reopen request
- Defer fd_install() until after copy_to_user() succeeded and thereby
get rid of having to use close_fd()
- Ensure that anonymous cachefiles on-demand fds are reused while
they are valid to avoid pinning already freed cookies"
* tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: Fix iomap_adjust_read_range for plen calculation
iomap: keep on increasing i_size in iomap_write_end()
cachefiles: remove unneeded include of <linux/fdtable.h>
fs/file: fix the check in find_next_fd()
cachefiles: make on-demand read killable
cachefiles: flush all requests after setting CACHEFILES_DEAD
cachefiles: Set object to close if ondemand_id < 0 in copen
cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
cachefiles: never get a new anonymous fd if ondemand_id is valid
cachefiles: add spin_lock for cachefiles_ondemand_info
cachefiles: add consistency check for copen/cread
cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
cachefiles: remove requests from xarray during flushing requests
cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
statx: Update offset commentary for struct statx
netfs: fix kernel doc for nets_wait_for_outstanding_io()
debugfs: continue to ignore unknown mount options
|
|
Channel device name is used for sysfs, but also by dmatest filter function.
With dynamic channel registration, channels can be registered after dma
controller registration. Users may want to have specific channel names.
If name is NULL, the channel name relies on previous implementation,
dma<controller_device_id>chan<channel_device_id>.
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20240531150712.2503554-11-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Introduce a new external lockspace flag DLM_LSFL_SOFTIRQ_SAFE. A
lockspace user will set this flag if it can handle dlm running the
callback functions from softirq context. When not set, dlm will
continue to run callback functions from the dlm_callback workqueue.
The new lockspace flag cannot be used for user space lockspaces, so
a uapi placeholder definition is used for the new flag value.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
|
|
All architectures that implement function graph also implements
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR. Remove it, as it is no longer a
differentiator.
Link: https://lore.kernel.org/linux-trace-kernel/20240611031737.982047614@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
In ice_ptp_cfg_clkout(), the ice driver needs to calculate the nearest next
second of a current time value specified in nanoseconds. It implements this
using div64_u64, because the time value is a u64. It could use div_u64
since NSEC_PER_SEC is smaller than 32-bits.
Ideally this would be implemented directly with roundup(), but that can't
work on all platforms due to a division which requires using the specific
macros and functions due to platform restrictions, and to ensure that the
most appropriate and fast instructions are used.
The kernel doesn't currently provide any 64-bit equivalents for doing
roundup. Attempting to use roundup() on a 32-bit platform will result in a
link failure due to not having a direct 64-bit division.
The closest equivalent for this is DIV64_U64_ROUND_UP, which does a
division always rounding up. However, this only computes the division, and
forces use of the div64_u64 in cases where the divisor is a 32bit value and
could make use of div_u64.
Introduce DIV_U64_ROUND_UP based on div_u64, and then use it to implement
roundup_u64 which takes a u64 input value and a u32 rounding value.
The name roundup_u64 matches the naming scheme of div_u64, and future
patches could implement roundup64_u64 if they need to round by a multiple
that is greater than 32-bits.
Replace the logic in ice_ptp.c which does this equivalent with the newly
added roundup_u64.
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240607-next-2024-06-03-intel-next-batch-v3-2-d1470cee3347@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-06-06
We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).
The main changes are:
1) Add a user space notification mechanism via epoll when a struct_ops
object is getting detached/unregistered, from Kui-Feng Lee.
2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
tests, from Geliang Tang.
3) Add BTF field (type and string fields, right now) iterator support
to libbpf instead of using existing callback-based approaches,
from Andrii Nakryiko.
4) Extend BPF selftests for the latter with a new btf_field_iter
selftest, from Alan Maguire.
5) Add new kfuncs for a generic, open-coded bits iterator,
from Yafang Shao.
6) Fix BPF selftests' kallsyms_find() helper under kernels configured
with CONFIG_LTO_CLANG_THIN, from Yonghong Song.
7) Remove a bunch of unused structs in BPF selftests,
from David Alan Gilbert.
8) Convert test_sockmap section names into names understood by libbpf
so it can deduce program type and attach type, from Jakub Sitnicki.
9) Extend libbpf with the ability to configure log verbosity
via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.
10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
in nested VMs, from Song Liu.
11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
optimization, from Xiao Wang.
12) Enable BPF programs to declare arrays and struct fields with kptr,
bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
selftests/bpf: Add start_test helper in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
selftests/bpf: Add btf_field_iter selftests
selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
libbpf: Remove callback-based type/string BTF field visitor helpers
bpftool: Use BTF field iterator in btfgen
libbpf: Make use of BTF field iterator in BTF handling code
libbpf: Make use of BTF field iterator in BPF linker code
libbpf: Add BTF field iterator
selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
selftests/bpf: Fix bpf_cookie and find_vma in nested VM
selftests/bpf: Test global bpf_list_head arrays.
selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
bpf: limit the number of levels of a nested struct type.
bpf: look into the types of the fields of a struct type recursively.
...
====================
Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.11
The first "new features" pull request for v6.11 with changes both in
stack and in drivers. Nothing out of ordinary, except that we have
two conflicts this time:
net/mac80211/cfg.c
https://lore.kernel.org/all/20240531124415.05b25e7a@canb.auug.org.au
drivers/net/wireless/microchip/wilc1000/netdev.c
https://lore.kernel.org/all/20240603110023.23572803@canb.auug.org.au
Major changes:
cfg80211/mac80211
* parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers
wilc1000
* read MAC address during probe to make it visible to user space
iwlwifi
* bump FW API to 91 for BZ/SC devices
* report 64-bit radiotap timestamp
* enable P2P low latency by default
* handle Transmit Power Envelope (TPE) advertised by AP
* start using guard()
rtlwifi
* RTL8192DU support
ath12k
* remove unsupported tx monitor handling
* channel 2 in 6 GHz band support
* Spatial Multiplexing Power Save (SMPS) in 6 GHz band support
* multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA)
support
* dynamic VLAN support
* add panic handler for resetting the firmware state
ath10k
* add qcom,no-msa-ready-indicator Device Tree property
* LED support for various chipsets
* tag 'wireless-next-2024-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (194 commits)
wifi: ath12k: add hw_link_id in ath12k_pdev
wifi: ath12k: add panic handler
wifi: rtw89: chan: Use swap() in rtw89_swap_sub_entity()
wifi: brcm80211: remove unused structs
wifi: brcm80211: use sizeof(*pointer) instead of sizeof(type)
wifi: ath12k: do not process consecutive RDDM event
dt-bindings: net: wireless: ath11k: Drop "qcom,ipq8074-wcss-pil" from example
wifi: ath12k: fix memory leak in ath12k_dp_rx_peer_frag_setup()
wifi: rtlwifi: handle return value of usb init TX/RX
wifi: rtlwifi: Enable the new rtl8192du driver
wifi: rtlwifi: Add rtl8192du/sw.c
wifi: rtlwifi: Constify rtl_hal_cfg.{ops,usb_interface_cfg} and rtl_priv.cfg
wifi: rtlwifi: Add rtl8192du/dm.{c,h}
wifi: rtlwifi: Add rtl8192du/fw.{c,h} and rtl8192du/led.{c,h}
wifi: rtlwifi: Add rtl8192du/rf.{c,h}
wifi: rtlwifi: Add rtl8192du/trx.{c,h}
wifi: rtlwifi: Add rtl8192du/phy.{c,h}
wifi: rtlwifi: Add rtl8192du/hw.{c,h}
wifi: rtlwifi: Add new members to struct rtl_priv for RTL8192DU
wifi: rtlwifi: Add rtl8192du/table.{c,h}
...
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Link: https://lore.kernel.org/r/20240607093517.41394C2BBFC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that the driver core allows for struct class to be in read-only memory,
we should make all 'class' structures declared at build time placing them
into read-only memory, instead of having to be dynamically allocated at
runtime.
Link: https://lore.kernel.org/r/2024061053-online-unwound-b173@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-By: Logan Gunthorpe <logang@deltatee.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
|
|
Back in 2007, in commit af65bdfce98d ("[NETLINK]: Switch cb_lock spinlock
to mutex and allow to override it") netlink core was extended to allow
subsystems to replace the dump mutex lock with its own lock.
The mechanism was used by rtnetlink to take rtnl_lock but it isn't
sufficiently flexible for other users. Over the 17 years since
it was added no other user appeared. Since rtnetlink needs conditional
locking now, and doesn't use it either, axe this feature complete.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The granularity of DMA mappings is transfer and moreover,
the direction is also important as it can be unidirect.
The current cur_msg_mapped flag doesn't fit well the DMA mapping
and syncing calls and we have tons of checks around on top of it.
So, instead of doing that rework the code to use per transfer per
direction flag to show if it's DMA mapped or not.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD
Link: https://lore.kernel.org/r/20240531194723.1761567-9-andriy.shevchenko@linux.intel.com
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The ACPI _ADR is a 64-bit value. We changed the definitions in commit
ca6f998cf9a2 ("ACPI: bus: change _ADR representation to 64 bits") but
some helpers still assume the value is a 32-bit value.
This patch adds a new helper to extract the full 64-bits. The existing
32-bit helper is kept for backwards-compatibility and cases where the
_ADR is known to fit in a 32-bit value.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240528192936.16180-2-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
ACLs in Spectrum-2 and newer ASICs can reside in the algorithmic TCAM
(A-TCAM) or in the ordinary circuit TCAM (C-TCAM). The former can
contain more ACLs (i.e., tc filters), but the number of masks in each
region (i.e., tc chain) is limited.
In order to mitigate the effects of the above limitation, the device
allows filters to share a single mask if their masks only differ in up
to 8 consecutive bits. For example, dst_ip/25 can be represented using
dst_ip/24 with a delta of 1 bit. The C-TCAM does not have a limit on the
number of masks being used (and therefore does not support mask
aggregation), but can contain a limited number of filters.
The driver uses the "objagg" library to perform the mask aggregation by
passing it objects that consist of the filter's mask and whether the
filter is to be inserted into the A-TCAM or the C-TCAM since filters in
different TCAMs cannot share a mask.
The set of created objects is dependent on the insertion order of the
filters and is not necessarily optimal. Therefore, the driver will
periodically ask the library to compute a more optimal set ("hints") by
looking at all the existing objects.
When the library asks the driver whether two objects can be aggregated
the driver only compares the provided masks and ignores the A-TCAM /
C-TCAM indication. This is the right thing to do since the goal is to
move as many filters as possible to the A-TCAM. The driver also forbids
two identical masks from being aggregated since this can only happen if
one was intentionally put in the C-TCAM to avoid a conflict in the
A-TCAM.
The above can result in the following set of hints:
H1: {mask X, A-TCAM} -> H2: {mask Y, A-TCAM} // X is Y + delta
H3: {mask Y, C-TCAM} -> H4: {mask Z, A-TCAM} // Y is Z + delta
After getting the hints from the library the driver will start migrating
filters from one region to another while consulting the computed hints
and instructing the device to perform a lookup in both regions during
the transition.
Assuming a filter with mask X is being migrated into the A-TCAM in the
new region, the hints lookup will return H1. Since H2 is the parent of
H1, the library will try to find the object associated with it and
create it if necessary in which case another hints lookup (recursive)
will be performed. This hints lookup for {mask Y, A-TCAM} will either
return H2 or H3 since the driver passes the library an object comparison
function that ignores the A-TCAM / C-TCAM indication.
This can eventually lead to nested objects which are not supported by
the library [1].
Fix by removing the object comparison function from both the driver and
the library as the driver was the only user. That way the lookup will
only return exact matches.
I do not have a reliable reproducer that can reproduce the issue in a
timely manner, but before the fix the issue would reproduce in several
minutes and with the fix it does not reproduce in over an hour.
Note that the current usefulness of the hints is limited because they
include the C-TCAM indication and represent aggregation that cannot
actually happen. This will be addressed in net-next.
[1]
WARNING: CPU: 0 PID: 153 at lib/objagg.c:170 objagg_obj_parent_assign+0xb5/0xd0
Modules linked in:
CPU: 0 PID: 153 Comm: kworker/0:18 Not tainted 6.9.0-rc6-custom-g70fbc2c1c38b #42
Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:objagg_obj_parent_assign+0xb5/0xd0
[...]
Call Trace:
<TASK>
__objagg_obj_get+0x2bb/0x580
objagg_obj_get+0xe/0x80
mlxsw_sp_acl_erp_mask_get+0xb5/0xf0
mlxsw_sp_acl_atcam_entry_add+0xe8/0x3c0
mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0
mlxsw_sp_acl_tcam_vchunk_migrate_one+0x16b/0x270
mlxsw_sp_acl_tcam_vregion_rehash_work+0xbe/0x510
process_one_work+0x151/0x370
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Resctrl open codes a search for information about a given cache level in
a couple of places (and more are on the way).
Provide a new inline function get_cpu_cacheinfo_level() in
<linux/cacheinfo.h> to do the search and return a pointer to the
cacheinfo structure.
Add lockdep_assert_cpus_held() to enforce the comment that cpuhp lock
must be held.
Simplify the existing get_cpu_cacheinfo_id() by using this new function
to do the search.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240610003927.341707-4-tony.luck@intel.com
|
|
This file was created with a direct cut and paste from cpu.h so
kept the legacy declaration style.
But the Linux coding standard for function declarations in header
files is to avoid use of "extern".
Drop "extern" from all function declarations.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240610003927.341707-3-tony.luck@intel.com
|
|
Avoid upcoming #include hell when <linux/cachinfo.h> wants to use
lockdep_assert_cpus_held() and creates a #include loop that would
break the build for arch/riscv.
[ bp: s/cpu/CPU/g ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240610003927.341707-2-tony.luck@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking doc fix from Ingo Molnar:
"Fix typos in the kerneldoc of some of the atomic APIs"
* tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"14 hotfixes, 6 of which are cc:stable.
All except the nilfs2 fix affect MM and all are singletons - see the
chagelogs for details"
* tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
mm: fix xyz_noprof functions calling profiled functions
codetag: avoid race at alloc_slab_obj_exts
mm/hugetlb: do not call vma_add_reservation upon ENOMEM
mm/ksm: fix ksm_zero_pages accounting
mm/ksm: fix ksm_pages_scanned accounting
kmsan: do not wipe out origin when doing partial unpoisoning
vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr()
mm: page_alloc: fix highatomic typing in multi-block buddies
nilfs2: fix potential kernel bug due to lack of writeback flag waiting
memcg: remove the lockdep assert from __mod_objcg_mlstate()
mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes
mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n
mm: drop the 'anon_' prefix for swap-out mTHP counters
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Core:
- Make iommu-dma code recognize 'force_aperture' again
- Fix for potential NULL-ptr dereference from iommu_sva_bind_device()
return value
AMD IOMMU fixes:
- Fix lockdep splat for invalid wait context
- Add feature bit check before enabling PPR
- Make workqueue name fit into buffer
- Fix memory leak in sysfs code"
* tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix Invalid wait context issue
iommu/amd: Check EFR[EPHSup] bit before enabling PPR
iommu/amd: Fix workqueue name
iommu: Return right value in iommu_sva_bind_device()
iommu/dma: Fix domain init
iommu/amd: Fix sysfs leak in iommu init
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux into driver-core-next
Uwe writes:
Change struct platform_driver::remove() to return void
This is step b) of the plan outlined in commit 5c5a7680e67b ("platform:
Provide a remove callback that returns no value"), which completes the
first major step of making the remove callback return no value. Up to
now it returned an int which however was mostly ignored by the driver
core and lured driver authors to believe there is some error handling.
Note that the Linux driver model assumes that removing a device cannot
fail, so this isn't about being lazy and not implementing error handling
in the core and so making .remove return void is the right thing to do.
* tag 'platform-remove-void-step-b' of https://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
platform: Make platform_driver::remove() return void
samples: qmi: Convert to platform remove callback returning void
nvdimm/of_pmem: Convert to platform remove callback returning void
nvdimm/e820: Convert to platform remove callback returning void
gpu: ipu-v3: Convert to platform remove callback returning void
gpu: host1x: Convert to platform remove callback returning void
drm/mediatek: Convert to platform remove callback returning void
drm/imagination: Convert to platform remove callback returning void
gpu: host1x: mipi: Benefit from devm_clk_get_prepared()
pps: clients: gpio: Convert to platform remove callback returning void
fsi: occ: Convert to platform remove callback returning void
fsi: master-gpio: Convert to platform remove callback returning void
fsi: master-ast-cf: Convert to platform remove callback returning void
fsi: master-aspeed: Convert to platform remove callback returning void
reset: ti-sci: Convert to platform remove callback returning void
reset: rzg2l-usbphy-ctrl: Convert to platform remove callback returning void
reset: meson-audio-arb: Convert to platform remove callback returning void
|
|
New CPU #defines encode vendor and family as well as model.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Device drivers with optional firmware may still want to use the
asynchronous firmware loading interface. To avoid printing a
warning into the kernel log when the optional firmware is
absent, add a nowarn variant of this interface.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20240516102532.213874-1-l.stach@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Commit 31c89007285d ("workqueue.c: Increase workqueue name length")
increased WQ_NAME_LEN from 24 to 32, but forget to increase
WORKER_DESC_LEN, which would cause truncation when setting kworker's
desc from workqueue_struct's name, process_one_work() for example.
Fixes: 31c89007285d ("workqueue.c: Increase workqueue name length")
Signed-off-by: Wenchao Hao <haowenchao22@gmail.com>
CC: Audra Mitchell <audra@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
generic_ci_match can be used by case-insensitive filesystems to compare
strings under lookup with dirents in a case-insensitive way. This
function is currently reimplemented by each filesystem supporting
casefolding, so this reduces code duplication in filesystem-specific
code.
[eugen.hristev@collabora.com: rework to first test the exact match, cleanup
and add error message]
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com>
Link: https://lore.kernel.org/r/20240606073353.47130-4-eugen.hristev@collabora.com
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
To avoid redundant memory barriers, add smp_mb__after_srcu_read_lock() to
pair with smp_mb__after_srcu_read_unlock() for use in paths that need to
emit a memory barrier, but already do srcu_read_lock(), which includes a
full memory barrier. Provide an API, e.g. as opposed to having callers
document the behavior via a comment, as the full memory barrier provided
by srcu_read_lock() is an implementation detail that shouldn't bleed into
random subsystems.
KVM will use smp_mb__after_srcu_read_lock() in it's VM-Exit path to ensure
a memory barrier is emitted, which is necessary to ensure correctness of
mixed memory types on CPUs that support self-snoop.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
[sean: massage changelog]
Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org
Link: https://lore.kernel.org/r/20240309010929.1403984-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Older systems will not populate the security attributes in the
capabilities register. The PSP on these systems, however, does have a
command to get the security attributes. Use this command during ccp
startup to populate the attributes if they're missing.
Closes: https://github.com/fwupd/fwupd/issues/5284
Closes: https://github.com/fwupd/fwupd/issues/5675
Closes: https://github.com/fwupd/fwupd/issues/6253
Closes: https://github.com/fwupd/fwupd/issues/7280
Closes: https://github.com/fwupd/fwupd/issues/6323
Closes: https://github.com/fwupd/fwupd/discussions/5433
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Align the whitespace so that future messages will also be better
aligned.
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"The core change is to detect unusually large number of VPD pages
(caused by device manufacturers having an endiannes issue) and reject
them rather than trying to parse a huge non-existent array.
The remaining fixes are in drivers the most user visible of which is
the ALUA state transition recognition (leads to intermittent I/O
errors in some situations otherwise)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort()
scsi: core: Handle devices which return an unusually large VPD page count
scsi: mpt3sas: Add missing kerneldoc parameter descriptions
scsi: qedf: Set qed_slowpath_params to zero before use
scsi: qedf: Wait for stag work during unload
scsi: qedf: Don't process stag work during unload and recovery
scsi: sr: Fix unintentional arithmetic wraparound
scsi: core: alua: I/O errors for ALUA state transitions
scsi: mpi3mr: Use proper format specifier in mpi3mr_sas_port_add()
|
|
Drivers often need to first disable an interrupt, carry out some
action, and then reenable the interrupt. Introduce support for the
"guard" notation for this so that the following is possible:
...
scoped_cond_guard(mutex_intr, return -EINTR, &data->sysfs_mutex) {
guard(disable_irq)(&client->irq);
error = elan_acquire_baseline(data);
if (error)
return error;
}
...
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/ZljAV6HjkPSEhWSw@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fix from Bjorn Helgaas:
- Revert lockdep checking on locking that protects device resets from
user-space config accesses; it exposed issues for which fixes are in
the works but are too risky for this cycle (Dan Williams)
* tag 'pci-v6.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Revert the cfg_access_lock lockdep mechanism
|
|
Sparse complains that function_trace_op is not static but is not declared
in a header file. It is used only in assembly code. But add it to a header
so that sparse no longer complains:
kernel/trace/ftrace.c:99:19: warning: symbol 'function_trace_op' was not declared. Should it be static?
Link: https://lore.kernel.org/linux-trace-kernel/20240605202708.289105647@goodmis.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/pensando/ionic/ionic_txrx.c
d9c04209990b ("ionic: Mark error paths in the data path as unlikely")
491aee894a08 ("ionic: fix kernel panic in XDP_TX action")
net/ipv6/ip6_fib.c
b4cb4a1391dc ("net: use unrcu_pointer() helper")
b01e1c030770 ("ipv6: fix possible race in __fib6_drop_pcpu_from()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
GCC 14.1 complains about the argument usage of kmemdup_array():
drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
130 | fuse->lookups = kmemdup_array(fuse->soc->lookups, sizeof(*fuse->lookups),
| ^
drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should specify number of elements, later size of each element
The annotation introduced by commit 7d78a7773355 ("string: Add
additional __realloc_size() annotations for "dup" helpers") lets the
compiler think that kmemdup_array() follows the same format as calloc(),
with the number of elements preceding the size of one element. So we
could simply swap the arguments to __realloc_size() to get rid of that
warning, but it seems cleaner to instead have kmemdup_array() follow the
same format as krealloc_array(), memdup_array_user(), calloc() etc.
Fixes: 7d78a7773355 ("string: Add additional __realloc_size() annotations for "dup" helpers")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20240606144608.97817-2-jean-philippe@linaro.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The GIC architecture specification defines a set of registers for
redistributors and ITSes that control the sharebility and cacheability
attributes of redistributors/ITSes initiator ports on the interconnect
(GICR_[V]PROPBASER, GICR_[V]PENDBASER, GITS_BASER<n>).
Architecturally the GIC provides a means to drive shareability and
cacheability attributes signals but it is not mandatory for designs to
wire up the corresponding interconnect signals that control the
cacheability/shareability of transactions.
Redistributors and ITSes interconnect ports can be connected to
non-coherent interconnects that are not able to manage the
shareability/cacheability attributes; this implicitly makes the
redistributors and ITSes non-coherent observers.
To enable non-coherent GIC designs on ACPI based systems, parse the MADT
GICC/GICR/ITS subtables non-coherent flags to determine whether the
respective components are non-coherent observers and force the
shareability attributes to be programmed into the redistributors and
ITSes registers.
An ACPI global function (acpi_get_madt_revision()) is added to retrieve
the MADT revision, in that it is essential to check the MADT revision
before checking for flags that were added with MADT revision 7 so that
if the kernel is booted with an ACPI MADT table with revision < 7 it
skips parsing the newly added flags (that should be zeroed reserved
values for MADT versions < 7 but they could turn out to be buggy and
should be ignored).
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20240606094238.757649-2-lpieralisi@kernel.org
|
|
In function deferred_init_memmap(), we call
deferred_init_mem_pfn_range_in_zone() to get the next deferred_init_pfn.
But we always search it from the very beginning.
Since we save the index in i, we can leverage this to search from i next
time.
[rppt refine the comment]
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Link: https://lore.kernel.org/all/20240605071339.15330-1-richard.weiyang@gmail.com
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
|
|
Add back HW-GRO to the reported features.
As the current implementation of HW-GRO uses KSMs with a
specific fixed buffer size (256B) to map its headers buffer,
we reported the feature only if the NIC is supporting KSM and
the minimum value for buffer size is below the requested one.
iperf3 bandwidth comparison:
+---------+--------+--------+-----------+
| streams | SW GRO | HW GRO | Unit |
|---------+--------+--------+-----------|
| 1 | 36 | 42 | Gbits/sec |
| 4 | 34 | 39 | Gbits/sec |
| 8 | 31 | 35 | Gbits/sec |
+---------+--------+--------+-----------+
A downstream patch will add skb fragment coalescing which will improve
performance considerably.
Benchmark details:
VM based setup
CPU: Intel(R) Xeon(R) Platinum 8380 CPU, 24 cores
NIC: ConnectX-7 100GbE
iperf3 and irq running on same CPU over a single receive queue
Signed-off-by: Yoray Zack <yorayz@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240603212219.1037656-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
KSM Mkey is KLM Mkey with a fixed buffer size. Due to this fact,
it is a faster mechanism than KLM.
SHAMPO feature used KLMs Mkeys for memory mappings of its headers buffer.
As it used KLMs with the same buffer size for each entry,
we can use KSMs instead.
This commit changes the Mkeys that map the SHAMPO headers buffer
from KLMs to KSMs.
Signed-off-by: Yoray Zack <yorayz@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240603212219.1037656-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We normally ksm_zero_pages++ in ksmd when page is merged with zero page,
but ksm_zero_pages-- is done from page tables side, where there is no any
accessing protection of ksm_zero_pages.
So we can read very exceptional value of ksm_zero_pages in rare cases,
such as -1, which is very confusing to users.
Fix it by changing to use atomic_long_t, and the same case with the
mm->ksm_zero_pages.
Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-2-34bb358fdc13@linux.dev
Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM")
Fixes: 6080d19f0704 ("ksm: add ksm zero pages for each process")
Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Cc: Stefan Roesch <shr@devkernel.io>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
if CONFIG_SYSFS is not enabled in config, we get the below error,
All errors (new ones prefixed by >>):
s390-linux-ld: mm/memory.o: in function `count_mthp_stat':
>> include/linux/huge_mm.h:285:(.text+0x191c): undefined reference to `mthp_stats'
s390-linux-ld: mm/huge_memory.o:(.rodata+0x10): undefined reference to `mthp_stats'
vim +285 include/linux/huge_mm.h
279
280 static inline void count_mthp_stat(int order, enum mthp_stat_item item)
281 {
282 if (order <= 0 || order > PMD_ORDER)
283 return;
284
> 285 this_cpu_inc(mthp_stats.stats[order][item]);
286 }
287
Link: https://lkml.kernel.org/r/20240523210045.40444-1-21cnbao@gmail.com
Fixes: ec33687c6749 ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback counters")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405231728.tCAogiSI-lkp@intel.com/
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The mTHP swap related counters: 'anon_swpout' and 'anon_swpout_fallback'
are confusing with an 'anon_' prefix, since the shmem can swap out
non-anonymous pages. So drop the 'anon_' prefix to keep consistent with
the old swap counter names.
This is needed in 6.10-rcX to avoid having an inconsistent ABI out in the
field.
Link: https://lkml.kernel.org/r/7a8989c13299920d7589007a30065c3e2c19f0e0.1716431702.git.baolin.wang@linux.alibaba.com
Fixes: d0f048ac39f6 ("mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters")
Fixes: 42248b9d34ea ("mm: add docs for per-order mTHP counters and transhuge_page ABI")
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Suggested-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the ACPI EC and AC drivers, the ACPI APEI error injection
driver and build issues related to the dev_is_pnp() macro referring to
pnp_bus_type that is not exported to modules.
Specifics:
- Fix error handling during EC operation region accesses in the ACPI
EC driver (Armin Wolf)
- Fix a memory leak in the APEI error injection driver introduced
during its converion to a platform driver (Dan Williams)
- Fix build failures related to the dev_is_pnp() macro by redefining
it as a proper function and exporting it to modules as appropriate
and unexport pnp_bus_type which need not be exported any more (Andy
Shevchenko)
- Update the ACPI AC driver to use power_supply_changed() to let the
power supply core handle configuration changes properly (Thomas
Weißschuh)"
* tag 'acpi-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: AC: Properly notify powermanagement core about changes
PNP: Hide pnp_bus_type from the non-PNP code
PNP: Make dev_is_pnp() to be a function and export it for modules
ACPI: EC: Avoid returning AE_OK on errors in address space handler
ACPI: EC: Abort address space access upon error
ACPI: APEI: EINJ: Fix einj_dev release leak
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix the intel_pstate and amd-pstate cpufreq drivers and the
cpupower utility.
Specifics:
- Fix a recently introduced unchecked HWP MSR access in the
intel_pstate driver (Srinivas Pandruvada)
- Add missing conversion from MHz to KHz to amd_pstate_set_boost() to
address sysfs inteface inconsistency and fix P-state frequency
reporting on AMD Family 1Ah CPUs in the cpupower utility (Dhananjay
Ugwekar)
- Get rid of an excess global header file used by the amd-pstate
cpufreq driver (Arnd Bergmann)"
* tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix unchecked HWP MSR access
cpufreq: amd-pstate: Fix the inconsistency in max frequency units
cpufreq: amd-pstate: remove global header file
tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs
|
|
No users after do_readlinkat started doing the job on its own.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20240604155257.109500-3-mjguzik@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use preempt_model_preemptible() to detect a preemptible kernel when
deciding whether or not to reschedule in order to drop a contended
spinlock or rwlock. Because PREEMPT_DYNAMIC selects PREEMPTION, kernels
built with PREEMPT_DYNAMIC=y will yield contended locks even if the live
preemption model is "none" or "voluntary". In short, make kernels with
dynamically selected models behave the same as kernels with statically
selected models.
Somewhat counter-intuitively, NOT yielding a lock can provide better
latency for the relevant tasks/processes. E.g. KVM x86's mmu_lock, a
rwlock, is often contended between an invalidation event (takes mmu_lock
for write) and a vCPU servicing a guest page fault (takes mmu_lock for
read). For _some_ setups, letting the invalidation task complete even
if there is mmu_lock contention provides lower latency for *all* tasks,
i.e. the invalidation completes sooner *and* the vCPU services the guest
page fault sooner.
But even KVM's mmu_lock behavior isn't uniform, e.g. the "best" behavior
can vary depending on the host VMM, the guest workload, the number of
vCPUs, the number of pCPUs in the host, why there is lock contention, etc.
In other words, simply deleting the CONFIG_PREEMPTION guard (or doing the
opposite and removing contention yielding entirely) needs to come with a
big pile of data proving that changing the status quo is a net positive.
Opportunistically document this side effect of preempt=full, as yielding
contended spinlocks can have significant, user-visible impact.
Fixes: c597bfddc9e9 ("sched: Provide Kconfig support for default dynamic preempt mode")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@proxmox.com
|
|
Move the declarations and inlined implementations of the preempt_model_*()
helpers to preempt.h so that they can be referenced in spinlock.h without
creating a potential circular dependency between spinlock.h and sched.h.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
Link: https://lkml.kernel.org/r/20240528003521.979836-2-ankur.a.arora@oracle.com
|
|
For ${atomic}_sub_and_test() the @i parameter is the value to subtract,
not add. Fix the typo in the kerneldoc template and generate the headers
with this update.
Fixes: ad8110706f38 ("locking/atomic: scripts: generate kerneldoc comments")
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com
|