Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest update from Steven Rostedt:
- Fix failure of directory of log file not existing
If a LOG_FILE option is set for ktest to log its messages, and the
directory path does not exist. Then ktest fails. Have ktest attempt
to create the directory where the log file exists and if that
succeeds continue on testing.
* tag 'ktest-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Fix Test Failures Due to Missing LOG_FILE Directories
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tooling updates from Steven Rostedt:
- Allow RTLA to collect data via BPF
The current implementation of rtla uses libtracefs and libtraceevent
to pull sample events generated by the timerlat tracer from the trace
buffer. rtla then processes the sample by updating the histogram and
summary (current, maximum, minimum, and sum values) as well as checks
if tracing has been stopped due to threshold overflow.
In use cases where a large number of samples is being generated, that
is, with measurements running on many CPUs and with a low interval,
this sample processing design causes a significant CPU load on the
rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was
reported as not being able to keep up with the samples and dropping
most of them, leading to it being unusable.
Change the way the timerlat trace processes samples by attaching a
BPF program to the trace event using the BPF skeleton feature of
bpftool. Unlike the current implementation, the BPF implementation
does not check whether tracing is stopped (in BPF mode, tracing is
always off to improve performance), but waits for a write to a BPF
ringbuffer instead. This allows rtla to exit immediately when a
threshold is violated, without waiting for the next iteration of the
while loop.
If the requirements for the BPF implementation are not met, either at
build time or at run time, the current implementation is used as
fallback. Which implementation is being used can be seen when running
rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
- Fix LD_FLAGS from being dropped in build
- Refactor code to remove duplication of save_trace_to_file
- Always set options and do not rely on default settings
Do not rely on the default kernel settings of the tracers when
starting. They could have been changed by the user which gives
inconsistent results. Always set the options that rtla expects.
- Add creation of ctags and TAGS for traversing code
* tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Add the ability to create ctags and etags
rtla/tests: Test setting default options
rtla/tests: Reset osnoise options before check
rtla: Always set all tracer options
rtla/osnoise: Set OSNOISE_WORKLOAD to true
rtla: Unify apply_config between top and hist
rtla/osnoise: Unify params struct
rtla: Fix segfault in save_trace_to_file call
tools/build: Use SYSTEM_BPFTOOL for system bpftool
rtla: Refactor save_trace_to_file
tools/rv: Keep user LDFLAGS in build
rtla/timerlat: Test BPF mode
rtla/timerlat_top: Use BPF to collect samples
rtla/timerlat_top: Move divisor to update
rtla/timerlat_hist: Use BPF to collect samples
rtla/timerlat: Add BPF skeleton to collect samples
rtla: Add optional dependency on BPF tooling
tools/build: Add bpftool-skeletons feature test
rtla/timerlat: Unify params struct
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull latency tracing updates from Steven Rostedt:
- Add some trace events to osnoise and timerlat sample generation
This adds more information to the osnoise and timerlat tracers as
well as allows BPF programs to be attached to these locations to
extract even more data.
- Fix to DECLARE_TRACE_CONDITION() macro
It wasn't used but now will be and it happened to be broken causing
the build to fail.
- Add scheduler specification monitors to runtime verifier (RV)
This is a continuation of Daniel Bristot's work.
RV allows monitors to run and react concurrently. Running the
cumulative model is equivalent to running single components using the
same reactors, with the advantage that it's easier to point out which
specification failed in case of error.
This update introduces nested monitors to RV, in short, the sysfs
monitor folder will contain a monitor named sched, which is nothing
but an empty container for other monitors. Controlling the sched
monitor (enable, disable, set reactors) controls all nested monitors.
The following scheduling monitors are added:
- sco: scheduling context operations
Monitor to ensure sched_set_state happens only in thread context
- tss: task switch while scheduling
Monitor to ensure sched_switch happens only in scheduling context
- snroc: set non runnable on its own context
Monitor to ensure set_state happens only in the respective task's context
- scpd: schedule called with preemption disabled
Monitor to ensure schedule is called with preemption disabled
- snep: schedule does not enable preempt
Monitor to ensure schedule does not enable preempt
- sncid: schedule not called with interrupt disabled
Monitor to ensure schedule is not called with interrupt disabled
* tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/rv: Allow rv list to filter for container
Documentation/rv: Add docs for the sched monitors
verification/dot2k: Add support for nested monitors
tools/rv: Add support for nested monitors
rv: Add scpd, snep and sncid per-cpu monitors
rv: Add snroc per-task monitor
rv: Add sco and tss per-cpu monitors
rv: Add option for nested monitors and include sched
sched: Add sched tracepoints for RV task model
rv: Add license identifiers to monitor files
tracing: Fix DECLARE_TRACE_CONDITION
trace/osnoise: Add trace events for samples
|
|
- Remove unused tools 'pci' build target left over after moving tests to
tools/testing/selftests/pci_endpoint (Jianfeng Liu)
- Fix typos and whitespace errors (Bjorn Helgaas)
* pci/misc:
PCI: Fix typos
tools/Makefile: Remove pci target
# Conflicts:
# drivers/pci/endpoint/functions/pci-epf-test.c
|
|
- Fix endpoint BAR testing so the test can skip disabled BARs instead of
reporting them as failures (Niklas Cassel)
- Verify that pci_endpoint interrupt tests set the correct IRQ type
(Kunihiko Hayashi)
- Fix interpretation of pci_endpoint_test_bars_read_bar() error returns
(Niklas Cassel)
- Fix potential string truncation in pci_endpoint_test_probe() (Niklas
Cassel)
- Increase endpoint test BAR size variable to accommodate BARs larger than
INT_MAX (Niklas Cassel)
- Release IRQs to avoid leak in pci_endpoint interrupt tests (Kunihiko
Hayashi)
- Log the correct IRQ type when pci_endpoint IRQ request test fails
(Kunihiko Hayashi)
- Remove pci_endpoint_test irq_type and no_msi globals; instead use
test->irq_type (Kunihiko Hayashi)
- Remove unnecessary use of managed IRQ functions in pci_endpoint_test
(Kunihiko Hayashi)
- Add and use IRQ_TYPE_* defines in pci_endpoint_test (Niklas Cassel)
- Add struct pci_epc_features.intx_capable and note that RK3568 and RK3588
can't raise INTx interrupts (Niklas Cassel)
- Expose supported IRQ types in CAPS so pci_endpoint_test can set
appropriate type (Niklas Cassel)
- Add PCITEST_IRQ_TYPE_AUTO to pci_endpoint_test for cases where the IRQ
type doesn't matter (Niklas Cassel)
* pci/endpoint-test:
misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO
PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
PCI: endpoint: Add intx_capable to epc_features struct
selftests: pci_endpoint: Use IRQ_TYPE_* defines from UAPI header
misc: pci_endpoint_test: Use IRQ_TYPE_* defines from UAPI header
PCI: endpoint: pcitest: Add IRQ_TYPE_* defines to UAPI header
misc: pci_endpoint_test: Do not use managed IRQ functions
misc: pci_endpoint_test: Remove global 'irq_type' and 'no_msi'
misc: pci_endpoint_test: Fix 'irq_type' to convey the correct type
misc: pci_endpoint_test: Fix displaying 'irq_type' after 'request_irq' error
misc: pci_endpoint_test: Avoid issue of interrupts remaining after request_irq error
misc: pci_endpoint_test: Handle BAR sizes larger than INT_MAX
misc: pci_endpoint_test: Give disabled BARs a distinct error code
misc: pci_endpoint_test: Fix potential truncation in pci_endpoint_test_probe()
misc: pci_endpoint_test: Fix pci_endpoint_test_bars_read_bar() error handling
selftests: pci_endpoint: Add GET_IRQTYPE checks to each interrupt test
selftests: pci_endpoint: Skip disabled BARs
|
|
Handle missing parent directories for LOG_FILE path to prevent test
failures. If the parent directories don't exist, create them to ensure
the tests proceed successfully.
Cc: <warthog9@eaglescrag.net>
Link: https://lore.kernel.org/20250307043854.2518539-1-Ayush.jain3@amd.com
Signed-off-by: Ayush Jain <Ayush.jain3@amd.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Add argument limitation test case for dynamic events.
This is a boudary check for the maximum number of the probe
event arguments.
Link: https://lore.kernel.org/all/174055078295.4079315.14702008939511417359.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Add BAD_TP_NAME syntax error message check.
Link: https://lore.kernel.org/all/174055077485.4079315.3624012056141021755.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Expand the tprobe event test case to check wrong tracepoint
format.
Link: https://lore.kernel.org/all/174055076681.4079315.16941322116874021804.stgit@mhiramat.tok.corp.google.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls)
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked) in
BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream
performance up to 2x.
- Improve performance of contended connect() by 200% by searching for
an available 4-tuple under RCU rather than a spin lock. Bring an
additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles. There are up to 4
namespace roles but we used to have just 2 netns pointer arguments,
interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout in
TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a
module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name to
messages as metadata
Driver API:
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where
possible. Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers:
- Remove the IBM LCS driver for s390
- Remove the sb1000 cable modem driver
- Add support for SFP module access over SMBus
- Add MCTP transport driver for MCTP-over-USB
- Enable XDP metadata support in multiple drivers
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD
platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96%
with 1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused
cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at
runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7"
* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
unix: fix up for "apparmor: add fine grained af_unix mediation"
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
net: usb: asix: ax88772: Increase phy_name size
net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
net: libwx: fix Tx L4 checksum
net: libwx: fix Tx descriptor content for some tunnel packets
atm: Fix NULL pointer dereference
net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
net: phy: aquantia: add essential functions to aqr105 driver
net: phy: aquantia: search for firmware-name in fwnode
net: phy: aquantia: add probe function to aqr105 for firmware loading
net: phy: Add swnode support to mdiobus_scan
gve: add XDP DROP and PASS support for DQ
gve: update XDP allocation path support RX buffer posting
gve: merge packet buffer size fields
gve: update GQ RX to use buf_size
gve: introduce config-based allocation for XDP
gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Move vm_table members out of kernel/sysctl.c
All vm_table array members have moved to their respective subsystems
leading to the removal of vm_table from kernel/sysctl.c. This
increases modularity by placing the ctl_tables closer to where they
are actually used and at the same time reducing the chances of merge
conflicts in kernel/sysctl.c.
- ctl_table range fixes
Replace the proc_handler function that checks variable ranges in
coredump_sysctls and vdso_table with the one that actually uses the
extra{1,2} pointers as min/max values. This tightens the range of the
values that users can pass into the kernel effectively preventing
{under,over}flows.
- Misc fixes
Correct grammar errors and typos in test messages. Update sysctl
files in MAINTAINERS. Constified and removed array size in
declaration for alignment_tbl
* tag 'sysctl-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (22 commits)
selftests/sysctl: fix wording of help messages
selftests: fix spelling/grammar errors in sysctl/sysctl.sh
MAINTAINERS: Update sysctl file list in MAINTAINERS
sysctl: Fix underflow value setting risk in vm_table
coredump: Fixes core_pipe_limit sysctl proc_handler
sysctl: remove unneeded include
sysctl: remove the vm_table
sh: vdso: move the sysctl to arch/sh/kernel/vsyscall/vsyscall.c
x86: vdso: move the sysctl to arch/x86/entry/vdso/vdso32-setup.c
fs: dcache: move the sysctl to fs/dcache.c
sunrpc: simplify rpcauth_cache_shrink_count()
fs: drop_caches: move sysctl to fs/drop_caches.c
fs: fs-writeback: move sysctl to fs/fs-writeback.c
mm: nommu: move sysctl to mm/nommu.c
security: min_addr: move sysctl to security/min_addr.c
mm: mmap: move sysctl to mm/mmap.c
mm: util: move sysctls to mm/util.c
mm: vmscan: move vmscan sysctls to mm/vmscan.c
mm: swap: move sysctl to mm/swap.c
mm: filemap: move sysctl to mm/filemap.c
...
|
|
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (scsi_debug, ufs, lpfc, st, fnic, mpi3mr,
mpt3sas) and the removal of cxlflash.
The only non-trivial core change is an addition to unit attention
handling to recognize UAs for power on/reset and new media so the tape
driver can use it"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (107 commits)
scsi: st: Tighten the page format heuristics with MODE SELECT
scsi: st: ERASE does not change tape location
scsi: st: Fix array overflow in st_setup()
scsi: target: tcm_loop: Fix wrong abort tag
scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flag
scsi: hisi_sas: Fixed failure to issue vendor specific commands
scsi: fnic: Remove unnecessary NUL-terminations
scsi: fnic: Remove redundant flush_workqueue() calls
scsi: core: Use a switch statement when attaching VPD pages
scsi: ufs: renesas: Add initialization code for R-Car S4-8 ES1.2
scsi: ufs: renesas: Add reusable functions
scsi: ufs: renesas: Refactor 0x10ad/0x10af PHY settings
scsi: ufs: renesas: Remove register control helper function
scsi: ufs: renesas: Add register read to remove save/set/restore
scsi: ufs: renesas: Replace init data by init code
scsi: ufs: dt-bindings: renesas,ufs: Add calibration data
scsi: mpi3mr: Task Abort EH Support
scsi: storvsc: Don't report the host packet status as the hv status
scsi: isci: Make most module parameters static
scsi: megaraid_sas: Make most module parameters static
...
|
|
Pull io_uring updates from Jens Axboe:
"This is the first of the io_uring pull requests for the 6.15 merge
window, there will be others once the net tree has gone in. This
contains:
- Cleanup and unification of cancelation handling across various
request types.
- Improvement for bundles, supporting them both for incrementally
consumed buffers, and for non-multishot requests.
- Enable toggling of using iowait while waiting on io_uring events or
not. Unfortunately this is still tied with CPU frequency boosting
on short waits, as the scheduler side has not been very receptive
to splitting the (useless) iowait stat from the cpufreq implied
boost.
- Add support for kbuf nodes, enabling zero-copy support for the ublk
block driver.
- Various cleanups for resource node handling.
- Series greatly cleaning up the legacy provided (non-ring based)
buffers. For years, we've been pushing the ring provided buffers as
the way to go, and that is what people have been using. Reduce the
complexity and code associated with legacy provided buffers.
- Series cleaning up the compat handling.
- Series improving and cleaning up the recvmsg/sendmsg iovec and msg
handling.
- Series of cleanups for io-wq.
- Start adding a bunch of selftests. The liburing repository
generally carries feature and regression tests for everything, but
at least for ublk initially, we'll try and go the route of having
it in selftests as well. We'll see how this goes, might decide to
migrate more tests this way in the future.
- Various little cleanups and fixes"
* tag 'for-6.15/io_uring-20250322' of git://git.kernel.dk/linux: (108 commits)
selftests: ublk: add stripe target
selftests: ublk: simplify loop io completion
selftests: ublk: enable zero copy for null target
selftests: ublk: prepare for supporting stripe target
selftests: ublk: move common code into common.c
selftests: ublk: increase max buffer size to 1MB
selftests: ublk: add single sqe allocator helper
selftests: ublk: add generic_01 for verifying sequential IO order
selftests: ublk: fix starting ublk device
io_uring: enable toggle of iowait usage when waiting on CQEs
selftests: ublk: fix write cache implementation
selftests: ublk: add variable for user to not show test result
selftests: ublk: don't show `modprobe` failure
selftests: ublk: add one dependency header
io_uring/kbuf: enable bundles for incrementally consumed buffers
Revert "io_uring/rsrc: simplify the bvec iter count calculation"
selftests: ublk: improve test usability
selftests: ublk: add stress test for covering IO vs. killing ublk server
selftests: ublk: add one stress test for covering IO vs. removing device
selftests: ublk: load/unload ublk_drv when preparing & cleaning up tests
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers updates from Ilpo Järvinen:
- alienware-wmi:
- Refactor and split WMAX/legacy drivers
- dell-ddv:
- Correct +0.1 offset in temperature
- Use the power supply extension mechanism for battery temperatures
- intel/pmc:
- Refactor init to mostly use a common init function
- Add support for Arrow Lake U/H
- Add support for Panther Lake
- intel/sst:
- Improve multi die handling
- Prefix header search path with sysroot (fixes cross-compiling)
- lenovo-wmi-hotkey-utilities:
- Support for mic & audio mute LEDs
- samsung-galaxybook:
- Add driver for Samsung Galaxy Book series
- wmi:
- Rework WCxx/WExx ACPI method handling
- Enable data block collection when the data block is set
- platform/arm:
- Add Huawei Matebook E Go EC driver
- platform/mellanox:
- Relocate to drivers/platform/mellanox/
- mlxbf-bootctl:
- RTC battery status sysfs support
- Miscellaneous cleanups / refactoring / improvements
* tag 'platform-drivers-x86-v6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (75 commits)
platform/x86: x86-android-tablets: Add select POWER_SUPPLY to Kconfig
platform/x86/amd/pmf: convert timeouts to secs_to_jiffies()
platform/x86: thinkpad_acpi: convert timeouts to secs_to_jiffies()
irqdomain: platform/x86: Switch to irq_domain_create_linear()
platform/x86/amd/pmc: fix leak in probe()
tools/power/x86/intel-speed-select: v1.22 release
tools/power/x86/intel-speed-select: Prefix header search path with sysroot
tools/power/x86/intel-speed-select: Die ID for IO dies
tools/power/x86/intel-speed-select: Fix the condition to check multi die system
tools/power/x86/intel-speed-select: Prevent increasing MAX_DIE_PER_PACKAGE
platform/x86/amd/pmc: Use managed APIs for mutex
platform/x86/amd/pmc: Remove unnecessary line breaks
platform/x86/amd/pmc: Move macros and structures to the PMC header file
platform/x86/amd/pmc: Notify user when platform does not support s0ix transition
platform/x86: dell-ddv: Use the power supply extension mechanism
platform/x86: dell-ddv: Use devm_battery_hook_register
platform/x86: dell-ddv: Fix temperature calculation
platform/x86: thinkpad_acpi: check the return value of devm_mutex_init()
platform/x86: samsung-galaxybook: Fix block_recording not supported logic
platform/x86: dell-uart-backlight: Make dell_uart_bl_serdev_driver static
...
|
|
- Add the ability to create and remove ctags and etags, using the following
make tags
make TAGS
make tags_clean
- fix a comment in Makefile.rtla with the correct spelling and don't
imply that the ability to create an rtla tarball will be removed
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: "Luis Claudio R . Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250321175053.29048-1-jkacur@redhat.com
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Add function to test engine to test with pre-set osnoise options, and
use it to test whether osnoise period (as an example) is set correctly.
The test works by pre-setting a high period of 10 minutes and stop on
threshold. Thus, it is easy to check whether rtla is properly resetting
the period to default: if it is, the test will complete on time, since
the first sample will overflow the threshold. If not, it will time out.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-7-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Remove any dangling tracing instances from previous improperly exited
runs of rtla, and reset osnoise options to default before running a test
case.
This ensures that the test results are deterministic. Specific test
cases checked that rtla behaves correctly even when the tracer state is
not clean will be added later.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-6-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
rtla currently only sets tracer options that are explicitly set by the
user, with the exception of OSNOISE_WORKLOAD.
This leads to improper behavior in case rtla is run with those options
not set to the default value. rtla does reset them to the original
value upon exiting, but that does not protect it from starting with
non-default values set either by an improperly exited rtla or by another
user of the tracers.
For example, after running this command:
$ echo 1 > /sys/kernel/tracing/osnoise/stop_tracing_us
all runs of rtla will stop at the 1us threshold, even if not requested
by the user:
$ rtla osnoise hist
Index CPU-000 CPU-001
1 8 5
2 5 9
3 1 2
4 6 1
5 2 1
6 0 1
8 1 1
12 0 1
14 1 0
15 1 0
over: 0 0
count: 25 21
min: 1 1
avg: 3.68 3.05
max: 15 12
rtla osnoise hit stop tracing
Fix the problem by setting the default value for all tracer options if
the user has not provided their own value.
For most of the options, it's enough to just drop the if clause checking
for the value being set. For cpus, "all" is used as the default value,
and for osnoise default period and runtime, default values of
the osnoise_data variable in trace_osnoise.c are used.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-5-tglozar@redhat.com
Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode")
Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode")
Fixes: a828cd18bc4a ("rtla: Add timerlat tool and timelart top mode")
Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
If running rtla osnoise with NO_OSNOISE_WORKLOAD, it reports no samples:
$ echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
$ rtla osnoise hist -d 10s
Index
over: 0
count: 0
min: 0
avg: 0
max: 0
This situation can also happen when running rtla-osnoise after an
improperly exited rtla-timerlat run.
Set OSNOISE_WORKLOAD in rtla-osnoise, too, similarly to what we
already did for timerlat in commit 217f0b1e990e ("rtla/timerlat_top: Set
OSNOISE_WORKLOAD for kernel threads") and commit d8d866171a41
("rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads").
Note that there is no user workload mode for rtla-osnoise yet, so
OSNOISE_WORKLOAD is always set to true.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-4-tglozar@redhat.com
Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode")
Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The functions osnoise_top_apply_config and osnoise_hist_apply_config, as
well as timerlat_top_apply_config and timerlat_hist_apply_config, are
mostly the same.
Move common part from them into separate functions osnoise_apply_config
and timerlat_apply_config.
For rtla-timerlat, also unify params->user_hist and params->user_top
into one field called params->user_data, and move several fields used
only by timerlat-top into the top-only section of struct
timerlat_params.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-3-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Instead of having separate structs osnoise_top_params and
osnoise_hist_params, use one struct osnoise_params for both.
This allows code using the structs to be shared between osnoise-top and
osnoise-hist.
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250320092500.101385-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Running rtla with exit on threshold, but without saving trace leads to a
segmenetation fault:
$ rtla timerlat hist -T 10
...
Max timerlat IRQ latency from idle: 4.29 us in cpu 0
Segmentation fault
This is caused by null pointer deference in the call of
save_trace_to_file, which attempts to dereference an uninitialized
osnoise_tool variable:
save_trace_to_file(record->trace.inst, params->trace_output);
^ this is uninitialized if params->trace_output is
not set
Fix this by not attempting to dereference "record" if it is NULL and
passing NULL instead. As a safety measure, the first field is also
checked for NULL inside save_trace_to_file.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Costa Shulyupin <costa.shul@redhat.com>
Link: https://lore.kernel.org/20250313141034.299117-1-tglozar@redhat.com
Fixes: dc4d4e7c72d1 ("rtla: Refactor save_trace_to_file")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The feature test for system bpftool uses BPFTOOL as the variable to set
its path, defaulting to just "bpftool" if not set by the user.
This conflicts with selftests and a few other utilities, which expect
BPFTOOL to be set to the in-tree bpftool path by default. For example,
bpftool selftests fail to build:
$ make -C tools/testing/selftests/bpf/
make: Entering directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'
make: *** No rule to make target 'bpftool', needed by '/home/tglozar/dev/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h'. Stop.
make: Leaving directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf'
Fix the problem by renaming the variable used for system bpftool from
BPFTOOL to SYSTEM_BPFTOOL, so that the new usage does not conflict with
the existing one of BPFTOOL.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250326004018.248357-1-tglozar@redhat.com
Fixes: 8a635c3856dd ("tools/build: Add bpftool-skeletons feature test")
Closes: https://lore.kernel.org/linux-kernel/5df6968a-2e5f-468e-b457-fc201535dd4c@linux.ibm.com/
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Suggested-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Test all network blockers:
- net.bind_tcp
- net.connect_tcp
Test coverage for security/landlock is 94.0% of 1525 lines according to
gcc/gcov-14.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-28-mic@digikod.net
[mic: Update test coverage]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Test all filesystem blockers, including events with several records, and
record with several blockers:
- fs.execute
- fs.write_file
- fs.read_file
- fs_read_dir
- fs.remove_dir
- fs.remove_file
- fs.make_char
- fs.make_dir
- fs.make_reg
- fs.make_sock
- fs.make_fifo
- fs.make_block
- fs.make_sym
- fs.refer
- fs.truncate
- fs.ioctl_dev
- fs.change_topology
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-27-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add a new scoped_audit.connect_to_child test to check the abstract UNIX
socket blocker.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-26-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add tests for all ptrace actions checking "blockers=ptrace" records.
This also improves PTRACE_TRACEME and PTRACE_ATTACH tests by making sure
that the restrictions comes from Landlock, and with the expected
process. These extended tests are like enhanced errno checks that make
sure Landlock enforcement is consistent.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-25-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit_exec tests to filter Landlock denials according to
cross-execution or muted subdomains.
Add a wait-pipe-sandbox.c test program to sandbox itself and send a
(denied) signals to its parent.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-24-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add audit_test.c to check with and without LANDLOCK_RESTRICT_SELF_*
flags against the two Landlock audit record types:
AUDIT_LANDLOCK_ACCESS and AUDIT_LANDLOCK_DOMAIN.
Check consistency of domain IDs per layer in AUDIT_LANDLOCK_ACCESS and
AUDIT_LANDLOCK_DOMAIN messages: denied access, domain allocation, and
domain deallocation.
These tests use signal scoping to make it simple. They are not in the
scoped_signal_test.c file but in the new dedicated audit_test.c file.
Tests are run with audit filters to ensure the audit records come from
the test program. Moreover, because there can only be one audit
process, tests would failed if run in parallel. Because of audit
limitations, tests can only be run in the initial namespace.
The audit test helpers were inspired by libaudit and
tools/testing/selftests/net/netfilter/audit_logread.c
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Phil Sutter <phil@nwl.cc>
Link: https://lore.kernel.org/r/20250320190717.2287696-23-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add the base_test's restrict_self_fd_flags tests to align with previous
restrict_self_fd tests but with the new
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF flag.
Add the restrict_self_flags tests to check that
LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF,
LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON, and
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF are valid but not the next
bit. Some checks are similar to restrict_self_checks_ordering's ones.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-22-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
To align with fs_test's layout1.inval and layout0.proc_nsfs which test
EBADFD for landlock_add_rule(2), create a new base_test's
restrict_self_fd which test EBADFD for landlock_restrict_self(2).
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-21-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Most of the time we want to log denied access because they should not
happen and such information helps diagnose issues. However, when
sandboxing processes that we know will try to access denied resources
(e.g. unknown, bogus, or malicious binary), we might want to not log
related access requests that might fill up logs.
By default, denied requests are logged until the task call execve(2).
If the LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF flag is set, denied
requests will not be logged for the same executed file.
If the LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON flag is set, denied
requests from after an execve(2) call will be logged.
The rationale is that a program should know its own behavior, but not
necessarily the behavior of other programs.
Because LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF is set for a specific
Landlock domain, it makes it possible to selectively mask some access
requests that would be logged by a parent domain, which might be handy
for unprivileged processes to limit logs. However, system
administrators should still use the audit filtering mechanism. There is
intentionally no audit nor sysctl configuration to re-enable these logs.
This is delegated to the user space program.
Increment the Landlock ABI version to reflect this interface change.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-18-mic@digikod.net
[mic: Rename variables and fix __maybe_unused]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Landlock IDs can be generated to uniquely identify Landlock objects.
For now, only Landlock domains get an ID at creation time. These IDs
map to immutable domain hierarchies.
Landlock IDs have important properties:
- They are unique during the lifetime of the running system thanks to
the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs.
- They are always greater than 2^32 and must then be stored in 64-bit
integer types.
- The initial ID (at boot time) is randomly picked between 2^32 and
2^33, which limits collisions in logs across different boots.
- IDs are sequential, which enables users to order them.
- IDs may not be consecutive but increase with a random 2^4 step, which
limits side channels.
Such IDs can be exposed to unprivileged processes, even if it is not the
case with this audit patch series. The domain IDs will be useful for
user space to identify sandboxes and get their properties.
These Landlock IDs are more secure that other absolute kernel IDs such
as pipe's inodes which rely on a shared global counter.
For checkpoint/restore features (i.e. CRIU), we could easily implement a
privileged interface (e.g. sysfs) to set the next ID counter.
IDR/IDA are not used because we only need a bijection from Landlock
objects to Landlock IDs, and we must not recycle IDs. This enables us
to identify all Landlock objects during the lifetime of the system (e.g.
in logs), but not to access an object from an ID nor know if an ID is
assigned. Using a counter is simpler, it scales (i.e. avoids growing
memory footprint), and it does not require locking. We'll use proper
file descriptors (with IDs used as inode numbers) to access Landlock
objects.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250320190717.2287696-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The new signal_scoping_thread_setuid tests check that the libc's
setuid() function works as expected even when a thread is sandboxed with
scoped signal restrictions.
Before the signal scoping fix, this test would have failed with the
setuid() call:
[pid 65] getpid() = 65
[pid 65] tgkill(65, 66, SIGRT_1) = -1 EPERM (Operation not permitted)
[pid 65] futex(0x40a66cdc, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 65] setuid(1001) = 0
After the fix, tgkill(2) is successfully leveraged to synchronize
credentials update across threads:
[pid 65] getpid() = 65
[pid 65] tgkill(65, 66, SIGRT_1) = 0
[pid 66] <... read resumed>0x40a65eb7, 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 66] --- SIGRT_1 {si_signo=SIGRT_1, si_code=SI_TKILL, si_pid=65, si_uid=1000} ---
[pid 66] getpid() = 65
[pid 66] setuid(1001) = 0
[pid 66] futex(0x40a66cdc, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 66] rt_sigreturn({mask=[]}) = 0
[pid 66] read(3, <unfinished ...>
[pid 65] setuid(1001) = 0
Test coverage for security/landlock is 92.9% of 1137 lines according to
gcc/gcov-14.
Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
Cc: Günther Noack <gnoack@google.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-8-mic@digikod.net
[mic: Update test coverage]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Split signal_scoping_threads tests into signal_scoping_thread_before
and signal_scoping_thread_after.
Use local variables for thread synchronization. Fix exported function.
Replace some asserts with expects.
Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
Cc: Günther Noack <gnoack@google.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-7-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Because Linux credentials are managed per thread, user space relies on
some hack to synchronize credential update across threads from the same
process. This is required by the Native POSIX Threads Library and
implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to
synchronize threads. See nptl(7) and libpsx(3). Furthermore, some
runtimes like Go do not enable developers to have control over threads
[1].
To avoid potential issues, and because threads are not security
boundaries, let's relax the Landlock (optional) signal scoping to always
allow signals sent between threads of the same process. This exception
is similar to the __ptrace_may_access() one.
hook_file_set_fowner() now checks if the target task is part of the same
process as the caller. If this is the case, then the related signal
triggered by the socket will always be allowed.
Scoping of abstract UNIX sockets is not changed because kernel objects
(e.g. sockets) should be tied to their creator's domain at creation
time.
Note that creating one Landlock domain per thread puts each of these
threads (and their future children) in their own scope, which is
probably not what users expect, especially in Go where we do not control
threads. However, being able to drop permissions on all threads should
not be restricted by signal scoping. We are working on a way to make it
possible to atomically restrict all threads of a process with the same
domain [2].
Add erratum for signal scoping.
Closes: https://github.com/landlock-lsm/go-landlock/issues/36
Fixes: 54a6e6bbf3be ("landlock: Add signal scoping")
Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies")
Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1]
Link: https://github.com/landlock-lsm/linux/issues/2 [2]
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net
[mic: Add extra pointer check and RCU guard, and ease backport]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
For PCITEST_MSI we really want to set PCITEST_SET_IRQTYPE explicitly
to PCITEST_IRQ_TYPE_MSI, since we want to test if MSI works.
For PCITEST_MSIX we really want to set PCITEST_SET_IRQTYPE explicitly
to PCITEST_IRQ_TYPE_MSIX, since we want to test if MSI works.
For PCITEST_LEGACY_IRQ we really want to set PCITEST_SET_IRQTYPE
explicitly to PCITEST_IRQ_TYPE_INTX, since we want to test if INTx
works.
However, for PCITEST_WRITE, PCITEST_READ, PCITEST_COPY, we really don't
care which IRQ type that is used, we just want to use a IRQ type that is
supported by the EPC.
The old behavior was to always use MSI for PCITEST_WRITE, PCITEST_READ,
PCITEST_COPY, was to always set IRQ type to MSI before doing the actual
test, however, there are EPC drivers that do not support MSI.
Add a new PCITEST_IRQ_TYPE_AUTO, that will use the CAPS register to see
which IRQ types the endpoint supports, and use one of the supported IRQ
types.
For backwards compatibility, if the endpoint does not expose any supported
IRQ type in the CAPS register, simply fallback to using MSI, as it was
unconditionally done before.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://lore.kernel.org/r/20250310111016.859445-16-cassel@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
"Another set of improvements to the kernel's CRC (cyclic redundancy
check) code:
- Rework the CRC64 library functions to be directly optimized, like
what I did last cycle for the CRC32 and CRC-T10DIF library
functions
- Rewrite the x86 PCLMULQDQ-optimized CRC code, and add VPCLMULQDQ
support and acceleration for crc64_be and crc64_nvme
- Rewrite the riscv Zbc-optimized CRC code, and add acceleration for
crc_t10dif, crc64_be, and crc64_nvme
- Remove crc_t10dif and crc64_rocksoft from the crypto API, since
they are no longer needed there
- Rename crc64_rocksoft to crc64_nvme, as the old name was incorrect
- Add kunit test cases for crc64_nvme and crc7
- Eliminate redundant functions for calculating the Castagnoli CRC32,
settling on just crc32c()
- Remove unnecessary prompts from some of the CRC kconfig options
- Further optimize the x86 crc32c code"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (36 commits)
x86/crc: drop the avx10_256 functions and rename avx10_512 to avx512
lib/crc: remove unnecessary prompt for CONFIG_CRC64
lib/crc: remove unnecessary prompt for CONFIG_LIBCRC32C
lib/crc: remove unnecessary prompt for CONFIG_CRC8
lib/crc: remove unnecessary prompt for CONFIG_CRC7
lib/crc: remove unnecessary prompt for CONFIG_CRC4
lib/crc7: unexport crc7_be_syndrome_table
lib/crc_kunit.c: update comment in crc_benchmark()
lib/crc_kunit.c: add test and benchmark for crc7_be()
x86/crc32: optimize tail handling for crc32c short inputs
riscv/crc64: add Zbc optimized CRC64 functions
riscv/crc-t10dif: add Zbc optimized CRC-T10DIF function
riscv/crc32: reimplement the CRC32 functions using new template
riscv/crc: add "template" for Zbc optimized CRC functions
x86/crc: add ANNOTATE_NOENDBR to suppress objtool warnings
x86/crc32: improve crc32c_arch() code generation with clang
x86/crc64: implement crc64_be and crc64_nvme using new template
x86/crc-t10dif: implement crc_t10dif using new template
x86/crc32: implement crc32_le using new template
x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are dominated by cpufreq updates which in turn are dominated by
updates related to boost support in the core and drivers and
amd-pstate driver optimizations.
Apart from the above, there are some cpuidle updates including a
rework of the most recent idle intervals handling in the venerable
menu governor that leads to significant improvements in some
performance benchmarks, as the governor is now more likely to predict
a shorter idle duration in some cases, and there are updates of the
core device power management code, mostly related to system suspend
and resume, that should help to avoid potential issues arising when
the drivers of devices depending on one another want to use different
optimizations.
There is also a usual collection of assorted fixes and cleanups,
including removal of some unused code.
Specifics:
- Manage sysfs attributes and boost frequencies efficiently from
cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)
- Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
Dhananjay Ugwekar, Imran Shaik, zuoqian)
- Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
Bai)
- cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)
- Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)
- Optimize the amd-pstate driver to avoid cases where call paths end
up calling the same writes multiple times and needlessly caching
variables through code reorganization, locking overhaul and tracing
adjustments (Mario Limonciello, Dhananjay Ugwekar)
- Make it possible to avoid enabling capacity-aware scheduling (CAS)
in the intel_pstate driver and relocate a check for out-of-band
(OOB) platform handling in it to make it detect OOB before checking
HWP availability (Rafael Wysocki)
- Fix dbs_update() to avoid inadvertent conversions of negative
integer values to unsigned int which causes CPU frequency selection
to be inaccurate in some cases when the "conservative" cpufreq
governor is in use (Jie Zhan)
- Update the handling of the most recent idle intervals in the menu
cpuidle governor to prevent useful information from being discarded
by it in some cases and improve the prediction accuracy (Rafael
Wysocki)
- Make it possible to tell the intel_idle driver to ignore its
built-in table of idle states for the given processor, clean up the
handling of auto-demotion disabling on Baytrail and Cherrytrail
chips in it, and update its MAINTAINERS entry (David Arcari, Artem
Bityutskiy, Rafael Wysocki)
- Make some cpuidle drivers use for_each_present_cpu() instead of
for_each_possible_cpu() during initialization to avoid issues
occurring when nosmp or maxcpus=0 are used (Jacky Bai)
- Clean up the Energy Model handling code somewhat (Rafael Wysocki)
- Use kfree_rcu() to simplify the handling of runtime Energy Model
updates (Li RongQing)
- Add an entry for the Energy Model framework to MAINTAINERS as
properly maintained (Lukasz Luba)
- Address RCU-related sparse warnings in the Energy Model code
(Rafael Wysocki)
- Remove ENERGY_MODEL dependency on SMP and allow it to be selected
when DEVFREQ is set without CPUFREQ so it can be used on a wider
range of systems (Jeson Gao)
- Unify error handling during runtime suspend and runtime resume in
the core to help drivers to implement more consistent runtime PM
error handling (Rafael Wysocki)
- Drop a redundant check from pm_runtime_force_resume() and rearrange
documentation related to __pm_runtime_disable() (Rafael Wysocki)
- Rework the handling of the "smart suspend" driver flag in the PM
core to avoid issues hat may occur when drivers using it depend on
some other drivers and clean up the related PM core code (Rafael
Wysocki, Colin Ian King)
- Fix the handling of devices with the power.direct_complete flag set
if device_suspend() returns an error for at least one device to
avoid situations in which some of them may not be resumed (Rafael
Wysocki)
- Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
possible deadlock that may occur if the "compressor" hibernation
module parameter is accessed during the registration of a new
ieee80211 device (Lizhi Xu)
- Suppress sleeping parent warning in device_pm_add() in the case
when new children are added under a device with the
power.direct_complete set after it has been processed by
device_resume() (Xu Yang)
- Remove needless return in three void functions related to system
wakeup (Zijun Hu)
- Replace deprecated kmap_atomic() with kmap_local_page() in the
hibernation core code (David Reaver)
- Remove unused helper functions related to system sleep (David Alan
Gilbert)
- Clean up s2idle_enter() so it does not lock and unlock CPU offline
in vain and update comments in it (Ulf Hansson)
- Clean up broken white space in dpm_wait_for_children() (Geert
Uytterhoeven)
- Update the cpupower utility to fix lib version-ing in it and memory
leaks in error legs, remove hard-coded values, and implement CPU
physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
Khan, Yiwei Lin, Zhongqiu Han)"
* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
PM: sleep: Fix bit masking operation
dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
cpufreq: Init cpufreq only for present CPUs
PM: sleep: Fix handling devices with direct_complete set on errors
cpuidle: Init cpuidle only for present CPUs
PM: clk: Remove unused pm_clk_remove()
PM: sleep: core: Fix indentation in dpm_wait_for_children()
PM: s2idle: Extend comment in s2idle_enter()
PM: s2idle: Drop redundant locks when entering s2idle
PM: sleep: Remove unused pm_generic_ wrappers
cpufreq: tegra186: Share policy per cluster
cpupower: Make lib versioning scheme more obvious and fix version link
PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
PM: EM: Address RCU-related sparse warnings
cpupower: Implement CPU physical core querying
pm: cpupower: remove hard-coded topology depth values
pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
...
|
|
__stack_chk_fail() can be called from uaccess-enabled code. Make sure
uaccess gets disabled before calling panic().
Fixes the following warning:
kernel/trace/trace_branch.o: error: objtool: ftrace_likely_update+0x1ea: call to __stack_chk_fail() with UACCESS enabled
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/a3e97e0119e1b04c725a8aa05f7bc83d98e657eb.1742852847.git.jpoimboe@kernel.org
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Nested virtualization support for VGICv3, giving the nested
hypervisor control of the VGIC hardware when running an L2 VM
- Removal of 'late' nested virtualization feature register masking,
making the supported feature set directly visible to userspace
- Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
- Paravirtual interface for discovering the set of CPU
implementations where a VM may run, addressing a longstanding issue
of guest CPU errata awareness in big-little systems and
cross-implementation VM migration
- Userspace control of the registers responsible for identifying a
particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
allowing VMs to be migrated cross-implementation
- pKVM updates, including support for tracking stage-2 page table
allocations in the protected hypervisor in the 'SecPageTable' stat
- Fixes to vPMU, ensuring that userspace updates to the vPMU after
KVM_RUN are reflected into the backing perf events
LoongArch:
- Remove unnecessary header include path
- Assume constant PGD during VM context switch
- Add perf events support for guest VM
RISC-V:
- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
x86:
- Add support for aging of SPTEs without holding mmu_lock.
Not taking mmu_lock allows multiple aging actions to run in
parallel, and more importantly avoids stalling vCPUs. This includes
an implementation of per-rmap-entry locking; aging the gfn is done
with only a per-rmap single-bin spinlock taken, whereas locking an
rmap for write requires taking both the per-rmap spinlock and the
mmu_lock.
Note that this decreases slightly the accuracy of accessed-page
information, because changes to the SPTE outside aging might not
use atomic operations even if they could race against a clear of
the Accessed bit.
This is deliberate because KVM and mm/ tolerate false
positives/negatives for accessed information, and testing has shown
that reducing the latency of aging is far more beneficial to
overall system performance than providing "perfect" young/old
information.
- Defer runtime CPUID updates until KVM emulates a CPUID instruction,
to coalesce updates when multiple pieces of vCPU state are
changing, e.g. as part of a nested transition
- Fix a variety of nested emulation bugs, and add VMX support for
synthesizing nested VM-Exit on interception (instead of injecting
#UD into L2)
- Drop "support" for async page faults for protected guests that do
not set SEND_ALWAYS (i.e. that only want async page faults at CPL3)
- Bring a bit of sanity to x86's VM teardown code, which has
accumulated a lot of cruft over the years. Particularly, destroy
vCPUs before the MMU, despite the latter being a VM-wide operation
- Add common secure TSC infrastructure for use within SNP and in the
future TDX
- Block KVM_CAP_SYNC_REGS if guest state is protected. It does not
make sense to use the capability if the relevant registers are not
available for reading or writing
- Don't take kvm->lock when iterating over vCPUs in the suspend
notifier to fix a largely theoretical deadlock
- Use the vCPU's actual Xen PV clock information when starting the
Xen timer, as the cached state in arch.hv_clock can be stale/bogus
- Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across
different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as
KVM's suspend notifier only accounts for kvmclock, and there's no
evidence that the flag is actually supported by Xen guests
- Clean up the per-vCPU "cache" of its reference pvclock, and instead
only track the vCPU's TSC scaling (multipler+shift) metadata (which
is moderately expensive to compute, and rarely changes for modern
setups)
- Don't write to the Xen hypercall page on MSR writes that are
initiated by the host (userspace or KVM) to fix a class of bugs
where KVM can write to guest memory at unexpected times, e.g.
during vCPU creation if userspace has set the Xen hypercall MSR
index to collide with an MSR that KVM emulates
- Restrict the Xen hypercall MSR index to the unofficial synthetic
range to reduce the set of possible collisions with MSRs that are
emulated by KVM (collisions can still happen as KVM emulates
Hyper-V MSRs, which also reside in the synthetic range)
- Clean up and optimize KVM's handling of Xen MSR writes and
xen_hvm_config
- Update Xen TSC leaves during CPUID emulation instead of modifying
the CPUID entries when updating PV clocks; there is no guarantee PV
clocks will be updated between TSC frequency changes and CPUID
emulation, and guest reads of the TSC leaves should be rare, i.e.
are not a hot path
x86 (Intel):
- Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and
thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1
- Pass XFD_ERR as the payload when injecting #NM, as a preparatory
step for upcoming FRED virtualization support
- Decouple the EPT entry RWX protection bit macros from the EPT
Violation bits, both as a general cleanup and in anticipation of
adding support for emulating Mode-Based Execution Control (MBEC)
- Reject KVM_RUN if userspace manages to gain control and stuff
invalid guest state while KVM is in the middle of emulating nested
VM-Enter
- Add a macro to handle KVM's sanity checks on entry/exit VMCS
control pairs in anticipation of adding sanity checks for secondary
exit controls (the primary field is out of bits)
x86 (AMD):
- Ensure the PSP driver is initialized when both the PSP and KVM
modules are built-in (the initcall framework doesn't handle
dependencies)
- Use long-term pins when registering encrypted memory regions, so
that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and
don't lead to excessive fragmentation
- Add macros and helpers for setting GHCB return/error codes
- Add support for Idle HLT interception, which elides interception if
the vCPU has a pending, unmasked virtual IRQ when HLT is executed
- Fix a bug in INVPCID emulation where KVM fails to check for a
non-canonical address
- Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is
invalid, e.g. because the vCPU was "destroyed" via SNP's AP
Creation hypercall
- Reject SNP AP Creation if the requested SEV features for the vCPU
don't match the VM's configured set of features
Selftests:
- Fix again the Intel PMU counters test; add a data load and do
CLFLUSH{OPT} on the data instead of executing code. The theory is
that modern Intel CPUs have learned new code prefetching tricks
that bypass the PMU counters
- Fix a flaw in the Intel PMU counters test where it asserts that an
event is counting correctly without actually knowing what the event
counts on the underlying hardware
- Fix a variety of flaws, bugs, and false failures/passes
dirty_log_test, and improve its coverage by collecting all dirty
entries on each iteration
- Fix a few minor bugs related to handling of stats FDs
- Add infrastructure to make vCPU and VM stats FDs available to tests
by default (open the FDs during VM/vCPU creation)
- Relax an assertion on the number of HLT exits in the xAPIC IPI test
when running on a CPU that supports AMD's Idle HLT (which elides
interception of HLT if a virtual IRQ is pending and unmasked)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits)
RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
RISC-V: KVM: Teardown riscv specific bits after kvm_exit
LoongArch: KVM: Register perf callbacks for guest
LoongArch: KVM: Implement arch-specific functions for guest perf
LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel()
LoongArch: KVM: Remove PGD saving during VM context switch
LoongArch: KVM: Remove unnecessary header include path
KVM: arm64: Tear down vGIC on failed vCPU creation
KVM: arm64: PMU: Reload when resetting
KVM: arm64: PMU: Reload when user modifies registers
KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
KVM: arm64: Initialize HCRX_EL2 traps in pKVM
KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
KVM: x86: Add infrastructure for secure TSC
KVM: x86: Push down setting vcpu.arch.user_set_tsc
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 speculation mitigation updates from Borislav Petkov:
- Some preparatory work to convert the mitigations machinery to
mitigating attack vectors instead of single vulnerabilities
- Untangle and remove a now unneeded X86_FEATURE_USE_IBPB flag
- Add support for a Zen5-specific SRSO mitigation
- Cleanups and minor improvements
* tag 'x86_bugs_for_v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Make spectre user default depend on MITIGATION_SPECTRE_V2
x86/bugs: Use the cpu_smt_possible() helper instead of open-coded code
x86/bugs: Add AUTO mitigations for mds/taa/mmio/rfds
x86/bugs: Relocate mds/taa/mmio/rfds defines
x86/bugs: Add X86_BUG_SPECTRE_V2_USER
x86/bugs: Remove X86_FEATURE_USE_IBPB
KVM: nVMX: Always use IBPB to properly virtualize IBRS
x86/bugs: Use a static branch to guard IBPB on vCPU switch
x86/bugs: Remove the X86_FEATURE_USE_IBPB check in ib_prctl_set()
x86/mm: Remove X86_FEATURE_USE_IBPB checks in cond_mitigation()
x86/bugs: Move the X86_FEATURE_USE_IBPB check into callers
x86/bugs: KVM: Add support for SRSO_MSR_FIX
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Nothing major this time around.
Apart from the usual perf/PMU updates, some page table cleanups, the
notable features are average CPU frequency based on the AMUv1
counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
the uaccess routines.
Perf and PMUs:
- Support for the 'Rainier' CPU PMU from Arm
- Preparatory driver changes and cleanups that pave the way for BRBE
support
- Support for partial virtualisation of the Apple-M1 PMU
- Support for the second event filter in Arm CSPMU designs
- Minor fixes and cleanups (CMN and DWC PMUs)
- Enable EL2 requirements for FEAT_PMUv3p9
Power, CPU topology:
- Support for AMUv1-based average CPU frequency
- Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
adds a generic topology_is_primary_thread() function overridden by
x86 and powerpc
New(ish) features:
- MOPS (memcpy/memset) support for the uaccess routines
Security/confidential compute:
- Fix the DMA address for devices used in Realms with Arm CCA. The
CCA architecture uses the address bit to differentiate between
shared and private addresses
- Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
default
Memory management clean-ups:
- Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
- Some minor page table accessor clean-ups
- PIE/POE (permission indirection/overlay) helpers clean-up
Kselftests:
- MTE: skip hugetlb tests if MTE is not supported on such mappings
and user correct naming for sync/async tag checking modes
Miscellaneous:
- Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
request)
- Sysreg updates for new register fields
- CPU type info for some Qualcomm Kryo cores"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
arm64: mm: Don't use %pK through printk
perf/arm_cspmu: Fix missing io.h include
arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
arm64: cputype: Add MIDR_CORTEX_A76AE
arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
arm64/sysreg: Enforce whole word match for open/close tokens
arm64/sysreg: Fix unbalanced closing block
arm64: Kconfig: Enable HOTPLUG_SMT
arm64: topology: Support SMT control on ACPI based system
arch_topology: Support SMT control for OF based system
cpu/SMT: Provide a default topology_is_primary_thread()
arm64/mm: Define PTDESC_ORDER
perf/arm_cspmu: Add PMEVFILT2R support
perf/arm_cspmu: Generalise event filtering
perf/arm_cspmu: Move register definitons to header
arm64/kernel: Always use level 2 or higher for early mappings
arm64/mm: Drop PXD_TABLE_BIT
arm64/mm: Check pmd_table() in pmd_trans_huge()
...
|
|
'for-next/sysreg', 'for-next/misc', 'for-next/pgtable-cleanups', 'for-next/kselftest', 'for-next/uaccess-mops', 'for-next/pie-poe-cleanup', 'for-next/cputype-kryo', 'for-next/cca-dma-address', 'for-next/drop-pxd_table_bit' and 'for-next/spectre-bhb-assume-vulnerable', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
perf/arm_cspmu: Fix missing io.h include
perf/arm_cspmu: Add PMEVFILT2R support
perf/arm_cspmu: Generalise event filtering
perf/arm_cspmu: Move register definitons to header
drivers/perf: apple_m1: Support host/guest event filtering
drivers/perf: apple_m1: Refactor event select/filter configuration
perf/dwc_pcie: fix duplicate pci_dev devices
perf/dwc_pcie: fix some unreleased resources
perf/arm-cmn: Minor event type housekeeping
perf: arm_pmu: Move PMUv3-specific data
perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
perf: arm_pmu: Don't disable counter in armpmu_add()
perf: arm_pmuv3: Call kvm_vcpu_pmu_resync_el0() before enabling counters
perf: arm_pmuv3: Add support for ARM Rainier PMU
* for-next/amuv1-avg-freq:
: Add support for AArch64 AMUv1-based average freq
arm64: Utilize for_each_cpu_wrap for reference lookup
arm64: Update AMU-based freq scale factor on entering idle
arm64: Provide an AMU-based version of arch_freq_get_on_cpu
cpufreq: Introduce an optional cpuinfo_avg_freq sysfs entry
cpufreq: Allow arch_freq_get_on_cpu to return an error
arch_topology: init capacity_freq_ref to 0
* for-next/pkey_unrestricted:
: mm/pkey: Add PKEY_UNRESTRICTED macro
selftest/powerpc/mm/pkey: fix build-break introduced by commit 00894c3fc917
selftests/powerpc: Use PKEY_UNRESTRICTED macro
selftests/mm: Use PKEY_UNRESTRICTED macro
mm/pkey: Add PKEY_UNRESTRICTED macro
* for-next/sysreg:
: arm64 sysreg updates
arm64/sysreg: Enforce whole word match for open/close tokens
arm64/sysreg: Fix unbalanced closing block
arm64/sysreg: Add register fields for HFGWTR2_EL2
arm64/sysreg: Add register fields for HFGRTR2_EL2
arm64/sysreg: Add register fields for HFGITR2_EL2
arm64/sysreg: Add register fields for HDFGWTR2_EL2
arm64/sysreg: Add register fields for HDFGRTR2_EL2
arm64/sysreg: Update register fields for ID_AA64MMFR0_EL1
* for-next/misc:
: Miscellaneous arm64 patches
arm64: mm: Don't use %pK through printk
arm64/fpsimd: Remove unused declaration fpsimd_kvm_prepare()
* for-next/pgtable-cleanups:
: arm64 pgtable accessors cleanup
arm64/mm: Define PTDESC_ORDER
arm64/kernel: Always use level 2 or higher for early mappings
arm64/hugetlb: Consistently use pud_sect_supported()
arm64/mm: Convert __pte_to_phys() and __phys_to_pte_val() as functions
* for-next/kselftest:
: arm64 kselftest updates
kselftest/arm64: mte: Skip the hugetlb tests if MTE not supported on such mappings
kselftest/arm64: mte: Use the correct naming for tag check modes in check_hugetlb_options.c
* for-next/uaccess-mops:
: Implement the uaccess memory copy/set using MOPS instructions
arm64: lib: Use MOPS for usercopy routines
arm64: mm: Handle PAN faults on uaccess CPY* instructions
arm64: extable: Add fixup handling for uaccess CPY* instructions
* for-next/pie-poe-cleanup:
: PIE/POE helpers cleanup
arm64/sysreg: Move POR_EL0_INIT to asm/por.h
arm64/sysreg: Rename POE_RXW to POE_RWX
arm64/sysreg: Improve PIR/POR helpers
* for-next/cputype-kryo:
: Add cputype info for some Qualcomm Kryo cores
arm64: cputype: Add comments about Qualcomm Kryo 5XX and 6XX cores
arm64: cputype: Add QCOM_CPU_PART_KRYO_3XX_GOLD
* for-next/cca-dma-address:
: Fix DMA address for devices used in realms with Arm CCA
arm64: realm: Use aliased addresses for device DMA to shared buffers
dma: Introduce generic dma_addr_*crypted helpers
dma: Fix encryption bit clearing for dma_to_phys
* for-next/drop-pxd_table_bit:
: Drop the arm64 PXD_TABLE_BIT (clean-up in preparation for 128-bit PTEs)
arm64/mm: Drop PXD_TABLE_BIT
arm64/mm: Check pmd_table() in pmd_trans_huge()
arm64/mm: Check PUD_TYPE_TABLE in pud_bad()
arm64/mm: Check PXD_TYPE_TABLE in [p4d|pgd]_bad()
arm64/mm: Clear PXX_TYPE_MASK and set PXD_TYPE_SECT in [pmd|pud]_mkhuge()
arm64/mm: Clear PXX_TYPE_MASK in mk_[pmd|pud]_sect_prot()
arm64/ptdump: Test PMD_TYPE_MASK for block mapping
KVM: arm64: ptdump: Test PMD_TYPE_MASK for block mapping
* for-next/spectre-bhb-assume-vulnerable:
: Rework Spectre BHB mitigations to not assume "safe"
arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
arm64: cputype: Add MIDR_CORTEX_A76AE
arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull VDSO infrastructure updates from Thomas Gleixner:
- Consolidate the VDSO storage
The VDSO data storage and data layout has been largely architecture
specific for historical reasons. That increases the maintenance
effort and causes inconsistencies over and over.
There is no real technical reason for architecture specific layouts
and implementations. The architecture specific details can easily be
integrated into a generic layout, which also reduces the amount of
duplicated code for managing the mappings.
Convert all architectures over to a unified layout and common mapping
infrastructure. This splits the VDSO data layout into subsystem
specific blocks, timekeeping, random and architecture parts, which
provides a better structure and allows to improve and update the
functionalities without conflict and interaction.
- Rework the timekeeping data storage
The current implementation is designed for exposing system
timekeeping accessors, which was good enough at the time when it was
designed.
PTP and Time Sensitive Networking (TSN) change that as there are
requirements to expose independent PTP clocks, which are not related
to system timekeeping.
Replace the monolithic data storage by a structured layout, which
allows to add support for independent PTP clocks on top while reusing
both the data structures and the time accessor implementations.
* tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
sparc/vdso: Always reject undefined references during linking
x86/vdso: Always reject undefined references during linking
vdso: Rework struct vdso_time_data and introduce struct vdso_clock
vdso: Move architecture related data before basetime data
powerpc/vdso: Prepare introduction of struct vdso_clock
arm64/vdso: Prepare introduction of struct vdso_clock
x86/vdso: Prepare introduction of struct vdso_clock
time/namespace: Prepare introduction of struct vdso_clock
vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct
vdso/vsyscall: Prepare introduction of struct vdso_clock
vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock
vdso/gettimeofday: Prepare introduction of struct vdso_clock
vdso/helpers: Prepare introduction of struct vdso_clock
vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks
vdso: Make vdso_time_data cacheline aligned
arm64: Make asm/cache.h compatible with vDSO
...
|
|
icsk->icsk_timeout can be replaced by icsk->icsk_retransmit_timer.expires
This saves 8 bytes in TCP/DCCP sockets and helps for better cache locality.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250324203607.703850-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core updates from Thomas Gleixner:
- Fix a memory ordering issue in posix-timers
Posix-timer lookup is lockless and reevaluates the timer validity
under the timer lock, but the update which validates the timer is not
protected by the timer lock. That allows the store to be reordered
against the initialization stores, so that the lookup side can
observe a partially initialized timer. That's mostly a theoretical
problem, but incorrect nevertheless.
- Fix a long standing inconsistency of the coarse time getters
The coarse time getters read the base time of the current update
cycle without reading the actual hardware clock. NTP frequency
adjustment can set the base time backwards. The fine grained
interfaces compensate this by reading the clock and applying the new
conversion factor, but the coarse grained time getters use the base
time directly. That allows the user to observe time going backwards.
Cure it by always forwarding base time, when NTP changes the
frequency with an immediate step.
- Rework of posix-timer hashing
The posix-timer hash is not scalable and due to the CRIU timer
restore mechanism prone to massive contention on the global hash
bucket lock.
Replace the global hash lock with a fine grained per bucket locking
scheme to address that.
- Rework the proc/$PID/timers interface.
/proc/$PID/timers is provided for CRIU to be able to restore a timer.
The printout happens with sighand lock held and interrupts disabled.
That's not required as this can be done with RCU protection as well.
- Provide a sane mechanism for CRIU to restore a timer ID
CRIU restores timers by creating and deleting them until the kernel
internal per process ID counter reached the requested ID. That's
horribly slow for sparse timer IDs.
Provide a prctl() which allows CRIU to restore a timer with a given
ID. When enabled the ID pointer is used as input pointer to read the
requested ID from user space. When disabled, the normal allocation
scheme (next ID) is active as before. This is backwards compatible
for both kernel and user space.
- Make hrtimer_update_function() less expensive.
The sanity checks are valuable, but expensive for high frequency
usage in io/uring. Make the debug checks conditional and enable them
only when lockdep is enabled.
- Small updates, cleanups and improvements
* tag 'timers-core-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
selftests/timers: Improve skew_consistency by testing with other clockids
timekeeping: Fix possible inconsistencies in _COARSE clockids
posix-timers: Drop redundant memset() invocation
selftests/timers/posix-timers: Add a test for exact allocation mode
posix-timers: Provide a mechanism to allocate a given timer ID
posix-timers: Dont iterate /proc/$PID/timers with sighand:: Siglock held
posix-timers: Make per process list RCU safe
posix-timers: Avoid false cacheline sharing
posix-timers: Switch to jhash32()
posix-timers: Improve hash table performance
posix-timers: Make signal_struct:: Next_posix_timer_id an atomic_t
posix-timers: Make lock_timer() use guard()
posix-timers: Rework timer removal
posix-timers: Simplify lock/unlock_timer()
posix-timers: Use guards in a few places
posix-timers: Remove SLAB_PANIC from kmem cache
posix-timers: Remove a few paranoid warnings
posix-timers: Cleanup includes
posix-timers: Add cond_resched() to posix_timer_add() search loop
posix-timers: Initialise timer before adding it to the hash table
...
|
|
I had to spend some (a lot;) time to understand why pidfd_info_test
(and more) fails with my patch under qemu on my machine ;) Until I
applied the patch below.
I think it is a bad idea to do the things like
#ifndef __NR_clone3
#define __NR_clone3 -1
#endif
because this can hide a problem. My working laptop runs Fedora-23 which
doesn't have __NR_clone3/etc in /usr/include/. So "make" happily succeeds,
but everything fails and it is not clear why.
Link: https://lore.kernel.org/r/20250323174518.GB834@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This new test makes sure that ftrace can trace a
function that was introduced by a livepatch.
Signed-off-by: Filipe Xavier <felipeaggger@gmail.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20250324-ftrace-sftest-livepatch-v3-2-d9d7cc386c75@gmail.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|