Age | Commit message (Collapse) | Author |
|
Probe and display L3 Cache topology
Add ability to average an added counter
(useful for pre-integrated "counters", such as Watts)
Break the limit of 64 built-in counters.
Assorted bug fixes and minor feature tweaks
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
/sys/devices/system/cpu/intel_uncore_frequency/package_X_die_Y/
may be readable by all, but
/sys/devices/system/cpu/intel_uncore_frequency/package_X_die_Y/current_freq_khz
may be readable only by root.
Non-root turbostat users see complaints in this scenario.
Fail probe of the interface if we can't read current_freq_khz.
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Original-patch-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
use a macro for PER_THREAD_PARAMS to make adding one later more clear.
no functional change
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Together with the RAPL MSRs, there are more MSRs gone on DMR, including
PLR (Perf Limit Reasons), and IRTL (Package cstate Interrupt Response
Time Limit) MSRs. The configurable TDP info should also be retrieved
from TPMI based Intel Speed Select Technology feature.
Remove the access of these MSRs for DMR. Improve the DMR platform
feature table to make it more readable at the same time.
Fixes: 83075bd59de2 ("tools/power turbostat: Add initial support for DMR")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
External atributes with format "raw" are not printed in summary lines
for nodes/packages (or with option -S). The new format "average"
behaves like "raw" but also adds the summary data
Signed-off-by: Michael Hebenstreit <michael.hebenstreit@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
pkg_base[pkg_id] is a simple array of structure pointers,
let the compiler treat it that way.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
We have out-grown the ability to use a 64-bit memory location
to inventory every possible built-in counter.
Leverage the the CPU_SET(3) macros to break this barrier.
Also, break the Joules & Watts counters into two,
since we can no longer 'or' them together...
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Explain the meaning of the Totl%C0, Any%C0, GFX%C0, CPUGFX% columns.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Similar to delta_cpu(), delta_platform() is called in turbostat main
loop. This ensures accurate SysWatt readings in periodic monitoring mode
$ sudo turbostat -S -q --show power -i 1
CoreTmp PkgTmp PkgWatt CorWatt GFXWatt RAMWatt PKG_% RAM_% SysWatt
60 61 6.21 1.13 0.16 0.00 0.00 0.00 13.07
58 61 6.00 1.07 0.18 0.00 0.00 0.00 12.75
58 61 5.74 1.05 0.17 0.00 0.00 0.00 12.22
58 60 6.27 1.11 0.24 0.00 0.00 0.00 13.55
However, delta_platform() is missing for forked program and causes bogus
SysWatt reporting,
$ sudo turbostat -S -q --show power sleep 1
1.004736 sec
CoreTmp PkgTmp PkgWatt CorWatt GFXWatt RAMWatt PKG_% RAM_% SysWatt
57 58 6.05 1.02 0.16 0.00 0.00 0.00 0.03
Add missing delta_platform() for forked program.
Fixes: e5f687b89bc2 ("tools/power turbostat: Add RAPL psys as a built-in counter")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Kernels configured with CONFIG_MULTIUSER=n have no cap_get_proc().
Check for ENOSYS to recognize this case, and continue on to
attempt to access the requested MSRs (such as temperature).
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
turbostat.c: In function 'parse_int_file':
turbostat.c:5567:19: error: 'PATH_MAX' undeclared (first use in this function)
5567 | char path[PATH_MAX];
| ^~~~~~~~
turbostat.c: In function 'probe_graphics':
turbostat.c:6787:19: error: 'PATH_MAX' undeclared (first use in this function)
6787 | char path[PATH_MAX];
| ^~~~~~~~
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
$ sudo turbostat --quiet --show junk
turbostat: Counter 'junk' can not be added.
Previously, invalid arguments to --show and --hide were silently ignored
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
The new default idle counter groupings broke "--show C1E%" (or any other C-state %)
Also delete a stray debug printf from the same offending commit.
Reported-by: Zhang Rui <rui.zhang@intel.com>
Fixes: ec4acd3166d8 ("tools/power turbostat: disable "cpuidle" invocation counters, by default")
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add initial support for BartlettLake.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Add initial support for DMR.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
for example:
intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W
[lenb: simplified format]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
squish me
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.
Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.
As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.
Fix the problem by skipping the already probed counter.
Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.
In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
cleared
platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.
Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.
With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.
Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.
No functional change intended.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.
Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.
Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.
Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Quit early for unsupported RAPL counters.
No functional change.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
commit 05a2f07db888 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.
However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.
Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.
Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.
Fixes: 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
Fix typo in the currently unused RAPL_GFX_ALL macro definition.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
idle_pct should be pct_idle
Signed-off-by: Len Brown <len.brown@intel.com>
|
|
This test reproduces a scenario where HFSC queue length and backlog accounting
can become inconsistent when a peek operation triggers a dequeue and possible
drop before the parent qdisc updates its counters. The test sets up a DRR root
qdisc with an HFSC class, netem, and blackhole children, and uses Scapy to
inject a packet. It helps to verify that HFSC correctly tracks qlen and backlog
even when packets are dropped during peek-induced dequeue.
Cc: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-3-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth and wireless.
A few more fixes for the locking changes trickling in. Nothing too
alarming, I suspect those will continue for another release. Other
than that things are slowing down nicely.
Current release - fix to a fix:
- Bluetooth: hci_event: use key encryption size when its known
- tools: ynl-gen: allow multi-attr without nested-attributes again
Current release - regressions:
- locking fixes:
- lock lower level devices when updating features
- eth: bnxt_en: bring back rtnl_lock() in the bnxt_open() path
- devmem: fix panic when Netlink socket closes after module unload
Current release - new code bugs:
- eth: txgbe: fixes for FW communication on new AML devices
Previous releases - always broken:
- sched: flush gso_skb list too during ->change(), avoid potential
null-deref on reconfig
- wifi: mt76: disable NAPI on driver removal
- hv_netvsc: fix error 'nvsp_rndis_pkt_complete error status: 2'"
* tag 'net-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
net: devmem: fix kernel panic when netlink socket close after module unload
tsnep: fix timestamping with a stacked DSA driver
net/tls: fix kernel panic when alloc_page failed
bnxt_en: bring back rtnl_lock() in the bnxt_open() path
mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
octeontx2-pf: Do not reallocate all ntuple filters
wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
wifi: mt76: disable napi on driver removal
Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
hv_netvsc: Remove rmsg_pgcnt
hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
octeontx2-af: Fix CGX Receive counters
net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
net: libwx: Fix FW mailbox unknown command
net: libwx: Fix FW mailbox reply timeout
net: txgbe: Fix to calculate EEPROM checksum for AML devices
octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
...
|
|
These tests:
"SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes"
"SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes"
output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)".
They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data
have been received by the other side. However, sometimes there is a delay
in updating this "unsent bytes" counter, and the test fails even though
the counter properly goes to 0 several milliseconds later.
The delay occurs in the kernel because the used buffer notification
callback virtio_vsock_tx_done(), called upon receipt of the data by the
other side, doesn't update the counter itself. It delegates that to
a kernel thread (via vsock->tx_work). Sometimes that thread is delayed
more than the test expects.
Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs.
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test")
Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit ce6cb8113c84 ("tools: ynl-gen: individually free previous
values on double set"), specifying the "multi-attr" property raises an
error unless the "nested-attributes" property is specified as well:
File "tools/net/ynl/./pyynl/ynl_gen_c.py", line 1147, in _load_nested_sets
child = self.pure_nested_structs.get(nested)
^^^^^^
UnboundLocalError: cannot access local variable 'nested' where it is not associated with a value
This appears to be a bug since there are existing specs which omit
"nested-attributes" on "multi-attr" attributes. Also, according to
Documentation/userspace-api/netlink/specs.rst, multi-attr "is the
recommended way of implementing arrays (no extra nesting)", suggesting
that nesting should even be avoided in favor of multi-attr.
Fix the indentation of the if-block introduced by the commit to avoid
the error.
Fixes: ce6cb8113c84 ("tools: ynl-gen: individually free previous values on double set")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/d6b58684b7e5bfb628f7313e6893d0097904e1d1.1746940107.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 ITS mitigation from Dave Hansen:
"Mitigate Indirect Target Selection (ITS) issue.
I'd describe this one as a good old CPU bug where the behavior is
_obviously_ wrong, but since it just results in bad predictions it
wasn't wrong enough to notice. Well, the researchers noticed and also
realized that thus bug undermined a bunch of existing indirect branch
mitigations.
Thus the unusually wide impact on this one. Details:
ITS is a bug in some Intel CPUs that affects indirect branches
including RETs in the first half of a cacheline. Due to ITS such
branches may get wrongly predicted to a target of (direct or indirect)
branch that is located in the second half of a cacheline. Researchers
at VUSec found this behavior and reported to Intel.
Affected processors:
- Cascade Lake, Cooper Lake, Whiskey Lake V, Coffee Lake R, Comet
Lake, Ice Lake, Tiger Lake and Rocket Lake.
Scope of impact:
- Guest/host isolation:
When eIBRS is used for guest/host isolation, the indirect branches
in the VMM may still be predicted with targets corresponding to
direct branches in the guest.
- Intra-mode using cBPF:
cBPF can be used to poison the branch history to exploit ITS.
Realigning the indirect branches and RETs mitigates this attack
vector.
- User/kernel:
With eIBRS enabled user/kernel isolation is *not* impacted by ITS.
- Indirect Branch Prediction Barrier (IBPB):
Due to this bug indirect branches may be predicted with targets
corresponding to direct branches which were executed prior to IBPB.
This will be fixed in the microcode.
Mitigation:
As indirect branches in the first half of cacheline are affected, the
mitigation is to replace those indirect branches with a call to thunk that
is aligned to the second half of the cacheline.
RETs that take prediction from RSB are not affected, but they may be
affected by RSB-underflow condition. So, RETs in the first half of
cacheline are also patched to a return thunk that executes the RET aligned
to second half of cacheline"
* tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftest/x86/bugs: Add selftests for ITS
x86/its: FineIBT-paranoid vs ITS
x86/its: Use dynamic thunks for indirect branches
x86/ibt: Keep IBT disabled during alternative patching
mm/execmem: Unify early execmem_cache behaviour
x86/its: Align RETs in BHB clear sequence to avoid thunking
x86/its: Add support for RSB stuffing mitigation
x86/its: Add "vmexit" option to skip mitigation on some CPUs
x86/its: Enable Indirect Target Selection mitigation
x86/its: Add support for ITS-safe return thunk
x86/its: Add support for ITS-safe indirect thunk
x86/its: Enumerate Indirect Target Selection (ITS) bug
Documentation: x86/bugs/its: Add ITS documentation
|
|
Pull KVM fixes from Paolo Bonzini:
"ARM:
- Avoid use of uninitialized memcache pointer in user_mem_abort()
- Always set HCR_EL2.xMO bits when running in VHE, allowing
interrupts to be taken while TGE=0 and fixing an ugly bug on
AmpereOne that occurs when taking an interrupt while clearing the
xMO bits (AC03_CPU_36)
- Prevent VMMs from hiding support for AArch64 at any EL virtualized
by KVM
- Save/restore the host value for HCRX_EL2 instead of restoring an
incorrect fixed value
- Make host_stage2_set_owner_locked() check that the entire requested
range is memory rather than just the first page
RISC-V:
- Add missing reset of smstateen CSRs
x86:
- Forcibly leave SMM on SHUTDOWN interception on AMD CPUs to avoid
causing problems due to KVM stuffing INIT on SHUTDOWN (KVM needs to
sanitize the VMCB as its state is undefined after SHUTDOWN,
emulating INIT is the least awful choice).
- Track the valid sync/dirty fields in kvm_run as a u64 to ensure KVM
KVM doesn't goof a sanity check in the future.
- Free obsolete roots when (re)loading the MMU to fix a bug where
pre-faulting memory can get stuck due to always encountering a
stale root.
- When dumping GHCB state, use KVM's snapshot instead of the raw GHCB
page to print state, so that KVM doesn't print stale/wrong
information.
- When changing memory attributes (e.g. shared <=> private), add
potential hugepage ranges to the mmu_invalidate_range_{start,end}
set so that KVM doesn't create a shared/private hugepage when the
the corresponding attributes will become mixed (the attributes are
commited *after* KVM finishes the invalidation).
- Rework the SRSO mitigation to enable BP_SPEC_REDUCE only when KVM
has at least one active VM. Effectively BP_SPEC_REDUCE when KVM is
loaded led to very measurable performance regressions for non-KVM
workloads"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions
KVM: arm64: Fix memory check in host_stage2_set_owner_locked()
KVM: arm64: Kill HCRX_HOST_FLAGS
KVM: arm64: Properly save/restore HCRX_EL2
KVM: arm64: selftest: Don't try to disable AArch64 support
KVM: arm64: Prevent userspace from disabling AArch64 support at any virtualisable EL
KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
KVM: arm64: Fix uninitialized memcache pointer in user_mem_abort()
KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing
KVM: SVM: Update dump_ghcb() to use the GHCB snapshot fields
KVM: RISC-V: reset smstateen CSRs
KVM: x86/mmu: Check and free obsolete roots in kvm_mmu_reload()
KVM: x86: Check that the high 32bits are clear in kvm_arch_vcpu_ioctl_run()
KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interception
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"22 hotfixes. 13 are cc:stable and the remainder address post-6.14
issues or aren't considered necessary for -stable kernels.
About half are for MM. Five OCFS2 fixes and a few MAINTAINERS updates"
* tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
mm: fix folio_pte_batch() on XEN PV
nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()
mm/hugetlb: copy the CMA flag when demoting
mm, swap: fix false warning for large allocation with !THP_SWAP
selftests/mm: fix a build failure on powerpc
selftests/mm: fix build break when compiling pkey_util.c
mm: vmalloc: support more granular vrealloc() sizing
tools/testing/selftests: fix guard region test tmpfs assumption
ocfs2: stop quota recovery before disabling quotas
ocfs2: implement handshaking with ocfs2 recovery thread
ocfs2: switch osb->disable_recovery to enum
mailmap: map Uwe's BayLibre addresses to a single one
MAINTAINERS: add mm THP section
mm/userfaultfd: fix uninitialized output field for -EAGAIN race
selftests/mm: compaction_test: support platform with huge mount of memory
MAINTAINERS: add core mm section
ocfs2: fix panic in failed foilio allocation
mm/huge_memory: fix dereferencing invalid pmd migration entry
MAINTAINERS: add reverse mapping section
x86: disable image size check for test builds
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.15, round #3
- Avoid use of uninitialized memcache pointer in user_mem_abort()
- Always set HCR_EL2.xMO bits when running in VHE, allowing interrupts
to be taken while TGE=0 and fixing an ugly bug on AmpereOne that
occurs when taking an interrupt while clearing the xMO bits
(AC03_CPU_36)
- Prevent VMMs from hiding support for AArch64 at any EL virtualized by
KVM
- Save/restore the host value for HCRX_EL2 instead of restoring an
incorrect fixed value
- Make host_stage2_set_owner_locked() check that the entire requested
range is memory rather than just the first page
|
|
netdev_bind_rx takes ownership of the queue array passed as parameter
and frees it, so a queue array buffer cannot be reused across multiple
netdev_bind_rx calls.
This commit fixes that by always passing in a newly created queue array
to all netdev_bind_rx calls in ncdevmem.
Fixes: 85585b4bc8d8 ("selftests: add ncdevmem, netcat for devmem TCP")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250508084434.1933069-1-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a crash in the ethtool YNL implementation when Hardware Clock information
is not present in the response. This ensures graceful handling of devices or
drivers that do not provide this optional field. e.g.
Traceback (most recent call last):
File "/net/tools/net/ynl/pyynl/./ethtool.py", line 438, in <module>
main()
~~~~^^
File "/net/tools/net/ynl/pyynl/./ethtool.py", line 341, in main
print(f'PTP Hardware Clock: {tsinfo["phc-index"]}')
~~~~~~^^^^^^^^^^^^^
KeyError: 'phc-index'
Fixes: f3d07b02b2b8 ("tools: ynl: ethtool testing tool")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508035414.82974-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rust fixes from Miguel Ojeda:
- Make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88.0
- Clean Rust (and Clippy) lints for the upcoming Rust 1.87.0 and 1.88.0
releases
- Clean objtool warning for the upcoming Rust 1.87.0 release by adding
one more noreturn function
* tag 'rust-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88
rust: clean Rust 1.88.0's `clippy::uninlined_format_args` lint
rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration
rust: clean Rust 1.88.0's `unnecessary_transmutes` lint
rust: allow Rust 1.87.0's `clippy::ptr_eq` lint
objtool/rust: add one more `noreturn` Rust function for Rust 1.87.0
|
|
Below are the tests added for Indirect Target Selection (ITS):
- its_sysfs.py - Check if sysfs reflects the correct mitigation status for
the mitigation selected via the kernel cmdline.
- its_permutations.py - tests mitigation selection with cmdline
permutations with other bugs like spectre_v2 and retbleed.
- its_indirect_alignment.py - verifies that for addresses in
.retpoline_sites section that belong to lower half of cacheline are
patched to ITS-safe thunk. Typical output looks like below:
Site 49: function symbol: __x64_sys_restart_syscall+0x1f <0xffffffffbb1509af>
# vmlinux: 0xffffffff813509af: jmp 0xffffffff81f5a8e0
# kcore: 0xffffffffbb1509af: jmpq *%rax
# ITS thunk NOT expected for site 49
# PASSED: Found *%rax
#
Site 50: function symbol: __resched_curr+0xb0 <0xffffffffbb181910>
# vmlinux: 0xffffffff81381910: jmp 0xffffffff81f5a8e0
# kcore: 0xffffffffbb181910: jmp 0xffffffffc02000fc
# ITS thunk expected for site 50
# PASSED: Found 0xffffffffc02000fc -> jmpq *%rax <scattered-thunk?>
- its_ret_alignment.py - verifies that for addresses in .return_sites
section that belong to lower half of cacheline are patched to
its_return_thunk. Typical output looks like below:
Site 97: function symbol: collect_event+0x48 <0xffffffffbb007f18>
# vmlinux: 0xffffffff81207f18: jmp 0xffffffff81f5b500
# kcore: 0xffffffffbb007f18: jmp 0xffffffffbbd5b560
# PASSED: Found jmp 0xffffffffbbd5b560 <its_return_thunk>
#
Site 98: function symbol: collect_event+0xa4 <0xffffffffbb007f74>
# vmlinux: 0xffffffff81207f74: jmp 0xffffffff81f5b500
# kcore: 0xffffffffbb007f74: retq
# PASSED: Found retq
Some of these tests have dependency on tools like virtme-ng[1] and drgn[2].
When the dependencies are not met, the test will be skipped.
[1] https://github.com/arighi/virtme-ng
[2] https://github.com/osandov/drgn
Co-developed-by: Tao Zhang <tao1.zhang@linux.intel.com>
Signed-off-by: Tao Zhang <tao1.zhang@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
|
|
FineIBT-paranoid was using the retpoline bytes for the paranoid check,
disabling retpolines, because all parts that have IBT also have eIBRS
and thus don't need no stinking retpolines.
Except... ITS needs the retpolines for indirect calls must not be in
the first half of a cacheline :-/
So what was the paranoid call sequence:
<fineibt_paranoid_start>:
0: 41 ba 78 56 34 12 mov $0x12345678, %r10d
6: 45 3b 53 f7 cmp -0x9(%r11), %r10d
a: 4d 8d 5b <f0> lea -0x10(%r11), %r11
e: 75 fd jne d <fineibt_paranoid_start+0xd>
10: 41 ff d3 call *%r11
13: 90 nop
Now becomes:
<fineibt_paranoid_start>:
0: 41 ba 78 56 34 12 mov $0x12345678, %r10d
6: 45 3b 53 f7 cmp -0x9(%r11), %r10d
a: 4d 8d 5b f0 lea -0x10(%r11), %r11
e: 2e e8 XX XX XX XX cs call __x86_indirect_paranoid_thunk_r11
Where the paranoid_thunk looks like:
1d: <ea> (bad)
__x86_indirect_paranoid_thunk_r11:
1e: 75 fd jne 1d
__x86_indirect_its_thunk_r11:
20: 41 ff eb jmp *%r11
23: cc int3
[ dhansen: remove initialization to false ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
|
|
Added new test cases for FQ, FQ_CODEL, FQ_PIE, and HHF qdiscs to verify queue
trimming behavior when the qdisc limit is dynamically reduced.
Each test injects packets, reduces the qdisc limit, and checks that the new
limit is enforced. This is still best effort since timing qdisc backlog
is not easy.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN, WiFi and netfilter.
We have still a comple of regressions open due to the recent
drivers locking refactor. The patches are in-flight, but not
ready yet.
Current release - regressions:
- core: lock netdevices during dev_shutdown
- sch_htb: make htb_deactivate() idempotent
- eth: virtio-net: don't re-enable refill work too early
Current release - new code bugs:
- eth: icssg-prueth: fix kernel panic during concurrent Tx queue
access
Previous releases - regressions:
- gre: fix again IPv6 link-local address generation.
- eth: b53: fix learning on VLAN unaware bridges
Previous releases - always broken:
- wifi: fix out-of-bounds access during multi-link element
defragmentation
- can:
- initialize spin lock on device probe
- fix order of unregistration calls
- openvswitch: fix unsafe attribute parsing in output_userspace()
- eth:
- virtio-net: fix total qstat values
- mtk_eth_soc: reset all TX queues on DMA free
- fbnic: firmware IPC mailbox fixes"
* tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
virtio-net: fix total qstat values
net: export a helper for adding up queue stats
fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready
fbnic: Cleanup handling of completions
fbnic: Actually flush_tx instead of stalling out
fbnic: Add additional handling of IRQs
fbnic: Gate AXI read/write enabling on FW mailbox
fbnic: Fix initialization of mailbox descriptor rings
net: dsa: b53: do not set learning and unicast/multicast on up
net: dsa: b53: fix learning on VLAN unaware bridges
net: dsa: b53: fix toggling vlan_filtering
net: dsa: b53: do not program vlans when vlan filtering is off
net: dsa: b53: do not allow to configure VLAN 0
net: dsa: b53: always rejoin default untagged VLAN on bridge leave
net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
net: dsa: b53: fix flushing old pvid VLAN on pvid change
net: dsa: b53: fix clearing PVID of a port
net: dsa: b53: keep CPU port always tagged again
...
|
|
The compiler is unaware of the size of code generated by the ".rept"
assembler directive. This results in the compiler emitting branch
instructions where the offset to branch to exceeds the maximum allowed
value, resulting in build failures like the following:
CC protection_keys
/tmp/ccypKWAE.s: Assembler messages:
/tmp/ccypKWAE.s:2073: Error: operand out of range (0x0000000000020158
is not between 0xffffffffffff8000 and 0x0000000000007ffc)
/tmp/ccypKWAE.s:2509: Error: operand out of range (0x0000000000020130
is not between 0xffffffffffff8000 and 0x0000000000007ffc)
Fix the issue by manually adding nop instructions using the preprocessor.
Link: https://lkml.kernel.org/r/20250428131937.641989-2-nysal@linux.ibm.com
Fixes: 46036188ea1f ("selftests/mm: build with -O2")
Reported-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Tested-by: Donet Tom <donettom@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 50910acd6f615 ("selftests/mm: use sys_pkey helpers consistently")
added a pkey_util.c to refactor some of the protection_keys functions
accessible by other tests. But this broken the build in powerpc in two
ways,
pkey-powerpc.h: In function `arch_is_powervm':
pkey-powerpc.h:73:21: error: storage size of `buf' isn't known
73 | struct stat buf;
| ^~~
pkey-powerpc.h:75:14: error: implicit declaration of function `stat'; did you mean `strcat'? [-Wimplicit-function-declaration]
75 | if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", &buf) == 0) &&
| ^~~~
| strcat
Since pkey_util.c includes pkeys-helper.h, which in turn includes pkeys-powerpc.h,
stat.h including is missing for "struct stat". This is fixed by adding "sys/stat.h"
in pkeys-powerpc.h
Secondly,
pkey-powerpc.h:55:18: warning: format `%llx' expects argument of type `long long unsigned int', but argument 3 has type `u64' {aka `long unsigned int'} [-Wformat=]
55 | dprintf4("%s() changing %016llx to %016llx\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56 | __func__, __read_pkey_reg(), pkey_reg);
| ~~~~~~~~~~~~~~~~~
| |
| u64 {aka long unsigned int}
pkey-helpers.h:63:32: note: in definition of macro `dprintf_level'
63 | sigsafe_printf(args); \
| ^~~~
These format specifier related warning are removed by adding
"__SANE_USERSPACE_TYPES__" to pkeys_utils.c.
Link: https://lkml.kernel.org/r/20250428131937.641989-1-nysal@linux.ibm.com
Fixes: 50910acd6f61 ("selftests/mm: use sys_pkey helpers consistently")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The current implementation of the guard region tests assume that /tmp is
mounted as tmpfs, that is shmem.
This isn't always the case, and at least one instance of a spurious test
failure has been reported as a result.
This assumption is unsafe, rushed and silly - and easily remedied by
simply using memfd, so do so.
We also have to fixup the readonly_file test to explicitly only be
applicable to file-backed cases.
Link: https://lkml.kernel.org/r/20250425162436.564002-1-lorenzo.stoakes@oracle.com
Fixes: 272f37d3e99a ("tools/selftests: expand all guard region tests to file-backed")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/linux-mm/a2d2766b-0ab4-437b-951a-8595a7506fe9@arm.com/
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|