summaryrefslogtreecommitdiff
path: root/drivers/hwtracing/coresight
AgeCommit message (Collapse)Author
2025-02-26coresight: etm4x: don't include '<linux/pm_wakeup.h>' directlyWolfram Sang
The header clearly states that it does not want to be included directly, only via '<linux/(platform_)?device.h>'. Which is already present, so delete the superfluous include. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250210113635.51935-2-wsa+renesas@sang-engineering.com
2025-02-24coresight: tpdm: Constify amba_id tableKrzysztof Kozlowski
'struct amba_id' table is not modified so can be changed to const for more safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250222-coresight-const-arm-id-v1-3-69a377cd098b@linaro.org
2025-02-24coresight: tpda: Constify amba_id tableKrzysztof Kozlowski
'struct amba_id' table is not modified so can be changed to const for more safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250222-coresight-const-arm-id-v1-2-69a377cd098b@linaro.org
2025-02-24coresight: catu: Constify amba_id tableKrzysztof Kozlowski
'struct amba_id' table is not modified so can be changed to const for more safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250222-coresight-const-arm-id-v1-1-69a377cd098b@linaro.org
2025-02-21coresight: config: Add preloaded configurationLinu Cherian
Add a preloaded configuration for generating external trigger on address match. This can be used by CTI and ETR blocks to stop trace capture on kernel panic. Kernel address for "panic" function is used as the default trigger address. This new configuration is available as, /sys/kernel/config/cs-syscfg/configurations/panicstop Signed-off-by: Linu Cherian <lcherian@marvell.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-8-lcherian@marvell.com
2025-02-21coresight: tmc: Stop trace capture on FlInLinu Cherian
Configure TMC ETR and ETF to flush and stop trace capture on FlIn event based on sysfs attribute, /sys/bus/coresight/devices/tmc_etXn/stop_on_flush. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-7-lcherian@marvell.com
2025-02-21coresight: tmc: Add support for reading crash dataLinu Cherian
* Add support for reading crashdata using special device files. The special device files /dev/crash_tmc_xxx would be available for read file operation only when the crash data is valid. * User can read the crash data as below For example, for reading crash data from tmc_etf sink #dd if=/dev/crash_tmc_etfXX of=~/cstrace.bin Signed-off-by: Anil Kumar Reddy <areddy3@marvell.com> Signed-off-by: Tanmay Jagdale <tanmay@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-6-lcherian@marvell.com
2025-02-21coresight: tmc: Enable panic sync handlingLinu Cherian
- Get reserved region from device tree node for metadata - Define metadata format for TMC - Add TMC ETR panic sync handler that syncs register snapshot to metadata region - Add TMC ETF panic sync handler that syncs register snapshot to metadata region and internal SRAM to reserved trace buffer region. Signed-off-by: Linu Cherian <lcherian@marvell.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-5-lcherian@marvell.com
2025-02-21coresight: core: Add provision for panic callbacksLinu Cherian
Panic callback handlers allows coresight device drivers to sync relevant trace data and trace metadata to reserved memory regions so that they can be retrieved later in the subsequent boot or in the crashdump kernel. Signed-off-by: Linu Cherian <lcherian@marvell.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-4-lcherian@marvell.com
2025-02-21coresight: tmc-etr: Add support to use reserved trace memoryLinu Cherian
Add support to use reserved memory for coresight ETR trace buffer. Introduce a new ETR buffer mode called ETR_MODE_RESRV, which becomes available when ETR device tree node is supplied with a valid reserved memory region. ETR_MODE_RESRV can be selected only by explicit user request. $ echo resrv >/sys/bus/coresight/devices/tmc_etr<N>/buf_mode_preferred Signed-off-by: Anil Kumar Reddy <areddy3@marvell.com> Signed-off-by: Linu Cherian <lcherian@marvell.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250212114918.548431-3-lcherian@marvell.com
2025-02-21coresight: catu: Fix number of pages while using 64k pagesIlkka Koskinen
Trying to record a trace on kernel with 64k pages resulted in -ENOMEM. This happens due to a bug in calculating the number of table pages, which returns zero. Fix the issue by rounding up. $ perf record --kcore -e cs_etm/@tmc_etr55,cycacc,branch_broadcast/k --per-thread taskset --cpu-list 1 dd if=/dev/zero of=/dev/null failed to mmap with 12 (Cannot allocate memory) Fixes: 8ed536b1e283 ("coresight: catu: Add support for scatter gather tables") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250109215348.5483-1-ilkka@os.amperecomputing.com
2025-01-28Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull KVM/arm64 updates from Will Deacon: "New features: - Support for non-protected guest in protected mode, achieving near feature parity with the non-protected mode - Support for the EL2 timers as part of the ongoing NV support - Allow control of hardware tracing for nVHE/hVHE Improvements, fixes and cleanups: - Massive cleanup of the debug infrastructure, making it a bit less awkward and definitely easier to maintain. This should pave the way for further optimisations - Complete rewrite of pKVM's fixed-feature infrastructure, aligning it with the rest of KVM and making the code easier to follow - Large simplification of pKVM's memory protection infrastructure - Better handling of RES0/RES1 fields for memory-backed system registers - Add a workaround for Qualcomm's Snapdragon X CPUs, which suffer from a pretty nasty timer bug - Small collection of cleanups and low-impact fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits) arm64/sysreg: Get rid of TRFCR_ELx SysregFields KVM: arm64: nv: Fix doc header layout for timers KVM: arm64: nv: Apply RESx settings to sysreg reset values KVM: arm64: nv: Always evaluate HCR_EL2 using sanitising accessors KVM: arm64: Fix selftests after sysreg field name update coresight: Pass guest TRFCR value to KVM KVM: arm64: Support trace filtering for guests KVM: arm64: coresight: Give TRBE enabled state to KVM coresight: trbe: Remove redundant disable call arm64/sysreg/tools: Move TRFCR definitions to sysreg tools: arm64: Update sysreg.h header files KVM: arm64: Drop pkvm_mem_transition for host/hyp donations KVM: arm64: Drop pkvm_mem_transition for host/hyp sharing KVM: arm64: Drop pkvm_mem_transition for FF-A KVM: arm64: Explicitly handle BRBE traps as UNDEFINED KVM: arm64: vgic: Use str_enabled_disabled() in vgic_v3_probe() arm64: kvm: Introduce nvhe stack size constants KVM: arm64: Fix nVHE stacktrace VA bits mask KVM: arm64: Fix FEAT_MTE in pKVM Documentation: Update the behaviour of "kvm-arm.mode" ...
2025-01-17arm64/sysreg: Get rid of TRFCR_ELx SysregFieldsMarc Zyngier
There is no such thing as TRFCR_ELx in the architecture. What we have is TRFCR_EL1, for which TRFCR_EL12 is an accessor. Rename TRFCR_ELx_* to TRFCR_EL1_*, and fix the bit of code using these names. Similarly, TRFCR_EL12 is redefined as a mapping to TRFCR_EL1. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/87cygsqgkh.wl-maz@kernel.org Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com>
2025-01-12coresight: Pass guest TRFCR value to KVMJames Clark
Currently the userspace and kernel filters for guests are never set, so no trace will be generated for them. Add support for tracing guests by passing the desired TRFCR value to KVM so it can be applied to the guest. By writing either E1TRE or E0TRE, filtering on either guest kernel or guest userspace is also supported. And if both E1TRE and E0TRE are cleared when exclude_guest is set, that option is supported too. This change also brings exclude_host support which is difficult to add as a separate commit without excess churn and resulting in no trace at all. cpu_prohibit_trace() gets moved to TRBE because the ETM driver doesn't need the read, it already has the base TRFCR value. TRBE only needs the read to disable it and then restore. Testing ======= The addresses were counted with the following: $ perf report -D | grep -Eo 'EL2|EL1|EL0' | sort | uniq -c Guest kernel only: $ perf record -e cs_etm//Gk -a -- true 535 EL1 1 EL2 Guest user only (only 5 addresses because the guest runs slowly in the model): $ perf record -e cs_etm//Gu -a -- true 5 EL0 Host kernel only: $ perf record -e cs_etm//Hk -a -- true 3501 EL2 Host userspace only: $ perf record -e cs_etm//Hu -a -- true 408 EL0 1 EL2 Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20250106142446.628923-8-james.clark@linaro.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-12KVM: arm64: coresight: Give TRBE enabled state to KVMJames Clark
Currently in nVHE, KVM has to check if TRBE is enabled on every guest switch even if it was never used. Because it's a debug feature and is more likely to not be used than used, give KVM the TRBE buffer status to allow a much simpler and faster do-nothing path in the hyp. Protected mode now disables trace regardless of TRBE (because trfcr_while_in_guest is always 0), which was not previously done. However, it continues to flush whenever the buffer is enabled regardless of the filter status. This avoids the hypothetical case of a host that had disabled the filter but not flushed which would arise if only doing the flush when the filter was enabled. Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250106142446.628923-6-james.clark@linaro.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-01-12coresight: trbe: Remove redundant disable callJames Clark
trbe_drain_and_disable_local() just clears TRBLIMITR and drains. TRBLIMITR is already cleared on the next line after this call, so replace it with only drain. This is so we can make a kvm call that has a preempt enabled warning from set_trbe_disabled() in the next commit, where trbe_reset_local() is called from a preemptible hotplug path. Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250106142446.628923-5-james.clark@linaro.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-12-13coresight-tpda: Optimize the function of reading element sizeTao Zhang
Since the new funnel device supports multi-port output scenarios, there may be more than one TPDM connected to one TPDA. In this way, when reading the element size of the TPDM, TPDA driver needs to find the expected TPDM corresponding to the filter source. When TPDA finds a TPDM or a filter source from a input connection, it will read the Devicetree to get the expected TPDM's element size. Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241213100731.25914-5-quic_taozha@quicinc.com
2024-12-13coresight: Add support for trace filtering by sourceTao Zhang
Some replicators have hard coded filtering of "trace" data, based on the source device. This is different from the trace filtering based on TraceID, available in the standard programmable replicators. e.g., Qualcomm replicators have filtering based on custom trace protocol format and is not programmable. The source device could be connected to the replicator via intermediate components (e.g., a funnel). Thus we need platform information from the firmware tables to decide the source device corresponding to a given output port from the replicator. Given this affects "trace path building" and traversing the path back from the sink to source, add the concept of "filtering by source" to the generic coresight connection. The specified source will be marked like below in the Devicetree. test-replicator { ... ... ... ... out-ports { ... ... ... ... port@0 { reg = <0>; xyz: endpoint { remote-endpoint = <&zyx>; filter-source = <&source_1>; <-- To specify the source to }; be filtered out here. }; port@1 { reg = <1>; abc: endpoint { remote-endpoint = <&cba>; filter-source = <&source_2>; <-- To specify the source to }; be filtered out here. }; }; }; Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241213100731.25914-4-quic_taozha@quicinc.com
2024-12-13coresight: Add a helper to check if a device is sourceTao Zhang
Since there are a lot of places in the code to check whether the device is source, add a helper to check it. Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241213100731.25914-3-quic_taozha@quicinc.com
2024-12-11coresight: Fix dsb_mode_store() unsigned val is never less than zeroPei Xiao
dsb_mode_store() warn: unsigned 'val' is never less than zero. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410150702.UaZ7kvet-lkp@intel.com/ Fixes: 018e43ad1eee ("coresight-tpdm: Add node to set dsb programming mode") Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/122503017ada249fbf12be3fa4ee6ccb8f8c78cc.1732156624.git.xiaopei01@kylinos.cn
2024-12-11coresight: dummy: Add static trace id support for dummy sourceMao Jinlong
Some dummy source has static trace id configured in HW and it cannot be changed via software programming. Configure the trace id in device tree and reserve the id when device probe. Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com> Link: https://lore.kernel.org/r/20241121062829.11571-4-quic_jinlmao@quicinc.com [ Fix Date and Version to December 2024, v6.14 ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
2024-12-11coresight: Add support to get static id for system trace sourcesMao Jinlong
Dynamic trace id was introduced in coresight subsystem, so trace id is allocated dynamically. However, some hardware ATB source has static trace id and it cannot be changed via software programming. For such source, it can call coresight_get_static_trace_id to get the fixed trace id from device node and pass id to coresight_trace_id_get_static_system_id to reserve the id. Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241121062829.11571-3-quic_jinlmao@quicinc.com
2024-12-11coresight: Drop atomics in connection refcountsJames Clark
These belong to the device being enabled or disabled and are only ever used inside the device's spinlock. Remove the atomics to not imply that there are any other concurrent accesses. If atomics were necessary I don't think they would have been enough anyway. There would be nothing to prevent an enable or disable running concurrently if not for the spinlock. Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241128121414.2425119-1-james.clark@linaro.org
2024-12-11Coresight: Narrow down the matching range of tpdmSongwei Chai
The format of tpdm's peripheral id is 1f0exx. To avoid potential conflicts in the future, update the .id_table's id to 0x001f0e00. This update will narrow down the matching range and prevent incorrect matches. For example, another component's peripheral id might be f0e00, which would incorrectly match the old id. Fixes: b3c71626a933 ("Coresight: Add coresight TPDM source driver") Signed-off-by: Songwei Chai <quic_songchai@quicinc.com> [ trimmmed Fixes commit sha to 12 chars ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241009091728.1638-1-quic_songchai@quicinc.com
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-09-27[tree-wide] finally take no_llseek outAl Viro
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-20coresight: Make trace ID map spinlock local to the mapJames Clark
Reduce contention on the lock by replacing the global lock with one for each map. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-18-james.clark@linaro.org
2024-08-20coresight: Emit sink ID in the HW_ID packetsJames Clark
For Perf to be able to decode when per-sink trace IDs are used, emit the sink that's being written to for each ETM. Perf currently errors out if it sees a newer packet version so instead of bumping it, add a new minor version field. This can be used to signify new versions that have backwards compatible fields. Considering this change is only for high core count machines, it doesn't make sense to make a breaking change for everyone. Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-17-james.clark@linaro.org
2024-08-20coresight: Remove pending trace ID release mechanismJames Clark
Pending the release of IDs was a way of managing concurrent sysfs and Perf sessions in a single global ID map. Perf may have finished while sysfs hadn't, and Perf shouldn't release the IDs in use by sysfs and vice versa. Now that Perf uses its own exclusive ID maps, pending release doesn't result in any different behavior than just releasing all IDs when the last Perf session finishes. As part of the per-sink trace ID change, we would have still had to make the pending mechanism work on a per-sink basis, due to the overlapping ID allocations, so instead of making that more complicated, just remove it. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-16-james.clark@linaro.org
2024-08-20coresight: Use per-sink trace ID maps for Perf sessionsJames Clark
This will allow sessions with more than CORESIGHT_TRACE_IDS_MAX ETMs as long as there are fewer than that many ETMs connected to each sink. Each sink owns its own trace ID map, and any Perf session connecting to that sink will allocate from it, even if the sink is currently in use by other users. This is similar to the existing behavior where the dynamic trace IDs are constant as long as there is any concurrent Perf session active. It's not completely optimal because slightly more IDs will be used than necessary, but the optimal solution involves tracking the PIDs of each session and allocating ID maps based on the session owner. This is difficult to do with the combination of per-thread and per-cpu modes and some scheduling issues. The complexity of this isn't likely to worth it because even with multiple users they'd just see a difference in the ordering of ID allocations rather than hitting any limits (unless the hardware does have too many ETMs connected to one sink). Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-15-james.clark@linaro.org
2024-08-20coresight: Make CPU id map a property of a trace ID mapJames Clark
The global CPU ID mappings won't work for per-sink ID maps so move it to the ID map struct. coresight_trace_id_release_all_pending() is hard coded to operate on the default map, but once Perf sessions use their own maps the pending release mechanism will be deleted. So it doesn't need to be extended to accept a trace ID map argument at this point. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-14-james.clark@linaro.org
2024-08-20coresight: Expose map arguments in trace ID APIJames Clark
The trace ID API is currently hard coded to always use the global map. Add public versions that allow the map to be passed in so that Perf mode can use per-sink maps. Keep the non-map versions so that sysfs mode can continue to use the default global map. System ID functions are unchanged because they will always use the default map. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-13-james.clark@linaro.org
2024-08-20coresight: Move struct coresight_trace_id_map to common headerJames Clark
The trace ID maps will need to be created and stored by the core and Perf code so move the definition up to the common header. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-12-james.clark@linaro.org
2024-08-20coresight: Clarify comments around the PID of the sink ownerJames Clark
"Process being monitored" and "pid of the process to monitor" imply that this would be the same PID if there were two sessions targeting the same process. But this is actually the PID of the process that did the Perf event open call, rather than the target of the session. So update the comments to make this clearer. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-11-james.clark@linaro.org
2024-08-20coresight: Remove unused ETM Perf stubsJames Clark
This file is never included anywhere if CONFIG_CORESIGHT is not set so they are unused and aren't currently compile tested with any config so remove them. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@arm.com> Tested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240722101202.26915-10-james.clark@linaro.org
2024-08-20coresight: tmc: sg: Do not leak sg_tableSuzuki K Poulose
Running perf with cs_etm on Juno triggers the following kmemleak warning ! :~# cat /sys/kernel/debug/kmemleak unreferenced object 0xffffff8806b6d720 (size 96): comm "perf", pid 562, jiffies 4297810960 hex dump (first 32 bytes): 38 d8 13 07 88 ff ff ff 00 d0 9e 85 c0 ff ff ff 8............... 00 10 00 88 c0 ff ff ff 00 f0 ff f7 ff 00 00 00 ................ backtrace (crc 1dbf6e00): [<ffffffc08107381c>] kmemleak_alloc+0xbc/0xd8 [<ffffffc0802f9798>] kmalloc_trace_noprof+0x220/0x2e8 [<ffffffc07bb71948>] tmc_alloc_sg_table+0x48/0x208 [coresight_tmc] [<ffffffc07bb71cbc>] tmc_etr_alloc_sg_buf+0xac/0x240 [coresight_tmc] [<ffffffc07bb72538>] tmc_alloc_etr_buf.constprop.0+0x1f0/0x260 [coresight_tmc] [<ffffffc07bb7280c>] alloc_etr_buf.constprop.0.isra.0+0x74/0xa8 [coresight_tmc] [<ffffffc07bb72950>] tmc_alloc_etr_buffer+0x110/0x260 [coresight_tmc] [<ffffffc07bb38afc>] etm_setup_aux+0x204/0x3b0 [coresight] [<ffffffc08025837c>] rb_alloc_aux+0x20c/0x318 [<ffffffc08024dd84>] perf_mmap+0x2e4/0x7a0 [<ffffffc0802cceb0>] mmap_region+0x3b0/0xa08 [<ffffffc0802cd8a8>] do_mmap+0x3a0/0x500 [<ffffffc080295328>] vm_mmap_pgoff+0x100/0x1d0 [<ffffffc0802cadf8>] ksys_mmap_pgoff+0xb8/0x110 [<ffffffc080020688>] __arm64_sys_mmap+0x38/0x58 [<ffffffc080028fc0>] invoke_syscall.constprop.0+0x58/0x100 This due to the fact that we do not free the "sg_table" itself while freeing up the SG table and data pages. Fix this by freeing the sg_table in tmc_free_sg_table(). Fixes: 99443ea19e8b ("coresight: Add generic TMC sg table framework") Cc: Mike Leach <mike.leach@linaro.org> Cc: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240702132846.1677261-1-suzuki.poulose@arm.com
2024-08-19Coresight: Set correct cs_mode for dummy source to fix disable issueJie Gan
The coresight_disable_source_sysfs function should verify the mode of the coresight device before disabling the source. However, the mode for the dummy source device is always set to CS_MODE_DISABLED, resulting in the check consistently failing. As a result, dummy source cannot be properly disabled. Configure CS_MODE_SYSFS/CS_MODE_PERF during the enablement. Configure CS_MODE_DISABLED during the disablement. Fixes: 9d3ba0b6c056 ("Coresight: Add coresight dummy driver") Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240812042844.2890115-1-quic_jiegan@quicinc.com
2024-08-19Coresight: Set correct cs_mode for TPDM to fix disable issueJie Gan
The coresight_disable_source_sysfs function should verify the mode of the coresight device before disabling the source. However, the mode for the TPDM device is always set to CS_MODE_DISABLED, resulting in the check consistently failing. As a result, TPDM cannot be properly disabled. Configure CS_MODE_SYSFS/CS_MODE_PERF during the enablement. Configure CS_MODE_DISABLED during the disablement. Fixes: b3c71626a933 ("Coresight: Add coresight TPDM source driver") Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240812043043.2890694-1-quic_jiegan@quicinc.com
2024-08-19coresight: cti: use device_* to iterate over device child nodesJavier Carrasco
Drop the manual access to the fwnode of the device to iterate over its child nodes. `device_for_each_child_node` macro provides direct access to the child nodes, and given that they are only required within the loop, the scoped variant of the macro can be used. Use the `device_for_each_child_node_scoped` macro to iterate over the direct child nodes of the device. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240808-device_child_node_access-v2-1-fc757cc76650@gmail.com
2024-07-01hwtracing: use for_each_endpoint_of_node()Kuninori Morimoto
We already have for_each_endpoint_of_node(), don't use of_graph_get_next_endpoint() directly. Replace it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/878qyl970c.wl-kuninori.morimoto.gx@renesas.com
2024-06-21coresight: constify the struct device_type usageRicardo B. Marliere
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the coresight_dev_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240219-device_cleanup-coresight-v1-1-4a8a0b816183@marliere.net Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
2024-06-10coresight: tmc: Remove duplicated include in coresight-tmc-core.cYang Li
The header files linux/acpi.h is included twice in coresight-tmc-core.c, so one inclusion of each can be removed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8937 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240506011121.39179-1-yang.lee@linux.alibaba.com
2024-06-07coresight: Fix ref leak when of_coresight_parse_endpoint() failsJames Clark
of_graph_get_next_endpoint() releases the reference to the previous endpoint on each iteration, but when parsing fails the loop exits early meaning the last reference is never dropped. Fix it by dropping the refcount in the exit condition. Fixes: d375b356e687 ("coresight: Fix support for sparsely populated ports") Signed-off-by: James Clark <james.clark@arm.com> Reported-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240529133626.90080-1-james.clark@arm.com
2024-05-22Merge tag 'char-misc-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver subsystem updates from Greg KH: "Here is the big set of char/misc and other driver subsystem updates for 6.10-rc1. Nothing major here, just lots of new drivers and updates for apis and new hardware types. Included in here are: - big IIO driver updates with more devices and drivers added - fpga driver updates - hyper-v driver updates - uio_pruss driver removal, no one uses it, other drivers control the same hardware now - binder minor updates - mhi driver updates - excon driver updates - counter driver updates - accessability driver updates - coresight driver updates - other hwtracing driver updates - nvmem driver updates - slimbus driver updates - spmi driver updates - other smaller misc and char driver updates All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (319 commits) misc: ntsync: mark driver as "broken" to prevent from building spmi: pmic-arb: Add multi bus support spmi: pmic-arb: Register controller for bus instead of arbiter spmi: pmic-arb: Make core resources acquiring a version operation spmi: pmic-arb: Make the APID init a version operation spmi: pmic-arb: Fix some compile warnings about members not being described dt-bindings: spmi: Deprecate qcom,bus-id dt-bindings: spmi: Add X1E80100 SPMI PMIC ARB schema spmi: pmic-arb: Replace three IS_ERR() calls by null pointer checks in spmi_pmic_arb_probe() spmi: hisi-spmi-controller: Do not override device identifier dt-bindings: spmi: hisilicon,hisi-spmi-controller: clean up example dt-bindings: spmi: hisilicon,hisi-spmi-controller: fix binding references spmi: make spmi_bus_type const extcon: adc-jack: Document missing struct members extcon: realtek: Remove unused of_gpio.h extcon: usbc-cros-ec: Convert to platform remove callback returning void extcon: usb-gpio: Convert to platform remove callback returning void extcon: max77843: Convert to platform remove callback returning void extcon: max3355: Convert to platform remove callback returning void extcon: intel-mrfld: Convert to platform remove callback returning void ...
2024-05-19Merge tag 'mm-stable-2024-05-17-19-19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: "The usual shower of singleton fixes and minor series all over MM, documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/ maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series: "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking"" * tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits) memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault selftests: cgroup: add tests to verify the zswap writeback path mm: memcg: make alloc_mem_cgroup_per_node_info() return bool mm/damon/core: fix return value from damos_wmark_metric_value mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED selftests: cgroup: remove redundant enabling of memory controller Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT Docs/mm/damon/design: use a list for supported filters Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file selftests/damon: classify tests for functionalities and regressions selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None' selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts selftests/damon/_damon_sysfs: check errors from nr_schemes file reads mm/damon/core: initialize ->esz_bp from damos_quota_init_priv() selftests/damon: add a test for DAMOS quota goal ...
2024-05-01coresight: tmc: Enable SG capability on ACPI based SoC-400 TMC ETR devicesAnshuman Khandual
This detects and enables the scatter gather capability (SG) on ACPI based Soc-400 TMC ETR devices via a new property called 'arm-armhc97c-sg-enable'. The updated ACPI spec can be found below, which contains this new property. https://developer.arm.com/documentation/den0067/latest/ This preserves current handling for the property 'arm,scatter-gather' both on ACPI and DT based platforms i.e the presence of the property is checked instead of the value. Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240404072934.940760-1-anshuman.khandual@arm.com
2024-04-25fix missing vmalloc.h includesKent Overstreet
Patch series "Memory allocation profiling", v6. Overview: Low overhead [1] per-callsite memory allocation profiling. Not just for debug kernels, overhead low enough to be deployed in production. Example output: root@moria-kvm:~# sort -rn /proc/allocinfo 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext 56373248 4737 mm/slub.c:2259 func:alloc_slab_page 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start 3940352 962 mm/memory.c:4214 func:alloc_anon_folio 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node ... Usage: kconfig options: - CONFIG_MEM_ALLOC_PROFILING - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT - CONFIG_MEM_ALLOC_PROFILING_DEBUG adds warnings for allocations that weren't accounted because of a missing annotation sysctl: /proc/sys/vm/mem_profiling Runtime info: /proc/allocinfo Notes: [1]: Overhead To measure the overhead we are comparing the following configurations: (1) Baseline with CONFIG_MEMCG_KMEM=n (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1) (5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) && CONFIG_MEMCG_KMEM=y (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y Performance overhead: To evaluate performance we implemented an in-kernel test executing multiple get_free_page/free_page and kmalloc/kfree calls with allocation sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU affinity set to a specific CPU to minimize the noise. Below are results from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on 56 core Intel Xeon: kmalloc pgalloc (1 baseline) 6.764s 16.902s (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%) (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%) (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%) (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%) (6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%) (7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%) Memory overhead: Kernel size: text data bss dec diff (1) 26515311 18890222 17018880 62424413 (2) 26524728 19423818 16740352 62688898 264485 (3) 26524724 19423818 16740352 62688894 264481 (4) 26524728 19423818 16740352 62688898 264485 (5) 26541782 18964374 16957440 62463596 39183 Memory consumption on a 56 core Intel CPU with 125GB of memory: Code tags: 192 kB PageExts: 262144 kB (256MB) SlabExts: 9876 kB (9.6MB) PcpuExts: 512 kB (0.5MB) Total overhead is 0.2% of total memory. Benchmarks: Hackbench tests run 100 times: hackbench -s 512 -l 200 -g 15 -f 25 -P baseline disabled profiling enabled profiling avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023) stdev 0.0137 0.0188 0.0077 hackbench -l 10000 baseline disabled profiling enabled profiling avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859) stdev 0.0933 0.0286 0.0489 stress-ng tests: stress-ng --class memory --seq 4 -t 60 stress-ng --class cpu --seq 4 -t 60 Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/ [2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/ This patch (of 37): The next patch drops vmalloc.h from a system header in order to fix a circular dependency; this adds it to all the files that were pulling it in implicitly. [kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c] Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev [surenb@google.com: fix arch/x86/mm/numa_32.c] Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com [kent.overstreet@linux.dev: a few places were depending on sizes.h] Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev [arnd@arndb.de: fix mm/kasan/hw_tags.c] Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org [surenb@google.com: fix arc build] Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25coresight: tpiu: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Fixes: 3d83d4d4904a ("coresight: tpiu: Move ACPI support from AMBA driver to platform driver") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/ad5e0d3ec081444a5ad04a7be277dde3afcb696b.1713858615.git.u.kleine-koenig@pengutronix.de
2024-04-25coresight: tmc: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Fixes: 70750e257aab ("coresight: tmc: Move ACPI support from AMBA driver to platform driver") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/3cf26d85a8d45f0efb07e07f3307a1b435ebf61e.1713858615.git.u.kleine-koenig@pengutronix.de
2024-04-25coresight: stm: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Fixes: 057256aaacc8 ("coresight: stm: Move ACPI support from AMBA driver to platform driver") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/3fefa60744fc68c9c4b40aeb69e34cda22582c4b.1713858615.git.u.kleine-koenig@pengutronix.de