summaryrefslogtreecommitdiff
path: root/drivers/perf/hisilicon
AgeCommit message (Collapse)Author
2025-07-14drivers/perf: hisi: Support PMUs with no interruptYicong Yang
We'll have PMUs don't have an interrupt to indicate the counter overflow, but the Uncore PMU core assume all the PMUs have interrupt. So handle this case in the core. The existing PMUs won't be affected. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Relax the event number check of v2 PMUsJunhao He
The supported event number range of each Uncore PMUs is provided by each driver in hisi_pmu::check_event and out of range events will be rejected. A later version with expanded event number range needs to register the PMU with updated hisi_pmu::check_event even if it's the only update, which means the expanded events cannot be used unless the driver's updated. However the unsupported events won't be counted by the hardware so we can relax the event number check to allow the use the expanded events. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driverJunhao He
SLLC v3 PMU has the following changes compared to previous version: a) update the register layout b) update the definition of SRCID_CTRL and TGTID_CTRL registers. To be compatible with v2, we use maximum width (11 bits) and mask the extra length for themselves. c) remove latency events (driver does not need to be adapted). SLLC v3 PMU is identified with HID HISI0264. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU informationJunhao He
Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. Factor out SLLC register definition to struct hisi_sllc_pmu_regs. No functional changes intended. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driverJunhao He
HiSilicon DDRC v3 PMU has the different interrupt register offset compared to the v2. Add device information of v3 PMU with ACPI HID HISI0235. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Simplify the probe process for each DDRC versionJunhao He
Version 1 and 2 of DDRC PMU also use different HID. Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. In order to support this extend struct hisi_pmu_dev_info for version specific counter bits and event range. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-01-07drivers/perf: hisi: Set correct IRQ affinity for PMUs with no associationYicong Yang
For PMUs with no association, the hisi_pmu->on_cpu is initialized according to the NUMA locality but use a wrong CPU for the interrupt affinity. The CPU selected from cpumask_local_spread() can be different from the CPU by the cpuhp callback. Fix this by setting the IRQ affinity to hisi_pmu->on_cpu. Fixes: 6cd137088fdf ("drivers/perf: hisi: Refactor the detection of associated CPUs") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241223125134.57885-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Delete redundant blank line of DDRC PMUJunhao He
Do not add blank line at the end of a code block defined by braces. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-11-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Fix incorrect variable name "hha_pmu" in DDRC PMU driverJunhao He
In the callback function write_evtype(), the variable name of struct hisi_pmu should be "ddrc_pmu" instead of "hha_pmu". Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-10-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Export associated CPUs of each PMU through sysfsYicong Yang
Although the event of the uncore PMU can only be opened on a single CPU, some PMU does have the affinity on a range of CPUs. For example the L3C PMU is associated to the CPUs sharing the L3T it monitors. Users may infer this affinity by the PMU name which may have SCCL ID and CCL ID encoded (for L3C etc), but it's not that straightforward. So export this information by adding an "associated_cpus" sysfs attribute then user can get this directly. Reviewed-by: Jonathan Cameron <Joanthan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-9-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Provide a generic implementation of cpumask/identifierYicong Yang
Each type of HiSilicon Uncore PMU has the following sysfs attributes: - format: bitmask in perf_event_attr::config[012] of corresponding attribute - event: events name and corresponding event code - cpumask: range of CPUs the events can be opened on - identifier: the version of this PMU Different types of PMU have different implementations of the "format" and "event" but all share the same implementation of the "cpumask" and "identifier". Thus we can move cpumask and identifier to the hisi_uncore_pmu framework and drivers can use the generic implementation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Add a common function to retrieve topology from firmwareYicong Yang
Currently each type of uncore PMU driver uses almost the same routine and the same firmware interface (or properties) to retrieve the topology information, then reset the unused IDs to -1. Extract the common parts to the framework in hisi_uncore_pmu_init_topology(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Extract topology information to a separate structureYicong Yang
HiSilicon Uncore PMUs are identified by the IDs of the topology element on which the PMUs are located. Add a new separate struct hisi_pmu_toplogy to encapsulate this information. Add additional documentation on the meaning of each ID. - make sccl_id and sicl_id into a union since they're exclusive. It can also be accessed by scl_id if the SICL/SCCL distinction is not relevant - make index_id and sub_id signed so -1 may be used to indicate the PMU doesn't have this topology element or it could not be retrieved. This patch should have no functional changes. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Refactor the detection of associated CPUsYicong Yang
There are two type of PMUs supported currently: 1) PMUs locate on SCCL (Super CPU Cluster [1]), associated with certain CCL (CPU cluster [1])(e.g. L3C PMU) or not (e.g. DDRC PMU) 2) PMUs locate on the SICL (Super IO Cluster [1]), which has no association with certain CPU topology (e.g. CPA PMU) Currently the associated CPUs of the PMU is detected in the cpuhp online callback as below: - for type 1) the CPUs match PMU's sccl_id and ccl_id - for type 2) PMU's sccl_id is -1 and all online CPUs will be associated Since uncore PMUs are not bound to certain CPU context and event could be counting started by any online CPU, the associated CPUs are just a preference. Below disadvantages are observed in current implementation: - the PMU cannot be used if its associated CPUs are offline - SICL PMUs are associated to all the online CPUs implicitly without the consideration of locality So refactor the way we detect the associated CPUs in below aspects: - add a clear definition of hisi_pmu::associated_cpus - initialize hisi_pmu::on_cpu based on locality if no associated CPU found, otherwise update it from associated CPUs - drop the detection with a sccl_id of -1 for SICL PMUs [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/perf/hisi-pmu.rst?h=v6.12-rc1 Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Migrate to one online CPU if no associated one onlineYicong Yang
If the selected CPU hisi_pmu::on_cpu goes offline, driver will select a new online CPU from hisi_pmu::associated_cpus, or if no online CPU found the PMU context won't be migrated. However for uncore PMUs the associated CPUs are just a peference and it also works to schedule the events on any online CPUs. So add a fallback to choose an online CPU if no associated CPUs found. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Don't update the associated_cpus on CPU offlineYicong Yang
Event will be scheduled on CPU of hisi_pmu::on_cpu which is selected from the intersection of hisi_pmu::associated_cpus and online CPUs. So the associated_cpus don't need to be maintained with online CPUs. This will save one update operation if one associated CPU is offlined. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-10drivers/perf: hisi: Define a symbol namespace for HiSilicon Uncore PMUsYicong Yang
The HiSilicon Uncore PMU framework implements some common functions and exported them to the drivers. Use a specific HISI_PMU namespace for the exported symbols to avoid pollute the generic ones. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241210141525.37788-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-11-06perf: Switch back to struct platform_driver::remove()Uwe Kleine-König
After commit 0edb555a65d1 ("platform: Make platform_driver::remove() return void") .remove() is (again) the right callback to implement for platform drivers. Convert all platform drivers below drivers/perf to use .remove(), with the eventual goal to drop struct platform_driver::remove_new(). As .remove() and .remove_new() have the same prototypes, conversion is done by just changing the structure member name in the driver initializer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20241027180313.410964-2-u.kleine-koenig@baylibre.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]Yicong Yang
Currently users can get the Root Ports supported by the PCIe PMU by "bus" sysfs attributes which indicates the PCIe bus number where Root Ports are located. This maybe insufficient since Root Ports supported by different PCIe PMUs may be located on the same PCIe bus. So export the BDF range the Root Ports additionally. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Fix TLP headers bandwidth countingYicong Yang
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and modify according to the user options. This will make TLP headers bandwidth only counting never take effect since HISI_PCIE_INIT_SET configures to count the TLP payloads bandwidth. Fix this by making the initial value of event ctrl register as 0. Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-08-30drivers/perf: hisi_pcie: Record hardware counts correctlyYicong Yang
Currently we set the period and record it as the initial value of the counter without checking it's set to the hardware successfully or not. However the counter maybe unwritable if the target event is unsupported by the device. In such case we will pass user a wrong count: [start counts when setting the period] hwc->prev_count = 0x8000000000000000 device.counter_value = 0 // the counter is not set as the period [when user reads the counter] event->count = device.counter_value - hwc->prev_count = 0x8000000000000000 // wrong. should be 0. Fix this by record the hardware counter counts correctly when setting the period. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-10perf: add missing MODULE_DESCRIPTION() macrosJeff Johnson
With ARCH=x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o Add the missing invocation of the MODULE_DESCRIPTION() macro to all files which have a MODULE_LICENSE(). This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although it did not produce a warning with the x86 allmodconfig configuration, may cause this warning with arm64 configurations. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-05-22Merge tag 'driver-core-6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the small set of driver core and kernfs changes for 6.10-rc1. Nothing major here at all, just a small set of changes for some driver core apis, and minor fixups. Included in here are: - sysfs_bin_attr_simple_read() helper added and used - device_show_string() helper added and used All usages of these were acked by the various maintainers. Also in here are: - kernfs minor cleanup - removed unused functions - typo fix in documentation - pay attention to sysfs_create_link() failures in module.c finally All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: device property: Fix a typo in the description of device_get_child_node_count() kernfs: mount: Remove unnecessary ‘NULL’ values from knparent scsi: Use device_show_string() helper for sysfs attributes platform/x86: Use device_show_string() helper for sysfs attributes perf: Use device_show_string() helper for sysfs attributes IB/qib: Use device_show_string() helper for sysfs attributes hwmon: Use device_show_string() helper for sysfs attributes driver core: Add device_show_string() helper for sysfs attributes treewide: Use sysfs_bin_attr_simple_read() helper sysfs: Add sysfs_bin_attr_simple_read() helper module: don't ignore sysfs_create_link() failures driver core: Remove unused platform_notify, platform_notify_remove
2024-05-04perf: Use device_show_string() helper for sysfs attributesLukas Wunner
Deduplicate sysfs ->show() callbacks which expose a string at a static memory location. Use the newly introduced device_show_string() helper in the driver core instead by declaring those sysfs attributes with DEVICE_STRING_ATTR_RO(). No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/3a297850312b4ecb62d6872121de04496900f502.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-28drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()Hao Chen
pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action() fails, the irq vector is not freed, which leads to a memory leak. Replace the devm_add_action with devm_add_action_or_reset to ensure the irq vector can be destroyed when it fails. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi: hns3: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/} Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Hao Chen <chenhao418@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28drivers/perf: hisi_pcie: Fix out-of-bound access when valid event groupJunhao He
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HISI_PCIE_MAX_COUNTERS, the memory write overflow of event_group array occurs. Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds. There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}' Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-hns3: Assign parents for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-6-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-uncore: Assign parents for event_source devicesJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-4-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-pcie: Assign parent for event_source deviceJonathan Cameron
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-2-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_uncore: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-9-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_pcie: Avoid placing cpumask on the stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-8-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()Junhao He
The function xxx_find_related_event() scan all working events to find related events. During this process, we also can find the idle counters. If not found related events, return the first idle counter to simplify the code. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Relax the check on related eventsJunhao He
If we use two events with the same filter and related event type (see the following example), the driver check whether they are related events and are in the same group, otherwise the function hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event cannot count but the 1st event is running, although the PCIe PMU has other idle counters. In this case, The perf event scheduler will make the two events to multiplex a counter, if the user use the formula (1st event_value / 2nd event_value) to calculate the bandwidth, he/she won't get the correct value, because they are not counting at the same period. This patch tries to fix this by making the related events to use different idle counters if they are not in the same event group. And finally, I'm going to say. The related events are best used in the same group [1]. There are two ways to know if they are related events. a) By event name, such as the latency events "xxx_latency, xxx_cnt" or bandwidth events "xxx_flux, xxx_time". b) By event type, such as "event=0xXXXX, event=0x1XXXX". Use group to count the related events: [1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}" example: 1st event: hisi_pcie0_core1/event=0x804,port=1 2nd event: hisi_pcie0_core1/event=0x10804,port=1 test cmd: perf stat -e hisi_pcie0_core1/event=0x804,port=1/ \ -e hisi_pcie0_core1/event=0x10804,port=1/ before patch: 25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%) 470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%) after patch: 24,147 hisi_pcie0_core1/event=0x804,port=1/ 474,558 hisi_pcie0_core1/event=0x10804,port=1/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/20240223103359.18669-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Check the target filter properlyJunhao He
The PMU can monitor traffic of certain target Root Port or downstream target Endpoint. User can specify the target filter by the "port" or "bdf" option respectively. The PMU can only monitor the Root Port or Endpoint on the same PCIe core so the value of "port" or "bdf" should be valid and will be checked by the driver. Currently at least and only one of "port" and "bdf" option must be set. If "port" filter is not set or is set explicitly to zero (default), driver will regard the user specifies a "bdf" option since "port" option is a bitmask of the target Root Ports and zero is not a valid value. If user not explicitly set "port" or "bdf" filter, the driver uses "bdf" default value (zero) to set target filter, but driver will skip the check of bdf=0, although it's a valid value (meaning 0000:000:00.0). Then the user just gets zero. Therefore, we need to check if both "port" and "bdf" are invalid, then return failure and report warning. Testing: before the patch: 0 hisi_pcie0_core1/rx_mrd_flux/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ after the patch: <not supported> hisi_pcie0_core1/rx_mrd_flux/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Add more events for counting TLP bandwidthYicong Yang
A typical PCIe transaction is consisted of various TLP packets in both direction. For counting bandwidth only memory read events are exported currently. Add memory write and completion counting events of both direction to complete the bandwidth counting. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Fix incorrect counting under metric modeYicong Yang
The metric counting shows incorrect results if the events in the metric group using the same event but different filter options. This is because we only judge the event code to decide whether the event in the metric group should share the same hardware counter, but ignore the settings of the filter. For example, on a platform of 2 ports 0x1 and 0x2 but only port 0x1 has a downstream PCIe NVME device. The metric counting shows both ports have the same counts because we misassign these two events to one same hardware counter: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/ 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.153863691 seconds time elapsed Fix this by using the whole config rather than the event only to judge whether two events are the same and should share the same hardware counter. With this patch, the metric counting in the above case tends to be corrected: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 0 hisi_pcie0_core1/event=0x0104,port=0x2/ 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.152875631 seconds time elapsed Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()Yicong Yang
Factor out retrieving of the register value for the corresponding event from hisi_pcie_config_event_ctrl() into a new function hisi_pcie_pmu_get_event_ctrl_val() allowing future reuse. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()Yicong Yang
hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL register which contains not only the filter but also the event code. The function names are bit misleading. Rename it to hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions more accurately. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09Junhao He
HiSilicon UC PMU v2 suffers the erratum 162700402 that the PMU counter cannot be set due to the lack of clock under power saving mode. This will lead to error or inaccurate counts. The clock can be enabled by the PMU global enabling control. This patch tries to fix this by set the UC PMU enable before set event period to turn on the clock, and then restore the UC PMU configuration. The counter register can hold its value without a clock. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240227125231.53127-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf: hisilicon: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert these drivers from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/33a8be0641b9447469fb7f6af0a10fb65efa97a3.1702648125.git.u.kleine-koenig@pengutronix.de Signed-off-by: Will Deacon <will@kernel.org>
2023-12-05drivers/perf: hisi: Fix some event id for HiSilicon UC pmuJunhao He
Some event id of HiSilicon uncore UC PMU driver is incorrect, fix them. Fixes: 312eca95e28d ("drivers/perf: hisi: Add support for HiSilicon UC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231204110425.20354-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24perf: hisi: Fix use-after-free when register pmu failsJunhao He
When we fail to register the uncore pmu, the pmu context may not been allocated. The error handing will call cpuhp_state_remove_instance() to call uncore pmu offline callback, which migrate the pmu context. Since that's liable to lead to some kind of use-after-free. Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been failed to register. Fixes: a0ab25cd82ee ("drivers/perf: hisi: Add support for HiSilicon PA PMU driver") FIxes: 3bf30882c3c7 ("drivers/perf: hisi: Add support for HiSilicon SLLC PMU driver") Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20231024113630.13472-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Initialize event->cpu only on successYicong Yang
Initialize the event->cpu only on success. To be more reasonable and keep consistent with other PMUs. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-24drivers/perf: hisi_pcie: Check the type first in pmu::event_init()Yicong Yang
Check whether the event type matches the PMU type firstly in pmu::event_init() before touching the event. Otherwise we'll change the events of others and lead to incorrect results. Since in perf_init_event() we may call every pmu's event_init() in a certain case, we should not modify the event if it's not ours. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231024092954.42297-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-10-19drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for ↵Hao Chen
hisi_hns3_pmu uninit process When tearing down a 'hisi_hns3' PMU, we mistakenly run the CPU hotplug callbacks after the device has been unregistered, leading to fireworks when we try to execute empty function callbacks within the driver: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 | CPU: 0 PID: 15 Comm: cpuhp/0 Tainted: G W O 5.12.0-rc4+ #1 | Hardware name: , BIOS KpxxxFPGA 1P B600 V143 04/22/2021 | pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--) | pc : perf_pmu_migrate_context+0x98/0x38c | lr : perf_pmu_migrate_context+0x94/0x38c | | Call trace: | perf_pmu_migrate_context+0x98/0x38c | hisi_hns3_pmu_offline_cpu+0x104/0x12c [hisi_hns3_pmu] Use cpuhp_state_remove_instance_nocalls() instead of cpuhp_state_remove_instance() so that the notifiers don't execute after the PMU device has been unregistered. Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20231019091352.998964-1-shaojijie@huawei.com [will: Rewrote commit message] Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16drivers/perf: hisi: Schedule perf session according to localityYicong Yang
The PCIe PMUs locate on different NUMA node but currently we don't consider it and likely stack all the sessions on the same CPU: [root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask 0 0 0 0 0 0 This can be optimize a bit to use a local CPU for the PMU. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230815131010.2147-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16drivers/perf: hisi: Add support for HiSilicon UC PMU driverJunhao He
On HiSilicon Hip09 platform, there are 4 UC (unified cache) modules on each chip CCL (CPU Cluster). UC is a cache that provides coherence between NUMA and UMA domains. It is located between L2 and Memory System. Many PMU events are supported. Let's support the UC PMU driver using the HiSilicon uncore PMU framework. * rd_req_en : rd_req_en is the abbreviation of read request tracetag enable and allows user to count only read operations. Details are listed in the hisi-pmu document at Documentation/admin-guide/perf/hisi-pmu.rst * srcid_en & srcid: Allows users to filter statistical information based on specific CPU/ICL by srcid. srcid_en depends on rd_req_en being enabled. * uring_channel: Allows users to filter statistical information based on the specified tx request uring channel. uring_channel only supported events: [0x47 ~ 0x59]. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driverJunhao He
Compared to the original PA device, H60PA offers higher bandwidth. The H60PA is a new device and we use HID to differentiate them. The events supported by PAv3 and PAv2 are different. The PAv3 PMU removed some events which are supported by PAv2 PMU. The older PA PMU driver will probe v3 as v2. Therefore PA events displayed by "perf list" cannot work properly. We add the HISI0275 HID for PAv3 PMU to distinguish different. For each H60PA PMU, except for the overflow interrupt register, other functions of the H60PA PMU are the same as the original PA PMU module. It has 8-programable counters and each counter is free-running. Interrupt is supported to handle counter (64-bits) overflow. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-09drivers/perf: hisi: Don't migrate perf to the CPU going to teardownJunhao He
The driver needs to migrate the perf context if the current using CPU going to teardown. By the time calling the cpuhp::teardown() callback the cpu_online_mask() hasn't updated yet and still includes the CPU going to teardown. In current driver's implementation we may migrate the context to the teardown CPU and leads to the below calltrace: ... [ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008 [ 368.113699][ T932] Call trace: [ 368.116834][ T932] __switch_to+0x7c/0xbc [ 368.120924][ T932] __schedule+0x338/0x6f0 [ 368.125098][ T932] schedule+0x50/0xe0 [ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24 [ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc [ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30 [ 368.144573][ T932] mutex_lock+0x50/0x60 [ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0 [ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu] [ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650 [ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190 [ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0 [ 368.175099][ T932] kthread+0x108/0x13c [ 368.179012][ T932] ret_from_fork+0x10/0x18 ... Use function cpumask_any_but() to find one correct active cpu to fixes this issue. Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230608114326.27649-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>