linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-12-14	Merge branch 'for-next/perf-user-counter-access' into for-next/perf	Will Deacon
	* for-next/perf-user-counter-access: Documentation: arm64: Document PMU counters access from userspace arm64: perf: Enable PMU counter userspace access for perf event arm64: perf: Add userspace counter access disable switch perf: Add a counter for number of user access events in context x86: perf: Move RDPMC event flag to a common definition
2021-12-14	Merge branch 'for-next/perf-smmu' into for-next/perf	Will Deacon
	* for-next/perf-smmu: perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding
2021-12-14	Merge branch 'for-next/perf-hisi' into for-next/perf	Will Deacon
	* for-next/perf-hisi: drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver
2021-12-14	Merge branch 'for-next/perf-cn10k' into for-next/perf	Will Deacon
	* for-next/perf-cn10k: dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support
2021-12-14	Merge branch 'for-next/perf-cmn' into for-next/perf	Will Deacon
	* for-next/perf-cmn: perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses perf/arm-cmn: Optimise DTM counter reads perf/arm-cmn: Refactor DTM handling perf/arm-cmn: Streamline node iteration perf/arm-cmn: Refactor node ID handling perf/arm-cmn: Drop compile-test restriction perf/arm-cmn: Account for NUMA affinity perf/arm-cmn: Fix CPU hotplug unregistration
2021-12-14	ASoC: rt5682: fix the wrong jack type detected	Derek Fang
	Some powers were changed during the jack insert detection and clk's enable/disable in CCF. If in parallel, the influence has a chance to detect the wrong jack type, so add a lock. Signed-off-by: Derek Fang <derek.fang@realtek.com> Link: https://lore.kernel.org/r/20211214105033.471-1-derek.fang@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-12-14	arm64: atomics: lse: define RETURN ops in terms of FETCH ops	Mark Rutland
	The FEAT_LSE atomic instructions include LD* instructions which return the original value of a memory location can be used to directly implement FETCH opertations. Each RETURN op is implemented as a copy of the corresponding FETCH op with a trailing instruction to generate the new value of the memory location. We only directly implement _fetch_add(), for which we have a trailing `add` instruction. As the compiler has no visibility of the `add`, this leads to less than optimal code generation when consuming the result. For example, the compiler cannot constant-fold the addition into later operations, and currently GCC 11.1.0 will compile: return __lse_atomic_sub_return(1, v) == 0; As: mov w1, #0xffffffff ldaddal w1, w2, [x0] add w1, w1, w2 cmp w1, #0x0 cset w0, eq // eq = none ret This patch improves this by replacing the `add` with C addition after the inline assembly block, e.g. ret += i; This allows the compiler to manipulate `i`. This permits the compiler to merge the `add` and `cmp` for the above, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x0] cmp w1, #0x1 cset w0, eq // eq = none ret With this change the assembly for each RETURN op is identical to the corresponding FETCH op (including barriers and clobbers) so I've removed the inline assembly and rewritten each RETURN op in terms of the corresponding FETCH op, e.g. \| static inline void __lse_atomic_add_return(int i, atomic_t *v) \| { \| return __lse_atomic_fetch_add(i, v) + i \| } The new construction does not adversely affect the common case, and before and after this patch GCC 11.1.0 can compile: __lse_atomic_add_return(i, v) As: ldaddal w0, w2, [x1] add w0, w0, w2 ... while having the freedom to do better elsewhere. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	arm64: atomics: lse: improve constraints for simple ops	Mark Rutland
	We have overly conservative assembly constraints for the basic FEAT_LSE atomic instructions, and using more accurate and permissive constraints will allow for better code generation. The FEAT_LSE basic atomic instructions have come in two forms: LD{op}{order}{size} <Rs>, <Rt>, [<Rn>] ST{op}{order}{size} <Rs>, [<Rn>] The ST* forms are aliases of the LD* forms where: ST{op}{order}{size} <Rs>, [<Rn>] Is: LD{op}{order}{size} <Rs>, XZR, [<Rn>] For either form, both <Rs> and <Rn> are read but not written back to, and <Rt> is written with the original value of the memory location. Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written after the other register value(s) are consumed. There are no UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or <Rn> are the same register. Our current inline assembly always uses <Rs> == <Rt>, treating this register as both an input and an output (using a '+r' constraint). This forces the compiler to do some unnecessary register shuffling and/or redundant value generation. For example, the compiler cannot reuse the <Rs> value, and currently GCC 11.1.0 will compile: __lse_atomic_add(1, a); __lse_atomic_add(1, b); __lse_atomic_add(1, c); As: mov w3, #0x1 mov w4, w3 stadd w4, [x0] mov w0, w3 stadd w0, [x1] stadd w3, [x2] We can improve this with more accurate constraints, separating <Rs> and <Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an output-only value ('=r'). As <Rt> is written back after <Rs> is consumed, it does not need to be earlyclobber ('=&r'), leaving the compiler free to use the same register for both <Rs> and <Rt> where this is desirable. At the same time, the redundant 'r' constraint for `v` is removed, as the `+Q` constraint is sufficient. With this change, the above example becomes: mov w3, #0x1 stadd w3, [x0] stadd w3, [x1] stadd w3, [x2] I've made this change for the non-value-returning and FETCH ops. The RETURN ops have a multi-instruction sequence for which we cannot use the same constraints, and a subsequent patch will rewrite hte RETURN ops in terms of the FETCH ops, relying on the ability for the compiler to reuse the <Rs> value. This is intended as an optimization. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	arm64: atomics: lse: define ANDs in terms of ANDNOTs	Mark Rutland
	The FEAT_LSE atomic instructions include atomic bit-clear instructions (`ldclr` and `stclr`) which can be used to directly implement ANDNOT operations. Each AND op is implemented as a copy of the corresponding ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the `i` argument. As the compiler has no visibility of the `mvn`, this leads to less than optimal code generation when generating `i` into a register. For example, __lse_atomic_fetch_and(0xf, v) can be compiled to: mov w1, #0xf mvn w1, w1 ldclral w1, w1, [x2] This patch improves this by replacing the `mvn` with NOT in C before the inline assembly block, e.g. i = ~i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xfffffff0 ldclral w1, w1, [x2] With this change the assembly for each AND op is identical to the corresponding ANDNOT op (including barriers and clobbers), so I've removed the inline assembly and rewritten each AND op in terms of the corresponding ANDNOT op, e.g. \| static inline void __lse_atomic_and(int i, atomic_t *v) \| { \| return __lse_atomic_andnot(~i, v); \| } This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	arm64: atomics lse: define SUBs in terms of ADDs	Mark Rutland
	The FEAT_LSE atomic instructions include atomic ADD instructions (`stadd` and `ldadd`), but do not include atomic SUB instructions, so we must build all of the SUB operations using the ADD instructions. We open-code these today, with each SUB op implemented as a copy of the corresponding ADD op with a leading `neg` instruction in the inline assembly to negate the `i` argument. As the compiler has no visibility of the `neg`, this leads to less than optimal code generation when generating `i` into a register. For example, __les_atomic_fetch_sub(1, v) can be compiled to: mov w1, #0x1 neg w1, w1 ldaddal w1, w1, [x2] This patch improves this by replacing the `neg` with negation in C before the inline assembly block, e.g. i = -i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x2] With this change the assembly for each SUB op is identical to the corresponding ADD op (including barriers and clobbers), so I've removed the inline assembly and rewritten each SUB op in terms of the corresponding ADD op, e.g. \| static inline void __lse_atomic_sub(int i, atomic_t *v) \| { \| __lse_atomic_add(-i, v); \| } For clarity I've moved the definition of each SUB op immediately after the corresponding ADD op, and used a single macro to create the RETURN forms of both ops. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	arm64: atomics: format whitespace consistently	Mark Rutland
	The code for the atomic ops is formatted inconsistently, and while this is not a functional problem it is rather distracting when working on them. Some have ops have consistent indentation, e.g. \| #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ \| static inline int __lse_atomic_add_return##name(int i, atomic_t v) \ \| { \ \| u32 tmp; \ \| \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ \| " add %w[i], %w[i], %w[tmp]" \ \| : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ \| : "r" (v) \ \| : cl); \ \| \ \| return i; \ \| } While others have negative indentation for some lines, and/or have misaligned trailing backslashes, e.g. \| static inline void __lse_atomic_##op(int i, atomic_t v) \ \| { \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " " #asm_op " %w[i], %[v]\n" \ \| : [i] "+r" (i), [v] "+Q" (v->counter) \ \| : "r" (v)); \ \| } This patch makes the indentation consistent and also aligns the trailing backslashes. This makes the code easier to read for those (like myself) who are easily distracted by these inconsistencies. This is intended as a cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	Merge branch 'mlxsw-fixes'	David S. Miller
	Ido Schimmel says: ==================== mlxsw: MAC profiles occupancy fix Patch #1 fixes a router interface (RIF) MAC profiles occupancy bug that was merged in the last cycle. Patch #2 adds a selftest that fails without the fix. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	selftests: mlxsw: Add a test case for MAC profiles consolidation	Danielle Ratson
	Add a test case to cover the bug fixed by the previous patch. Edit the MAC address of one netdev so that it matches the MAC address of the second netdev. Verify that the two MAC profiles were consolidated by testing that the MAC profiles occupancy decreased by one. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	mlxsw: spectrum_router: Consolidate MAC profiles when possible	Danielle Ratson
	Currently, when setting a router interface (RIF) MAC address while the MAC profile is not shared with other RIFs, the profile is edited so that the new MAC address is assigned to it. This does not take into account a situation in which the new MAC address already matches an existing MAC profile. In that situation, two MAC profiles will be occupied even though they hold MAC addresses from the same profile. In order to prevent that, add a check to ensure that editing a MAC profile takes place only when the new MAC address does not match an existing profile. Fixes: 605d25cd782a6 ("mlxsw: spectrum_router: Add RIF MAC profiles support") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Tested-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	rds: memory leak in __rds_conn_create()	Hangyu Hua
	__rds_conn_create() did not release conn->c_path when loop_trans != 0 and trans->t_prefer_loopback != 0 and is_outgoing == 0. Fixes: aced3ce57cd3 ("RDS tcp loopback connection can hang") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	Merge tag 'mac80211-for-net-2021-12-14' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A fairly large number of fixes this time: * fix a station info memory leak on insert collisions * a rate control fix for retransmissions * two aggregation setup fixes * reload current regdomain when reloading database * a locking fix in regulatory work * a probe request allocation size fix in mac80211 * apply TCP vs. aggregation (sk pacing) on mesh * fix ordering of channel context update vs. station state * set up skb->dev for mesh forwarding properly * track QoS data frames only for admission control to avoid out-of-bounds read (found by syzbot) * validate extended element ID vs. existing data to avoid out-of-bounds read (found by syzbot) * fix locking in mac80211 aggregation TX setup * fix traffic stall after HW restart when TXQs are used * fix ordering of reconfig/restart after HW restart * fix interface type for extended aggregation capability lookup ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-13 This series contains updates to iavf driver only. Dan Carpenter fixes some missing mutex unlocking. Stefan Assmann restores stopping watchdog from overriding to reset state. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	flow_offload: return EOPNOTSUPP for the unsupported mpls action type	Baowen Zheng
	We need to return EOPNOTSUPP for the unsupported mpls action type when setup the flow action. In the original implement, we will return 0 for the unsupported mpls action type, actually we do not setup it and the following actions to the flow action entry. Fixes: 9838b20a7fb2 ("net: sched: take rtnl lock in tc_setup_flow_action()") Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	mmc: sdhci-tegra: Fix switch to HS400ES mode	Prathamesh Shete
	When CMD13 is sent after switching to HS400ES mode, the bus is operating at either MMC_HIGH_26_MAX_DTR or MMC_HIGH_52_MAX_DTR. To meet Tegra SDHCI requirement at HS400ES mode, force SDHCI interface clock to MMC_HS200_MAX_DTR (200 MHz) so that host controller CAR clock and the interface clock are rate matched. Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: dfc9700cef77 ("mmc: tegra: Implement HS400 enhanced strobe") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211214113653.4631-1-pshete@nvidia.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-12-14	net: stmmac: fix tc flower deletion for VLAN priority Rx steering	Ong Boon Leong
	To replicate the issue:- 1) Add 1 flower filter for VLAN Priority based frame steering:- $ IFDEVNAME=eth0 $ tc qdisc add dev $IFDEVNAME ingress $ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 $ tc filter add dev $IFDEVNAME parent ffff: protocol 802.1Q \ flower vlan_prio 0 hw_tc 0 2) Get the 'pref' id $ tc filter show dev $IFDEVNAME ingress 3) Delete a specific tc flower record (say pref 49151) $ tc filter del dev $IFDEVNAME parent ffff: pref 49151 From dmesg, we will observe kernel NULL pointer ooops [ 197.170464] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 197.171367] #PF: supervisor read access in kernel mode [ 197.171367] #PF: error_code(0x0000) - not-present page [ 197.171367] PGD 0 P4D 0 [ 197.171367] Oops: 0000 [#1] PREEMPT SMP NOPTI <snip> [ 197.171367] RIP: 0010:tc_setup_cls+0x20b/0x4a0 [stmmac] <snip> [ 197.171367] Call Trace: [ 197.171367] <TASK> [ 197.171367] ? __stmmac_disable_all_queues+0xa8/0xe0 [stmmac] [ 197.171367] stmmac_setup_tc_block_cb+0x70/0x110 [stmmac] [ 197.171367] tc_setup_cb_destroy+0xb3/0x180 [ 197.171367] fl_hw_destroy_filter+0x94/0xc0 [cls_flower] The above issue is due to previous incorrect implementation of tc_del_vlan_flow(), shown below, that uses flow_cls_offload_flow_rule() to get struct flow_rule rule which is no longer valid for tc filter delete operation. struct flow_rule rule = flow_cls_offload_flow_rule(cls); struct flow_dissector *dissector = rule->match.dissector; So, to ensure tc_del_vlan_flow() deletes the right VLAN cls record for earlier configured RX queue (configured by hw_tc) in tc_add_vlan_flow(), this patch introduces stmmac_rfs_entry as driver-side flow_cls_offload record for 'RX frame steering' tc flower, currently used for VLAN priority. The implementation has taken consideration for future extension to include other type RX frame steering such as EtherType based. v2: - Clean up overly extensive backtrace and rewrite git message to better explain the kernel NULL pointer issue. Fixes: 0e039f5cf86c ("net: stmmac: add RX frame steering based on VLAN priority in tc flower") Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14	drivers/perf: hisi: Add driver for HiSilicon PCIe PMU	Qi Liu
	PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported to sample bandwidth, latency, buffer occupation etc. Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is registered as a PMU in /sys/bus/event_source/devices, so users can select target PMU, and use filter to do further sets. Filtering options contains: event - select the event. port - select target Root Ports. Information of Root Ports are shown under sysfs. bdf - select requester_id of target EP device. trig_len - set trigger condition for starting event statistics. trig_mode - set trigger mode. 0 means starting to statistic when bigger than trigger condition, and 1 means smaller. thr_len - set threshold for statistics. thr_mode - set threshold mode. 0 means count when bigger than threshold, and 1 means smaller. Acked-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	docs: perf: Add description for HiSilicon PCIe PMU driver	Qi Liu
	PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported on HiSilicon HIP09 platform. Document it to provide guidance on how to use it. Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-2-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	PCI/MSI: Clear PCI_MSIX_FLAGS_MASKALL on error	Thomas Gleixner
	PCI_MSIX_FLAGS_MASKALL is set in the MSI-X control register at MSI-X interrupt setup time. It's cleared on success, but the error handling path only clears the PCI_MSIX_FLAGS_ENABLE bit. That's incorrect as the reset state of the PCI_MSIX_FLAGS_MASKALL bit is zero. That can be observed via lspci: Capabilities: [b0] MSI-X: Enable- Count=67 Masked+ Clear the bit in the error path to restore the reset state. Fixes: 438553958ba1 ("PCI/MSI: Enable and mask MSI-X early") Reported-by: Stefan Roese <sr@denx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Stefan Roese <sr@denx.de> Cc: linux-pci@vger.kernel.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Marek Vasut <marex@denx.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87tufevoqx.ffs@tglx
2021-12-14	PCI/MSI: Mask MSI-X vectors only on success	Stefan Roese
	Masking all unused MSI-X entries is done to ensure that a crash kernel starts from a clean slate, which correponds to the reset state of the device as defined in the PCI-E specificion 3.0 and later: Vector Control for MSI-X Table Entries -------------------------------------- "00: Mask bit: When this bit is set, the function is prohibited from sending a message using this MSI-X Table entry. ... This bit’s state after reset is 1 (entry is masked)." A Marvell NVME device fails to deliver MSI interrupts after trying to enable MSI-X interrupts due to that masking. It seems to take the MSI-X mask bits into account even when MSI-X is disabled. While not specification compliant, this can be cured by moving the masking into the success path, so that the MSI-X table entries stay in device reset state when the MSI-X setup fails. [ tglx: Move it into the success path, add comment and amend changelog ] Fixes: aa8092c1d1f1 ("PCI/MSI: Mask all unused MSI-X entries") Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-pci@vger.kernel.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Marek Vasut <marex@denx.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211210161025.3287927-1-sr@denx.de
2021-12-14	dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings	Bhaskara Budiredla
	Add device tree bindings for Last-level-cache Tag-and-data (LLC-TAD) unit PMU for Marvell CN10K SoCs. Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211115043506.6679-3-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	drivers: perf: Add LLC-TAD perf counter support	Bhaskara Budiredla
	This driver adds support for Last-level cache tag-and-data unit (LLC-TAD) PMU that is featured in some of the Marvell's CN10K infrastructure silicons. The LLC is divided into 2N slices distributed across N Mesh tiles in a single-socket configuration. The driver always configures the same counter for all of the TADs. The user would end up effectively reserving one of eight counters in every TAD to look across all TADs. The occurrences of events are aggregated and presented to the user at the end of an application run. The driver does not provide a way for the user to partition TADs so that different TADs are used for different applications. The event counters are zeroed to start event counting to avoid any rollover issues. TAD perf counters are 64-bit, so it's not currently possible to overflow event counters at current mesh and core frequencies. To measure tad pmu events use perf tool stat command. For instance: perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application> perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application> Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	arm64/xor: use EOR3 instructions when available	Ard Biesheuvel
	Use the EOR3 instruction to implement xor_blocks() if the instruction is available, which is the case if the CPU implements the SHA-3 extension. This is about 20% faster on Apple M1 when using the 5-way version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-14	powerpc/module_64: Fix livepatching for RO modules	Russell Currey
	Livepatching a loaded module involves applying relocations through apply_relocate_add(), which attempts to write to read-only memory when CONFIG_STRICT_MODULE_RWX=y. Work around this by performing these writes through the text poke area by using patch_instruction(). R_PPC_REL24 is the only relocation type generated by the kpatch-build userspace tool or klp-convert kernel tree that I observed applying a relocation to a post-init module. A more comprehensive solution is planned, but using patch_instruction() for R_PPC_REL24 on should serve as a sufficient fix. This does have a performance impact, I observed ~15% overhead in module_load() on POWER8 bare metal with checksum verification off. Fixes: c35717c71e98 ("powerpc: Set ARCH_HAS_STRICT_MODULE_RWX") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Tested-by: Joe Lawrence <joe.lawrence@redhat.com> [mpe: Check return codes from patch_instruction()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211214121248.777249-1-mpe@ellerman.id.au
2021-12-14	perf/smmuv3: Synthesize IIDR from CoreSight ID registers	Robin Murphy
	The SMMU_PMCG_IIDR register was not present in older revisions of the Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of fields from several PIDR registers, allowing us to present a standardized identifier to userspace. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/smmuv3: Add devicetree support	Jean-Philippe Brucker
	Add device-tree support to the SMMUv3 PMCG driver. Signed-off-by: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	dt-bindings: Add Arm SMMUv3 PMCG binding	Jean-Philippe Brucker
	Add binding for the Arm SMMUv3 PMU. Each node represents a PMCG, and is placed as a sibling node of the SMMU. Although the PMCGs registers may be within the SMMU MMIO region, they are separate devices, and there can be multiple PMCG devices for each SMMU (for example one for the TCU and one for each TBU). Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-2-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Add debugfs topology info	Robin Murphy
	In general, detailed performance analysis will require knoweldge of the the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc. are connected to each node. However for certain development and bringup tasks it can be useful to have a quick overview of the CMN internal topology to hand too. Add a debugfs file to map this out. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Add CI-700 Support	Robin Murphy
	Add the identifiers and events for the CI-700 coherent interconnect. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/28f566ab23a83733c6c9ef9414c010b760b4549c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	dt-bindings: perf: arm-cmn: Add CI-700	Robin Murphy
	CI-700 is a new client-level coherent interconnect derived from the enterprise-level CMN family, and shares the same PMU design. CC: devicetree@vger.kernel.org Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/5f0b372f808f1468e6d9500cedafbecd10254674.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Support new IP features	Robin Murphy
	The second generation of CMN IPs add new node types and significantly expand the configuration space with options for extra device ports on edge XPs, either plumbed into the regular DTM or with extra dedicated DTMs to monitor them, plus larger (and smaller) mesh sizes. Add basic support for pulling this new information out of the hardware, piping it around as necessary, and handling (most of) the new choices. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e58b495bcc7deec3882be4bac910ed0bf6979674.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Demarcate CMN-600 specifics	Robin Murphy
	In preparation for supporting newer CMN products, let's introduce a means to differentiate the features and events which are specific to a particular IP from those which remain common to the whole family. The newer designs have also smoothed off some of the rough edges in terms of discoverability, so separate out the parts of the flow which have effectively now become CMN-600 quirks. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9f6368cdca4c821d801138939508a5bba54ccabb.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Move group validation data off-stack	Robin Murphy
	With the value of CMN_MAX_DTMS increasing significantly, our validation data structure is set to get quite big. Technically we could pack it at least twice as densely, since we only need around 19 bits of information per DTM, but that makes the code even more mind-bogglingly impenetrable, and even half of "quite big" may still be uncomfortably large for a stack frame (~1KB). Just move it to an off-stack allocation instead. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0cabff2e5839ddc0979e757c55515966f65359e4.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Optimise DTC counter accesses	Robin Murphy
	In cases where we do know which DTC domain a node belongs to, we can skip initialising or reading the global count in DTCs where we know it won't change. The machinery to achieve that is mostly in place already, so finish hooking it up by converting the vestigial domain tracking to propagate suitable bitmaps all the way through to events. Note that this does not allow allocating such an unused counter to a different event on that DTC, because that is a flippin' nightmare. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/51d930fd945ef51c81f5889ccca055c302b0a1d0.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Optimise DTM counter reads	Robin Murphy
	When multiple nodes of the same type are connected to the same XP (particularly in CAL configurations), it seems that they are likely to be consecutive in logical ID. Therefore, we're likely to gain a small benefit from an easy tweak to optimise out consecutive reads of the same set of DTM counters for an aggregated event. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Refactor DTM handling	Robin Murphy
	Untangle DTMs from XPs into a dedicated abstraction. This helps make things a little more obvious and robust, but primarily paves the way for further development where new IPs can grow extra DTMs per XP. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9cca18b1b98f482df7f1aaf3d3213e7f39500423.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Streamline node iteration	Robin Murphy
	Refactor the places where we scan through the set of nodes to switch from explicit array indexing to pointer-based iteration. This leads to slightly simpler object code, but also makes the source less dense and more pleasant for further development. It also unearths an almost-bug in arm_cmn_event_init() where we've been depending on the "array index" of NULL relative to cmn->dns being a sufficiently large number, yuck. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ee0c9eda9a643f46001ac43aadf3f0b1fd5660dd.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Refactor node ID handling	Robin Murphy
	Add a bit more abstraction for the places where we decompose node IDs. This will help keep things nice and manageable when we come to add yet more variables which affect the node ID format. Also use the opportunity to move the rest of the low-level node management helpers back up to the logical place they were meant to be - how they ended up buried right in the middle of the event-related definitions is somewhat of a mystery... Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/a2242a8c3c96056c13a04ae87bf2047e5e64d2d9.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Drop compile-test restriction	Robin Murphy
	Although CMN is currently (and overwhelmingly likely to remain) deployed in arm64-only (modulo userspace) systems, the 64-bit "dependency" for compile-testing was just laziness due to heavy reliance on readq/writeq accessors. Since we only need one extra include for robustness in that regard, let's pull that in, widen the compile-test coverage, and fix up the smattering of type laziness that that brings to light. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Account for NUMA affinity	Robin Murphy
	On a system with multiple CMN meshes, ideally we'd want to access each PMU from within its own mesh, rather than with a long CML round-trip, wherever feasible. Since such a system is likely to be presented as multiple NUMA nodes, let's also hope a proximity domain is specified for each CMN programming interface, and use that to guide our choice of IRQ affinity to favour a node-local CPU where possible. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/32438b0d016e0649d882d47d30ac2000484287b9.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf/arm-cmn: Fix CPU hotplug unregistration	Robin Murphy
	Attempting to migrate the PMU context after we've unregistered the PMU device, or especially if we never successfully registered it in the first place, is a woefully bad idea. It's also fundamentally pointless anyway. Make sure to unregister an instance from the hotplug handler without invoking the teardown callback. Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/2c221d745544774e4b07583b65b5d4d94f7e0fe4.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	Documentation: arm64: Document PMU counters access from userspace	Raphael Gault
	Add documentation to describe the access to the pmu hardware counters from userspace. Signed-off-by: Raphael Gault <raphael.gault@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-6-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	arm64: perf: Enable PMU counter userspace access for perf event	Rob Herring
	Arm PMUs can support direct userspace access of counters which allows for low overhead (i.e. no syscall) self-monitoring of tasks. The same feature exists on x86 called 'rdpmc'. Unlike x86, userspace access will only be enabled for thread bound events. This could be extended if needed, but simplifies the implementation and reduces the chances for any information leaks (which the x86 implementation suffers from). PMU EL0 access will be enabled when an event with userspace access is part of the thread's context. This includes when the event is not scheduled on the PMU. There's some additional overhead clearing dirty counters when access is enabled in order to prevent leaking disabled counter data from other tasks. Unlike x86, enabling of userspace access must be requested with a new attr bit: config1:1. If the user requests userspace access with 64-bit counters, then the event open will fail if the h/w doesn't support 64-bit counters. Chaining is not supported with userspace access. The modes for config1 are as follows: config1 = 0 : user access disabled and always 32-bit config1 = 1 : user access disabled and always 64-bit (using chaining if needed) config1 = 2 : user access enabled and always 32-bit config1 = 3 : user access enabled and always 64-bit Based on work by Raphael Gault <raphael.gault@arm.com>, but has been completely re-written. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-5-robh@kernel.org [will: Made armv8pmu_proc_user_access_handler() static] Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	arm64: perf: Add userspace counter access disable switch	Rob Herring
	Like x86, some users may want to disable userspace PMU counter altogether. Add a sysctl 'perf_user_access' file to control userspace counter access. The default is '0' which is disabled. Writing '1' enables access. Note that x86 supports globally enabling user access by writing '2' to /sys/bus/event_source/devices/cpu/rdpmc. As there's not existing userspace support to worry about, this shouldn't be necessary for Arm. It could be added later if the need arises. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-perf-users@vger.kernel.org Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-4-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	perf: Add a counter for number of user access events in context	Rob Herring
	On arm64, user space counter access will be controlled differently compared to x86. On x86, access in the strictest mode is enabled for all tasks in an MM when any event is mmap'ed. For arm64, access is explicitly requested for an event and only enabled when the event's context is active. This avoids hooks into the arch context switch code and gives better control of when access is enabled. In order to configure user space access when the PMU is enabled, it is necessary to know if any event (currently active or not) in the current context has user space accessed enabled. Add a counter similar to other counters in the context to avoid walking the event list every time. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-3-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14	x86: perf: Move RDPMC event flag to a common definition	Rob Herring
	In preparation to enable user counter access on arm64 and to move some of the user access handling to perf core, create a common event flag for user counter access and convert x86 to use it. Since the architecture specific flags start at the LSB, starting at the MSB for common flags. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-perf-users@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211208201124.310740-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>