linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-05-12	x86/msr-index: Define INTEGRITY_CAPABILITIES MSR	Tony Luck
	The INTEGRITY_CAPABILITIES MSR is enumerated by bit 2 of the CORE_CAPABILITIES MSR. Add defines for the CORE_CAPS enumeration as well as for the integrity MSR. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220506225410.1652287-3-tony.luck@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	x86/microcode/intel: Expose collect_cpu_info_early() for IFS	Jithu Joseph
	IFS is a CPU feature that allows a binary blob, similar to microcode, to be loaded and consumed to perform low level validation of CPU circuitry. In fact, it carries the same Processor Signature (family/model/stepping) details that are contained in Intel microcode blobs. In support of an IFS driver to trigger loading, validation, and running of these tests blobs, make the functionality of cpu_signatures_match() and collect_cpu_info_early() available outside of the microcode driver. Add an "intel_" prefix and drop the "_early" suffix from collect_cpu_info_early() and EXPORT_SYMBOL_GPL() it. Add declaration to x86 <asm/cpu.h> Make cpu_signatures_match() an inline function in x86 <asm/cpu.h>, and also give it an "intel_" prefix. No functional change intended. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Co-developed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220506225410.1652287-2-tony.luck@intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	platform/x86: asus-nb-wmi: Add keymap for MyASUS key	Luca Stefani
	This event is triggered by pressing Fn+F12 on ASUS Zenbook UX425JA Map it to KEY_PROG1 to allow userspace to configure it Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> Link: https://lore.kernel.org/r/20220506122536.113566-2-luca.stefani.ge1@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	platform/x86: asus-wmi: Update unknown code message	Luca Stefani
	Prepend 0x to the actual key code to specify it is already an hex value Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> Link: https://lore.kernel.org/r/20220506122536.113566-1-luca.stefani.ge1@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	Documentation/ABI: Add new attributes for mlxreg-io sysfs interfaces	Michael Shych
	Add documentation for the new attributes: - "phy_reset" - Reset PHY. - "mac_reset" - Reset MAC. - "qsfp_pwr_good" - The power status of QSFP ports. Signed-off-by: Michael Shych <michaelsh@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20220430115809.54565-4-michaelsh@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	platform/mellanox: Add support for new SN2201 system	Michael Shych
	The SN2201 is a highly integrated for one rack unit system with L3 management switches. It has 48 x 1Gbps RJ45 + 4 x 100G QSFP28 ports in a compact 1RU form factor. The system also including a serial port (RS-232 interface), an OOB port (1G/100M MDI interface) and USB ports for management functions. The processor used on SN2201 is Intel Atom®Processor C Series, C3338R which is one of the Denverton product families. System equipped with Nvidia®Spectrum-1 32x100GbE Ethernet switch. Features: - 48 ports RJ45 support 10/100/1000M speed. - Support 4 QSFP28 ports with 10/25/40/50/100G. - A USB port is available on SN2201. This port is used for image and File Management purposes - backing up and restoring images and config files - Provides flow control mechanism to ensure zero packet loss. Uses backpressure for half-duplex operation and IEEE802.3x for full duplex operation. - Cut-through and Store-and-Forward free switching mechanism. By default the mode is cut-through. - Standard 1U chassis height. - 19" rack mountable. - Extensive system LED and per port LEDs. - Redundant power supply. - 2 x AC Power Supply (one PSU is default, second PSU is optional). Signed-off-by: Michael Shych <michaelsh@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20220430115809.54565-3-michaelsh@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-12	perf/arm-cmn: Decode CAL devices properly in debugfs	Robin Murphy
	The debugfs code is lazy, and since it only keeps the bottom byte of each connect_info register to save space, it also treats the whole thing as the device_type since the other bits were reserved anyway. Upon closer inspection, though, this is no longer true on newer IP versions, so let's be good and decode the exact field properly. This should help it not get confused when a Component Aggregation Layer is present (which is already implied if Node IDs are found for both device addresses represented by the next two lines of the table). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6a13a6128a28cfe2eec6d09cf372a167ec9c3b65.1652274773.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-12	block: Fix the bio.bi_opf comment	Bart Van Assche
	Commit ef295ecf090d modified the Linux kernel such that the bottom bits of the bi_opf member contain the operation instead of the topmost bits. That commit did not update the comment next to bi_opf. Hence this patch. From commit ef295ecf090d: -#define bio_op(bio) ((bio)->bi_opf >> BIO_OP_SHIFT) +#define bio_op(bio) ((bio)->bi_opf & REQ_OP_MASK) Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Fixes: ef295ecf090d ("block: better op and flags encoding") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220511235152.1082246-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12	mtd: spi-nor: debugfs: fix format specifier	Michael Walle
	The intention was to print the JEDEC ID in the following format: nn nn nn In this case format specifier has to be "%*ph". Fix it. Fixes: 0257be79fc4a ("mtd: spi-nor: expose internal parameters via debugfs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Michael Walle <michael@walle.cc> Signed-off-by: Pratyush Yadav <p.yadav@ti.com> Link: https://lore.kernel.org/r/20220512112027.3771734-1-michael@walle.cc
2022-05-12	block: reorder the REQ_ flags	Christoph Hellwig
	Keep the op-specific flag last so that they are clearly separate from the generic flags. Various recent commits just kept adding new flags at the end. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220512061408.1826595-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12	io_uring: fix ordering of args in io_uring_queue_async_work	Dylan Yudaken
	Fix arg ordering in TP_ARGS macro, which fixes the output. Fixes: 502c87d65564c ("io-uring: Make tracepoints consistent.") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220512091834.728610-2-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12	arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs	Shreyas K K
	Add KRYO4XX gold/big cores to the list of CPUs that need the repeat TLBI workaround. Apply this to the affected KRYO4XX cores (rcpe to rfpe). The variant and revision bits are implementation defined and are different from the their Cortex CPU counterparts on which they are based on, i.e., (r0p0 to r3p0) is equivalent to (rcpe to rfpe). Signed-off-by: Shreyas K K <quic_shrekk@quicinc.com> Reviewed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Link: https://lore.kernel.org/r/20220512110134.12179-1-quic_shrekk@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-12	mlxsw: Avoid warning during ip6gre device removal	Amit Cohen
	IPv6 addresses which are used for tunnels are stored in a hash table with reference counting. When a new GRE tunnel is configured, the driver is notified and configures it in hardware. Currently, any change in the tunnel is not applied in the driver. It means that if the remote address is changed, the driver is not aware of this change and the first address will be used. This behavior results in a warning [1] in scenarios such as the following: # ip link add name gre1 type ip6gre local 2000::3 remote 2000::fffe tos inherit ttl inherit # ip link set name gre1 type ip6gre local 2000::3 remote 2000::ffff ttl inherit # ip link delete gre1 The change of the address is not applied in the driver. Currently, the driver uses the remote address which is stored in the 'parms' of the overlay device. When the tunnel is removed, the new IPv6 address is used, the driver tries to release it, but as it is not aware of the change, this address is not configured and it warns about releasing non existing IPv6 address. Fix it by using the IPv6 address which is cached in the IPIP entry, this address is the last one that the driver used, so even in cases such the above, the first address will be released, without any warning. [1]: WARNING: CPU: 1 PID: 2197 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2920 mlxsw_sp_ipv6_addr_put+0x146/0x220 [mlxsw_spectrum] ... CPU: 1 PID: 2197 Comm: ip Not tainted 5.17.0-rc8-custom-95062-gc1e5ded51a9a #84 Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021 RIP: 0010:mlxsw_sp_ipv6_addr_put+0x146/0x220 [mlxsw_spectrum] ... Call Trace: <TASK> mlxsw_sp2_ipip_rem_addr_unset_gre6+0xf1/0x120 [mlxsw_spectrum] mlxsw_sp_netdevice_ipip_ol_event+0xdb/0x640 [mlxsw_spectrum] mlxsw_sp_netdevice_event+0xc4/0x850 [mlxsw_spectrum] raw_notifier_call_chain+0x3c/0x50 call_netdevice_notifiers_info+0x2f/0x80 unregister_netdevice_many+0x311/0x6d0 rtnl_dellink+0x136/0x360 rtnetlink_rcv_msg+0x12f/0x380 netlink_rcv_skb+0x49/0xf0 netlink_unicast+0x233/0x340 netlink_sendmsg+0x202/0x440 ____sys_sendmsg+0x1f3/0x220 ___sys_sendmsg+0x70/0xb0 __sys_sendmsg+0x54/0xa0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: e846efe2737b ("mlxsw: spectrum: Add hash table for IPv6 address mapping") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220511115747.238602-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-12	arm64: cpufeature: remove duplicate ID_AA64ISAR2_EL1 entry	Kristina Martsenko
	The ID register table should have one entry per ID register but currently has two entries for ID_AA64ISAR2_EL1. Only one entry has an override, and get_arm64_ftr_reg() can end up choosing the other, causing the override to be ignored. Fix this by removing the duplicate entry. While here, also make the check in sort_ftr_regs() more strict so that duplicate entries can't be added in the future. Fixes: def8c222f054 ("arm64: Add support of PAuth QARMA3 architected algorithm") Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20220511162030.1403386-1-kristina.martsenko@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-05-12	drm/vc4: hdmi: Fix build error for implicit function declaration	Hui Tang
	drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_connector_detect’: drivers/gpu/drm/vc4/vc4_hdmi.c:228:7: error: implicit declaration of function ‘gpiod_get_value_cansleep’; did you mean ‘gpio_get_value_cansleep’? [-Werror=implicit-function-declaration] if (gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) ^~~~~~~~~~~~~~~~~~~~~~~~ gpio_get_value_cansleep CC [M] drivers/gpu/drm/vc4/vc4_validate.o CC [M] drivers/gpu/drm/vc4/vc4_v3d.o CC [M] drivers/gpu/drm/vc4/vc4_validate_shaders.o CC [M] drivers/gpu/drm/vc4/vc4_debugfs.o drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’: drivers/gpu/drm/vc4/vc4_hdmi.c:2883:23: error: implicit declaration of function ‘devm_gpiod_get_optional’; did you mean ‘devm_clk_get_optional’? [-Werror=implicit-function-declaration] vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN); ^~~~~~~~~~~~~~~~~~~~~~~ devm_clk_get_optional drivers/gpu/drm/vc4/vc4_hdmi.c:2883:59: error: ‘GPIOD_IN’ undeclared (first use in this function); did you mean ‘GPIOF_IN’? vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN); ^~~~~~~~ GPIOF_IN drivers/gpu/drm/vc4/vc4_hdmi.c:2883:59: note: each undeclared identifier is reported only once for each function it appears in cc1: all warnings being treated as errors Fixes: 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod") Signed-off-by: Hui Tang <tanghui20@huawei.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220510135148.247719-1-tanghui20@huawei.com
2022-05-12	net: bcmgenet: Check for Wake-on-LAN interrupt probe deferral	Florian Fainelli
	The interrupt controller supplying the Wake-on-LAN interrupt line maybe modular on some platforms (irq-bcm7038-l1.c) and might be probed at a later time than the GENET driver. We need to specifically check for -EPROBE_DEFER and propagate that error to ensure that we eventually fetch the interrupt descriptor. Fixes: 9deb48b53e7f ("bcmgenet: add WOL IRQ check") Fixes: 5b1f0e62941b ("net: bcmgenet: Avoid touching non-existent interrupt") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Stefan Wahren <stefan.wahren@i2se.com> Link: https://lore.kernel.org/r/20220511031752.2245566-1-f.fainelli@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-12	net: ethernet: mediatek: ppe: fix wrong size passed to memset()	Yang Yingliang
	'foe_table' is a pointer, the real size of struct mtk_foe_entry should be pass to memset(). Fixes: ba37b7caf1ed ("net: ethernet: mtk_eth_soc: add support for initializing the PPE") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20220511030829.3308094-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-11	Merge tag 'for-net-2022-05-11' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix the creation of hdev->name when index is greater than 9999 * tag 'for-net-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: Fix the creation of hdev->name ==================== Link: https://lore.kernel.org/r/20220512002901.823647-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-11	Merge tag 'wireless-2022-05-11' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v5.18 Second set of fixes for v5.18 and hopefully the last one. We have a new iwlwifi maintainer, a fix to rfkill ioctl interface and important fixes to both stack and two drivers. * tag 'wireless-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: rfkill: uapi: fix RFKILL_IOCTL_MAX_SIZE ioctl request definition nl80211: fix locking in nl80211_set_tx_bitrate_mask() mac80211_hwsim: call ieee80211_tx_prepare_skb under RCU protection mac80211_hwsim: fix RCU protected chanctx access mailmap: update Kalle Valo's email mac80211: Reset MBSSID parameters upon connection cfg80211: retrieve S1G operating channel number nl80211: validate S1G channel width mac80211: fix rx reordering with non explicit / psmp ack policy ath11k: reduce the wait time of 11d scan and hw scan while add interface MAINTAINERS: update iwlwifi driver maintainer iwlwifi: iwl-dbg: Use del_timer_sync() before freeing ==================== Link: https://lore.kernel.org/r/20220511154535.A1A12C340EE@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-11	Bluetooth: Fix the creation of hdev->name	Itay Iellin
	Set a size limit of 8 bytes of the written buffer to "hdev->name" including the terminating null byte, as the size of "hdev->name" is 8 bytes. If an id value which is greater than 9999 is allocated, then the "snprintf(hdev->name, sizeof(hdev->name), "hci%d", id)" function call would lead to a truncation of the id value in decimal notation. Set an explicit maximum id parameter in the id allocation function call. The id allocation function defines the maximum allocated id value as the maximum id parameter value minus one. Therefore, HCI_MAX_ID is defined as 10000. Signed-off-by: Itay Iellin <ieitayie@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-05-11	blk-iocost: combine local_stat and desc_stat to stat	Chengming Zhou
	When we flush usage, wait, indebt stat in iocg_flush_stat(), we use local_stat and desc_stat, which has no point since the leaf iocg only has local_stat and the inner iocg only has desc_stat. Also we don't need to flush percpu abs_vusage for these inner iocgs. This patch combine local_stat and desc_stat to stat, only flush percpu abs_vusage for active leaf iocgs, then build inner walk list to propagate. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220510034757.21761-1-zhouchengming@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-12	sched/tracing: Append prev_state to tp args instead	Delyan Kratunov
	Commit fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting sched_switch event, 2022-01-20) added a new prev_state argument to the sched_switch tracepoint, before the prev task_struct pointer. This reordering of arguments broke BPF programs that use the raw tracepoint (e.g. tp_btf programs). The type of the second argument has changed and existing programs that assume a task_struct* argument (e.g. for bpf_task_storage access) will now fail to verify. If we instead append the new argument to the end, all existing programs would continue to work and can conditionally extract the prev_state argument on supported kernel versions. Fixes: fa2c3254d7cf (sched/tracing: Don't re-read p->state when emitting sched_switch event, 2022-01-20) Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/c8a6930dfdd58a4a5755fc01732675472979732b.camel@fb.com
2022-05-11	i40e: i40e_main: fix a missing check on list iterator	Xiaomeng Tong
	The bug is here: ret = i40e_add_macvlan_filter(hw, ch->seid, vdev->dev_addr, &aq_err); The list iterator 'ch' will point to a bogus position containing HEAD if the list is empty or no element is found. This case must be checked before any use of the iterator, otherwise it will lead to a invalid memory access. To fix this bug, use a new variable 'iter' as the list iterator, while use the origin variable 'ch' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: 1d8d80b4e4ff6 ("i40e: Add macvlan support on i40e") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220510204846.2166999-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-11	net/sched: act_pedit: really ensure the skb is writable	Paolo Abeni
	Currently pedit tries to ensure that the accessed skb offset is writable via skb_unclone(). The action potentially allows touching any skb bytes, so it may end-up modifying shared data. The above causes some sporadic MPTCP self-test failures, due to this code: tc -n $ns2 filter add dev ns2eth$i egress \ protocol ip prio 1000 \ handle 42 fw \ action pedit munge offset 148 u8 invert \ pipe csum tcp \ index 100 The above modifies a data byte outside the skb head and the skb is a cloned one, carrying a TCP output packet. This change addresses the issue by keeping track of a rough over-estimate highest skb offset accessed by the action and ensuring such offset is really writable. Note that this may cause performance regressions in some scenarios, but hopefully pedit is not in the critical path. Fixes: db2c24175d14 ("act_pedit: access skb->data safely") Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/1fcf78e6679d0a287dd61bb0f04730ce33b3255d.1652194627.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-11	Merge branch 'exp.2022.05.11a' into HEAD	Paul E. McKenney
	exp.2022.05.11a: Expedited-grace-period latency-reduction updates.
2022-05-11	spi: stm32-qspi: flags management fixes	Mark Brown
	Merge series from patrice.chotard@foss.st.com <patrice.chotard@foss.st.com> Patrice Chotard <patrice.chotard@foss.st.com>: From: Patrice Chotard <patrice.chotard@foss.st.com> This series update flags management in the following cases: - In APM mode, don't take care of TCF and TEF flags - Always check TCF flag in stm32_qspi_wait_cmd() - Don't check BUSY flag when sending new command
2022-05-11	rcu: Move expedited grace period (GP) work to RT kthread_worker	Kalesh Singh
	Enabling CONFIG_RCU_BOOST did not reduce RCU expedited grace-period latency because its workqueues run at SCHED_OTHER, and thus can be delayed by normal processes. This commit avoids these delays by moving the expedited GP work items to a real-time-priority kthread_worker. This option is controlled by CONFIG_RCU_EXP_KTHREAD and disabled by default on PREEMPT_RT=y kernels which disable expedited grace periods after boot by unconditionally setting rcupdate.rcu_normal_after_boot=1. The results were evaluated on arm64 Android devices (6GB ram) running 5.10 kernel, and capturing trace data in critical user-level code. The table below shows the resulting order-of-magnitude improvements in synchronize_rcu_expedited() latency: ------------------------------------------------------------------------ \| \| workqueues \| kthread_worker \| Diff \| ------------------------------------------------------------------------ \| Count \| 725 \| 688 \| \| ------------------------------------------------------------------------ \| Min Duration (ns) \| 326 \| 447 \| 37.12% \| ------------------------------------------------------------------------ \| Q1 (ns) \| 39,428 \| 38,971 \| -1.16% \| ------------------------------------------------------------------------ \| Q2 - Median (ns) \| 98,225 \| 69,743 \| -29.00% \| ------------------------------------------------------------------------ \| Q3 (ns) \| 342,122 \| 126,638 \| -62.98% \| ------------------------------------------------------------------------ \| Max Duration (ns) \| 372,766,967 \| 2,329,671 \| -99.38% \| ------------------------------------------------------------------------ \| Avg Duration (ns) \| 2,746,353 \| 151,242 \| -94.49% \| ------------------------------------------------------------------------ \| Standard Deviation (ns) \| 19,327,765 \| 294,408 \| \| ------------------------------------------------------------------------ The below table show the range of maximums/minimums for synchronize_rcu_expedited() latency from all experiments: ------------------------------------------------------------------------ \| \| workqueues \| kthread_worker \| Diff \| ------------------------------------------------------------------------ \| Total No. of Experiments \| 25 \| 23 \| \| ------------------------------------------------------------------------ \| Largest Maximum (ns) \| 372,766,967 \| 2,329,671 \| -99.38% \| ------------------------------------------------------------------------ \| Smallest Maximum (ns) \| 38,819 \| 86,954 \| 124.00% \| ------------------------------------------------------------------------ \| Range of Maximums (ns) \| 372,728,148 \| 2,242,717 \| \| ------------------------------------------------------------------------ \| Largest Minimum (ns) \| 88,623 \| 27,588 \| -68.87% \| ------------------------------------------------------------------------ \| Smallest Minimum (ns) \| 326 \| 447 \| 37.12% \| ------------------------------------------------------------------------ \| Range of Minimums (ns) \| 88,297 \| 27,141 \| \| ------------------------------------------------------------------------ Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Tejun Heo <tj@kernel.org> Reported-by: Tim Murray <timmurray@google.com> Reported-by: Wei Wang <wvw@google.com> Tested-by: Kyle Lin <kylelin@google.com> Tested-by: Chunwei Lu <chunweilu@google.com> Tested-by: Lulu Wang <luluw@google.com> Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-05-11	x86/vsyscall: Remove CONFIG_LEGACY_VSYSCALL_EMULATE	Andy Lutomirski
	CONFIG_LEGACY_VSYSCALL_EMULATE is, as far as I know, only needed for the combined use of exotic and outdated debugging mechanisms with outdated binaries. At this point, no one should be using it. Eventually, dynamic switching of vsyscalls will be implemented, but this is much more complicated to support in EMULATE mode than XONLY mode. So let's force all the distros off of EMULATE mode. If anyone actually needs it, they can set vsyscall=emulate, and the kernel can then get away with refusing to support newer security models if that option is set. [ bp: Remove "we"s. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Florian Weimer <fweimer@redhat.com> Link: https://lore.kernel.org/r/898932fe61db6a9d61bc2458fa2f6049f1ca9f5c.1652290558.git.luto@kernel.org
2022-05-11	rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT	Uladzislau Rezki
	Currently both expedited and regular grace period stall warnings use a single timeout value that with units of seconds. However, recent Android use cases problem require a sub-100-millisecond expedited RCU CPU stall warning. Given that expedited RCU grace periods normally complete in far less than a single millisecond, especially for small systems, this is not unreasonable. Therefore introduce the CONFIG_RCU_EXP_CPU_STALL_TIMEOUT kernel configuration that defaults to 20 msec on Android and remains the same as that of the non-expedited stall warnings otherwise. It also can be changed in run-time via: /sys/.../parameters/rcu_exp_cpu_stall_timeout. [ paulmck: Default of zero to use CONFIG_RCU_STALL_TIMEOUT. ] Signed-off-by: Uladzislau Rezki <uladzislau.rezki@sony.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-05-11	Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes	Maarten Lankhorst
	Requested by Zack for vmwgfx fixes. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2022-05-11	thermal: int340x: Mode setting with new OS handshake	Srinivas Pandruvada
	With the new OS handshake introduced by commit: "c7ff29763989 ("thermal: int340x: Update OS policy capability handshake")", the "enabled" thermal zone mode doesn't work in the same way as previously. The "enabled" mode fails with -EINVAL when the new handshake is used. To address this issue, when the new OS UUID mask is set: - When the mode is "enabled", return 0 as the firmware already has the latest policy mask. - When the mode is "disabled", update the firmware with the UUID mask of zero. This way, the firmware can take over the thermal control. Also reset the OS UUID mask, which allows user space to update with new set of policies. Fixes: c7ff29763989 ("thermal: int340x: Update OS policy capability handshake") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog edits, removed unneeded parens ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-11	regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt	Miaoqian Lin
	of_node_get() returns a node with refcount incremented. Calling of_node_put() to drop the reference when not needed anymore. Fixes: 3784b6d64dc5 ("regulator: pfuze100: add pfuze100 regulator driver") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20220511113506.45185-1-linmq006@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-11	spi: stm32-qspi: Remove SR_BUSY bit check before sending command	Patrice Chotard
	Waiting for SR_BUSY bit when receiving a new command is not needed. SR_BUSY bit is already managed in the previous command treatment. Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Link: https://lore.kernel.org/r/20220511074644.558874-4-patrice.chotard@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-11	spi: stm32-qspi: Always check SR_TCF flags in stm32_qspi_wait_cmd()	Patrice Chotard
	Currently, SR_TCF flag is checked in case there is data, this criteria is not correct. SR_TCF flags is set when programmed number of bytes has been transferred to the memory device ("bytes" comprised command and data send to the SPI device). So even if there is no data, we must check SR_TCF flag. Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Link: https://lore.kernel.org/r/20220511074644.558874-3-patrice.chotard@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-11	spi: stm32-qspi: Fix wait_cmd timeout in APM mode	Patrice Chotard
	In APM mode, TCF and TEF flags are not set. To avoid timeout in stm32_qspi_wait_cmd(), don't check if TCF/TEF are set. Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Reported-by: eberhard.stoll@kontron.de Link: https://lore.kernel.org/r/20220511074644.558874-2-patrice.chotard@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-11	drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2)	Alex Deucher
	Check if the requested stable pstate matches the current one before changing it. This avoids changing the stable pstate on context destroy if the user never changed it in the first place via the IOCTL. v2: compare the current and requested rather than setting a flag (Lijo) Fixes: 8cda7a4f96e435 ("drm/amdgpu/UAPI: add new CTX OP to get/set stable pstates") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-05-11	Revert "drm/amd/pm: keep the BACO feature enabled for suspend"	Alex Deucher
	This reverts commit eaa090538e8d21801c6d5f94590c3799e6a528b5. Commit ebc002e3ee78 ("drm/amdgpu: don't use BACO for reset in S3") stops using BACO for reset during suspend, so it's no longer necessary to leave BACO enabled during suspend. This fixes resume from suspend on the navy flounder dGPU in the ASUS ROG Strix G513QY. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2008 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1982 Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-05-11	platform_data/mlxreg: Add field for notification callback	Michael Shych
	Add notification callback to inform caller that platform driver probing has been completed. It allows to caller to perform some initialization flow steps depending on specific driver probing completion. Signed-off-by: Michael Shych <michaelsh@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20220430115809.54565-2-michaelsh@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-05-11	sched/deadline: Remove superfluous rq clock update in push_dl_task()	Hao Jia
	The change to call update_rq_clock() before activate_task() commit 840d719604b0 ("sched/deadline: Update rq_clock of later_rq when pushing a task") is no longer needed since commit f4904815f97a ("sched/deadline: Fix double accounting of rq/running bw in push & pull") removed the add_running_bw() before the activate_task(). So we remove some comments that are no longer needed and update rq clock in activate_task(). Signed-off-by: Hao Jia <jiahao.os@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Link: https://lore.kernel.org/r/20220430085843.62939-3-jiahao.os@bytedance.com
2022-05-11	sched/core: Avoid obvious double update_rq_clock warning	Hao Jia
	When we use raw_spin_rq_lock() to acquire the rq lock and have to update the rq clock while holding the lock, the kernel may issue a WARN_DOUBLE_CLOCK warning. Since we directly use raw_spin_rq_lock() to acquire rq lock instead of rq_lock(), there is no corresponding change to rq->clock_update_flags. In particular, we have obtained the rq lock of other CPUs, the rq->clock_update_flags of this CPU may be RQCF_UPDATED at this time, and then calling update_rq_clock() will trigger the WARN_DOUBLE_CLOCK warning. So we need to clear RQCF_UPDATED of rq->clock_update_flags to avoid the WARN_DOUBLE_CLOCK warning. For the sched_rt_period_timer() and migrate_task_rq_dl() cases we simply replace raw_spin_rq_lock()/raw_spin_rq_unlock() with rq_lock()/rq_unlock(). For the {pull,push}_{rt,dl}_task() cases, we add the double_rq_clock_clear_update() function to clear RQCF_UPDATED of rq->clock_update_flags, and call double_rq_clock_clear_update() before double_lock_balance()/double_rq_lock() returns to avoid the WARN_DOUBLE_CLOCK warning. Some call trace reports: Call Trace 1: <IRQ> sched_rt_period_timer+0x10f/0x3a0 ? enqueue_top_rt_rq+0x110/0x110 __hrtimer_run_queues+0x1a9/0x490 hrtimer_interrupt+0x10b/0x240 __sysvec_apic_timer_interrupt+0x8a/0x250 sysvec_apic_timer_interrupt+0x9a/0xd0 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 Call Trace 2: <TASK> activate_task+0x8b/0x110 push_rt_task.part.108+0x241/0x2c0 push_rt_tasks+0x15/0x30 finish_task_switch+0xaa/0x2e0 ? __switch_to+0x134/0x420 __schedule+0x343/0x8e0 ? hrtimer_start_range_ns+0x101/0x340 schedule+0x4e/0xb0 do_nanosleep+0x8e/0x160 hrtimer_nanosleep+0x89/0x120 ? hrtimer_init_sleeper+0x90/0x90 __x64_sys_nanosleep+0x96/0xd0 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Call Trace 3: <TASK> deactivate_task+0x93/0xe0 pull_rt_task+0x33e/0x400 balance_rt+0x7e/0x90 __schedule+0x62f/0x8e0 do_task_dead+0x3f/0x50 do_exit+0x7b8/0xbb0 do_group_exit+0x2d/0x90 get_signal+0x9df/0x9e0 ? preempt_count_add+0x56/0xa0 ? __remove_hrtimer+0x35/0x70 arch_do_signal_or_restart+0x36/0x720 ? nanosleep_copyout+0x39/0x50 ? do_nanosleep+0x131/0x160 ? audit_filter_inodes+0xf5/0x120 exit_to_user_mode_prepare+0x10f/0x1e0 syscall_exit_to_user_mode+0x17/0x30 do_syscall_64+0x40/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Call Trace 4: update_rq_clock+0x128/0x1a0 migrate_task_rq_dl+0xec/0x310 set_task_cpu+0x84/0x1e4 try_to_wake_up+0x1d8/0x5c0 wake_up_process+0x1c/0x30 hrtimer_wakeup+0x24/0x3c __hrtimer_run_queues+0x114/0x270 hrtimer_interrupt+0xe8/0x244 arch_timer_handler_phys+0x30/0x50 handle_percpu_devid_irq+0x88/0x140 generic_handle_domain_irq+0x40/0x60 gic_handle_irq+0x48/0xe0 call_on_irq_stack+0x2c/0x60 do_interrupt_handler+0x80/0x84 Steps to reproduce: 1. Enable CONFIG_SCHED_DEBUG when compiling the kernel 2. echo 1 > /sys/kernel/debug/clear_warn_once echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched/features echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched/features 3. Run some rt/dl tasks that periodically work and sleep, e.g. Create 2*n rt or dl (90% running) tasks via rt-app (on a system with n CPUs), and Dietmar Eggemann reports Call Trace 4 when running on PREEMPT_RT kernel. Signed-off-by: Hao Jia <jiahao.os@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20220430085843.62939-2-jiahao.os@bytedance.com
2022-05-11	perf/ibs: Fix comment	Ravi Bangoria
	s/IBS Op Data 2/IBS Op Data 1/ for MSR 0xc0011035. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-9-ravi.bangoria@amd.com
2022-05-11	perf/amd/ibs: Advertise zen4_ibs_extensions as pmu capability attribute	Ravi Bangoria
	PMU driver can advertise certain feature via capability attribute('caps' sysfs directory) which can be consumed by userspace tools like perf. Add zen4_ibs_extensions capability attribute for IBS pmus. This attribute will be enabled when CPUID_Fn8000001B_EAX[11] is set. With patch on Zen4: $ ls /sys/bus/event_source/devices/ibs_op/caps zen4_ibs_extensions Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-5-ravi.bangoria@amd.com
2022-05-11	perf/amd/ibs: Add support for L3 miss filtering	Ravi Bangoria
	IBS L3 miss filtering works by tagging an instruction on IBS counter overflow and generating an NMI if the tagged instruction causes an L3 miss. Samples without an L3 miss are discarded and counter is reset with random value (between 1-15 for fetch pmu and 1-127 for op pmu). This helps in reducing sampling overhead when user is interested only in such samples. One of the use case of such filtered samples is to feed data to page-migration daemon in tiered memory systems. Add support for L3 miss filtering in IBS driver via new pmu attribute "l3missonly". Example usage: # perf record -a -e ibs_op/l3missonly=1/ --raw-samples sleep 5 Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-4-ravi.bangoria@amd.com
2022-05-11	perf/amd/ibs: Use ->is_visible callback for dynamic attributes	Ravi Bangoria
	Currently, some attributes are added at build time whereas others at boot time depending on IBS pmu capabilities. Instead, we can just add all attribute groups at build time but hide individual group at boot time using more appropriate ->is_visible() callback. Also, struct perf_ibs has bunch of fields for pmu attributes which just pass on the pointer, does not do anything else. Remove them. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-3-ravi.bangoria@amd.com
2022-05-11	perf/amd/ibs: Cascade pmu init functions' return value	Ravi Bangoria
	IBS pmu initialization code ignores return value provided by callee functions. Fix it. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-2-ravi.bangoria@amd.com
2022-05-11	perf/x86/uncore: Add new Alder Lake and Raptor Lake support	Kan Liang
	From the perspective of the uncore PMU, there is nothing changed for the new Alder Lake N and Raptor Lake P. Add new PCIIDs of IMC. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220504194413.1003071-5-kan.liang@linux.intel.com
2022-05-11	perf/x86/uncore: Clean up uncore_pci_ids[]	Kan Liang
	The initialization code to assign PCI IDs for different platforms is similar. Add the new macros to reduce the redundant code. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220504194413.1003071-4-kan.liang@linux.intel.com
2022-05-11	perf/x86/cstate: Add new Alder Lake and Raptor Lake support	Kan Liang
	From the perspective of Intel cstate residency counters, there is nothing changed for the new Alder Lake N and Raptor Lake P. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220504194413.1003071-3-kan.liang@linux.intel.com
2022-05-11	perf/x86/msr: Add new Alder Lake and Raptor Lake support	Kan Liang
	The new Alder Lake N and Raptor Lake P also support PPERF and SMI_COUNT MSRs. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220504194413.1003071-2-kan.liang@linux.intel.com
2022-05-11	perf/x86: Add new Alder Lake and Raptor Lake support	Kan Liang
	From PMU's perspective, there is no difference for the new Alder Lake N and Raptor Lake P. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220504194413.1003071-1-kan.liang@linux.intel.com