linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-10-10	net-shapers: implement NL get operation	Paolo Abeni
	Introduce the basic infrastructure to implement the net-shaper core functionality. Each network devices carries a net-shaper cache, the NL get() operation fetches the data from such cache. The cache is initially empty, will be fill by the set()/group() operation implemented later and is destroyed at device cleanup time. The net_shaper_fill_handle(), net_shaper_ctx_init(), and net_shaper_generic_pre() implementations handle generic index type attributes, despite the current caller always pass a constant value to avoid more noise in later patches using them with different attributes. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/ddd10fd645a9367803ad02fca4a5664ea5ace170.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10	netlink: spec: add shaper YAML spec	Paolo Abeni
	Define the user-space visible interface to query, configure and delete network shapers via yaml definition. Add dummy implementations for the relevant NL callbacks. set() and delete() operations touch a single shaper creating/updating or deleting it. The group() operation creates a shaper's group, nesting multiple input shapers under the specified output shaper. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/7a33a1ff370bdbcd0cd3f909575c912cd56f41da.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10	genetlink: extend info user-storage to match NL cb ctx	Paolo Abeni
	This allows a more uniform implementation of non-dump and dump operations, and will be used later in the series to avoid some per-operation allocation. Additionally rename the NL_ASSERT_DUMP_CTX_FITS macro, to fit a more extended usage. Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/1130cc2896626b84587a2a5f96a5c6829638f4da.1728460186.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10	sound: Make CONFIG_SND depend on INDIRECT_IOMEM instead of UML	Julian Vetter
	When building for the UM arch and neither INDIRECT_IOMEM=y, nor HAS_IOMEM=y is selected, it will fall back to the implementations from asm-generic/io.h for IO memcpy. But these fall-back functions just do a memcpy. So, instead of depending on UML, add dependency on 'HAS_IOMEM \|\| INDIRECT_IOMEM'. Reviewed-by: Yann Sionneau <ysionneau@kalrayinc.com> Signed-off-by: Julian Vetter <jvetter@kalrayinc.com> Link: https://patch.msgid.link/20241010124601.700528-1-jvetter@kalrayinc.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-10	Merge branch 'rtnetlink-handle-error-of-rtnl_register_module'	Paolo Abeni
	Kuniyuki Iwashima says: ==================== rtnetlink: Handle error of rtnl_register_module(). While converting phonet to per-netns RTNL, I found a weird comment /* Further rtnl_register_module() cannot fail / that was true but no longer true after commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"). Many callers of rtnl_register_module() just ignore the returned value but should handle them properly. This series introduces two helpers, rtnl_register_many() and rtnl_unregister_many(), to do that easily and fix such callers. All rtnl_register() and rtnl_register_module() will be converted to _many() variant and some rtnl_lock() will be saved in _many() later in net-next. Changes: v4: Add more context in changelog of each patch v3: https://lore.kernel.org/all/20241007124459.5727-1-kuniyu@amazon.com/ * Move module owner to struct rtnl_msg_handler Make struct rtnl_msg_handler args/vars const * Update mctp goto labels v2: https://lore.kernel.org/netdev/20241004222358.79129-1-kuniyu@amazon.com/ * Remove __exit from mctp_neigh_exit(). v1: https://lore.kernel.org/netdev/20241003205725.5612-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20241008184737.9619-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	phonet: Handle error of rtnl_register_module().	Kuniyuki Iwashima
	Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"), once the first rtnl_register_module() allocated rtnl_msg_handlers[PF_PHONET], the following calls never failed. However, after the commit, rtnl_register_module() could fail silently to allocate rtnl_msg_handlers[PF_PHONET][msgtype] and requires error handling for each call. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's use rtnl_register_many() to handle the errors easily. Fixes: addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Rémi Denis-Courmont <courmisch@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	mpls: Handle error of rtnl_register_module().	Kuniyuki Iwashima
	Since introduced, mpls_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 03c0566542f4 ("mpls: Netlink commands to add, remove, and dump routes") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	mctp: Handle error of rtnl_register_module().	Kuniyuki Iwashima
	Since introduced, mctp has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 583be982d934 ("mctp: Add device handling and netlink interface") Fixes: 831119f88781 ("mctp: Add neighbour netlink interface") Fixes: 06d2f4c583a7 ("mctp: Add netlink route management") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	bridge: Handle error of rtnl_register_module().	Kuniyuki Iwashima
	Since introduced, br_vlan_rtnl_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: 8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support") Fixes: f26b296585dc ("net: bridge: vlan: add new rtm message support") Fixes: adb3ce9bcb0f ("net: bridge: vlan: add del rtm message support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	vxlan: Handle error of rtnl_register_module().	Kuniyuki Iwashima
	Since introduced, vxlan_vnifilter_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	rtnetlink: Add bulk registration helpers for rtnetlink message handlers.	Kuniyuki Iwashima
	Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers"), once rtnl_msg_handlers[protocol] was allocated, the following rtnl_register_module() for the same protocol never failed. However, after the commit, rtnl_msg_handler[protocol][msgtype] needs to be allocated in each rtnl_register_module(), so each call could fail. Many callers of rtnl_register_module() do not handle the returned error, and we need to add many error handlings. To handle that easily, let's add wrapper functions for bulk registration of rtnetlink message handlers. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	KVM: s390: Change virtual to physical address access in diag 0x258 handler	Michael Mueller
	The parameters for the diag 0x258 are real addresses, not virtual, but KVM was using them as virtual addresses. This only happened to work, since the Linux kernel as a guest used to have a 1:1 mapping for physical vs virtual addresses. Fix KVM so that it correctly uses the addresses as real addresses. Cc: stable@vger.kernel.org Fixes: 8ae04b8f500b ("KVM: s390: Guest's memory access functions get access registers") Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-3-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-10	KVM: s390: gaccess: Check if guest address is in memslot	Nico Boehr
	Previously, access_guest_page() did not check whether the given guest address is inside of a memslot. This is not a problem, since kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case. However, -EFAULT is also returned when copy_to/from_user fails. When emulating a guest instruction, the address being outside a memslot usually means that an addressing exception should be injected into the guest. Failure in copy_to/from_user however indicates that something is wrong in userspace and hence should be handled there. To be able to distinguish these two cases, return PGM_ADDRESSING in access_guest_page() when the guest address is outside guest memory. In access_guest_real(), populate vcpu->arch.pgm.code such that kvm_s390_inject_prog_cond() can be used in the caller for injecting into the guest (if applicable). Since this adds a new return value to access_guest_page(), we need to make sure that other callers are not confused by the new positive return value. There are the following users of access_guest_page(): - access_guest_with_key() does the checking itself (in guest_range_to_gpas()), so this case should never happen. Even if, the handling is set up properly. - access_guest_real() just passes the return code to its callers, which are: - read_guest_real() - see below - write_guest_real() - see below There are the following users of read_guest_real(): - ar_translation() in gaccess.c which already returns PGM_* - setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always return -EFAULT on read_guest_read() nonzero return - no change - shadow_crycb(), handle_stfle() always present this as validity, this could be handled better but doesn't change current behaviour - no change There are the following users of write_guest_real(): - kvm_s390_store_status_unloaded() always returns -EFAULT on write_guest_real() failure. Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions") Cc: stable@vger.kernel.org Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-10	s390/ap: Fix CCA crypto card behavior within protected execution environment	Harald Freudenberger
	A crypto card comes in 3 flavors: accelerator, CCA co-processor or EP11 co-processor. Within a protected execution environment only the accelerator and EP11 co-processor is supported. However, it is possible to set up a KVM guest with a CCA card and run it as a protected execution guest. There is nothing at the host side which prevents this. Within such a guest, a CCA card is shown as "illicit" and you can't do anything with such a crypto card. Regardless of the unsupported CCA card within a protected execution guest there are a couple of user space applications which unconditional try to run crypto requests to the zcrypt device driver. There was a bug within the AP bus code which allowed such a request to be forwarded to a CCA card where it is finally rejected and the driver reacts with -ENODEV but also triggers an AP bus scan. Together with a retry loop this caused some kind of "hang" of the KVM guest. On startup it caused timeouts and finally led the KVM guest startup fail. Fix that by closing the gap and make sure a CCA card is not usable within a protected execution environment. Another behavior within an protected execution environment with CCA cards was that the se_bind and se_associate AP queue sysfs attributes where shown. The implementation unconditional always added these attributes. Fix that by checking if the card mode is supported within a protected execution environment and only if valid, add the attribute group. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-10	s390/pci: Handle PCI error codes other than 0x3a	Niklas Schnelle
	The Linux implementation of PCI error recovery for s390 was based on the understanding that firmware error recovery is a two step process with an optional initial error event to indicate the cause of the error if known followed by either error event 0x3A (Success) or 0x3B (Failure) to indicate whether firmware was able to recover. While this has been the case in testing and the error cases seen in the wild it turns out this is not correct. Instead firmware only generates 0x3A for some error and service scenarios and expects the OS to perform recovery for all PCI events codes except for those indicating permanent error (0x3B, 0x40) and those indicating errors on the function measurement block (0x2A, 0x2B, 0x2C). Align Linux behavior with these expectations. Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-10	ALSA: line6: update contact information	Markus Grabner
	The Line6 driver source code files contain an outdated email address of the original author. This patch updates the contact information. Signed-off-by: Markus Grabner <line6@grabner-graz.at> Link: https://patch.msgid.link/20241009194251.15662-1-line6@grabner-graz.at Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-10	ALSA: usb-audio: Fix NULL pointer deref in snd_usb_power_domain_set()	Karol Kosik
	Commit adding support for multiple control interfaces expanded struct snd_usb_power_domain with pointer to control interface for proper control message routing but missed one initialization point of this structure, which has left new field with NULL value. Standard mandates that each device has at least one control interface and code responsible for power domain does not check for NULL values when querying for control interface. This caused some USB devices to crash the kernel. Fixes: 6aa8700150f7 ("ALSA: usb-audio: Support multiple control interfaces") Signed-off-by: Karol Kosik <k.kosik@outlook.com> Link: https://patch.msgid.link/AS8P190MB1285B563C6B5394DB274813FEC782@AS8P190MB1285.EURP190.PROD.OUTLOOK.COM Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-10	PM: domains: Fix alloc/free in dev_pm_domain_attach\|detach_list()	Ulf Hansson
	The dev_pm_domain_attach\|detach_list() functions are not resource managed, hence they should not use devm_* helpers to manage allocation/freeing of data. Let's fix this by converting to the traditional alloc/free functions. Fixes: 161e16a5e50a ("PM: domains: Add helper functions to attach/detach multiple PM domains") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-3-ulf.hansson@linaro.org
2024-10-10	net: phy: Validate PHY LED OPs presence before registering	Christian Marangi
	Validate PHY LED OPs presence before registering and parsing them. Defining LED nodes for a PHY driver that actually doesn't supports them is redundant and useless. It's also the case with Generic PHY driver used and a DT having LEDs node for the specific PHY. Skip it and report the error with debug print enabled. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241008194718.9682-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	Revert "drm/tegra: gr3d: Convert into dev_pm_domain_attach\|detach_list()"	Ulf Hansson
	This reverts commit f790b5c09665cab0d51dfcc84832d79d2b1e6c0e. The reverted commit was not ready to be applied due to dependency on other OPP/pmdomain changes that didn't make it for the last release cycle. Let's revert it to fix the behaviour. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20241002122232.194245-2-ulf.hansson@linaro.org
2024-10-10	Merge tag 'nf-24-10-09' of ↵	Paolo Abeni
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Restrict xtables extensions to families that are safe, syzbot found a way to combine ebtables with extensions that are never used by userspace tools. From Florian Westphal. 2) Set l3mdev inconditionally whenever possible in nft_fib to fix lookup mismatch, also from Florian. netfilter pull request 24-10-09 * tag 'nf-24-10-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: netfilter: conntrack_vrf.sh: add fib test case netfilter: fib: check correct rtable in vrf setups netfilter: xtables: avoid NFPROTO_UNSPEC where needed ==================== Link: https://patch.msgid.link/20241009213858.3565808-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	mmc: sdhci-of-dwcmshc: Prevent stale command interrupt handling	Michal Wilczynski
	While working with the T-Head 1520 LicheePi4A SoC, certain conditions arose that allowed me to reproduce a race issue in the sdhci code. To reproduce the bug, you need to enable the sdio1 controller in the device tree file `arch/riscv/boot/dts/thead/th1520-lichee-module-4a.dtsi` as follows: &sdio1 { bus-width = <4>; max-frequency = <100000000>; no-sd; no-mmc; broken-cd; cap-sd-highspeed; post-power-on-delay-ms = <50>; status = "okay"; wakeup-source; keep-power-in-suspend; }; When resetting the SoC using the reset button, the following messages appear in the dmesg log: [ 8.164898] mmc2: Got command interrupt 0x00000001 even though no command operation was in progress. [ 8.174054] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== [ 8.180503] mmc2: sdhci: Sys addr: 0x00000000 \| Version: 0x00000005 [ 8.186950] mmc2: sdhci: Blk size: 0x00000000 \| Blk cnt: 0x00000000 [ 8.193395] mmc2: sdhci: Argument: 0x00000000 \| Trn mode: 0x00000000 [ 8.199841] mmc2: sdhci: Present: 0x03da0000 \| Host ctl: 0x00000000 [ 8.206287] mmc2: sdhci: Power: 0x0000000f \| Blk gap: 0x00000000 [ 8.212733] mmc2: sdhci: Wake-up: 0x00000000 \| Clock: 0x0000decf [ 8.219178] mmc2: sdhci: Timeout: 0x00000000 \| Int stat: 0x00000000 [ 8.225622] mmc2: sdhci: Int enab: 0x00ff1003 \| Sig enab: 0x00ff1003 [ 8.232068] mmc2: sdhci: ACmd stat: 0x00000000 \| Slot int: 0x00000000 [ 8.238513] mmc2: sdhci: Caps: 0x3f69c881 \| Caps_1: 0x08008177 [ 8.244959] mmc2: sdhci: Cmd: 0x00000502 \| Max curr: 0x00191919 [ 8.254115] mmc2: sdhci: Resp[0]: 0x00001009 \| Resp[1]: 0x00000000 [ 8.260561] mmc2: sdhci: Resp[2]: 0x00000000 \| Resp[3]: 0x00000000 [ 8.267005] mmc2: sdhci: Host ctl2: 0x00001000 [ 8.271453] mmc2: sdhci: ADMA Err: 0x00000000 \| ADMA Ptr: 0x0000000000000000 [ 8.278594] mmc2: sdhci: ============================================ I also enabled some traces to better understand the problem: kworker/3:1-62 [003] ..... 8.163538: mmc_request_start: mmc2: start struct mmc_request[000000000d30cc0c]: cmd_opcode=5 cmd_arg=0x0 cmd_flags=0x2e1 cmd_retries=0 stop_opcode=0 stop_arg=0x0 stop_flags=0x0 stop_retries=0 sbc_opcode=0 sbc_arg=0x0 sbc_flags=0x0 sbc_retires=0 blocks=0 block_size=0 blk_addr=0 data_flags=0x0 tag=0 can_retune=0 doing_retune=0 retune_now=0 need_retune=0 hold_retune=1 retune_period=0 <idle>-0 [000] d.h2. 8.164816: sdhci_cmd_irq: hw_name=ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x10000 intmask_p=0x18000 irq/24-mmc2-96 [000] ..... 8.164840: sdhci_thread_irq: msg= irq/24-mmc2-96 [000] d.h2. 8.164896: sdhci_cmd_irq: hw_name=ffe70a0000.mmc quirks=0x2008008 quirks2=0x8 intmask=0x1 intmask_p=0x1 irq/24-mmc2-96 [000] ..... 8.285142: mmc_request_done: mmc2: end struct mmc_request[000000000d30cc0c]: cmd_opcode=5 cmd_err=-110 cmd_resp=0x0 0x0 0x0 0x0 cmd_retries=0 stop_opcode=0 stop_err=0 stop_resp=0x0 0x0 0x0 0x0 stop_retries=0 sbc_opcode=0 sbc_err=0 sbc_resp=0x0 0x0 0x0 0x0 sbc_retries=0 bytes_xfered=0 data_err=0 tag=0 can_retune=0 doing_retune=0 retune_now=0 need_retune=0 hold_retune=1 retune_period=0 Here's what happens: the __mmc_start_request function is called with opcode 5. Since the power to the Wi-Fi card, which resides on this SDIO bus, is initially off after the reset, an interrupt SDHCI_INT_TIMEOUT is triggered. Immediately after that, a second interrupt SDHCI_INT_RESPONSE is triggered. Depending on the exact timing, these conditions can trigger the following race problem: 1) The sdhci_cmd_irq top half handles the command as an error. It sets host->cmd to NULL and host->pending_reset to true. 2) The sdhci_thread_irq bottom half is scheduled next and executes faster than the second interrupt handler for SDHCI_INT_RESPONSE. It clears host->pending_reset before the SDHCI_INT_RESPONSE handler runs. 3) The pending interrupt SDHCI_INT_RESPONSE handler gets called, triggering a code path that prints: "mmc2: Got command interrupt 0x00000001 even though no command operation was in progress." To solve this issue, we need to clear pending interrupts when resetting host->pending_reset. This ensures that after sdhci_threaded_irq restores interrupts, there are no pending stale interrupts. The behavior observed here is non-compliant with the SDHCI standard. Place the code in the sdhci-of-dwcmshc driver to account for a hardware-specific quirk instead of the core SDHCI code. Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 43658a542ebf ("mmc: sdhci-of-dwcmshc: Add support for T-Head TH1520") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241008100327.4108895-1-m.wilczynski@samsung.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-10	Merge branch 'net-mlx5-qos-refactor-esw-qos-to-support-new-features'	Paolo Abeni
	Tariq Toukan says: ==================== net/mlx5: qos: Refactor esw qos to support new features This patch series by Cosmin and Carolina prepares the mlx5 qos infra for the upcoming feature of cross E-Switch scheduling. Noop cleanups: net/mlx5: qos: Flesh out element_attributes in mlx5_ifc.h net/mlx5: qos: Rename vport 'tsar' into 'sched_elem'. net/mlx5: qos: Consistently name vport vars as 'vport' net/mlx5: qos: Refactor and document bw_share calculation net/mlx5: qos: Rename rate group 'list' as 'parent_entry' Refactor the code with the goal of moving groups out of E-Switches: net/mlx5: qos: Maintain rate group vport members in a list net/mlx5: qos: Always create group0 net/mlx5: qos: Drop 'esw' param from vport qos functions net/mlx5: qos: Store the eswitch in a mlx5_esw_rate_group Move groups from an E-Switch into an mlx5_qos_domain: net/mlx5: qos: Store rate groups in a qos domain Refactor locking to use a new mutex in the qos domain: net/mlx5: qos: Refactor locking to a qos domain mutex In follow-up patchsets, we'll allow qos domains to be shared between E-Switches of the same NIC. The two top patches are simple enhancements. ==================== Link: https://patch.msgid.link/20241008183222.137702-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: Add support check for TSAR types in QoS scheduling	Carolina Jubran
	Introduce a new function, mlx5_qos_tsar_type_supported(), to handle the validation of TSAR types within QoS scheduling contexts. Refactor the existing code to use this new function, replacing direct checks for TSAR type support in the NIC scheduling hierarchy. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: Unify QoS element type checks across NIC and E-Switch	Carolina Jubran
	Refactor the QoS element type support check by introducing a new function, mlx5_qos_element_type_supported(), which handles element type validation for both NIC and E-Switch schedulers. This change removes the redundant esw_qos_element_type_supported() function and unifies the element type checks into a single implementation. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Refactor locking to a qos domain mutex	Cosmin Ratiu
	E-Switch qos changes used the esw state_lock to serialize qos changes. With the introduction of cross-esw scheduling, multiple E-Switches might be involved in a qos operation, so prepare for that by switching locking to use a qos domain mutex. Add three helper functions: - esw_qos_lock - esw_qos_unlock - esw_assert_qos_lock_held Convert existing direct lock/unlock/lockdep calls to them. Also call esw_assert_qos_lock_held in a couple more places. mlx5_esw_qos_set_vport_rate expected to be called with the esw state_lock already held. Change it to instead acquire the qos lock directly. mlx5_eswitch_get_vport_config also accessed qos properties with the esw state lock. Introduce a new function mlx5_esw_qos_get_vport_rate to access those with the correct lock and change get_vport_config to use it. Finally, mlx5_vport_disable is called from the cleanup path with the esw state_lock held, so have it additionally acquire the qos lock to make sure there are no races. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Store rate groups in a qos domain	Cosmin Ratiu
	Groups are currently maintained as a list in their corresponding eswitch, protected by the esw state_lock. The upcoming cross-eswitch scheduling feature cannot work with this approach, as it would require acquiring multiple eswitch locks (in the correct order) in order to maintain group membership. This commit moves the rate groups into a new 'qos domain' struct and adds explicit qos init/cleanup steps to the eswitch init/cleanup. Upcoming patches will expand the qos domain struct and allow it to be shared between eswitches. For now, qos domains are private to each esw so there's only an extra indirection. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Rename rate group 'list' as 'parent_entry'	Cosmin Ratiu
	'list' is not very descriptive, I prefer list membership to clearly specify which list the entry belongs to. This commit renames the list entry into the esw groups list as 'parent_entry' to make the code more readable. This is a no-op change. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Add an explicit 'dev' to vport trace calls	Cosmin Ratiu
	vport qos trace calls used vport->dev implicitly as the device to which the command was sent (and thus the device logged in traces). But that will no longer be the case for cross-esw scheduling, where the commands have to be sent to the group esw device instead. This commit corrects that. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Store the eswitch in a mlx5_esw_rate_group	Cosmin Ratiu
	The rate groups are about to be moved out of eswitches, so store a reference to the eswitch they belong to so things can still work later. This allows dropping the esw parameter from a couple of functions and simplifying some of the code. Use this opportunity to make sure that vport scheduling element commands are always sent to the group eswitch, because that will be relevant for cross-esw scheduling. For now though, the eswitches are not different. There is no functionality change here. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Drop 'esw' param from vport qos functions	Cosmin Ratiu
	The vport has a pointer to its own eswitch in vport->dev->priv.eswitch, so passing the same eswitch as a parameter to the various functions manipulating vport qos is superfluous at best and prone to errors at worst. More importantly, with the upcoming cross-esw scheduling changes, the eswitch that should receive the various scheduling element commands is NOT the same as the vport's eswitch, so the current code's assumptions will break. To avoid confusion and bugs, this commit drops the 'esw' parameter from all vport qos functions and uses the vport's own eswitch pointer instead. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Always create group0	Cosmin Ratiu
	All vports not explicitly members of a group with QoS enabled are part of the internal esw group0, except when the hw reports that groups aren't supported (log_esw_max_sched_depth == 0). This creates corner cases in the code, which has to make sure that this case is supported. Additionally, the groups are about to be moved out of eswitches, and group0 being NULL creates additional complications there. This patch makes sure to always create group0, even if max sched depth is 0. In that case, a software-only group0 is created referencing the root TSAR. Vports can point to this group when their QoS is enabled and they'll be attached to the root TSAR directly. This eliminates corner cases in the code by offering the guarantee that if qos is enabled, vport->qos.group is non-NULL. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Maintain rate group vport members in a list	Cosmin Ratiu
	Previously, finding group members was done by iterating over all vports of an eswitch and comparing their group with the required one, but that approach will break down when a group can contain vports from multiple eswitches. Solve that by maintaining a list of vport members. Instead of iterating over esw vports, loop over the members list. Use this opportunity to provide two new functions to allocate and free a group, so that the number of state transitions is smaller. This will also be used in a future patch. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Refactor and document bw_share calculation	Cosmin Ratiu
	The previous function (esw_qos_calculate_group_min_rate_divider) had two completely different modes of execution, depending on the 'group_level' parameter. Split it into two separate functions: - esw_qos_calculate_min_rate_divider - computes min across groups. - esw_qos_calculate_group_min_rate_divider - computes min in a group. Fold the divider calculation into the corresponding normalize functions to avoid having the caller compute the corresponding divider. Also rename the normalize functions to better indicate what level they're operating on. Finally, document everything so that this topic can more easily be understood by future maintainers. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Consistently name vport vars as 'vport'	Cosmin Ratiu
	The current mixture of 'vport' and 'evport' can be improved. There is no functional change. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Rename vport 'tsar' into 'sched_elem'.	Cosmin Ratiu
	Vports do not use TSARs (Transmit Scheduling ARbiters), which are used for grouping multiple entities together. Use the correct name in variables and functions for clarity. Also move the scheduling context to a local variable in the esw_qos_sched_elem_config function instead of an empty parameter that needs to be provided by all callers. There is no functional change here. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net/mlx5: qos: Flesh out element_attributes in mlx5_ifc.h	Cosmin Ratiu
	This is used for multiple purposes, depending on the scheduling element created. There are a few helper struct defined a long time ago, but they are not easy to find in the file and they are about to get new members. This commit cleans up this area a bit by: - moving the helper structs closer to where they are relevant. - defining a helper union to include all of them to help discoverability. - making use of it everywhere element_attributes is used. - using a consistent 'attr' name. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	Merge branch 'eth-fbnic-add-timestamping-support'	Paolo Abeni
	Vadim Fedorenko says: ==================== eth: fbnic: add timestamping support The series is to add timestamping support for Meta's NIC driver. Changelog: v3 -> v4: - use adjust_by_scaled_ppm() instead of open coding it - adjust cached value of high bits of timestamp to be sure it is older then incoming timestamps v2 -> v3: - rebase on top of net-next - add doc to describe retur value of fbnic_ts40_to_ns() v1 -> v2: - adjust comment about using u64 stats locking primitive - fix typo in the first patch - Cc Richard ==================== Link: https://patch.msgid.link/20241008181436.4120604-1-vadfed@meta.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	eth: fbnic: add ethtool timestamping statistics	Vadim Fedorenko
	Add counters of packets with HW timestamps requests and lost timestamps with no associated skbs. Use ethtool interface to report these counters. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	eth: fbnic: add TX packets timestamping support	Vadim Fedorenko
	Add TX configuration to ethtool interface. Add processing of TX timestamp completions as well as configuration to request HW to create TX timestamp completion. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	eth: fbnic: add RX packets timestamping support	Vadim Fedorenko
	Add callbacks to support timestamping configuration via ethtool. Add processing of RX timestamps. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	eth: fbnic: add initial PHC support	Vadim Fedorenko
	Create PHC device and provide callbacks needed for ptp_clock device. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	eth: fbnic: add software TX timestamping support	Vadim Fedorenko
	Add software TX timestamping support. RX software timestamping is implemented in the core and there is no need to provide special flag in the driver anymore. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	openat2: explicitly return -E2BIG for (usize > PAGE_SIZE)	Aleksa Sarai
	While we do currently return -EFAULT in this case, it seems prudent to follow the behaviour of other syscalls like clone3. It seems quite unlikely that anyone depends on this error code being EFAULT, but we can always revert this if it turns out to be an issue. Cc: stable@vger.kernel.org # v5.6+ Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/r/20241010-extensible-structs-check_fields-v3-3-d2833dfe6edd@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-10	net: Remove likely from l3mdev_master_ifindex_by_index	Breno Leitao
	The likely() annotation in l3mdev_master_ifindex_by_index() has been found to be incorrect 100% of the time in real-world workloads (e.g., web servers). Annotated branches shows the following in these servers: correct incorrect % Function File Line 0 169053813 100 l3mdev_master_ifindex_by_index l3mdev.h 81 This is happening because l3mdev_master_ifindex_by_index() is called from __inet_check_established(), which calls l3mdev_master_ifindex_by_index() passing the socked bounded interface. l3mdev_master_ifindex_by_index(net, sk->sk_bound_dev_if); Since most sockets are not going to be bound to a network device, the likely() is giving the wrong assumption. Remove the likely() annotation to ensure more accurate branch prediction. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241008163205.3939629-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	net: do not delay dst_entries_add() in dst_release()	Eric Dumazet
	dst_entries_add() uses per-cpu data that might be freed at netns dismantle from ip6_route_net_exit() calling dst_entries_destroy() Before ip6_route_net_exit() can be called, we release all the dsts associated with this netns, via calls to dst_release(), which waits an rcu grace period before calling dst_destroy() dst_entries_add() use in dst_destroy() is racy, because dst_entries_destroy() could have been called already. Decrementing the number of dsts must happen sooner. Notes: 1) in CONFIG_XFRM case, dst_destroy() can call dst_release_immediate(child), this might also cause UAF if the child does not have DST_NOCOUNT set. IPSEC maintainers might take a look and see how to address this. 2) There is also discussion about removing this count of dst, which might happen in future kernels. Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()") Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/ Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20241008143110.1064899-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-10	crypto: marvell/cesa - Disable hash algorithms	Herbert Xu
	Disable cesa hash algorithms by lowering the priority because they appear to be broken when invoked in parallel. This allows them to still be tested for debugging purposes. Reported-by: Klaus Kudielka <klaus.kudielka@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-10	crypto: testmgr - Hide ENOENT errors better	Herbert Xu
	The previous patch removed the ENOENT warning at the point of allocation, but the overall self-test warning is still there. Fix all of them by returning zero as the test result. This is safe because if the algorithm has gone away, then it cannot be marked as tested. Fixes: 4eded6d14f5b ("crypto: testmgr - Hide ENOENT errors") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-10	crypto: api - Fix liveliness check in crypto_alg_tested	Herbert Xu
	As algorithm testing is carried out without holding the main crypto lock, it is always possible for the algorithm to go away during the test. So before crypto_alg_tested updates the status of the tested alg, it checks whether it's still on the list of all algorithms. This is inaccurate because it may be off the main list but still on the list of algorithms to be removed. Updating the algorithm status is safe per se as the larval still holds a reference to it. However, killing spawns of other algorithms that are of lower priority is clearly a deficiency as it adds unnecessary churn. Fix the test by checking whether the algorithm is dead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-10	ata: libata: Update MAINTAINERS file	Damien Le Moal
	Modify the entry for the ahci_platform driver (LIBATA SATA AHCI PLATFORM devices support) in the MAINTAINERS file to remove Jens as maintainer. Also remove all references to Jens block tree from the various LIBATA driver entries as the tree reference for these is defined by the LIBATA SUBSYSTEM entry. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20241010020117.416333-1-dlemoal@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>