summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-04-16crypto: acomp - Use request flag helpers and add acomp_request_flagsHerbert Xu
Use the newly added request flag helpers to manage the request flags. Also add acomp_request_flags which lets bottom-level users to access the request flags without the bits private to the acomp API. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16crypto: api - Add helpers to manage request flagsHerbert Xu
Add helpers so that the ON_STACK request flag management is not duplicated all over the place. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16crypto: ahash - Remove request chainingHerbert Xu
Request chaining requires the user to do too much book keeping. Remove it from ahash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-16crypto: acomp - Remove request chainingHerbert Xu
Request chaining requires the user to do too much book keeping. Remove it from acomp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-04-15net: ptp: introduce .supported_perout_flags to ptp_clock_infoJacob Keller
The PTP_PEROUT_REQUEST2 ioctl has gained support for flags specifying specific output behavior including PTP_PEROUT_ONE_SHOT, PTP_PEROUT_DUTY_CYCLE, PTP_PEROUT_PHASE. Driver authors are notorious for not checking the flags of the request. This results in misinterpreting the request, generating an output signal that does not match the requested value. It is anticipated that even more flags will be added in the future, resulting in even more broken requests. Expecting these issues to be caught during review or playing whack-a-mole after the fact is not a great solution. Instead, introduce the supported_perout_flags field in the ptp_clock_info structure. Update the core character device logic to explicitly reject any request which has a flag not on this list. This ensures that drivers must 'opt in' to the flags they support. Drivers which don't set the .supported_perout_flags field will not need to check that unsupported flags aren't passed, as the core takes care of this. Update the drivers which do support flags to set this new field. Note the following driver files set n_per_out to a non-zero value but did not check the flags at all: • drivers/ptp/ptp_clockmatrix.c • drivers/ptp/ptp_idt82p33.c • drivers/ptp/ptp_fc3.c • drivers/net/ethernet/ti/am65-cpts.c • drivers/net/ethernet/aquantia/atlantic/aq_ptp.c • drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c • drivers/net/dsa/sja1105/sja1105_ptp.c • drivers/net/ethernet/freescale/dpaa2/dpaa2-ptp.c • drivers/net/ethernet/mscc/ocelot_vsc7514.c • drivers/net/ethernet/intel/i40e/i40e_ptp.c Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250414-jk-supported-perout-flags-v2-2-f6b17d15475c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: ptp: introduce .supported_extts_flags to ptp_clock_infoJacob Keller
The PTP_EXTTS_REQUEST(2) ioctl has a flags field which specifies how the external timestamp request should behave. This includes which edge of the signal to timestamp, as well as a specialized "offset" mode. It is expected that more flags will be added in the future. Driver authors routinely do not check the flags, often accepting requests with flags which they do not support. Even drivers which do check flags may not be future-proofed to reject flags not yet defined. Thus, any future flag additions often require manually updating drivers to reject these flags. This approach of hoping we catch flag checks during review, or playing whack-a-mole after the fact is the wrong approach. Introduce the "supported_extts_flags" field to the ptp_clock_info structure. This field defines the set of flags the device actually supports. Update the core character device logic to check this field and reject unsupported requests. Getting this right is somewhat tricky. First, to avoid unnecessary repetition and make basic functionality work when .supported_extts_flags is 0, the core always accepts the PTP_ENABLE_FEATURE flag. This flag is used to set the 'on' parameter to the .enable function and is thus always 'supported' by all drivers. For backwards compatibility, the PTP_RISING_EDGE and PTP_FALLING_EDGE flags are merely "hints" when using the old PTP_EXTTS_REQUEST ioctl, and are not expected to be enforced. If the user issues PTP_EXTTS_REQUEST2, the PTP_STRICT_FLAGS flag is added which is supposed to inform the driver to strictly validate the flags and reject unsupported requests. To handle this, first check if the driver reports PTP_STRICT_FLAGS support. If it does not, then always allow the PTP_RISING_EDGE and PTP_FALLING_EDGE flags. This keeps backwards compatibility with the original PTP_EXTTS_REQUEST ioctl where these flags are not guaranteed to be honored. This way, drivers which do not set the supported_extts_flags will continue to accept requests for the original PTP_EXTTS_REQUEST ioctl. The core will automatically reject requests with new flags, and correctly reject requests with PTP_STRICT_FLAGS, where the driver is supposed to strictly validate the flags. Update the various drivers, refactoring their validation logic into the .supported_extts_flags field. For consistency and readability, PTP_ENABLE_FEATURE is not set in the supported flags list, and PTP_EXTTS_EDGES is expanded to PTP_RISING_EDGE | PTP_FALLING_EDGE in all cases. Note the following driver files set n_ext_ts to a non-zero value but did not check flags at all: • drivers/net/ethernet/freescale/dpaa2/dpaa2-ptp.c • drivers/net/ethernet/freescale/enetc/enetc_ptp.c • drivers/net/ethernet/intel/i40e/i40e_ptp.c • drivers/net/ethernet/marvell/octeontx2/nic/otx2_ptp.c • drivers/net/ethernet/renesas/ravb_ptp.c • drivers/net/ethernet/renesas/rtsn.c • drivers/net/ethernet/renesas/rtsn.h • drivers/net/ethernet/ti/am65-cpts.c • drivers/net/ethernet/ti/cpts.h • drivers/net/ethernet/ti/icssg/icss_iep.c • drivers/net/ethernet/xscale/ptp_ixp46x.c • drivers/net/phy/bcm-phy-ptp.c • drivers/ptp/ptp_ocp.c • drivers/ptp/ptp_pch.c • drivers/ptp/ptp_qoriq.c These drivers behavior does change slightly: they will now reject the PTP_EXTTS_REQUEST2 ioctl, because they do not strictly validate their flags. This also makes them no longer incorrectly accept PTP_EXT_OFFSET. Also note that the renesas ravb driver does not support PTP_STRICT_FLAGS. We could leave the .supported_extts_flags as 0, but I added the PTP_RISING_EDGE | PTP_FALLING_EDGE since the driver previously manually validated these flags. This is equivalent to 0 because the core will allow these flags regardless unless PTP_STRICT_FLAGS is also set. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250414-jk-supported-perout-flags-v2-1-f6b17d15475c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: fib_rules: Fix iif / oif matching on L3 master deviceIdo Schimmel
Before commit 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") it was possible to use FIB rules to match on a L3 domain. This was done by having a FIB rule match on iif / oif being a L3 master device. It worked because prior to the FIB rule lookup the iif / oif fields in the flow structure were reset to the index of the L3 master device to which the input / output device was enslaved to. The above scheme made it impossible to match on the original input / output device. Therefore, cited commit stopped overwriting the iif / oif fields in the flow structure and instead stored the index of the enslaving L3 master device in a new field ('flowi_l3mdev') in the flow structure. While the change enabled new use cases, it broke the original use case of matching on a L3 domain. Fix this by interpreting the iif / oif matching on a L3 master device as a match against the L3 domain. In other words, if the iif / oif in the FIB rule points to a L3 master device, compare the provided index against 'flowi_l3mdev' rather than 'flowi_{i,o}if'. Before cited commit, a FIB rule that matched on 'iif vrf1' would only match incoming traffic from devices enslaved to 'vrf1'. With the proposed change (i.e., comparing against 'flowi_l3mdev'), the rule would also match traffic originating from a socket bound to 'vrf1'. Avoid that by adding a new flow flag ('FLOWI_FLAG_L3MDEV_OIF') that indicates if the L3 domain was derived from the output interface or the input interface (when not set) and take this flag into account when evaluating the FIB rule against the flow structure. Avoid unnecessary checks in the data path by detecting that a rule matches on a L3 master device when the rule is installed and marking it as such. Tested using the following script [1]. Output before 40867d74c374 (v5.4.291): default dev dummy1 table 100 scope link default dev dummy1 table 200 scope link Output after 40867d74c374: default dev dummy1 table 300 scope link default dev dummy1 table 300 scope link Output with this patch: default dev dummy1 table 100 scope link default dev dummy1 table 200 scope link [1] #!/bin/bash ip link add name vrf1 up type vrf table 10 ip link add name dummy1 up master vrf1 type dummy sysctl -wq net.ipv4.conf.all.forwarding=1 sysctl -wq net.ipv4.conf.all.rp_filter=0 ip route add table 100 default dev dummy1 ip route add table 200 default dev dummy1 ip route add table 300 default dev dummy1 ip rule add prio 0 oif vrf1 table 100 ip rule add prio 1 iif vrf1 table 200 ip rule add prio 2 table 300 ip route get 192.0.2.1 oif dummy1 fibmatch ip route get 192.0.2.1 iif dummy1 from 198.51.100.1 fibmatch Fixes: 40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices") Reported-by: hanhuihui <hanhuihui5@huawei.com> Closes: https://lore.kernel.org/netdev/ec671c4f821a4d63904d0da15d604b75@huawei.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250414172022.242991-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15PCI: pciehp: Ignore Link Down/Up caused by Secondary Bus ResetLukas Wunner
When a Secondary Bus Reset is issued at a hotplug port, it causes a Data Link Layer State Changed event as a side effect. On hotplug ports using in-band presence detect, it additionally causes a Presence Detect Changed event. These spurious events should not result in teardown and re-enumeration of the device in the slot. Hence commit 2e35afaefe64 ("PCI: pciehp: Add reset_slot() method") masked the Presence Detect Changed Enable bit in the Slot Control register during a Secondary Bus Reset. Commit 06a8d89af551 ("PCI: pciehp: Disable link notification across slot reset") additionally masked the Data Link Layer State Changed Enable bit. However masking those bits only disables interrupt generation (PCIe r6.2 sec 6.7.3.1). The events are still visible in the Slot Status register and picked up by the IRQ handler if it runs during a Secondary Bus Reset. This can happen if the interrupt is shared or if an unmasked hotplug event occurs, e.g. Attention Button Pressed or Power Fault Detected. The likelihood of this happening used to be small, so it wasn't much of a problem in practice. That has changed with the recent introduction of bandwidth control in v6.13-rc1 with commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller"): Bandwidth control shares the interrupt with PCIe hotplug. A Secondary Bus Reset causes a Link Bandwidth Notification, so the hotplug IRQ handler runs, picks up the masked events and tears down the device in the slot. As a result, Joel reports VFIO passthrough failure of a GPU, which Ilpo root-caused to the incorrect handling of masked hotplug events. Clearly, a more reliable way is needed to ignore spurious hotplug events. For Downstream Port Containment, a new ignore mechanism was introduced by commit a97396c6eb13 ("PCI: pciehp: Ignore Link Down/Up caused by DPC"). It has been working reliably for the past four years. Adapt it for Secondary Bus Resets. Introduce two helpers to annotate code sections which cause spurious link changes: pci_hp_ignore_link_change() and pci_hp_unignore_link_change() Use those helpers in lieu of masking interrupts in the Slot Control register. Introduce a helper to check whether such a code section is executing concurrently and if so, await it: pci_hp_spurious_link_change() Invoke the helper in the hotplug IRQ thread pciehp_ist(). Re-use the IRQ thread's existing code which ignores DPC-induced link changes unless the link is unexpectedly down after reset recovery or the device was replaced during the bus reset. That code block in pciehp_ist() was previously only executed if a Data Link Layer State Changed event has occurred. Additionally execute it for Presence Detect Changed events. That's necessary for compatibility with PCIe r1.0 hotplug ports because Data Link Layer State Changed didn't exist before PCIe r1.1. DPC was added with PCIe r3.1 and thus DPC-capable hotplug ports always support Data Link Layer State Changed events. But the same cannot be assumed for Secondary Bus Reset, which already existed in PCIe r1.0. Secondary Bus Reset is only one of many causes of spurious link changes. Others include runtime suspend to D3cold, firmware updates or FPGA reconfiguration. The new pci_hp_{,un}ignore_link_change() helpers may be used by all kinds of drivers to annotate such code sections, hence their declarations are publicly visible in <linux/pci.h>. A case in point is the Mellanox Ethernet driver which disables a firmware reset feature if the Ethernet card is attached to a hotplug port, see commit 3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled"). Going forward, PCIe hotplug will be able to cope gracefully with all such use cases once the code sections are properly annotated. The new helpers internally use two bits in struct pci_dev's priv_flags as well as a wait_queue. This mirrors what was done for DPC by commit a97396c6eb13 ("PCI: pciehp: Ignore Link Down/Up caused by DPC"). That may be insufficient if spurious link changes are caused by multiple sources simultaneously. An example might be a Secondary Bus Reset issued by AER during FPGA reconfiguration. If this turns out to happen in real life, support for it can easily be added by replacing the PCI_LINK_CHANGING flag with an atomic_t counter incremented by pci_hp_ignore_link_change() and decremented by pci_hp_unignore_link_change(). Instead of awaiting a zero PCI_LINK_CHANGING flag, the pci_hp_spurious_link_change() helper would then simply await a zero counter. Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") Reported-by: Joel Mathew Thomas <proxy0@tutamail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219765 Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Joel Mathew Thomas <proxy0@tutamail.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/d04deaf49d634a2edf42bf3c06ed81b4ca54d17b.1744298239.git.lukas@wunner.de
2025-04-15ASoC: Add codec driver for Cirrus Logic CS48L32 DSPMark Brown
Merge series from Richard Fitzgerald <rf@opensource.cirrus.com>: formance low-power audio DSP with analog and PDM digital inputs and support for low-power always-on voice-trigger functionality. This series adds the devicetree bindings and the ASoC codec driver.
2025-04-15ASoC: skip the endpoint that doesn't present andMark Brown
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: A codec endpoint may not be used. We could check the present SDCA functions to know if the endpoint is used or not. Skip the endpoint which is not used. And load the topology dynamically for each endpoint. With this feature, we don't need to use the quirk to determine the existence of the optional codec DAIs.
2025-04-15sysfs: constify attribute_group::bin_attrsThomas Weißschuh
All users of this field have been migrated to bin_attrs_new. It can now be constified. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250313-sysfs-const-bin_attr-final-v2-2-96284e1e88ce@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15sysfs: constify bin_attribute argument of bin_attribute::read/write()Thomas Weißschuh
All callback implementers have been moved to the const variant of the callbacks. The signature of the original callbacks can now be changed. Also remove the now unnecessary transition machinery inside __BIN_ATTR(). Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250313-sysfs-const-bin_attr-final-v2-1-96284e1e88ce@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15device property: Add a note to the fwnode.hAndy Shevchenko
Add a note to the fwnode.h that the header should not be used directly in the leaf drivers, they all should use the higher level APIs and the respective headers. The purpose of this note is to give guidance to driver writers to avoid repeating a common mistake. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250408095229.1298005-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15netlink: Introduce nlmsg_payload helperBreno Leitao
Create a new helper function, nlmsg_payload(), to simplify checking and retrieving Netlink message payloads. This reduces boilerplate code for users who need to verify the message length before accessing its data. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250414-nlmsg-v2-1-3d90cb42c6af@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15net: phy: remove device_phy_find_deviceHeiner Kallweit
AFAICS this function has never had a user. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/ab7b8094-2eea-4e82-a047-fd60117f220b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15mptcp: sched: remove mptcp_sched_dataMatthieu Baerts (NGI0)
This is a follow-up of commit b68b106b0f15 ("mptcp: sched: reduce size for unused data"), now removing the mptcp_sched_data structure. Now is a good time to do that, because the previously mentioned WIP work has been updated, no longer depending on this structure. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250413-net-next-mptcp-sched-mib-sft-misc-v2-1-0f83a4350150@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15platform/mellanox: Rename field to improve code readabilityVadim Pasternak
Rename field 'counter' in 'mlxreg_core_hotplug_platform_data' to count. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20250412091843.33943-2-vadimp@nvidia.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-15io_uring/zcrx: return ifq id to the userPavel Begunkov
IORING_OP_RECV_ZC requests take a zcrx object id via sqe::zcrx_ifq_idx, which binds it to the corresponding if / queue. However, we don't return that id back to the user. It's fine as currently there can be only one zcrx and the user assumes that its id should be 0, but as we'll need multiple zcrx objects in the future let's explicitly pass it back on registration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8714667d370651962f7d1a169032e5f02682a73e.1744722517.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-15driver core: auxiliary bus: add device creation helpersJerome Brunet
Add helper functions to create a device on the auxiliary bus. This is meant for fairly simple usage of the auxiliary bus, to avoid having the same code repeated in the different drivers. Suggested-by: Stephen Boyd <sboyd@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20250218-aux-device-create-helper-v4-1-c3d7dfdea2e6@baylibre.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-15ASoC: cs48l32: Add driver for Cirrus Logic CS48L32 audio DSPRichard Fitzgerald
Add a codec driver for the Cirrus Logic CS48L32 audio DSP. The CS48L32 is a low-power audio DSP with microphone inputs for "Always on Voice" (i.e. voice trigger) and voice command processing. It has a programmable Halo Core DSP and a variety of power-efficient fixed-function audio processors, with configurable digital mixing and routing. There are two I2S/TDM audio serial ports. Four analogue inputs are available through IN1. These feed into a 2-channel ADC through an analogue mux. There is an ALSA control for each IN1 ADC channel to select which analogue input to use. A dedicated digital mic (DMIC) PDM input is available on IN2. Two PDM outputs can feed DMIC inputs on another codec or a host DMIC/PDM input. An on-board regulator provides a power supply or bias voltage to attached microphones. Three switchable MICBIAS outputs are fed from this allowing only the microphone in use to be powered-up. There are DAPM widgets for these outputs: MICBIAS1A, MICBIAS1B and MICBIAS1C. The machine driver must create a DAPM route from the required MICBIAS1x widget to the INn input widgets to make the MICBIAS switch on when the audio input is powered-up. For example if the microphone feeding CS48L32 pin IN1LN_1 is powered from MICBIAS1A, the machine driver must create the path: (sink) IN1LN_1 <----- (source) MICBIAS1A Co-developed-by: Stuart Henderson <stuarth@opensource.cirrus.com> Signed-off-by: Stuart Henderson <stuarth@opensource.cirrus.com> Co-developed-by: Qi Zhou <qi.zhou@cirrus.com> Signed-off-by: Qi Zhou <qi.zhou@cirrus.com> Co-developed-by: Piotr Stankiewicz <piotrs@opensource.cirrus.com> Signed-off-by: Piotr Stankiewicz <piotrs@opensource.cirrus.com> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250415115016.505777-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-15ASoC: dt-bindings: Add Cirrus Logic CS48L32 audio DSPRichard Fitzgerald
The CS48L32 is an Audio DSP with microphone inputs and SPI control interface. It has a programmable DSP and a variety of power-efficient fixed-function audio processors, with configurable digital mixing and routing. Most properties are core properties: supply regulators, gpios, clocks, interrupt parent and SPI interface. The custom properties define the configuration of the microphone inputs to match what is physically attached to them. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Reviewed-by: "Rob Herring (Arm)" <robh@kernel.org> Link: https://patch.msgid.link/20250415115016.505777-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-15mm: skip folio reclaim in legacy memcg contexts for deadlockable mappingsJoanne Koong
Currently in shrink_folio_list(), reclaim for folios under writeback falls into 3 different cases: 1) Reclaim is encountering an excessive number of folios under writeback and this folio has both the writeback and reclaim flags set 2) Dirty throttling is enabled (this happens if reclaim through cgroup is not enabled, if reclaim through cgroupv2 memcg is enabled, or if reclaim is on the root cgroup), or if the folio is not marked for immediate reclaim, or if the caller does not have __GFP_FS (or __GFP_IO if it's going to swap) set 3) Legacy cgroupv1 encounters a folio that already has the reclaim flag set and the caller did not have __GFP_FS (or __GFP_IO if swap) set In cases 1) and 2), we activate the folio and skip reclaiming it while in case 3), we wait for writeback to finish on the folio and then try to reclaim the folio again. In case 3, we wait on writeback because cgroupv1 does not have dirty folio throttling, as such this is a mitigation against the case where there are too many folios in writeback with nothing else to reclaim. If a filesystem (eg fuse) may deadlock due to reclaim waiting on writeback, then the filesystem needs to add inefficient messy workarounds to prevent this. To improve the performance of these filesystems, this commit adds two things: a) a AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM mapping flag that filesystems may set to indicate that reclaim should not wait on writeback b) if legacy memcg encounters a folio with this AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM flag set (eg case 3), the folio will be activated and skip reclaim (eg default to behavior in case 2) instead. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-04-15ASoC: cs-amp-lib: Annotate struct cirrus_amp_efi_data with __counted_by()Thorsten Blum
Add the __counted_by() compiler attribute to the flexible array member 'data' to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. No functional changes intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250415090354.92211-2-thorsten.blum@linux.dev Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-15interconnect: core: Add dynamic id allocation supportRaviteja Laggyshetty
The current interconnect framework relies on static IDs for node creation and registration, which limits topologies with multiple instances of the same interconnect provider. To address this, introduce icc_node_create_dyn() and icc_link_nodes() APIs to dynamically allocate IDs for interconnect nodes during creation and link. This change removes the dependency on static IDs, allowing multiple instances of the same hardware, such as EPSS L3. Signed-off-by: Raviteja Laggyshetty <quic_rlaggysh@quicinc.com> Link: https://lore.kernel.org/r/20250415095343.32125-3-quic_rlaggysh@quicinc.com Signed-off-by: Georgi Djakov <djakov@kernel.org>
2025-04-15Merge drm/drm-next into drm-misc-nextThomas Zimmermann
Backmerging to get fixes from v6.15-rc2 into drm-misc-next. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-04-15drm/display: hdmi: provide central data authority for ACR paramsDmitry Baryshkov
HDMI standard defines recommended N and CTS values for Audio Clock Regeneration. Currently each driver implements those, frequently in somewhat unique way. Provide a generic helper for getting those values to be used by the HDMI drivers. The helper is added to drm_hdmi_helper.c rather than drm_hdmi_audio.c since HDMI drivers can be using this helper function even without switching to DRM HDMI Audio helpers. Note: currently this only handles the values per HDMI 1.4b Section 7.2 and HDMI 2.0 Section 9.2.1. Later the table can be expanded to accommodate for Deep Color TMDS char rates per HDMI 1.4 Appendix D and/or HDMI 2.0 / 2.1 Appendix C). Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://lore.kernel.org/r/20250408-drm-hdmi-acr-v2-1-dee7298ab1af@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-04-15fuse: add more control over cache invalidation behaviourLuis Henriques
Currently userspace is able to notify the kernel to invalidate the cache for an inode. This means that, if all the inodes in a filesystem need to be invalidated, then userspace needs to iterate through all of them and do this kernel notification separately. This patch adds the concept of 'epoch': each fuse connection will have the current epoch initialized and every new dentry will have it's d_time set to the current epoch value. A new operation will then allow userspace to increment the epoch value. Every time a dentry is d_revalidate()'ed, it's epoch is compared with the current connection epoch and invalidated if it's value is different. Signed-off-by: Luis Henriques <luis@igalia.com> Tested-by: Laura Promberger <laura.promberger@cern.ch> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-04-15fs: add kern_path_locked_negative()Christian Brauner
The audit code relies on the fact that kern_path_locked() returned a path even for a negative dentry. If it doesn't find a valid dentry it immediately calls: audit_find_parent(d_backing_inode(parent_path.dentry)); which assumes that parent_path.dentry is still valid. But it isn't since kern_path_locked() has been changed to path_put() also for a negative dentry. Fix this by adding a helper that implements the required audit semantics and allows us to fix the immediate bleeding. We can find a unified solution for this afterwards. Link: https://lore.kernel.org/20250414-rennt-wimmeln-f186c3a780f1@brauner Fixes: 1c3cb50b58c3 ("VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry") Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-15PCI/MSI: Add an option to write MSIX ENTRY_DATA before any readsJonathan Currier
Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") introduced a readl() from ENTRY_VECTOR_CTRL before the writel() to ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune chips, the NIU module, will cause an error and/or fatal trap if any MSIX table entry is read before the corresponding ENTRY_DATA field is written to. Add an optional early writel() in msix_prepare_msi_desc(). Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") Signed-off-by: Jonathan Currier <dullfire@yahoo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241117234843.19236-2-dullfire@yahoo.com
2025-04-14rxrpc: Add more CHALLENGE/RESPONSE packet tracingDavid Howells
Add more tracing for CHALLENGE and RESPONSE packets. Currently, rxrpc only has client-relevant tracepoints (rx_challenge and tx_response), but add the server-side ones too. Further, record the service ID in the rx_challenge tracepoint as well. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-14-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: Display security params in the afs_cb_call tracepointDavid Howells
Make the afs_cb_call tracepoint display some security parameters to make debugging easier. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-12-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: rxgk: Implement connection rekeyingDavid Howells
Implement rekeying of connections with the RxGK security class. This involves regenerating the keys with a different key number as part of the input data after a certain amount of time or a certain amount of bytes encrypted. Rekeying may be triggered by either end. The LSW of the key number is inserted into the security-specific field in the RX header, and we try and expand it to 32-bits to make it last longer. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Chuck Lever <chuck.lever@oracle.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-10-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: rxgk: Implement the yfs-rxgk security class (GSSAPI)David Howells
Implement the basic parts of the yfs-rxgk security class (security index 6) to support GSSAPI-negotiated security. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Chuck Lever <chuck.lever@oracle.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-9-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: Add YFS RxGK (GSSAPI) security classDavid Howells
Add support for the YFS-variant RxGK security class to support GSSAPI-derived authentication. This also allows the use of better crypto over the rxkad security class. The key payload is XDR encoded of the form: typedef int64_t opr_time; const AFSTOKEN_RK_TIX_MAX = 12000; /* Matches entry in rxkad.h */ struct token_rxkad { afs_int32 viceid; afs_int32 kvno; afs_int64 key; afs_int32 begintime; afs_int32 endtime; afs_int32 primary_flag; opaque ticket<AFSTOKEN_RK_TIX_MAX>; }; struct token_rxgk { opr_time begintime; opr_time endtime; afs_int64 level; afs_int64 lifetime; afs_int64 bytelife; afs_int64 enctype; opaque key<>; opaque ticket<>; }; const AFSTOKEN_UNION_NOAUTH = 0; const AFSTOKEN_UNION_KAD = 2; const AFSTOKEN_UNION_YFSGK = 6; union ktc_tokenUnion switch (afs_int32 type) { case AFSTOKEN_UNION_KAD: token_rxkad kad; case AFSTOKEN_UNION_YFSGK: token_rxgk gk; }; const AFSTOKEN_LENGTH_MAX = 16384; typedef opaque token_opaque<AFSTOKEN_LENGTH_MAX>; const AFSTOKEN_MAX = 8; const AFSTOKEN_CELL_MAX = 64; struct ktc_setTokenData { afs_int32 flags; string cell<AFSTOKEN_CELL_MAX>; token_opaque tokens<AFSTOKEN_MAX>; }; The parser for the basic token struct is already present, as is the rxkad token type. This adds a parser for the rxgk token type. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Chuck Lever <chuck.lever@oracle.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-7-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: Add the security index for yfs-rxgkDavid Howells
Add the security index and abort codes for the YFS variant of rxgk. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20250411095303.2316168-6-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: Allow CHALLENGEs to the passed to the app for a RESPONSEDavid Howells
Allow the app to request that CHALLENGEs be passed to it through an out-of-band queue that allows recvmsg() to pick it up so that the app can add data to it with sendmsg(). This will allow the application (AFS or userspace) to interact with the process if it wants to and put values into user-defined fields. This will be used by AFS when talking to a fileserver to supply that fileserver with a crypto key by which callback RPCs can be encrypted (ie. notifications from the fileserver to the client). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-5-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14rxrpc: Pull out certain app callback funcs into an ops tableDavid Howells
A number of functions separately furnish an AF_RXRPC socket with callback function pointers into a kernel app (such as the AFS filesystem) that is using it. Replace most of these with an ops table for the entire socket. This makes it easier to add more callback functions. Note that the call incoming data processing callback is retaind as that gets set to different things, depending on the type of op. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Simon Horman <horms@kernel.org> cc: linux-afs@lists.infradead.org Link: https://patch.msgid.link/20250411095303.2316168-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: stmmac: remove eee_usecs_rateRussell King (Oracle)
plat_dat->eee_users_rate is now unused, so remove this member. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3Vuv-000E7y-9k@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: Remove ->exit_batch_rtnl().Kuniyuki Iwashima
There are no ->exit_batch_rtnl() users remaining. Let's remove the hook. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20250411205258.63164-15-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14ipv4: ip_tunnel: Convert ip_tunnel_delete_nets() callers to ->exit_rtnl().Kuniyuki Iwashima
ip_tunnel_delete_nets() iterates the dying netns list and performs the same operations for each. Let's export ip_tunnel_destroy() as ip_tunnel_delete_net() and call it from ->exit_rtnl(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20250411205258.63164-7-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: Add ->exit_rtnl() hook to struct pernet_operations.Kuniyuki Iwashima
struct pernet_operations provides two batching hooks; ->exit_batch() and ->exit_batch_rtnl(). The batching variant is beneficial if ->exit() meets any of the following conditions: 1) ->exit() repeatedly acquires a global lock for each netns 2) ->exit() has a time-consuming operation that can be factored out (e.g. synchronize_rcu(), smp_mb(), etc) 3) ->exit() does not need to repeat the same iterations for each netns (e.g. inet_twsk_purge()) Currently, none of the ->exit_batch_rtnl() functions satisfy any of the above conditions because RTNL is factored out and held by the caller and all of these functions iterate over the dying netns list. Also, we want to hold per-netns RTNL there but avoid spreading __rtnl_net_lock() across multiple locations. Let's add ->exit_rtnl() hook and run it under __rtnl_net_lock(). The following patches will convert all ->exit_batch_rtnl() users to ->exit_rtnl(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20250411205258.63164-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: ethtool: fix get_ts_stats() documentationRussell King (Oracle)
Commit 0e9c127729be ("ethtool: add interface to read Tx hardware timestamping statistics") added documentation for timestamping statistics, but added the detailed explanation for this method to the get_ts_info() rather than get_ts_stats(). Move it to the correct entry. Cc: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u3MTz-000Crx-IW@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14page_pool: Track DMA-mapped pages and unmap them when destroying the poolToke Høiland-Jørgensen
When enabling DMA mapping in page_pool, pages are kept DMA mapped until they are released from the pool, to avoid the overhead of re-mapping the pages every time they are used. This causes resource leaks and/or crashes when there are pages still outstanding while the device is torn down, because page_pool will attempt an unmap through a non-existent DMA device on the subsequent page return. To fix this, implement a simple tracking of outstanding DMA-mapped pages in page pool using an xarray. This was first suggested by Mina[0], and turns out to be fairly straight forward: We simply store pointers to pages directly in the xarray with xa_alloc() when they are first DMA mapped, and remove them from the array on unmap. Then, when a page pool is torn down, it can simply walk the xarray and unmap all pages still present there before returning, which also allows us to get rid of the get/put_device() calls in page_pool. Using xa_cmpxchg(), no additional synchronisation is needed, as a page will only ever be unmapped once. To avoid having to walk the entire xarray on unmap to find the page reference, we stash the ID assigned by xa_alloc() into the page structure itself, using the upper bits of the pp_magic field. This requires a couple of defines to avoid conflicting with the POINTER_POISON_DELTA define, but this is all evaluated at compile-time, so does not affect run-time performance. The bitmap calculations in this patch gives the following number of bits for different architectures: - 23 bits on 32-bit architectures - 21 bits on PPC64 (because of the definition of ILLEGAL_POINTER_VALUE) - 32 bits on other 64-bit architectures Stashing a value into the unused bits of pp_magic does have the effect that it can make the value stored there lie outside the unmappable range (as governed by the mmap_min_addr sysctl), for architectures that don't define ILLEGAL_POINTER_VALUE. This means that if one of the pointers that is aliased to the pp_magic field (such as page->lru.next) is dereferenced while the page is owned by page_pool, that could lead to a dereference into userspace, which is a security concern. The risk of this is mitigated by the fact that (a) we always clear pp_magic before releasing a page from page_pool, and (b) this would need a use-after-free bug for struct page, which can have many other risks since page->lru.next is used as a generic list pointer in multiple places in the kernel. As such, with this patch we take the position that this risk is negligible in practice. For more discussion, see[1]. Since all the tracking added in this patch is performed on DMA map/unmap, no additional code is needed in the fast path, meaning the performance overhead of this tracking is negligible there. A micro-benchmark shows that the total overhead of the tracking itself is about 400 ns (39 cycles(tsc) 395.218 ns; sum for both map and unmap[2]). Since this cost is only paid on DMA map and unmap, it seems like an acceptable cost to fix the late unmap issue. Further optimisation can narrow the cases where this cost is paid (for instance by eliding the tracking when DMA map/unmap is a no-op). The extra memory needed to track the pages is neatly encapsulated inside xarray, which uses the 'struct xa_node' structure to track items. This structure is 576 bytes long, with slots for 64 items, meaning that a full node occurs only 9 bytes of overhead per slot it tracks (in practice, it probably won't be this efficient, but in any case it should be an acceptable overhead). [0] https://lore.kernel.org/all/CAHS8izPg7B5DwKfSuzz-iOop_YRbk3Sd6Y4rX7KBG9DcVJcyWg@mail.gmail.com/ [1] https://lore.kernel.org/r/20250320023202.GA25514@openwall.com [2] https://lore.kernel.org/r/ae07144c-9295-4c9d-a400-153bb689fe9e@huawei.com Reported-by: Yonglong Liu <liuyonglong@huawei.com> Closes: https://lore.kernel.org/r/8743264a-9700-4227-a556-5f931c720211@huawei.com Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code") Suggested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Jesper Dangaard Brouer <hawk@kernel.org> Tested-by: Jesper Dangaard Brouer <hawk@kernel.org> Tested-by: Qiuling Ren <qren@redhat.com> Tested-by: Yuying Ma <yuma@redhat.com> Tested-by: Yonglong Liu <liuyonglong@huawei.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409-page-pool-track-dma-v9-2-6a9ef2e0cba8@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14page_pool: Move pp_magic check into helper functionsToke Høiland-Jørgensen
Since we are about to stash some more information into the pp_magic field, let's move the magic signature checks into a pair of helper functions so it can be changed in one place. Reviewed-by: Mina Almasry <almasrymina@google.com> Tested-by: Yonglong Liu <liuyonglong@huawei.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409-page-pool-track-dma-v9-1-6a9ef2e0cba8@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: bridge: Add offload_fail_notification boptJoseph Huang
Add BR_BOOLOPT_MDB_OFFLOAD_FAIL_NOTIFICATION bool option. Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250411150323.1117797-3-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: bridge: mcast: Add offload failed mdb flagJoseph Huang
Add MDB_FLAGS_OFFLOAD_FAILED and MDB_PG_FLAGS_OFFLOAD_FAILED to indicate that an attempt to offload the MDB entry to switchdev has failed. Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250411150323.1117797-2-Joseph.Huang@garmin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14net: convert dev->rtnl_link_state to a boolJakub Kicinski
netdevice reg_state was split into two 16 bit enums back in 2010 in commit a2835763e130 ("rtnetlink: handle rtnl_link netlink notifications manually"). Since the split the fields have been moved apart, and last year we converted reg_state to a normal u8 in commit 4d42b37def70 ("net: convert dev->reg_state to u8"). rtnl_link_state being a 16 bitfield makes no sense. Convert it to a single bool, it seems very unlikely after 15 years that we'll need more values in it. We could drop dev->rtnl_link_ops from the conditions but feels like having it there more clearly points at the reason for this hack. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250410014246.780885-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14udp: properly deal with xfrm encap and ADDRFORMPaolo Abeni
UDP GRO accounting assumes that the GRO receive callback is always set when the UDP tunnel is enabled, but syzkaller proved otherwise, leading tot the following splat: WARNING: CPU: 0 PID: 5837 at net/ipv4/udp_offload.c:123 udp_tunnel_update_gro_rcv+0x28d/0x4c0 net/ipv4/udp_offload.c:123 Modules linked in: CPU: 0 UID: 0 PID: 5837 Comm: syz-executor850 Not tainted 6.14.0-syzkaller-13320-g420aabef3ab5 #0 PREEMPT(full) Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 RIP: 0010:udp_tunnel_update_gro_rcv+0x28d/0x4c0 net/ipv4/udp_offload.c:123 Code: 00 00 e8 c6 5a 2f f7 48 c1 e5 04 48 8d b5 20 53 c7 9a ba 10 00 00 00 4c 89 ff e8 ce 87 99 f7 e9 ce 00 00 00 e8 a4 5a 2f f7 90 <0f> 0b 90 e9 de fd ff ff bf 01 00 00 00 89 ee e8 cf 5e 2f f7 85 ed RSP: 0018:ffffc90003effa88 EFLAGS: 00010293 RAX: ffffffff8a93fc9c RBX: 0000000000000000 RCX: ffff8880306f9e00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffff8a93fabe R09: 1ffffffff20bfb2e R10: dffffc0000000000 R11: fffffbfff20bfb2f R12: ffff88814ef21738 R13: dffffc0000000000 R14: ffff88814ef21778 R15: 1ffff11029de42ef FS: 0000000000000000(0000) GS:ffff888124f96000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f04eec760d0 CR3: 000000000eb38000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> udp_tunnel_cleanup_gro include/net/udp_tunnel.h:205 [inline] udpv6_destroy_sock+0x212/0x270 net/ipv6/udp.c:1829 sk_common_release+0x71/0x2e0 net/core/sock.c:3896 inet_release+0x17d/0x200 net/ipv4/af_inet.c:435 __sock_release net/socket.c:647 [inline] sock_close+0xbc/0x240 net/socket.c:1391 __fput+0x3e9/0x9f0 fs/file_table.c:465 task_work_run+0x251/0x310 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0xa11/0x27f0 kernel/exit.c:953 do_group_exit+0x207/0x2c0 kernel/exit.c:1102 __do_sys_exit_group kernel/exit.c:1113 [inline] __se_sys_exit_group kernel/exit.c:1111 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1111 x64_sys_call+0x26c3/0x26d0 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f04eebfac79 Code: Unable to access opcode bytes at 0x7f04eebfac4f. RSP: 002b:00007fffdcaa34a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04eebfac79 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 RBP: 00007f04eec75270 R08: ffffffffffffffb8 R09: 00007fffdcaa36c8 R10: 0000200000000000 R11: 0000000000000246 R12: 00007f04eec75270 R13: 0000000000000000 R14: 00007f04eec75cc0 R15: 00007f04eebcca70 Address the issue moving the accounting hook into setup_udp_tunnel_sock() and set_xfrm_gro_udp_encap_rcv(), where the GRO callback is actually set. set_xfrm_gro_udp_encap_rcv() is prone to races with IPV6_ADDRFORM, run the relevant setsockopt under the socket lock to ensure using consistent values of sk_family and up->encap_type. Refactor the GRO callback selection code, to make it clear that the function pointer is always initialized. Reported-by: syzbot+8c469a2260132cd095c1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8c469a2260132cd095c1 Fixes: 172bf009c18d ("xfrm: Support GRO for IPv4 ESP in UDP encapsulation") Fixes: 5d7f5b2f6b935 ("udp_tunnel: use static call for GRO hooks when possible") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/92bcdb6899145a9a387c8fa9e3ca656642a43634.1744228733.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-15Merge tag 'drm-intel-next-2025-04-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Cross-subsystem Changes: - Update GVT MAINTAINERS (Jani) Driver Changes: - Updates for xe3lpd display (Gustavo) - Fix link training interrupted by HPD pulse (Imre) - Watermark bound checks for DSC (Ankit) - VRR Refactor and other fixes and improvements (Ankit) - More conversions towards intel_display struct (Gustavo, Jani) - Other clean-up patches towards a display separation (Jani) - Maintain asciibetical order for HAS_* macros (Ankit) - Fixes around probe/initialization (Janusz) - Fix build and doc build issue (Yue, Rodrigo) - DSI related fixes (Suraj, William, Jani) - Improve DC6 entry counter (Mohammed) - Fix xe2hpd memory type identification (Vivek) - PSR related fixes and improvements (Animesh, Jouni) - DP MST related fixes and improvements (Imre) - Fix scanline_offset for LNL+/BMG+ (Ville) - Some gvt related fixes and changes (Ville, Jani) - Some PLL code adjustment (Ville) - Display wa addition (Vinod) - DRAM type logging (Lucas) - Pimp the initial FB readout (Ville) - Some sagv/bw cleanup (Ville) - Remove i915_display_capabilities debugfs entry (Jani) - Move PCH type to display caps debugfs entry (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/Z_kTqPX5Mjruq1pL@intel.com
2025-04-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix hang in bnxt_re due to miscomputing the budget - Avoid a -Wformat-security message in dev_set_name() - Avoid an unused definition warning in fs.c with some kconfigs - Fix error handling in usnic and remove IS_ERR_OR_NULL() usage - Regression in RXE support foudn by blktests due to missing ODP exclusions - Set the dma_segment_size on HNS so it doesn't corrupt DMA when using very large IOs - Move a INIT_WORK to near when the work is allocated in cm.c to fix a racey crash where work in progress was being init'd - Use __GFP_NOWARN to not dump in kvcalloc() if userspace requests a very big MR * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Remove unusable nq variable RDMA/core: Silence oversized kvmalloc() warning RDMA/cma: Fix workqueue crash in cma_netevent_work_handler RDMA/hns: Fix wrong maximum DMA segment size RDMA/rxe: Fix null pointer dereference in ODP MR check RDMA/mlx5: Fix compilation warning when USER_ACCESS isn't set RDMA/usnic: Fix passing zero to PTR_ERR in usnic_ib_pci_probe() RDMA/ucaps: Avoid format-security warning RDMA/bnxt_re: Fix budget handling of notification queue