summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2 daysnet: phy: bcm84881: provide phy_query_inband() implementationRussell King (Oracle)
BCM84881 has no support for inband signalling, so this is a trivial implementation that returns no support for inband. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: phylink: add pcs_query_inband()Russell King (Oracle)
Add a pcs_query_inband() interface which reflects phy_query_inband() for PHYs. This can be used to determine for the specified interface mode whether in-band signalling is supported by the PCS, and whether the PCS requires in-band signalling. This is used to determine whether we should use inband autonegotiation in inband mode, which may be required or may be unsupported in various interface modes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: phy: add phy_query_inband()Russell King (Oracle)
Add a method to query the PHY's in-band capabilities for a PHY interface mode. This can be used to determine for the specified interface mode whether in-band signalling is supported, and whether the PHY requires in-band signalling. When not implemented, or the PHY driver doesn't report any modes for the interface, LINK_INBAND_VALID will not be set. When set, the remainder of the flags can be interpreted. LINK_INBAND_POSSIBLE means that the device can be configured to use or not use in-band signalling. Later patches may add support to configure this at the PHY. LINK_INBAND_REQUIRED means that the device uses in-band signalling which can not be disabled. When only LINK_INBAND_VALID has been set, this means that the device does not support any in-band signalling, and can't be configured to do so. "Bypass" mode (where the device may be configured for in-band, but may still bring the link up if there is no in-band received from the link partner) is not considered in this patch. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: phylink allow EEE management for SFPs without PHYsRussell King (Oracle)
Allow EEE management for SFPs without accessible PHYs. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: dsa: mt7530: convert to phylink managed EEERussell King (Oracle)
Fixme: doesn't bit 25 and 26 also need to be set in the PMCR for PMCR_FORCE_EEE100 and PMCR_FORCE_EEE1G to take effect? Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: dsa: add support for phylink managed EEERussell King (Oracle)
Add support to allow DSA drivers to use phylink managed EEE, with only needing support for controlling the LPI state. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: dsa: mv88e6xxx: add EEE controlsRussell King (Oracle)
Add phylink EEE control methods to allow EEE to be configured. When LPI is to be disabled, we force the port to have EEE disabled, but when enabling EEE, if the port is under the control of the PPU, we stop forcing it, otherwise we force-enable EEE. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: dsa: mv88e6xxx: add support for EEE forcingRussell King (Oracle)
Add support for EEE forcing using the MAC control register. Replace the 88e6393x errata 4.5 EEE disable code with a call to the new EEE forcing code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: dsa: mv88e6xxx: add port_set_eee() methodRussell King (Oracle)
Add a port_set_eee() method to allow the EEE settings for a port to be configured. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: mvpp2: add EEE implementationRussell King (Oracle)
Add EEE support for mvpp2, using phylink's EEE implementation, which means we just need to implement the two methods for LPI control, and with the initial configuration. Only the GMAC is supported, so only 100M, 1G and 2.5G speeds. Disabling LPI requires clearing a single bit. Enabling LPI needs a full configuration of several values, as the timer values are dependent on the MAC operating speed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: mvneta: convert to phylink EEE implementationRussell King (Oracle)
Convert mvneta to use phylink's EEE implementation, which means we just need to implement the two methods for LPI control, and adding the initial configuration. Disabling LPI requires clearing a single bit. Enabling LPI needs a full configuration of several values, as the timer values are dependent on the MAC operating speed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: phylink: add EEE managementRussell King (Oracle)
Add EEE management to phylink. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 daysnet: add helpers for EEE configurationRussell King (Oracle)
Add helpers that phylib and phylink can use to manage EEE configuration and determine whether the MAC should be permitted to use LPI based on that configuration. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: felix: provide own phylink MAC operationsRussell King (Oracle)
Convert felix to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: rzn1_a5psw: provide own phylink MAC operationsnet-nextRussell King (Oracle)
Convert rzn1_a5psw to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. We need to provide a stub for the mac_config() method which is mandatory. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: lan9303: provide own phylink MAC operationsRussell King (Oracle)
Convert lan9303 to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. We need to provide stubs for the mac_link_down() and mac_config() methods which are mandatory. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: xrs700x: provide own phylink MAC operationsRussell King (Oracle)
Convert xrs700x to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. We need to provide stubs for the mac_link_down() and mac_config() methods which are mandatory. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: bcm_sf2: provide own phylink MAC operationsRussell King (Oracle)
Convert bcm_sf2 to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: lantiq_gswip: provide own phylink MAC operationsnet-mergedRussell King (Oracle)
Convert lantiq_gswip to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. For lantiq_gswip, it means we end up with a common instance of phylink MAC operations that are shared between the different variants, rather than having duplicated initialisers in dsa_switch_ops. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: qca8k: provide own phylink MAC operationsRussell King (Oracle)
Convert qca8k to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: ar9331: provide own phylink MAC operationsRussell King (Oracle)
Convert ar9331 to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
3 daysnet: dsa: sja1105: provide own phylink MAC operationsRussell King (Oracle)
Convert sja1105 to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
6 daysnet: dsa: convert dsa_user_phylink_fixed_state() to use dsa_phylink_to_port()Russell King (Oracle)
Convert dsa_user_phylink_fixed_state() to use the newly introduced dsa_phylink_to_port() helper. Suggested-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
8 daysnet: dsa: mv88e6xxx: provide own phylink MAC operationsRussell King (Oracle)
Convert mv88e6xxx to provide its own phylink MAC operations, thus avoiding the shim layer in DSA's port.c Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
8 daysnet: dsa: allow DSA switch drivers to provide their own phylink mac opsRussell King (Oracle)
Rather than having a shim for each and every phylink MAC operation, allow DSA switch drivers to provide their own ops structure. When a DSA driver provides the phylink MAC operations, the shimmed ops must not be provided, so fail an attempt to register a switch with both the phylink_mac_ops in struct dsa_switch and the phylink_mac_* operations populated in dsa_switch_ops populated. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
8 daysnet: dsa: introduce dsa_phylink_to_port()Russell King (Oracle)
We convert from a phylink_config struct to a dsa_port struct in many places, let's provide a helper for this. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-04-04net: pcs: rzn1-miic: update PCS driver to use neg_modeRussell King (Oracle)
Update the RZN1-MIIC PCS driver to use neg_mode rather than the mode argument to match the other updated PCS drivers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-04-03net: dsa: mv88e6xxx: update 88e6185 PCS driver to use neg_modeRussell King (Oracle)
Update the Marvell 88e6185 PCS driver to use neg_mode rather than the mode argument to match the other updated PCS drivers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: phy: marvell: add comment about m88e1111_config_init_1000basex()Russell King (Oracle)
The comment in m88e1111_config_init_1000basex() is wrong - it claims that Autoneg will be enabled, but this doesn't actually happen. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: dsa: b53: remove eee_enabled/eee_active in b53_get_mac_eee()Russell King (Oracle)
b53_get_mac_eee() sets both eee_enabled and eee_active, and then returns zero. dsa_user_get_eee(), which calls this function, will then continue to call phylink_ethtool_get_eee(), which will return -EOPNOTSUPP if there is no PHY present, otherwise calling phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Thus, when there is no PHY, dsa_user_get_eee() will fail with -EOPNOTSUPP, meaning eee_enabled and eee_active will not be returned to userspace. When there is a PHY, eee_enabled and eee_active will be overwritten by phylib, making the setting of these members in b53_get_mac_eee() entirely unnecessary. Remove this code, thus simplifying b53_get_mac_eee(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: bcmasp: remove eee_enabled/eee_active in bcmasp_get_eee()Russell King (Oracle)
bcmasp_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in bcmasp_get_eee() is redundant, and can be removed. This also makes intf->eee.eee_active unnecessary, so remove this and use a local variable where appropriate. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: bcmgenet: remove eee_enabled/eee_active in bcmgenet_get_eee()Russell King (Oracle)
bcmgenet_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in bcmgenet_get_eee() is redundant, and can be removed. This also makes priv->eee.eee_active unnecessary, so remove this and use a local variable where appropriate. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: fec: remove eee_enabled/eee_active in fec_enet_get_eee()Russell King (Oracle)
fec_enet_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in fec_enet_get_eee() is redundant. Remove this, and remove the setting of fep->eee.eee_active member which becomes a write-only variable. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: sxgbe: remove eee_enabled/eee_active in sxgbe_get_eee()Russell King (Oracle)
sxgbe_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in sxgbe_get_eee() is redundant. Remove this, and remove the priv->eee_active member which then becomes a write-only variable. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: stmmac: remove eee_enabled/eee_active in stmmac_ethtool_op_get_eee()Russell King (Oracle)
stmmac_ethtool_op_get_eee() sets both eee_enabled and eee_active, and then goes on to call phylink_ethtool_get_eee(). phylink_ethtool_get_eee() will return -EOPNOTSUPP if there is no PHY present, otherwise calling phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Thus, when there is no PHY, stmmac_ethtool_op_get_eee() will fail with -EOPNOTSUPP, meaning eee_enabled and eee_active will not be returned to userspace. When there is a PHY, eee_enabled and eee_active will be overwritten by phylib, making the setting of these members in stmmac_ethtool_op_get_eee() entirely unnecessary. Remove this code, thus simplifying stmmac_ethtool_op_get_eee(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-26net: phy: constify phydev->drvRussell King (Oracle)
Device driver structures are shared between all devices that they match, and thus nothing should never write to the device driver structure through the phydev->drv pointer. Let's make this pointer const to catch code that attempts to do so. Suggested-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-03-10Linux 6.8Linus Torvalds
2024-03-10Merge tag 'trace-ring-buffer-v6.8-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Do not allow large strings (> 4096) as single write to trace_marker The size of a string written into trace_marker was determined by the size of the sub-buffer in the ring buffer. That size is dependent on the PAGE_SIZE of the architecture as it can be mapped into user space. But on PowerPC, where PAGE_SIZE is 64K, that made the limit of the string of writing into trace_marker 64K. One of the selftests looks at the size of the ring buffer sub-buffers and writes that plus more into the trace_marker. The write will take what it can and report back what it consumed so that the user space application (like echo) will write the rest of the string. The string is stored in the ring buffer and can be read via the "trace" or "trace_pipe" files. The reading of the ring buffer uses vsnprintf(), which uses a precision "%.*s" to make sure it only reads what is stored in the buffer, as a bug could cause the string to be non terminated. With the combination of the precision change and the PAGE_SIZE of 64K allowing huge strings to be added into the ring buffer, plus the test that would actually stress that limit, a bug was reported that the precision used was too big for "%.*s" as the string was close to 64K in size and the max precision of vsnprintf is 32K. Linus suggested not to have that precision as it could hide a bug if the string was again stored without a nul byte. Another issue that was brought up is that the trace_seq buffer is also based on PAGE_SIZE even though it is not tied to the architecture limit like the ring buffer sub-buffer is. Having it be 64K * 2 is simply just too big and wasting memory on systems with 64K page sizes. It is now hardcoded to 8K which is what all other architectures with 4K PAGE_SIZE has. Finally, the write to trace_marker is now limited to 4K as there is no reason to write larger strings into trace_marker. - ring_buffer_wait() should not loop. The ring_buffer_wait() does not have the full context (yet) on if it should loop or not. Just exit the loop as soon as its woken up and let the callers decide to loop or not (they already do, so it's a bit redundant). - Fix shortest_full field to be the smallest amount in the ring buffer that a waiter is waiting for. The "shortest_full" field is updated when a new waiter comes in and wants to wait for a smaller amount of data in the ring buffer than other waiters. But after all waiters are woken up, it's not reset, so if another waiter comes in wanting to wait for more data, it will be woken up when the ring buffer has a smaller amount from what the previous waiters were waiting for. - The wake up all waiters on close is incorrectly called frome .release() and not from .flush() so it will never wake up any waiters as the .release() will not get called until all .read() calls are finished. And the wakeup is for the waiters in those .read() calls. * tag 'trace-ring-buffer-v6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Use .flush() call to wake up readers ring-buffer: Fix resetting of shortest_full ring-buffer: Fix waking up ring buffer readers tracing: Limit trace_marker writes to just 4K tracing: Limit trace_seq size to just 8K and not depend on architecture PAGE_SIZE tracing: Remove precision vsnprintf() check from print event
2024-03-10Merge tag 'phy-fixes3-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - fixes for Qualcomm qmp-combo driver for ordering of drm and type-c switch registartion due to drivers might not probe defer after having registered child devices to avoid triggering a probe deferral loop. This fixes internal display on Lenovo ThinkPad X13s * tag 'phy-fixes3-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom-qmp-combo: fix type-c switch registration phy: qcom-qmp-combo: fix drm bridge registration
2024-03-10tracing: Use .flush() call to wake up readersSteven Rostedt (Google)
The .release() function does not get called until all readers of a file descriptor are finished. If a thread is blocked on reading a file descriptor in ring_buffer_wait(), and another thread closes the file descriptor, it will not wake up the other thread as ring_buffer_wake_waiters() is called by .release(), and that will not get called until the .read() is finished. The issue originally showed up in trace-cmd, but the readers are actually other processes with their own file descriptors. So calling close() would wake up the other tasks because they are blocked on another descriptor then the one that was closed(). But there's other wake ups that solve that issue. When a thread is blocked on a read, it can still hang even when another thread closed its descriptor. This is what the .flush() callback is for. Have the .flush() wake up the readers. Link: https://lore.kernel.org/linux-trace-kernel/20240308202432.107909457@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: f3ddb74ad0790 ("tracing: Wake up ring buffer waiters on closing of the file") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-10ring-buffer: Fix resetting of shortest_fullSteven Rostedt (Google)
The "shortest_full" variable is used to keep track of the waiter that is waiting for the smallest amount on the ring buffer before being woken up. When a tasks waits on the ring buffer, it passes in a "full" value that is a percentage. 0 means wake up on any data. 1-100 means wake up from 1% to 100% full buffer. As all waiters are on the same wait queue, the wake up happens for the waiter with the smallest percentage. The problem is that the smallest_full on the cpu_buffer that stores the smallest amount doesn't get reset when all the waiters are woken up. It does get reset when the ring buffer is reset (echo > /sys/kernel/tracing/trace). This means that tasks may be woken up more often then when they want to be. Instead, have the shortest_full field get reset just before waking up all the tasks. If the tasks wait again, they will update the shortest_full before sleeping. Also add locking around setting of shortest_full in the poll logic, and change "work" to "rbwork" to match the variable name for rb_irq_work structures that are used in other places. Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.948914369@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: 2c2b0a78b3739 ("ring-buffer: Add percentage of ring buffer full to wake up reader") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating an inconsistent ABI (KVM_MEM_GUEST_MEMFD is not writable from userspace, so there would be no way to write to a read-only guest_memfd). - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely for development and testing. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD dirty logging test that caused false passes. x86 fixes: - Fix missing marking of a guest page as dirty when emulating an atomic access. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention with preemptible kernels (including CONFIG_PREEMPT_DYNAMIC in non-preemptible mode). - Disable AMD DebugSwap by default, it breaks VMSA signing and will be re-enabled with a better VM creation API in 6.10. - Do the cache flush of converted pages in svm_register_enc_region() before dropping kvm->lock, to avoid a race with unregistering of the same region and the consequent use-after-free issue" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: SEV: disable SEV-ES DebugSwap by default KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing KVM: SVM: Flush pages under kvm->lock to fix UAF in svm_register_enc_region() KVM: selftests: Add a testcase to verify GUEST_MEMFD and READONLY are exclusive KVM: selftests: Create GUEST_MEMFD for relevant invalid flags testcases KVM: x86/mmu: Restrict KVM_SW_PROTECTED_VM to the TDP MMU KVM: x86: Update KVM_SW_PROTECTED_VM docs to make it clear they're a WIP KVM: Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY KVM: x86: Mark target gfn of emulated atomic instruction as dirty
2024-03-10ring-buffer: Fix waking up ring buffer readersSteven Rostedt (Google)
A task can wait on a ring buffer for when it fills up to a specific watermark. The writer will check the minimum watermark that waiters are waiting for and if the ring buffer is past that, it will wake up all the waiters. The waiters are in a wait loop, and will first check if a signal is pending and then check if the ring buffer is at the desired level where it should break out of the loop. If a file that uses a ring buffer closes, and there's threads waiting on the ring buffer, it needs to wake up those threads. To do this, a "wait_index" was used. Before entering the wait loop, the waiter will read the wait_index. On wakeup, it will check if the wait_index is different than when it entered the loop, and will exit the loop if it is. The waker will only need to update the wait_index before waking up the waiters. This had a couple of bugs. One trivial one and one broken by design. The trivial bug was that the waiter checked the wait_index after the schedule() call. It had to be checked between the prepare_to_wait() and the schedule() which it was not. The main bug is that the first check to set the default wait_index will always be outside the prepare_to_wait() and the schedule(). That's because the ring_buffer_wait() doesn't have enough context to know if it should break out of the loop. The loop itself is not needed, because all the callers to the ring_buffer_wait() also has their own loop, as the callers have a better sense of what the context is to decide whether to break out of the loop or not. Just have the ring_buffer_wait() block once, and if it gets woken up, exit the function and let the callers decide what to do next. Link: https://lore.kernel.org/all/CAHk-=whs5MdtNjzFkTyaUy=vHi=qwWgPi0JgTe6OYUYMNSRZfg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240308202431.792933613@goodmis.org Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linke li <lilinke99@qq.com> Cc: Rabin Vincent <rabin@rab.in> Fixes: e30f53aad2202 ("tracing: Do not busy wait in buffer splice") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-09Merge tag 'i2c-for-6.8-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two patches from Heiner for the i801 are targeting muxes discovered while working on some other features. Essentially, there is a reordering when adding optional slaves and proper cleanup upon registering a mux device. Christophe fixes the exit path in the wmt driver that was leaving the clocks hanging, and the last fix from Tommy avoids false error reports in IRQ" * tag 'i2c-for-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: aspeed: Fix the dummy irq expected print i2c: wmt: Fix an error handling path in wmt_i2c_probe() i2c: i801: Avoid potential double call to gpiod_remove_lookup_table i2c: i801: Fix using mux_pdev before it's set
2024-03-09Merge tag 'firewire-fixes-6.8-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "A fix to suppress a warning about unreleased IRQ for 1394 OHCI hardware when disabling MSI. In Linux kernel v6.5, a PCI driver for 1394 OHCI hardware was optimized into the managed device resources. Edmund Raile points out that the change brings the warning about unreleased IRQ at the call of pci_disable_msi(), since the API expects that the relevant IRQ has already been released in advance. As long as the API is called in .remove callback of PCI device operation, it is prohibited to maintain the IRQ as the part of managed device resource. As a workaround, the IRQ is explicitly released at .remove callback, before the call of pci_disable_msi(). pci_disable_msi() is legacy API nowadays in PCI MSI implementation. I have a plan to replace it with the modern API in the development for the future version of Linux kernel. So at present I keep them as is" * tag 'firewire-fixes-6.8-final' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: ohci: prevent leak of left-over IRQ on unbind
2024-03-09SEV: disable SEV-ES DebugSwap by defaultPaolo Bonzini
The DebugSwap feature of SEV-ES provides a way for confidential guests to use data breakpoints. However, because the status of the DebugSwap feature is recorded in the VMSA, enabling it by default invalidates the attestation signatures. In 6.10 we will introduce a new API to create SEV VMs that will allow enabling DebugSwap based on what the user tells KVM to do. Contextually, we will change the legacy KVM_SEV_ES_INIT API to never enable DebugSwap. For compatibility with kernels that pre-date the introduction of DebugSwap, as well as with those where KVM_SEV_ES_INIT will never enable it, do not enable the feature by default. If anybody wants to use it, for now they can enable the sev_es_debug_swap_enabled module parameter, but this will result in a warning. Fixes: d1f85fbe836e ("KVM: SEV: Enable data breakpoints in SEV-ES") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-03-09Merge tag 'kvm-x86-guest_memfd_fixes-6.8' of ↵Paolo Bonzini
https://github.com/kvm-x86/linux into HEAD KVM GUEST_MEMFD fixes for 6.8: - Make KVM_MEM_GUEST_MEMFD mutually exclusive with KVM_MEM_READONLY to avoid creating ABI that KVM can't sanely support. - Update documentation for KVM_SW_PROTECTED_VM to make it abundantly clear that such VMs are purely a development and testing vehicle, and come with zero guarantees. - Limit KVM_SW_PROTECTED_VM guests to the TDP MMU, as the long term plan is to support confidential VMs with deterministic private memory (SNP and TDX) only in the TDP MMU. - Fix a bug in a GUEST_MEMFD negative test that resulted in false passes when verifying that KVM_MEM_GUEST_MEMFD memslots can't be dirty logged.
2024-03-09Merge tag 'kvm-x86-fixes-6.8-2' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 fixes for 6.8, round 2: - When emulating an atomic access, mark the gfn as dirty in the memslot to fix a bug where KVM could fail to mark the slot as dirty during live migration, ultimately resulting in guest data corruption due to a dirty page not being re-copied from the source to the target. - Check for mmu_notifier invalidation events before faulting in the pfn, and before acquiring mmu_lock, to avoid unnecessary work and lock contention. Contending mmu_lock is especially problematic on preemptible kernels, as KVM may yield mmu_lock in response to the contention, which severely degrades overall performance due to vCPUs making it difficult for the task that triggered invalidation to make forward progress. Note, due to another kernel bug, this fix isn't limited to preemtible kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield contended rwlocks and spinlocks. https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com
2024-03-08Merge tag 'ceph-for-6.8-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A follow-up for sparse read fixes that went into -rc4 -- msgr2 case was missed and is corrected here" * tag 'ceph-for-6.8-rc8' of https://github.com/ceph/ceph-client: libceph: init the cursor when preparing sparse read in msgr2
2024-03-08Merge tag 'char-misc-6.8-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few small char/misc and other driver subsystem fixes for reported issues that have been in my tree. Included in here are fixes for: - iio driver fixes for reported problems - much reported bugfix for a lis3lv02d_i2c regression - comedi driver bugfix - mei new device ids - mei driver fixes - counter core fix All of these have been in linux-next with no reported issues, some for many weeks" * tag 'char-misc-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: gsc_proxy: match component when GSC is on different bus misc: fastrpc: Pass proper arguments to scm call comedi: comedi_test: Prevent timers rescheduling during deletion comedi: comedi_8255: Correct error in subdevice initialization misc: lis3lv02d_i2c: Fix regulators getting en-/dis-abled twice on suspend/resume iio: accel: adxl367: fix I2C FIFO data register iio: accel: adxl367: fix DEVID read after reset iio: pressure: dlhl60d: Initialize empty DLH bytes iio: imu: inv_mpu6050: fix frequency setting when chip is off iio: pressure: Fixes BMP38x and BMP390 SPI support iio: imu: inv_mpu6050: fix FIFO parsing when empty mei: Add Meteor Lake support for IVSC device mei: me: add arrow lake point H DID mei: me: add arrow lake point S DID counter: fix privdata alignment