summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-11f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percentDaeho Jeong
Added control knobs for gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: do FG_GC when GC boosting is required for zoned devicesDaeho Jeong
Under low free section count, we need to use FG_GC instead of BG_GC to recover free sections. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: increase BG GC migration window granularity when boosted for zoned devicesDaeho Jeong
Need bigger BG GC migration window granularity when free section is running low. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: add reserved_segments sysfs nodeDaeho Jeong
For the fine tuning of GC behavior, add reserved_segments sysfs node. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: introduce migration_window_granularityDaeho Jeong
We can control the scanning window granularity for GC migration. For more frequent scanning and GC on zoned devices, we need a fine grained control knob for it. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: make BG GC more aggressive for zoned devicesDaeho Jeong
Since we don't have any GC on device side for zoned devices, need more aggressive BG GC. So, tune the parameters for that. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: avoid unused block when dio write in LFS modeDaejun Park
This patch addresses the problem that when using LFS mode, unused blocks may occur in f2fs_map_blocks() during block allocation for dio writes. If a new section is allocated during block allocation, it will not be included in the map struct by map_is_mergeable() if the LBA of the allocated block is not contiguous. However, the block already allocated in this process will remain unused due to the LFS mode. This patch avoids the possibility of unused blocks by escaping f2fs_map_blocks() when allocating the last block in a section. Signed-off-by: Daejun Park <daejun7.park@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: fix to check atomic_file in f2fs ioctl interfacesChao Yu
Some f2fs ioctl interfaces like f2fs_ioc_set_pin_file(), f2fs_move_file_range(), and f2fs_defragment_range() missed to check atomic_write status, which may cause potential race issue, fix it. Cc: stable@vger.kernel.org Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: get rid of online repaire on corrupted directoryChao Yu
syzbot reports a f2fs bug as below: kernel BUG at fs/f2fs/inode.c:896! RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896 Call Trace: evict+0x532/0x950 fs/inode.c:704 dispose_list fs/inode.c:747 [inline] evict_inodes+0x5f9/0x690 fs/inode.c:797 generic_shutdown_super+0x9d/0x2d0 fs/super.c:627 kill_block_super+0x44/0x90 fs/super.c:1696 kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898 deactivate_locked_super+0xc4/0x130 fs/super.c:473 cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373 task_work_run+0x24f/0x310 kernel/task_work.c:228 ptrace_notify+0x2d2/0x380 kernel/signal.c:2402 ptrace_report_syscall include/linux/ptrace.h:415 [inline] ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline] syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173 syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] syscall_exit_to_user_mode+0x279/0x370 kernel/entry/common.c:218 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896 Online repaire on corrupted directory in f2fs_lookup() can generate dirty data/meta while racing w/ readonly remount, it may leave dirty inode after filesystem becomes readonly, however, checkpoint() will skips flushing dirty inode in a state of readonly mode, result in above panic. Let's get rid of online repaire in f2fs_lookup(), and leave the work to fsck.f2fs. Fixes: 510022a85839 ("f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries") Reported-by: syzbot+ebea2790904673d7c618@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000a7b20f061ff2d56a@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11f2fs: prevent atomic file from being dirtied before commitDaeho Jeong
Keep atomic file clean while updating and make it dirtied during commit in order to avoid unnecessary and excessive inode updates in the previous fix. Fixes: 4bf78322346f ("f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag") Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-10drivers: perf: Fix smp_processor_id() use in preemptible codeAlexandre Ghiti
As reported in [1], the use of smp_processor_id() in pmu_sbi_device_probe() must be protected by disabling the preemption, so simple use get_cpu()/put_cpu() instead. Reported-by: Nam Cao <namcao@linutronix.de> Closes: https://lore.kernel.org/linux-riscv/20240820074925.ReMKUPP3@linutronix.de/ [1] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Nam Cao <namcao@linutronix.de> Fixes: a8625217a054 ("drivers/perf: riscv: Implement SBI PMU snapshot function") Reported-by: Andrea Parri <parri.andrea@gmail.com> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20240826165210.124696-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-11Merge tag 'drm-misc-next-fixes-2024-09-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: tegra: - Fix uninitialized variable in EDID code Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240905113836.GA292407@linux.fritz.box
2024-09-11Merge tag 'exynos-drm-next-for-v6.12' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next Three cleanups - Drop stale exynos file pattern from MAINTAINERS file The old "exynos" directory is removed from MAINTAINERS as Samsung Exynos display bindings have been relocated. This resolves a warning from get_maintainers.pl about no files matching the outdated directory. - Constify struct exynos_drm_ipp_funcs By making struct exynos_drm_ipp_funcs constant, the patch enhances security by moving the structure to a read-only section of memory. This change results in a slight reduction in the data section size. - Remove unnecessary code The function exynos_atomic_commit is removed as it became redundant after a previous update. This cleans up the code and eliminates unused function declarations. One fixup - Fix wrong assignment in gsc_bind() A double assignment in gsc_bind() was flagged by the cocci tool and corrected to fix an incorrect assignment, addressing a potential issue introduced in a prior commit. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240909004641.406858-1-inki.dae@samsung.com
2024-09-10Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-09-09 (ice, igb) This series contains updates to ice and igb drivers. Martyna moves LLDP rule removal to the proper uninitialization function for ice. Jake corrects accounting logic for FWD_TO_VSI_LIST switch filters on ice. Przemek removes incorrect, explicit calls to pci_disable_device() for ice. Michal Schmidt stops incorrect use of VSI list for VLAN use on ice. Sriram Yagnaraman adjusts igb_xdp_ring_update_tail() to be called under Tx lock on igb. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igb: Always call igb_xdp_ring_update_tail() under Tx lock ice: fix VSI lists confusion when adding VLANs ice: stop calling pci_disable_device() as we use pcim ice: fix accounting for filters shared by multiple VSIs ice: Fix lldp packets dropping after changing the number of channels ==================== Link: https://patch.msgid.link/20240909203842.3109822-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge tag 'mlx5-fixes-2024-09-09' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2024-09-09 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Fix bridge mode operations when there are no VFs net/mlx5: Verify support for scheduling element and TSAR type net/mlx5: Add missing masks and QoS bit masks for scheduling elements net/mlx5: Explicitly set scheduling element and TSAR type net/mlx5e: Add missing link mode to ptys2ext_ethtool_map net/mlx5e: Add missing link modes to ptys2ethtool_map net/mlx5: Update the list of the PCI supported devices ==================== Link: https://patch.msgid.link/20240909194505.69715-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: support devlink subfunction Michal Swiatkowski says: Currently ice driver does not allow creating more than one networking device per physical function. The only way to have more hardware backed netdev is to use SR-IOV. Following patchset adds support for devlink port API. For each new pcisf type port, driver allocates new VSI, configures all resources needed, including dynamically MSIX vectors, program rules and registers new netdev. This series supports only one Tx/Rx queue pair per subfunction. Example commands: devlink port add pci/0000:31:00.1 flavour pcisf pfnum 1 sfnum 1000 devlink port function set pci/0000:31:00.1/1 hw_addr 00:00:00:00:03:14 devlink port function set pci/0000:31:00.1/1 state active devlink port function del pci/0000:31:00.1/1 Make the port representor and eswitch code generic to support subfunction representor type. VSI configuration is slightly different between VF and SF. It needs to be reflected in the code. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: subfunction activation and base devlink ops ice: basic support for VLAN in subfunctions ice: support subfunction devlink Tx topology ice: implement netdevice ops for SF representor ice: check if SF is ready in ethtool ops ice: don't set target VSI for subfunction ice: create port representor for SF ice: make representor code generic ice: implement netdev for subfunction ice: base subfunction aux driver ice: allocate devlink for subfunction ice: treat subfunction VSI the same as PF VSI ice: add basic devlink subfunctions support ice: export ice ndo_ops functions ice: add new VSI type for subfunctions ==================== Link: https://patch.msgid.link/20240906223010.2194591-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge tag 'mlx5-updates-2024-09-02' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2024-08-29 HW-Managed Flow Steering in mlx5 driver Yevgeny Kliteynik says: ======================= 1. Overview ----------- ConnectX devices support packet matching, modification, and redirection. This functionality is referred as Flow Steering. To configure a steering rule, the rule is written to the device-owned memory. This memory is accessed and cached by the device when processing a packet. The first implementation of Flow Steering was done in FW, and it is referred in the mlx5 driver as Device-Managed Flow Steering (DMFS). Later we introduced SW-managed Flow Steering (SWS or SMFS), where the driver is writing directly to the device's configuration memory (ICM) through RC QP using RDMA operations (RDMA-read and RDAM-write), thus achieving higher rates of rule insertion/deletion. Now we introduce a new flow steering implementation: HW-Managed Flow Steering (HWS or HMFS). In this new approach, the driver is configuring steering rules directly to the HW using the WQs with a special new type of WQE. This way we can reach higher rule insertion/deletion rate with much lower CPU utilization compared to SWS. The key benefits of HWS as opposed to SWS: + HW manages the steering decision tree - HW calculates CRC for each entry - HW handles tree hash collisions - HW & FW manage objects refcount + HW keeps cache coherency: - HW provides tree access locking and synchronization - HW provides notification on completion + Insertion rate isn’t affected by background traffic - Dedicated HW components that handle insertion 2. Performance -------------- Measuring Connection Tracking with simple IPv4 flows w/o NAT, we are able to get ~5 times more flows offloaded per second using HWS. 3. Configuration ---------------- The enablement of HWS mode in eswitch manager is done using the same devlink param that is already used for switching between FW-managed steering and SW-managed steering modes: # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs 4. Upstream Submission ---------------------- HWS support consists of 3 main components: + Steering: - The lower layer that exposes HWS API to upper layers and implements all the management of flow steering building blocks + FS-Core - Implementation of fs_hws layer to enable fs_core to use HWS instead of FW or SW steering - Create HW steering action pools to utilize the ability of HWS to share steering actions among different rules - Add support for configuring HWS mode through devlink command, similar to configuring SWS mode + Connection Tracking - Implementation of CT support for HW steering - Hooks up the CT ops for the new steering mode and uses the HWS API to implement connection tracking. Because of the large number of patches, we need to perform the submission in several separate patch series. This series is the first submission that lays the ground work for the next submissions, where an actual user of HWS will be added. 5. Patches in this series ------------------------- This patch series contains implementation of the first bullet from above. ======================= * tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: HWS, added API and enabled HWS support net/mlx5: HWS, added send engine and context handling net/mlx5: HWS, added debug dump and internal headers net/mlx5: HWS, added backward-compatible API handling net/mlx5: HWS, added memory management handling net/mlx5: HWS, added vport handling net/mlx5: HWS, added modify header pattern and args handling net/mlx5: HWS, added FW commands handling net/mlx5: HWS, added matchers functionality net/mlx5: HWS, added definers handling net/mlx5: HWS, added rules handling net/mlx5: HWS, added tables handling net/mlx5: HWS, added actions handling net/mlx5: Added missing definitions in preparation for HW Steering net/mlx5: Added missing mlx5_ifc definition for HW Steering ==================== Link: https://patch.msgid.link/20240909181250.41596-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge tag 'ipsec-next-2024-09-10' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2024-09-10 1) Remove an unneeded WARN_ON on packet offload. From Patrisious Haddad. 2) Add a copy from skb_seq_state to buffer function. This is needed for the upcomming IPTFS patchset. From Christian Hopps. 3) Spelling fix in xfrm.h. From Simon Horman. 4) Speed up xfrm policy insertions. From Florian Westphal. 5) Add and revert a patch to support xfrm interfaces for packet offload. This patch was just half cooked. 6) Extend usage of the new xfrm_policy_is_dead_or_sk helper. From Florian Westphal. 7) Update comments on sdb and xfrm_policy. From Florian Westphal. 8) Fix a null pointer dereference in the new policy insertion code From Florian Westphal. 9) Fix an uninitialized variable in the new policy insertion code. From Nathan Chancellor. * tag 'ipsec-next-2024-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: policy: Restore dir assignments in xfrm_hash_rebuild() xfrm: policy: fix null dereference Revert "xfrm: add SA information to the offloaded packet" xfrm: minor update to sdb and xfrm_policy comments xfrm: policy: use recently added helper in more places xfrm: add SA information to the offloaded packet xfrm: policy: remove remaining use of inexact list xfrm: switch migrate to xfrm_policy_lookup_bytype xfrm: policy: don't iterate inexact policies twice at insert time selftests: add xfrm policy insertion speed test script xfrm: Correct spelling in xfrm.h net: add copy from skb_seq_state to buffer function xfrm: Remove documentation WARN_ON to limit return values for offloaded SA ==================== Link: https://patch.msgid.link/20240910065507.2436394-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge branch 'bnxt_en-msix-improvements'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: MSIX improvements This patchset makes some improvements related to MSIX. The first patch adjusts the default MSIX vectors assigned for RoCE. On the PF, the number of MSIX is increased to 64 from the current 9. The second patch allocates additional MSIX vectors ahead of time when changing ethtool channels if dynamic MSIX is supported. The 3rd patch makes sure that the IRQ name is not truncated. ==================== Link: https://patch.msgid.link/20240909202737.93852-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10bnxt_en: resize bnxt_irq name field to fit format stringEdwin Peer
The name field of struct bnxt_irq is written using snprintf in bnxt_setup_msix(). Make the field large enough to fit the maximal formatted string to prevent truncation. Truncated IRQ names are less meaningful to the user. For example, "enp4s0f0np0-TxRx-0" gets truncated to "enp4s0f0np0-TxRx-" with the existing code. Make sure we have space for the extra characters added to the IRQ names: - the characters introduced by the static format string: hyphens - the maximal static substituted ring type string: "TxRx" - the maximum length of an integer formatted as a string, even though reasonable ring numbers would never be as long as this. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909202737.93852-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10bnxt_en: Add MSIX check in bnxt_check_rings()Michael Chan
bnxt_check_rings() is called to ensure that we have the hardware ring resources before committing to reinitialize with the new number of rings. MSIX vectors are never checked at this point, because up until recently we must first disable MSIX before we can allocate the new set of MSIX vectors. Now that we support dynamic MSIX allocation, check to make sure we can dynamically allocate the new MSIX vectors as the last step in bnxt_check_rings() if dynamic MSIX is supported. For example, the IOMMU group may limit the number of MSIX vectors for the device. With this patch, the ring change will fail more gracefully when there is not enough MSIX vectors. It is also better to move bnxt_check_rings() to be called as the last step when changing ethtool rings. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909202737.93852-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10bnxt_en: Increase the number of MSIX vectors for RoCE deviceMichael Chan
If RocE is supported on the device, set the number of RoCE MSIX vectors to the number of online CPUs + 1 and capped at these maximums: VF: 2 NPAR: 5 PF: 64 For the PF, the maximum is now increased from the previous value of 9 to get better performance for kernel applications. Remove the unnecessary check for BNXT_FLAG_ROCE_CAP. bnxt_set_dflt_ulp_msix() will only be called if the flag is set. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909202737.93852-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: amlogic,meson-dwmac: Fix "amlogic,tx-delay-ns" schemaRob Herring (Arm)
The "amlogic,tx-delay-ns" property schema has unnecessary type reference as it's a standard unit suffix, and the constraints are in freeform text rather than schema. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://patch.msgid.link/20240909172342.487675-2-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge branch 'net-xilinx-axienet-partial-checksum-offload-improvements'Jakub Kicinski
Sean Anderson says: ==================== net: xilinx: axienet: Partial checksum offload improvements Partial checksum offload is not always used when it could be. Enable it in more cases. ==================== Link: https://patch.msgid.link/20240909161016.1149119-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: xilinx: axienet: Relax partial rx checksum checksSean Anderson
The partial rx checksum feature computes a checksum over the entire packet, regardless of the L3 protocol. Remove the check for IPv4. Additionally, testing with csum.py (from kselftests) shows no anomalies with 64-byte packets, so we can remove that check as well. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909161016.1149119-5-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: xilinx: axienet: Set RXCSUM in featuresSean Anderson
When it is supported by hardware, we enable receive checksum offload unconditionally. Update features to reflect this. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20240909161016.1149119-4-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: xilinx: axienet: Enable NETIF_F_HW_CSUM for partial tx checksummingSean Anderson
Partial tx chechsumming is completely generic and does not depend on the L3/L4 protocol. Signal this to the net subsystem by enabling the more-generic offload feature (instead of restricting ourselves to TCP/UDP over IPv4 checksumming only like is necessary with full checksumming). Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909161016.1149119-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: xilinx: axienet: Remove unused checksum variablesSean Anderson
These variables are set but never used. Remove them. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20240909161016.1149119-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10rtase: Fix spelling mistake: "tx_underun" -> "tx_underrun"Colin Ian King
There is a spelling mistake in the struct field tx_underun, rename it to tx_underrun. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909134612.63912-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"Colin Ian King
There is a spelling mistake in the struct field tx_underun, rename it to tx_underrun. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/20240909140021.64884-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10sch_cake: constify inverse square root cacheDave Taht
sch_cake uses a cache of the first 16 values of the inverse square root calculation for the Cobalt AQM to save some cycles on the fast path. This cache is populated when the qdisc is first loaded, but there's really no reason why it can't just be pre-populated. So change it to be pre-populated with constants, which also makes it possible to constify it. This gives a modest space saving for the module (not counting debug data): .text: -224 bytes .rodata: +80 bytes .bss: -64 bytes Total: -192 bytes Signed-off-by: Dave Taht <dave.taht@gmail.com> [ fixed up comment, rewrote commit message ] Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20240909091630.22177-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVERKory Maincent
Add net/ethtool/pse-pd.c to PSE NETWORK DRIVER to receive emails concerning modifications to the ethtool part. Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240909114336.362174-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11Merge tag 'amd-drm-next-6.12-2024-09-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.12-2024-09-06: amdgpu: - IPS updates - Post divider fix - DML2 updates - Misc static checker fixes - DCN 3.5 fixes - Replay fixes - DMCUB updates - SWSMU fixes - DP MST fixes - Add debug flag for per queue resets - devcoredump updates - SR-IOV fixes - MES fixes - Always allocate cleared VRAM for GEM - Pipe reset for GC 9.4.3 - ODM policy fixes - Per queue reset support for GC 10 - Per queue reset support for GC 11 - Per queue reset support for GC 12 - Display flickering fixes - MPO fixes - Display sharpening updates amdkfd: - SVM fix for IH for APUs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240906211008.3072097-1-alexander.deucher@amd.com
2024-09-10PCI: Rename CRS Completion Status to RRSBjorn Helgaas
PCIe r6.0 changed the abbreviation for "Configuration Request Retry Status" Completion Status from "CRS" to "RRS" and uses the terminology of "Configuration RRS Software Visibility" instead of "CRS Software Visibility". Align the Linux usage with the r6.0 spec language. No functional change intended. It's confusing to make this change, but I think "RRS" *is* a better abbreviation because it was easy to interpret "CRS" as "Completion Retry Status", which really didn't make any sense. Link: https://lore.kernel.org/r/20240827234848.4429-4-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-09-10PCI: aardvark: Correct Configuration RRS checkingBjorn Helgaas
Per PCIe r6.0, sec 2.3.2, when a Root Complex handles a Completion with Request Retry Status for a Configuration Read Request that includes both bytes of the Vendor ID field, it must complete the Request to the host by returning 0001h for the Vendor ID and all 1's for any additional bytes. Previously we only returned the 0001h Vendor ID value if we got an RRS completion for reads of exactly 4 bytes. A read of 2 bytes would not qualify, although the spec says it should. Check for reads of 2 or more bytes including the Vendor ID. I don't think this will fix any observable problems because RRS only applies to the first config reads after reset, and those are all currently dword (4-byte) reads. Link: https://lore.kernel.org/r/20240827234848.4429-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-09-10PCI: Wait for device readiness with Configuration RRSBjorn Helgaas
After a device reset, delays are required before the device can successfully complete config accesses. PCIe r6.0, sec 6.6, specifies some delays required before software can perform config accesses. Devices that require more time after those delays may respond to config accesses with Configuration Request Retry Status (RRS) completions. Callers of pci_dev_wait() are responsible for delays until the device can respond to config accesses. pci_dev_wait() waits any additional time until the device can successfully complete config accesses. Reading config space of devices that are not present or not ready typically returns ~0 (PCI_ERROR_RESPONSE). Previously we polled the Command register until we got a value other than ~0. This is sometimes a problem because Root Complex handling of RRS completions may include several retries and implementation-specific behavior that is invisible to software (see sec 2.3.2), so the exponential backoff in pci_dev_wait() may not work as intended. Linux enables Configuration RRS Software Visibility on all Root Ports that support it. If it is enabled, read the Vendor ID instead of the Command register. RRS completions cause immediate return of the 0x0001 reserved Vendor ID value, so the pci_dev_wait() backoff works correctly. When a read of Vendor ID eventually completes successfully by returning a non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be initialized and ready to respond to config requests. For conventional PCI devices or devices below Root Ports that don't support Configuration RRS Software Visibility, poll the Command register as before. This was developed independently, but is very similar to Stanislav Spassov's previous work at https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com Link: https://lore.kernel.org/r/20240827234848.4429-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Duc Dang <ducdang@google.com>
2024-09-10net: dsa: microchip: update tag_ksz masks for KSZ9477 familyPieter Van Trappen
Remove magic number 7 by introducing a GENMASK macro instead. Remove magic number 0x80 by using the BIT macro instead. Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240909134301.75448-1-vtpieter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10dt-bindings: net: tja11xx: fix the broken bindingWei Fang
As Rob pointed in another mail thread [1], the binding of tja11xx PHY is completely broken, the schema cannot catch the error in the DTS. A compatiable string must be needed if we want to add a custom propety. So extract known PHY IDs from the tja11xx PHY drivers and convert them into supported compatible string list to fix the broken binding issue. Fixes: 52b2fe4535ad ("dt-bindings: net: tja11xx: add nxp,refclk_in property") Link: https://lore.kernel.org/31058f49-bac5-49a9-a422-c43b121bf049@kernel.org # [1] Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20240909012152.431647-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge branch ↵Jakub Kicinski
'net-timestamp-introduce-a-flag-to-filter-out-rx-software-and-hardware-report' Jason Xing says: ==================== net-timestamp: introduce a flag to filter out rx software and hardware report When one socket is set SOF_TIMESTAMPING_RX_SOFTWARE which means the whole system turns on the netstamp_needed_key button, other sockets that only have SOF_TIMESTAMPING_SOFTWARE will be affected and then print the rx timestamp information even without setting SOF_TIMESTAMPING_RX_SOFTWARE generation flag. How to solve it without breaking users? We introduce a new flag named SOF_TIMESTAMPING_OPT_RX_FILTER. Using it together with SOF_TIMESTAMPING_SOFTWARE can stop reporting the rx software timestamp. Similarly, we also filter out the hardware case where one process enables the rx hardware generation flag, then another process only passing SOF_TIMESTAMPING_RAW_HARDWARE gets the timestamp. So we can set both SOF_TIMESTAMPING_RAW_HARDWARE and SOF_TIMESTAMPING_OPT_RX_FILTER to stop reporting rx hardware timestamp after this patch applied. v6: https://lore.kernel.org/20240906095640.77533-1-kerneljasonxing@gmail.com v5: https://lore.kernel.org/20240905071738.3725-1-kerneljasonxing@gmail.com v4: https://lore.kernel.org/20240830153751.86895-1-kerneljasonxing@gmail.com v3: https://lore.kernel.org/20240828160145.68805-1-kerneljasonxing@gmail.com v2: https://lore.kernel.org/20240825152440.93054-1-kerneljasonxing@gmail.com ==================== Link: https://patch.msgid.link/20240909015612.3856-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net-timestamp: add selftests for SOF_TIMESTAMPING_OPT_RX_FILTERJason Xing
Test a few possible cases where we use SOF_TIMESTAMPING_OPT_RX_FILTER with software or hardware report/generation flag. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240909015612.3856-3-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net-timestamp: introduce SOF_TIMESTAMPING_OPT_RX_FILTER flagJason Xing
introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter out rx software timestamp report, especially after a process turns on netstamp_needed_key which can time stamp every incoming skb. Previously, we found out if an application starts first which turns on netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE could also get rx timestamp. Now we handle this case by introducing this new flag without breaking users. Quoting Willem to explain why we need the flag: "why a process would want to request software timestamp reporting, but not receive software timestamp generation. The only use I see is when the application does request SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE." Similarly, this new flag could also be used for hardware case where we can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive hardware receive timestamp. Another thing about errqueue in this patch I have a few words to say: In this case, we need to handle the egress path carefully, or else reporting the tx timestamp will fail. Egress path and ingress path will finally call sock_recv_timestamp(). We have to distinguish them. Errqueue is a good indicator to reflect the flow direction. Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240909015612.3856-2-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net-timestamp: correct the use of SOF_TIMESTAMPING_RAW_HARDWAREJason Xing
SOF_TIMESTAMPING_RAW_HARDWARE is a report flag which passes the timestamps generated by either SOF_TIMESTAMPING_TX_HARDWARE or SOF_TIMESTAMPING_RX_HARDWARE to the userspace all the time. So let us revise the doc here. Link: https://lore.kernel.org/all/66d8c21d3042a_163d93294cb@willemb.c.googlers.com.notmuch/ Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Link: https://patch.msgid.link/20240908124141.39628-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10Merge branch 'net-stmmac-fpe-via-ethtool-tc'Jakub Kicinski
Furong Xu says: ==================== net: stmmac: FPE via ethtool + tc Move the Frame Preemption(FPE) over to the new standard API which uses ethtool-mm/tc-mqprio/tc-taprio. ==================== Link: https://patch.msgid.link/cover.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: silence FPE kernel logsFurong Xu
ethtool --show-mm can get real-time state of FPE. fpe_irq_status logs should keep quiet. tc-taprio can always query driver state, delete unbalanced logs. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/39943d7967f291674a97ef0572878aca273087e9.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: support fp parameter of tc-taprioFurong Xu
tc-taprio can select whether traffic classes are express or preemptible. 0) tc qdisc add dev eth1 parent root handle 100 taprio \ num_tc 4 \ map 0 1 2 3 2 2 2 2 2 2 2 2 2 2 2 3 \ queues 1@0 1@1 1@2 1@3 \ base-time 1000000000 \ sched-entry S 03 10000000 \ sched-entry S 0e 10000000 \ flags 0x2 fp P E E E 1) After some traffic tests, MAC merge layer statistics are all good. Local device: [ { "ifname": "eth1", "pmac-enabled": true, "tx-enabled": true, "tx-active": true, "tx-min-frag-size": 60, "rx-min-frag-size": 60, "verify-enabled": true, "verify-time": 100, "max-verify-time": 128, "verify-status": "SUCCEEDED", "statistics": { "MACMergeFrameAssErrorCount": 0, "MACMergeFrameSmdErrorCount": 0, "MACMergeFrameAssOkCount": 0, "MACMergeFragCountRx": 0, "MACMergeFragCountTx": 17837, "MACMergeHoldCount": 18639 } } ] Remote device: [ { "ifname": "end1", "pmac-enabled": true, "tx-enabled": true, "tx-active": true, "tx-min-frag-size": 60, "rx-min-frag-size": 60, "verify-enabled": true, "verify-time": 100, "max-verify-time": 128, "verify-status": "SUCCEEDED", "statistics": { "MACMergeFrameAssErrorCount": 0, "MACMergeFrameSmdErrorCount": 0, "MACMergeFrameAssOkCount": 17189, "MACMergeFragCountRx": 17837, "MACMergeFragCountTx": 0, "MACMergeHoldCount": 0 } } ] Tested on DWMAC CORE 5.10a Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/0d21ae356fb3cab77337527e87d46748a4852055.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: support fp parameter of tc-mqprioFurong Xu
tc-mqprio can select whether traffic classes are express or preemptible. After some traffic tests, MAC merge layer statistics are all good. Local device: ethtool --include-statistics --json --show-mm eth1 [ { "ifname": "eth1", "pmac-enabled": true, "tx-enabled": true, "tx-active": true, "tx-min-frag-size": 60, "rx-min-frag-size": 60, "verify-enabled": true, "verify-time": 100, "max-verify-time": 128, "verify-status": "SUCCEEDED", "statistics": { "MACMergeFrameAssErrorCount": 0, "MACMergeFrameSmdErrorCount": 0, "MACMergeFrameAssOkCount": 0, "MACMergeFragCountRx": 0, "MACMergeFragCountTx": 35105, "MACMergeHoldCount": 0 } } ] Remote device: ethtool --include-statistics --json --show-mm end1 [ { "ifname": "end1", "pmac-enabled": true, "tx-enabled": true, "tx-active": true, "tx-min-frag-size": 60, "rx-min-frag-size": 60, "verify-enabled": true, "verify-time": 100, "max-verify-time": 128, "verify-status": "SUCCEEDED", "statistics": { "MACMergeFrameAssErrorCount": 0, "MACMergeFrameSmdErrorCount": 0, "MACMergeFrameAssOkCount": 35105, "MACMergeFragCountRx": 35105, "MACMergeFragCountTx": 0, "MACMergeHoldCount": 0 } } ] Tested on DWMAC CORE 5.10a Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/592965ea93ed8240f0a1b8f6f8ebb8914f69419b.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: configure FPE via ethtool-mmFurong Xu
Implement ethtool --show-mm and --set-mm callbacks. NIC up/down, link up/down, suspend/resume, kselftest-ethtool_mm, all tested okay. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/06ed409314fe0ee37b78b800922f2c0cce762532.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: refactor FPE verification processFurong Xu
Drop driver defined stmmac_fpe_state, and switch to common ethtool_mm_verify_status for local TX verification status. Local side and remote side verification processes are completely independent. There is no reason at all to keep a local state and a remote state. Add a spinlock to avoid races among ISR, timer, link update and register configuration. This patch is based on Vladimir Oltean's proposal. Vladimir Oltean says: ==================== In the INITIAL state, the timer sends MPACKET_VERIFY. Eventually the stmmac_fpe_event_status() IRQ fires and advances the state to VERIFYING, then rearms the timer after verify_time ms. If a subsequent IRQ comes in and modifies the state to SUCCEEDED after getting MPACKET_RESPONSE, the timer sees this. It must enable the EFPE bit now. Otherwise, it decrements the verify_limit counter and tries again. Eventually it moves the status to FAILED, from which the IRQ cannot move it anywhere else, except for another stmmac_fpe_apply() call. ==================== Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/151f86c8428eba967039718c6bf90a7d841e703b.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: drop stmmac_fpe_handshakeFurong Xu
ethtool --set-mm can trigger FPE verification process by calling stmmac_fpe_send_mpacket, stmmac_fpe_handshake should be gone. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/42018b1a15eb3ced567fd6a73798c7cd4e08799a.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10net: stmmac: move stmmac_fpe_cfg to stmmac_priv dataFurong Xu
By moving the fpe_cfg field to the stmmac_priv data, stmmac_fpe_cfg becomes platform-data eventually, instead of a run-time config. Suggested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Link: https://patch.msgid.link/d9b3d7ecb308c5e39778a4c8ae9df288a2754379.1725631883.git.0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>