summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-06-07workqueue: Increase worker desc's length to 32Wenchao Hao
Commit 31c89007285d ("workqueue.c: Increase workqueue name length") increased WQ_NAME_LEN from 24 to 32, but forget to increase WORKER_DESC_LEN, which would cause truncation when setting kworker's desc from workqueue_struct's name, process_one_work() for example. Fixes: 31c89007285d ("workqueue.c: Increase workqueue name length") Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> CC: Audra Mitchell <audra@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-06-07libfs: Introduce case-insensitive string comparison helperGabriel Krisman Bertazi
generic_ci_match can be used by case-insensitive filesystems to compare strings under lookup with dirents in a case-insensitive way. This function is currently reimplemented by each filesystem supporting casefolding, so this reduces code duplication in filesystem-specific code. [eugen.hristev@collabora.com: rework to first test the exact match, cleanup and add error message] Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Link: https://lore.kernel.org/r/20240606073353.47130-4-eugen.hristev@collabora.com Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-07srcu: Add an API for a memory barrier after SRCU read lockYan Zhao
To avoid redundant memory barriers, add smp_mb__after_srcu_read_lock() to pair with smp_mb__after_srcu_read_unlock() for use in paths that need to emit a memory barrier, but already do srcu_read_lock(), which includes a full memory barrier. Provide an API, e.g. as opposed to having callers document the behavior via a comment, as the full memory barrier provided by srcu_read_lock() is an implementation detail that shouldn't bleed into random subsystems. KVM will use smp_mb__after_srcu_read_lock() in it's VM-Exit path to ensure a memory barrier is emitted, which is necessary to ensure correctness of mixed memory types on CPUs that support self-snoop. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> [sean: massage changelog] Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org Link: https://lore.kernel.org/r/20240309010929.1403984-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-07drm/mipi-dbi: Add support for DRM_FORMAT_RGB888Noralf Trønnes
DRM_FORMAT_RGB888 is 24 bits per pixel and it would be natural to send it on the SPI bus using a 24 bits per word transfer. The problem with this is that not all SPI controllers support 24 bpw. Since DRM_FORMAT_RGB888 is stored in memory as little endian and the SPI bus is big endian we use 8 bpw to always get the same pixel format on the bus: b8g8r8. The MIPI DCS specification lists the standard commands that can be sent over the MIPI DBI interface. The set_address_mode (36h) command has one bit in the parameter that controls RGB/BGR order. This means that the controller can be configured to receive the pixel as BGR. RGB888 is rarely supported on these controllers but RGB666 is very common. All datasheets I have seen do at least support the pixel format option where each color is sent as one byte and the 6 MSB's are used. All this put together means that we can send each pixel as b8g8r8 and an RGB666 capable controller sees this as b6x2g6x2r6x2. v4: - s/emulation_format/pixel_format/ (Dmitry) Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240604-panel-mipi-dbi-rgb666-v4-4-d7c2bcb9b78d@tronnes.org Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
2024-06-07drm/mipi-dbi: Make bits per word configurable for pixel transfersNoralf Trønnes
MIPI DCS write/set commands have 8 bit parameters except for the write_memory commands where it depends on the pixel format. drm_mipi_dbi does currently only support RGB565 which is 16-bit and it has to make sure that the pixels enters the SPI bus in big endian format since the MIPI DBI spec doesn't have support for little endian. drm_mipi_dbi is optimized for DBI interface option 3 which means that the 16-bit bytes are swapped by the upper layer if the SPI bus does not support 16 bits per word, signified by the swap_bytes member. In order to support both 16-bit and 24-bit pixel transfers we need a way to tell the DBI command layer the format of the buffer. Add a write_memory_bpw member that the upper layer can use to tell how many bits per word to use for the SPI transfer. v4: - Expand the commit message (Dmitry) Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240604-panel-mipi-dbi-rgb666-v4-3-d7c2bcb9b78d@tronnes.org Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
2024-06-07crypto: sm2 - Remove sm2 algorithmHerbert Xu
The SM2 algorithm has a single user in the kernel. However, it's never been integrated properly with that user: asymmetric_keys. The crux of the issue is that the way it computes its digest with sm3 does not fit into the architecture of asymmetric_keys. As no solution has been proposed, remove this algorithm. It can be resubmitted when it is integrated properly into the asymmetric_keys subsystem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-07crypto: ccp - Add support for getting security attributes on some older systemsMario Limonciello
Older systems will not populate the security attributes in the capabilities register. The PSP on these systems, however, does have a command to get the security attributes. Use this command during ccp startup to populate the attributes if they're missing. Closes: https://github.com/fwupd/fwupd/issues/5284 Closes: https://github.com/fwupd/fwupd/issues/5675 Closes: https://github.com/fwupd/fwupd/issues/6253 Closes: https://github.com/fwupd/fwupd/issues/7280 Closes: https://github.com/fwupd/fwupd/issues/6323 Closes: https://github.com/fwupd/fwupd/discussions/5433 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-07crypto: ccp - align psp_platform_access_msgMario Limonciello
Align the whitespace so that future messages will also be better aligned. Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-06-07dt-bindings: net: dp8386x: Add MIT license along with GPL-2.0Udit Kumar
Modify license to include dual licensing as GPL-2.0-only OR MIT license for TI specific phy header files. This allows for Linux kernel files to be used in other Operating System ecosystems such as Zephyr or FreeBSD. While at this, update the GPL-2.0 to be GPL-2.0-only to be in sync with latest SPDX conventions (GPL-2.0 is deprecated). While at this, update the TI copyright year to sync with current year to indicate license change. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Trent Piepho <tpiepho@impinj.com> Cc: Wadim Egorov <w.egorov@phytec.de> Cc: Kip Broadhurst <kbroadhurst@ti.com> Signed-off-by: Udit Kumar <u-kumar1@ti.com> Acked-by: Wadim Egorov <w.egorov@phytec.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-07input: Add support for "Do Not Disturb"Aseda Aboagye
HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-07input: Add event code for accessibility keyAseda Aboagye
HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-06-06Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The core change is to detect unusually large number of VPD pages (caused by device manufacturers having an endiannes issue) and reject them rather than trying to parse a huge non-existent array. The remaining fixes are in drivers the most user visible of which is the ALUA state transition recognition (leads to intermittent I/O errors in some situations otherwise)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() scsi: core: Handle devices which return an unusually large VPD page count scsi: mpt3sas: Add missing kerneldoc parameter descriptions scsi: qedf: Set qed_slowpath_params to zero before use scsi: qedf: Wait for stag work during unload scsi: qedf: Don't process stag work during unload and recovery scsi: sr: Fix unintentional arithmetic wraparound scsi: core: alua: I/O errors for ALUA state transitions scsi: mpi3mr: Use proper format specifier in mpi3mr_sas_port_add()
2024-06-06linux/interrupt.h: allow "guard" notation to disable and reenable IRQDmitry Torokhov
Drivers often need to first disable an interrupt, carry out some action, and then reenable the interrupt. Introduce support for the "guard" notation for this so that the following is possible: ... scoped_cond_guard(mutex_intr, return -EINTR, &data->sysfs_mutex) { guard(disable_irq)(&client->irq); error = elan_acquire_baseline(data); if (error) return error; } ... Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/ZljAV6HjkPSEhWSw@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-06-06Merge tag 'pci-v6.10-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Revert lockdep checking on locking that protects device resets from user-space config accesses; it exposed issues for which fixes are in the works but are too risky for this cycle (Dan Williams) * tag 'pci-v6.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Revert the cfg_access_lock lockdep mechanism
2024-06-06ftrace: Declare function_trace_op in header to quiet sparse warningSteven Rostedt (Google)
Sparse complains that function_trace_op is not static but is not declared in a header file. It is used only in assembly code. But add it to a header so that sparse no longer complains: kernel/trace/ftrace.c:99:19: warning: symbol 'function_trace_op' was not declared. Should it be static? Link: https://lore.kernel.org/linux-trace-kernel/20240605202708.289105647@goodmis.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/pensando/ionic/ionic_txrx.c d9c04209990b ("ionic: Mark error paths in the data path as unlikely") 491aee894a08 ("ionic: fix kernel panic in XDP_TX action") net/ipv6/ip6_fib.c b4cb4a1391dc ("net: use unrcu_pointer() helper") b01e1c030770 ("ipv6: fix possible race in __fib6_drop_pcpu_from()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06drm/print: Improve drm_dbg_printerMichal Wajdeczko
With recent introduction of a generic drm dev printk function, we can now store and use location where drm_dbg_printer was invoked and output it's symbolic name like we do for all drm debug prints. Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517163406.2348-3-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06drm/print: Kill ___drm_dbg()Michal Wajdeczko
There is no point in maintaining a separate print function, while there is __drm_dev_dbg() function that can work with a NULL device. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240516160015.2260-1-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06drm/print: Add missing [drm] prefix to drm based WARNMichal Wajdeczko
All drm_device based logging macros, except those related to WARN, include the [drm] prefix. Fix that. [ ] 0000:00:00.0: this is a warning [ ] 0000:00:00.0: drm_WARN_ON(true) vs [ ] 0000:00:00.0: [drm] this is a warning [ ] 0000:00:00.0: [drm] drm_WARN_ON(true) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240523174429.800-1-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06Merge tag 'net-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from BPF and big collection of fixes for WiFi core and drivers. Current release - regressions: - vxlan: fix regression when dropping packets due to invalid src addresses - bpf: fix a potential use-after-free in bpf_link_free() - xdp: revert support for redirect to any xsk socket bound to the same UMEM as it can result in a corruption - virtio_net: - add missing lock protection when reading return code from control_buf - fix false-positive lockdep splat in DIM - Revert "wifi: wilc1000: convert list management to RCU" - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config Previous releases - regressions: - rtnetlink: make the "split" NLM_DONE handling generic, restore the old behavior for two cases where we started coalescing those messages with normal messages, breaking sloppily-coded userspace - wifi: - cfg80211: validate HE operation element parsing - cfg80211: fix 6 GHz scan request building - mt76: mt7615: add missing chanctx ops - ath11k: move power type check to ASSOC stage, fix connecting to 6 GHz AP - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS - iwlwifi: mvm: fix a crash on 7265 Previous releases - always broken: - ncsi: prevent multi-threaded channel probing, a spec violation - vmxnet3: disable rx data ring on dma allocation failure - ethtool: init tsinfo stats if requested, prevent unintentionally reporting all-zero stats on devices which don't implement any - dst_cache: fix possible races in less common IPv6 features - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED - ax25: fix two refcounting bugs - eth: ionic: fix kernel panic in XDP_TX action Misc: - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB" * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) selftests: net: lib: set 'i' as local selftests: net: lib: avoid error removing empty netns name selftests: net: lib: support errexit with busywait net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() ipv6: fix possible race in __fib6_drop_pcpu_from() af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. af_unix: Annotate data-races around sk->sk_sndbuf. af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). af_unix: Annotate data-race of sk->sk_state in unix_accept(). af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). af_unix: Annodate data-races around sk->sk_state for writers. af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. ...
2024-06-06mm/util: Swap kmemdup_array() argumentsJean-Philippe Brucker
GCC 14.1 complains about the argument usage of kmemdup_array(): drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args] 130 | fuse->lookups = kmemdup_array(fuse->soc->lookups, sizeof(*fuse->lookups), | ^ drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should specify number of elements, later size of each element The annotation introduced by commit 7d78a7773355 ("string: Add additional __realloc_size() annotations for "dup" helpers") lets the compiler think that kmemdup_array() follows the same format as calloc(), with the number of elements preceding the size of one element. So we could simply swap the arguments to __realloc_size() to get rid of that warning, but it seems cleaner to instead have kmemdup_array() follow the same format as krealloc_array(), memdup_array_user(), calloc() etc. Fixes: 7d78a7773355 ("string: Add additional __realloc_size() annotations for "dup" helpers") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240606144608.97817-2-jean-philippe@linaro.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-06-06irqchip/gic-v3: Enable non-coherent redistributors/ITSes ACPI probingLorenzo Pieralisi
The GIC architecture specification defines a set of registers for redistributors and ITSes that control the sharebility and cacheability attributes of redistributors/ITSes initiator ports on the interconnect (GICR_[V]PROPBASER, GICR_[V]PENDBASER, GITS_BASER<n>). Architecturally the GIC provides a means to drive shareability and cacheability attributes signals but it is not mandatory for designs to wire up the corresponding interconnect signals that control the cacheability/shareability of transactions. Redistributors and ITSes interconnect ports can be connected to non-coherent interconnects that are not able to manage the shareability/cacheability attributes; this implicitly makes the redistributors and ITSes non-coherent observers. To enable non-coherent GIC designs on ACPI based systems, parse the MADT GICC/GICR/ITS subtables non-coherent flags to determine whether the respective components are non-coherent observers and force the shareability attributes to be programmed into the redistributors and ITSes registers. An ACPI global function (acpi_get_madt_revision()) is added to retrieve the MADT revision, in that it is essential to check the MADT revision before checking for flags that were added with MADT revision 7 so that if the kernel is booted with an ACPI MADT table with revision < 7 it skips parsing the newly added flags (that should be zeroed reserved values for MADT versions < 7 but they could turn out to be buggy and should be ignored). Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Link: https://lore.kernel.org/r/20240606094238.757649-2-lpieralisi@kernel.org
2024-06-06drm/mm: Remove unused drm_mm_replace_nodeRodrigo Vivi
Last caller was removed with commit 078a5b498d6a ("drm/tests: Remove slow tests"). Cc: Maxime Ripard <mripard@kernel.org> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240604175438.48125-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06tcp: move reqsk_alloc() to inet_connection_sock.cEric Dumazet
reqsk_alloc() has a single caller, no need to expose it in include/net/request_sock.h. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06tcp: small changes in reqsk_put() and reqsk_free()Eric Dumazet
In reqsk_free(), use DEBUG_NET_WARN_ON_ONCE() instead of WARN_ON_ONCE() for a condition which never fired. In reqsk_put() directly call __reqsk_free(), there is no point checking rsk_refcnt again right after a transition to zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06net: use unrcu_pointer() helperEric Dumazet
Toke mentioned unrcu_pointer() existence, allowing to remove some of the ugly casts we have when using xchg() for rcu protected pointers. Also make inet_rcv_compat const. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20240604111603.45871-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-06mm/mm_init.c: not always search next deferred_init_pfn from very beginningWei Yang
In function deferred_init_memmap(), we call deferred_init_mem_pfn_range_in_zone() to get the next deferred_init_pfn. But we always search it from the very beginning. Since we save the index in i, we can leverage this to search from i next time. [rppt refine the comment] Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Link: https://lore.kernel.org/all/20240605071339.15330-1-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-06-05net/mlx5e: SHAMPO, Re-enable HW-GROYoray Zack
Add back HW-GRO to the reported features. As the current implementation of HW-GRO uses KSMs with a specific fixed buffer size (256B) to map its headers buffer, we reported the feature only if the NIC is supporting KSM and the minimum value for buffer size is below the requested one. iperf3 bandwidth comparison: +---------+--------+--------+-----------+ | streams | SW GRO | HW GRO | Unit | |---------+--------+--------+-----------| | 1 | 36 | 42 | Gbits/sec | | 4 | 34 | 39 | Gbits/sec | | 8 | 31 | 35 | Gbits/sec | +---------+--------+--------+-----------+ A downstream patch will add skb fragment coalescing which will improve performance considerably. Benchmark details: VM based setup CPU: Intel(R) Xeon(R) Platinum 8380 CPU, 24 cores NIC: ConnectX-7 100GbE iperf3 and irq running on same CPU over a single receive queue Signed-off-by: Yoray Zack <yorayz@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240603212219.1037656-14-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05net/mlx5e: SHAMPO, Use KSMs instead of KLMsYoray Zack
KSM Mkey is KLM Mkey with a fixed buffer size. Due to this fact, it is a faster mechanism than KLM. SHAMPO feature used KLMs Mkeys for memory mappings of its headers buffer. As it used KLMs with the same buffer size for each entry, we can use KSMs instead. This commit changes the Mkeys that map the SHAMPO headers buffer from KLMs to KSMs. Signed-off-by: Yoray Zack <yorayz@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240603212219.1037656-13-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05mm/ksm: fix ksm_zero_pages accountingChengming Zhou
We normally ksm_zero_pages++ in ksmd when page is merged with zero page, but ksm_zero_pages-- is done from page tables side, where there is no any accessing protection of ksm_zero_pages. So we can read very exceptional value of ksm_zero_pages in rare cases, such as -1, which is very confusing to users. Fix it by changing to use atomic_long_t, and the same case with the mm->ksm_zero_pages. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-2-34bb358fdc13@linux.dev Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM") Fixes: 6080d19f0704 ("ksm: add ksm zero pages for each process") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-05mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=nBarry Song
if CONFIG_SYSFS is not enabled in config, we get the below error, All errors (new ones prefixed by >>): s390-linux-ld: mm/memory.o: in function `count_mthp_stat': >> include/linux/huge_mm.h:285:(.text+0x191c): undefined reference to `mthp_stats' s390-linux-ld: mm/huge_memory.o:(.rodata+0x10): undefined reference to `mthp_stats' vim +285 include/linux/huge_mm.h 279 280 static inline void count_mthp_stat(int order, enum mthp_stat_item item) 281 { 282 if (order <= 0 || order > PMD_ORDER) 283 return; 284 > 285 this_cpu_inc(mthp_stats.stats[order][item]); 286 } 287 Link: https://lkml.kernel.org/r/20240523210045.40444-1-21cnbao@gmail.com Fixes: ec33687c6749 ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback counters") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405231728.tCAogiSI-lkp@intel.com/ Signed-off-by: Barry Song <v-songbaohua@oppo.com> Tested-by: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-05mm: drop the 'anon_' prefix for swap-out mTHP countersBaolin Wang
The mTHP swap related counters: 'anon_swpout' and 'anon_swpout_fallback' are confusing with an 'anon_' prefix, since the shmem can swap out non-anonymous pages. So drop the 'anon_' prefix to keep consistent with the old swap counter names. This is needed in 6.10-rcX to avoid having an inconsistent ABI out in the field. Link: https://lkml.kernel.org/r/7a8989c13299920d7589007a30065c3e2c19f0e0.1716431702.git.baolin.wang@linux.alibaba.com Fixes: d0f048ac39f6 ("mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters") Fixes: 42248b9d34ea ("mm: add docs for per-order mTHP counters and transhuge_page ABI") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Suggested-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-05Merge tag 'acpi-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix the ACPI EC and AC drivers, the ACPI APEI error injection driver and build issues related to the dev_is_pnp() macro referring to pnp_bus_type that is not exported to modules. Specifics: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf) - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams) - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko) - Update the ACPI AC driver to use power_supply_changed() to let the power supply core handle configuration changes properly (Thomas Weißschuh)" * tag 'acpi-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: AC: Properly notify powermanagement core about changes PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error ACPI: APEI: EINJ: Fix einj_dev release leak
2024-06-05Merge tag 'pm-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the intel_pstate and amd-pstate cpufreq drivers and the cpupower utility. Specifics: - Fix a recently introduced unchecked HWP MSR access in the intel_pstate driver (Srinivas Pandruvada) - Add missing conversion from MHz to KHz to amd_pstate_set_boost() to address sysfs inteface inconsistency and fix P-state frequency reporting on AMD Family 1Ah CPUs in the cpupower utility (Dhananjay Ugwekar) - Get rid of an excess global header file used by the amd-pstate cpufreq driver (Arnd Bergmann)" * tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Fix unchecked HWP MSR access cpufreq: amd-pstate: Fix the inconsistency in max frequency units cpufreq: amd-pstate: remove global header file tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs
2024-06-05vfs: retire user_path_at_empty and drop empty arg from getname_flagsMateusz Guzik
No users after do_readlinkat started doing the job on its own. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20240604155257.109500-3-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-05drm/amdgpu: define new gfx12 uapi flagsMarek Olšák
define new gfx12 uapi flags Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05sched/core: Drop spinlocks on contention iff kernel is preemptibleSean Christopherson
Use preempt_model_preemptible() to detect a preemptible kernel when deciding whether or not to reschedule in order to drop a contended spinlock or rwlock. Because PREEMPT_DYNAMIC selects PREEMPTION, kernels built with PREEMPT_DYNAMIC=y will yield contended locks even if the live preemption model is "none" or "voluntary". In short, make kernels with dynamically selected models behave the same as kernels with statically selected models. Somewhat counter-intuitively, NOT yielding a lock can provide better latency for the relevant tasks/processes. E.g. KVM x86's mmu_lock, a rwlock, is often contended between an invalidation event (takes mmu_lock for write) and a vCPU servicing a guest page fault (takes mmu_lock for read). For _some_ setups, letting the invalidation task complete even if there is mmu_lock contention provides lower latency for *all* tasks, i.e. the invalidation completes sooner *and* the vCPU services the guest page fault sooner. But even KVM's mmu_lock behavior isn't uniform, e.g. the "best" behavior can vary depending on the host VMM, the guest workload, the number of vCPUs, the number of pCPUs in the host, why there is lock contention, etc. In other words, simply deleting the CONFIG_PREEMPTION guard (or doing the opposite and removing contention yielding entirely) needs to come with a big pile of data proving that changing the status quo is a net positive. Opportunistically document this side effect of preempt=full, as yielding contended spinlocks can have significant, user-visible impact. Fixes: c597bfddc9e9 ("sched: Provide Kconfig support for default dynamic preempt mode") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Link: https://lore.kernel.org/kvm/ef81ff36-64bb-4cfe-ae9b-e3acf47bff24@proxmox.com
2024-06-05sched/core: Move preempt_model_*() helpers from sched.h to preempt.hSean Christopherson
Move the declarations and inlined implementations of the preempt_model_*() helpers to preempt.h so that they can be referenced in spinlock.h without creating a potential circular dependency between spinlock.h and sched.h. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com> Link: https://lkml.kernel.org/r/20240528003521.979836-2-ankur.a.arora@oracle.com
2024-06-05locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldocCarlos Llamas
For ${atomic}_sub_and_test() the @i parameter is the value to subtract, not add. Fix the typo in the kerneldoc template and generate the headers with this update. Fixes: ad8110706f38 ("locking/atomic: scripts: generate kerneldoc comments") Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com
2024-06-05KVM: x86: Add a capability to configure bus frequency for APIC timerIsaku Yamahata
Add KVM_CAP_X86_APIC_BUS_CYCLES_NS capability to configure the APIC bus clock frequency for APIC timer emulation. Allow KVM_ENABLE_CAPABILITY(KVM_CAP_X86_APIC_BUS_CYCLES_NS) to set the frequency in nanoseconds. When using this capability, the user space VMM should configure CPUID leaf 0x15 to advertise the frequency. Vishal reported that the TDX guest kernel expects a 25MHz APIC bus frequency but ends up getting interrupts at a significantly higher rate. The TDX architecture hard-codes the core crystal clock frequency to 25MHz and mandates exposing it via CPUID leaf 0x15. The TDX architecture does not allow the VMM to override the value. In addition, per Intel SDM: "The APIC timer frequency will be the processor’s bus clock or core crystal clock frequency (when TSC/core crystal clock ratio is enumerated in CPUID leaf 0x15) divided by the value specified in the divide configuration register." The resulting 25MHz APIC bus frequency conflicts with the KVM hardcoded APIC bus frequency of 1GHz. The KVM doesn't enumerate CPUID leaf 0x15 to the guest unless the user space VMM sets it using KVM_SET_CPUID. If the CPUID leaf 0x15 is enumerated, the guest kernel uses it as the APIC bus frequency. If not, the guest kernel measures the frequency based on other known timers like the ACPI timer or the legacy PIT. As reported by Vishal the TDX guest kernel expects a 25MHz timer frequency but gets timer interrupt more frequently due to the 1GHz frequency used by KVM. To ensure that the guest doesn't have a conflicting view of the APIC bus frequency, allow the userspace to tell KVM to use the same frequency that TDX mandates instead of the default 1Ghz. Reported-by: Vishal Annapurve <vannapurve@google.com> Closes: https://lore.kernel.org/lkml/20231006011255.4163884-1-vannapurve@google.com Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/6748a4c12269e756f0c48680da8ccc5367c31ce7.1714081726.git.reinette.chatre@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05tcp: add sysctl_tcp_rto_min_usKevin Yang
Adding a sysctl knob to allow user to specify a default rto_min at socket init time, other than using the hard coded 200ms default rto_min. Note that the rto_min route option has the highest precedence for configuring this setting, followed by the TCP_BPF_RTO_MIN socket option, followed by the tcp_rto_min_us sysctl. Signed-off-by: Kevin Yang <yyd@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05rtnetlink: make the "split" NLM_DONE handling genericJakub Kicinski
Jaroslav reports Dell's OMSA Systems Management Data Engine expects NLM_DONE in a separate recvmsg(), both for rtnl_dump_ifinfo() and inet_dump_ifaddr(). We already added a similar fix previously in commit 460b0d33cf10 ("inet: bring NLM_DONE out to a separate recv() again") Instead of modifying all the dump handlers, and making them look different than modern for_each_netdev_dump()-based dump handlers - put the workaround in rtnetlink code. This will also help us move the custom rtnl-locking from af_netlink in the future (in net-next). Note that this change is not touching rtnl_dump_all(). rtnl_dump_all() is different kettle of fish and a potential problem. We now mix families in a single recvmsg(), but NLM_DONE is not coalesced. Tested: ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_addr.yaml \ --dump getaddr --json '{"ifa-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \ --dump getroute --json '{"rtm-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_link.yaml \ --dump getlink Fixes: 3e41af90767d ("rtnetlink: use xarray iterator to implement rtnl_dump_ifinfo()") Fixes: cdb2f80f1c10 ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Link: https://lore.kernel.org/all/CAK8fFZ7MKoFSEzMBDAOjoUt+vTZRRQgLDNXEOfdCCXSoXXKE0g@mail.gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05dt-bindings: power: add Amlogic A4 power domainsXianwei Zhao
Add devicetree binding document and related header file for Amlogic A4 secure power domains. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com> Link: https://lore.kernel.org/r/20240529-a4_secpowerdomain-v2-1-47502fc0eaf3@amlogic.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-06-05devlink: Constify the 'table_ops' parameter of devl_dpipe_table_register()Christophe JAILLET
"struct devlink_dpipe_table_ops" only contains some function pointers. Update "struct devlink_dpipe_table" and the 'table_ops' parameter of devl_dpipe_table_register() so that structures in drivers can be constified. Constifying these structures will move some data to a read-only section, so increase overall security. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05dma-buf: align fd_flags and heap_flags with dma_heap_allocation_dataBarry Song
dma_heap_allocation_data defines the UAPI as follows: struct dma_heap_allocation_data { __u64 len; __u32 fd; __u32 fd_flags; __u64 heap_flags; }; However, dma_heap_buffer_alloc() casts both fd_flags and heap_flags into unsigned int. We're inconsistent with types in the non UAPI arguments. This patch fixes it. Signed-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240605012605.5341-1-21cnbao@gmail.com
2024-06-05net: caif: remove unused structsDr. David Alan Gilbert
'cfpktq' has been unused since commit 73d6ac633c6c ("caif: code cleanup"). 'caif_packet_funcs' is declared but never defined. Remove both of them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: remove NULL-pointer net parameter in ip_metrics_convertJason Xing
When I was doing some experiments, I found that when using the first parameter, namely, struct net, in ip_metrics_convert() always triggers NULL pointer crash. Then I digged into this part, realizing that we can remove this one due to its uselessness. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05fsnotify: clear PARENT_WATCHED flags lazilyAmir Goldstein
In some setups directories can have many (usually negative) dentries. Hence __fsnotify_update_child_dentry_flags() function can take a significant amount of time. Since the bulk of this function happens under inode->i_lock this causes a significant contention on the lock when we remove the watch from the directory as the __fsnotify_update_child_dentry_flags() call from fsnotify_recalc_mask() races with __fsnotify_update_child_dentry_flags() calls from __fsnotify_parent() happening on children. This can lead upto softlockup reports reported by users. Fix the problem by calling fsnotify_update_children_dentry_flags() to set PARENT_WATCHED flags only when parent starts watching children. When parent stops watching children, clear false positive PARENT_WATCHED flags lazily in __fsnotify_parent() for each accessed child. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-05mm/memblock: fix a typo in description of for_each_mem_region()Wei Yang
No functional change. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Link: https://lore.kernel.org/r/20240525023040.13509-2-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-06-05platform/chrome: cros_ec_proto: Upgrade get_next_event to v3Daisuke Nojiri
Upgrade EC_CMD_GET_NEXT_EVENT to version 3. The max supported version will be v3. So, we speak v3 even if the EC says it supports v4+. Signed-off-by: Daisuke Nojiri <dnojiri@chromium.org> Link: https://lore.kernel.org/r/20240604230837.2878737-1-dnojiri@chromium.org [tzungbi: uint32_t -> u32 per suggested by checkpatch.pl] Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>