linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-05	coresight: Store in-connections as well as out-connections	James Clark
	This will allow CATU to get its associated ETR in a generic way where currently the enable path has some hard coded searches which avoid the need to store input connections. This also means that the full search for connected devices on removal can be replaced with a loop through only the input and output devices. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-10-james.clark@arm.com
2023-06-05	coresight: Store pointers to connections rather than an array of them	James Clark
	This will allow the same connection object to be referenced via the input connection list in a later commit rather than duplicating them. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-8-james.clark@arm.com
2023-06-05	coresight: Dynamically add connections	James Clark
	Add a function for adding connections dynamically. This also removes the 1:1 mapping between port number and the index into the connections array. The only place this mapping was used was in the warning for duplicate output ports, which has been replaced by a search. Other uses of the port number already use the port member variable. Being able to dynamically add connections will allow other devices like CTI to re-use the connection mechanism despite not having explicit connections described in the DT. The connections array is now no longer sparse, so child_fwnode doesn't need to be checked as all connections have a target node. Because the array is no longer sparse, the high in and out port numbers are required for the refcount arrays. But these will also be removed in a later commit when the refcount is made a property of the connection. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-7-james.clark@arm.com
2023-06-05	coresight: Rename connection members to make the direction explicit	James Clark
	When input connections are added they will use the same connection object as the output so parent and child could be misinterpreted. Making the direction unambiguous in the names should improve readability. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-6-james.clark@arm.com
2023-06-05	coresight: Rename nr_outports to nr_outconns	James Clark
	Rename to avoid confusion between port number and the index in the connection array. The port number is already stored in the connection, and in a later commit the connection array will be appended to, so the length of it will no longer reflect the number of ports. No functional changes. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-5-james.clark@arm.com
2023-06-05	coresight: Change name of pdata->conns	James Clark
	conns is actually for output connections. Change the name to make it clearer and so that we can add input connections later. No functional changes. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-4-james.clark@arm.com
2023-06-05	coresight: Use enum type for cs_mode wherever possible	James Clark
	mode is stored as a local_t, but it is also passed around a lot as a plain u32, so use the correct type wherever local_t isn't currently used. This helps a little bit with readability. Reviewed-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20230425143542.2305069-3-james.clark@arm.com
2023-06-05	drivers/perf: apple_m1: Force 63bit counters for M2 CPUs	Marc Zyngier
	Sidharth reports that on M2, the PMU never generates any interrupt when using 'perf record', which is a annoying as you get no sample. I'm temped to say "no sample, no problem", but others may have a different opinion. Upon investigation, it appears that the counters on M2 are significantly different from the ones on M1, as they count on 64 bits instead of 48. Which of course, in the fine M1 tradition, means that we can only use 63 bits, as the top bit is used to signal the interrupt... This results in having to introduce yet another flag to indicate yet another odd counter width. Who knows what the next crazy implementation will do... With this, perf can work out the correct offset, and 'perf record' works as intended. Tested on M2 and M2-Pro CPUs. Cc: Janne Grunau <j@jannau.net> Cc: Hector Martin <marcan@marcan.st> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Fixes: 7d0bfb7c9977 ("drivers/perf: apple_m1: Add Apple M2 support") Reported-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Tested-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230528080205.288446-1-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-06-05	iopoll: Do not use timekeeping in read_poll_timeout_atomic()	Geert Uytterhoeven
	read_poll_timeout_atomic() uses ktime_get() to implement the timeout feature, just like its non-atomic counterpart. However, there are several issues with this, due to its use in atomic contexts: 1. When called in the s2ram path (as typically done by clock or PM domain drivers), timekeeping may be suspended, triggering the WARN_ON(timekeeping_suspended) in ktime_get(): WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78 Calling ktime_get_mono_fast_ns() instead of ktime_get() would get rid of that warning. However, that would break timeout handling, as (at least on systems with an ARM architectured timer), the time returned by ktime_get_mono_fast_ns() does not advance while timekeeping is suspended. Interestingly, (on the same ARM systems) the time returned by ktime_get() does advance while timekeeping is suspended, despite the warning. 2. Depending on the actual clock source, and especially before a high-resolution clocksource (e.g. the ARM architectured timer) becomes available, time may not advance in atomic contexts, thus breaking timeout handling. Fix this by abandoning the idea that one can rely on timekeeping to implement timeout handling in all atomic contexts, and switch from a global time-based to a locally-estimated timeout handling. In most (all?) cases the timeout condition is exceptional and an error condition, hence any additional delays due to underestimating wall clock time are irrelevant. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/3d2a2f4e553489392d871108797c3be08f88300b.1685692810.git.geert+renesas@glider.be
2023-06-05	iopoll: Call cpu_relax() in busy loops	Geert Uytterhoeven
	It is considered good practice to call cpu_relax() in busy loops, see Documentation/process/volatile-considered-harmful.rst. This can not only lower CPU power consumption or yield to a hyperthreaded twin processor, but also allows an architecture to mitigate hardware issues (e.g. ARM Erratum 754327 for Cortex-A9 prior to r2p0) in the architecture-specific cpu_relax() implementation. In addition, cpu_relax() is also a compiler barrier. It is not immediately obvious that the @op argument "function" will result in an actual function call (e.g. in case of inlining). Where a function call is a C sequence point, this is lost on inlining. Therefore, with agressive enough optimization it might be possible for the compiler to hoist the: (val) = op(args); "load" out of the loop because it doesn't see the value changing. The addition of cpu_relax() would inhibit this. As the iopoll helpers lack calls to cpu_relax(), people are sometimes reluctant to use them, and may fall back to open-coded polling loops (including cpu_relax() calls) instead. Fix this by adding calls to cpu_relax() to the iopoll helpers: - For the non-atomic case, it is sufficient to call cpu_relax() in case of a zero sleep-between-reads value, as a call to usleep_range() is a safe barrier otherwise. However, it doesn't hurt to add the call regardless, for simplicity, and for similarity with the atomic case below. - For the atomic case, cpu_relax() must be called regardless of the sleep-between-reads value, as there is no guarantee all architecture-specific implementations of udelay() handle this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/45c87bec3397fdd704376807f0eec5cc71be440f.1685692810.git.geert+renesas@glider.be
2023-06-05	NFSD: Ensure that xdr_write_pages updates rq_next_page	Chuck Lever
	All other NFSv[23] procedures manage to keep page_ptr and rq_next_page in lock step. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-05	SUNRPC: Trace struct svc_sock lifetime events	Chuck Lever
	Capture a timestamp and pointer address during the creation and destruction of struct svc_sock to record its lifetime. This helps to diagnose transport reference counting issues. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-05	fs.h: Optimize file struct to prevent false sharing	chenzhiyin
	In the syscall test of UnixBench, performance regression occurred due to false sharing. The lock and atomic members, including file::f_lock, file::f_count and file::f_pos_lock are highly contended and frequently updated in the high-concurrency test scenarios. perf c2c indentified one affected read access, file::f_op. To prevent false sharing, the layout of file struct is changed as following (A) f_lock, f_count and f_pos_lock are put together to share the same cache line. (B) The read mostly members, including f_path, f_inode, f_op are put into a separate cache line. (C) f_mode is put together with f_count, since they are used frequently at the same time. Due to '__randomize_layout' attribute of file struct, the updated layout only can be effective when CONFIG_RANDSTRUCT_NONE is 'y'. The optimization has been validated in the syscall test of UnixBench. performance gain is 30~50%. Furthermore, to confirm the optimization effectiveness on the other codes path, the results of fsdisk, fsbuffer and fstime are also shown. Here are the detailed test results of unixbench. Command: numactl -C 3-18 ./Run -c 16 syscall fsbuffer fstime fsdisk Without Patch ------------------------------------------------------------------------ File Copy 1024 bufsize 2000 maxblocks 875052.1 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 235484.0 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 2815153.5 KBps (30.0 s, 2 samples) System Call Overhead 5772268.3 lps (10.0 s, 7 samples) System Benchmarks Partial Index BASELINE RESULT INDEX File Copy 1024 bufsize 2000 maxblocks 3960.0 875052.1 2209.7 File Copy 256 bufsize 500 maxblocks 1655.0 235484.0 1422.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 2815153.5 4853.7 System Call Overhead 15000.0 5772268.3 3848.2 ======== System Benchmarks Index Score (Partial Only) 2768.3 With Patch ------------------------------------------------------------------------ File Copy 1024 bufsize 2000 maxblocks 1009977.2 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 264765.9 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 3052236.0 KBps (30.0 s, 2 samples) System Call Overhead 8237404.4 lps (10.0 s, 7 samples) System Benchmarks Partial Index BASELINE RESULT INDEX File Copy 1024 bufsize 2000 maxblocks 3960.0 1009977.2 2550.4 File Copy 256 bufsize 500 maxblocks 1655.0 264765.9 1599.8 File Copy 4096 bufsize 8000 maxblocks 5800.0 3052236.0 5262.5 System Call Overhead 15000.0 8237404.4 5491.6 ======== System Benchmarks Index Score (Partial Only) 3295.3 Signed-off-by: chenzhiyin <zhiyin.chen@intel.com> Message-Id: <20230601092400.27162-1-zhiyin.chen@intel.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-05	highmem: Rename put_and_unmap_page() to unmap_and_put_page()	Fabio M. De Francesco
	With commit 849ad04cf562a ("new helper: put_and_unmap_page()"), Al Viro introduced the put_and_unmap_page() to use in those many places where we have a common pattern consisting of calls to kunmap_local() + put_page(). Obviously, first we unmap and then we put pages. Instead, the original name of this helper seems to imply that we first put and then unmap. Therefore, rename the helper and change the only known upstreamed user (i.e., fs/sysv) before this helper enters common use and might become difficult to find all call sites and instead easy to break the builds. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Message-Id: <20230602103307.5637-1-fmdefrancesco@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-06-05	dt-bindings: reset: nuvoton: Document ma35d1 reset control	Jacky Huang
	Add the dt-bindings header for Nuvoton ma35d1, that gets shared between the reset controller and reset references in the dts. Add documentation to describe nuvoton ma35d1 reset driver. Signed-off-by: Jacky Huang <ychuang3@nuvoton.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-05	dt-bindings: clock: nuvoton: add binding for ma35d1 clock controller	Jacky Huang
	Add the dt-bindings header for Nuvoton ma35d1, that gets shared between the clock controller and clock references in the dts. Add documentation to describe nuvoton ma35d1 clock driver. Signed-off-by: Jacky Huang <ychuang3@nuvoton.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-05	x86/amd_nb: Add MI200 PCI IDs	Yazen Ghannam
	The AMD MI200 series accelerators are data center GPUs. They include unified memory controllers and a data fabric similar to those used in AMD x86 CPU products. The memory controllers report errors using MCA, though these errors are generally handled through GPU drivers that directly manage the accelerator device. In some configurations, memory errors from these devices will be reported through MCA and managed by x86 CPUs. The OS is expected to handle these errors in similar fashion to MCA errors originating from memory controllers on the CPUs. In Linux, this flow includes passing MCA errors to a notifier chain with handlers in the EDAC subsystem. The AMD64 EDAC module requires information from the memory controllers and data fabric in order to provide detailed decoding of memory errors. The information is read from hardware registers accessed through interfaces in the data fabric. The accelerator data fabrics are visible to the host x86 CPUs as PCI devices just like x86 CPU data fabrics are already. However, the accelerator fabrics have new and unique PCI IDs. Add PCI IDs for the MI200 series of accelerator devices in order to enable EDAC support. The data fabrics of the accelerator devices will be enumerated as any other fabric already supported. System-specific implementation details will be handled within the AMD64 EDAC module. [ bp: Scrub off marketing speak. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230515113537.1052146-2-muralimk@amd.com
2023-06-05	net: pcs: xpcs: remove xpcs_create() from public view	Russell King (Oracle)
	There are now no callers of xpcs_create(), so let's remove it from public view to discourage future direct usage. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-05	net: pcs: Drop the TSE PCS driver	Maxime Chevallier
	Now that we can easily create a mdio-device that represents a memory-mapped device that exposes an MDIO-like register layout, we don't need the Altera TSE PCS anymore, since we can use the Lynx PCS instead. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-05	net: mdio: Introduce a regmap-based mdio driver	Maxime Chevallier
	There exists several examples today of devices that embed an ethernet PHY or PCS directly inside an SoC. In this situation, either the device is controlled through a vendor-specific register set, or sometimes exposes the standard 802.3 registers that are typically accessed over MDIO. As phylib and phylink are designed to use mdiodevices, this driver allows creating a virtual MDIO bus, that translates mdiodev register accesses to regmap accesses. The reason we use regmap is because there are at least 3 such devices known today, 2 of them are Altera TSE PCS's, memory-mapped, exposed with a 4-byte stride in stmmac's dwmac-socfpga variant, and a 2-byte stride in altera-tse. The other one (nxp,sja1110-base-tx-mdio) is exposed over SPI. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-05	locking/atomic: scripts: generate kerneldoc comments	Mark Rutland
	Currently the atomics are documented in Documentation/atomic_t.txt, and have no kerneldoc comments. There are a sufficient number of gotchas (e.g. semantics, noinstr-safety) that it would be nice to have comments to call these out, and it would be nice to have kerneldoc comments such that these can be collated. While it's possible to derive the semantics from the code, this can be painful given the amount of indirection we currently have (e.g. fallback paths), and it's easy to be mislead by naming, e.g. * The unconditional void-returning ops only have relaxed variants without a _relaxed suffix, and can easily be mistaken for being fully ordered. It would be nice to give these a _relaxed() suffix, but this would result in significant churn throughout the kernel. * Our naming of conditional and unconditional+test ops is rather inconsistent, and it can be difficult to derive the name of an operation, or to identify where an op is conditional or unconditional+test. Some ops are clearly conditional: - dec_if_positive - add_unless - dec_unless_positive - inc_unless_negative Some ops are clearly unconditional+test: - sub_and_test - dec_and_test - inc_and_test However, what exactly those test is not obvious. A _test_zero suffix might be clearer. Others could be read ambiguously: - inc_not_zero // conditional - add_negative // unconditional+test It would probably be worth renaming these, e.g. to inc_unless_zero and add_test_negative. As a step towards making this more consistent and easier to understand, this patch adds kerneldoc comments for all generated atomic_() functions. These are generated from templates, with some common text shared, making it easy to extend these in future if necessary. I've tried to make these as consistent and clear as possible, and I've deliberately ensured: All ops have their ordering explicitly mentioned in the short and long description. * All test ops have "test" in their short description. * All ops are described as an expression using their usual C operator. For example: andnot: "Atomically updates @v to (@v & ~@i)" inc: "Atomically updates @v to (@v + 1)" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All conditional ops have their condition described as an expression using the usual C operators. For example: add_unless: "If (@v != @u), atomically updates @v to (@v + @i)" cmpxchg: "If (@v == @old), atomically updates @v to @new" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All bitwise ops (and,andnot,or,xor) explicitly mention that they are bitwise in their short description, so that they are not mistaken for performing their logical equivalents. * The noinstr safety of each op is explicitly described, with a description of whether or not to use the raw_ form of the op. There should be no functional change as a result of this patch. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-26-mark.rutland@arm.com
2023-06-05	locking/atomic: scripts: simplify raw_atomic*() definitions	Mark Rutland
	Currently each ordering variant has several potential definitions, with a mixture of preprocessor and C definitions, including several copies of its C prototype, e.g. \| #if defined(arch_atomic_fetch_andnot_acquire) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| } \| #elif defined(arch_atomic_fetch_andnot) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot \| #else \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| return raw_atomic_fetch_and_acquire(~i, v); \| } \| #endif Make this a bit simpler by defining the C prototype once, and writing the various potential definitions as plain C code guarded by ifdeffery. For example, the above becomes: \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| #if defined(arch_atomic_fetch_andnot_acquire) \| return arch_atomic_fetch_andnot_acquire(i, v); \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| #elif defined(arch_atomic_fetch_andnot) \| return arch_atomic_fetch_andnot(i, v); \| #else \| return raw_atomic_fetch_and_acquire(~i, v); \| #endif \| } Which is far easier to read. As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. At the same time, the fallbacks for raw_atomic_xhcg() are made to use 'new' rather than 'i' as the name of the new value. This is what the existing fallback template used, and is more consistent with the raw_atomic{_try,}cmpxchg() fallbacks. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-24-mark.rutland@arm.com
2023-06-05	locking/atomic: scripts: simplify raw_atomic_long*() definitions	Mark Rutland
	Currently, atomic-long is split into two sections, one defining the raw_atomic_long_() ops for CONFIG_64BIT, and one defining the raw atomic_long_() ops for !CONFIG_64BIT. With many lines elided, this looks like: \| #ifdef CONFIG_64BIT \| ... \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| return raw_atomic64_try_cmpxchg(v, (s64 )old, new); \| } \| ... \| #else / CONFIG_64BIT / \| ... \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| return raw_atomic_try_cmpxchg(v, (int )old, new); \| } \| ... \| #endif The two definitions are spread far apart in the file, and duplicate the prototype, making it hard to have a legible set of kerneldoc comments. Make this simpler by defining the C prototype once, and writing the two definitions inline. For example, the above becomes: \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| #ifdef CONFIG_64BIT \| return raw_atomic64_try_cmpxchg(v, (s64 )old, new); \| #else \| return raw_atomic_try_cmpxchg(v, (int )old, new); \| #endif \| } As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. As a bonus, both the script and the generated file are somewhat shorter. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-23-mark.rutland@arm.com
2023-06-05	locking/atomic: scripts: restructure fallback ifdeffery	Mark Rutland
	Currently the various ordering variants of an atomic operation are defined in groups of full/acquire/release/relaxed ordering variants with some shared ifdeffery and several potential definitions of each ordering variant in different branches of the shared ifdeffery. As an ordering variant can have several potential definitions down different branches of the shared ifdeffery, it can be painful for a human to find a relevant definition, and we don't have a good location to place anything common to all definitions of an ordering variant (e.g. kerneldoc). Historically the grouping of full/acquire/release/relaxed ordering variants was necessary as we filled in the missing atomics in the same namespace as the architecture used. It would be easy to accidentally define one ordering fallback in terms of another ordering fallback with redundant barriers, and avoiding that would otherwise require a lot of baroque ifdeffery. With recent changes we no longer need to fill in the missing atomics in the arch_atomic_<op>() namespace, and only need to fill in the raw_atomic_<op>() namespace. Due to this, there's no risk of a namespace collision, and we can define each raw_atomic_<op> ordering variant with its own ifdeffery checking for the arch_atomic_<op> ordering variants. Restructure the fallbacks in this way, with each ordering variant having its own ifdeffery of the form: \| #if defined(arch_atomic_fetch_andnot_acquire) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| } \| #elif defined(arch_atomic_fetch_andnot) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot \| #else \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| return raw_atomic_fetch_and_acquire(~i, v); \| } \| #endif Note that where there's no relevant arch_atomic_<op>() ordering variant, we'll define the operation in terms of a distinct raw_atomic_<otherop>(), as this itself might have been filled in with a fallback. As we now generate the raw_atomic*_<op>() implementations directly, we no longer need the trivial wrappers, so they are removed. This makes the ifdeffery easier to follow, and will allow for further improvements in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-21-mark.rutland@arm.com
2023-06-05	locking/atomic: scripts: build raw_atomic_long*() directly	Mark Rutland
	Now that arch_atomic() usage is limited to the atomic headers, we no longer have any users of arch_atomic_long_(), and can generate raw_atomic_long_() directly. Generate the raw_atomic_long_() ops directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-20-mark.rutland@arm.com
2023-06-05	locking/atomic: treewide: use raw_atomic*_<op>()	Mark Rutland
	Now that we have raw_atomic_<op>() definitions, there's no need to use arch_atomic_<op>() definitions outside of the low-level atomic definitions. Move treewide users of arch_atomic_<op>() over to the equivalent raw_atomic_<op>(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-19-mark.rutland@arm.com
2023-06-05	locking/atomic: scripts: add trivial raw_atomic*_<op>()	Mark Rutland
	Currently a number of arch_atomic_<op>() functions are optional, and where an arch does not provide a given arch_atomic_<op>() we will define an implementation of arch_atomic*_<op>() in atomic-arch-fallback.h. Filling in the missing ops requires special care as we want to select the optimal definition of each op (e.g. preferentially defining ops in terms of their relaxed form rather than their fully-ordered form). The ifdeffery necessary for this requires us to group ordering variants together, which can be a bit painful to read, and is painful for kerneldoc generation. It would be easier to handle this if we generated ops into a separate namespace, as this would remove the need to take special care with the ifdeffery, and allow each ordering variant to be generated separately. This patch adds a new set of raw_atomic_<op>() definitions, which are currently trivial wrappers of their arch_atomic_<op>() equivalent. This will allow us to move treewide users of arch_atomic_<op>() over to raw atomic op before we rework the fallback generation to generate raw_atomic_<op> directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-18-mark.rutland@arm.com
2023-06-05	locking/atomic: make atomic*_{cmp,}xchg optional	Mark Rutland
	Most architectures define the atomic/atomic64 xchg and cmpxchg operations in terms of arch_xchg and arch_cmpxchg respectfully. Add fallbacks for these cases and remove the trivial cases from arch code. On some architectures the existing definitions are kept as these are used to build other arch_atomic*() operations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
2023-06-05	locking/atomic: remove fallback comments	Mark Rutland
	Currently a subset of the fallback templates have kerneldoc comments, resulting in a haphazard set of generated kerneldoc comments as only some operations have fallback templates to begin with. We'd like to generate more consistent kerneldoc comments, and to do so we'll need to restructure the way the fallback code is generated. To minimize churn and to make it easier to restructure the fallback code, this patch removes the existing kerneldoc comments from the fallback templates. We can add new kerneldoc comments in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-3-mark.rutland@arm.com
2023-06-05	arch: Remove cmpxchg_double	Peter Zijlstra
	No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.991907085@infradead.org
2023-06-05	slub: Replace cmpxchg_double()	Peter Zijlstra
	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.924677086@infradead.org
2023-06-05	x86,intel_iommu: Replace cmpxchg_double()	Peter Zijlstra
	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.855976804@infradead.org
2023-06-05	percpu: Wire up cmpxchg128	Peter Zijlstra
	In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.654945124@infradead.org
2023-06-05	percpu: Add {raw,this}_cpu_try_cmpxchg()	Peter Zijlstra
	Add the try_cmpxchg() form to the per-cpu ops. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.587480729@infradead.org
2023-06-05	instrumentation: Wire up cmpxchg128()	Peter Zijlstra
	Wire up the cmpxchg128 family in the atomic wrapper scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.519237070@infradead.org
2023-06-05	types: Introduce [us]128	Peter Zijlstra
	Introduce [us]128 (when available). Unlike [us]64, ensure they are always naturally aligned. This also enables 128bit wide atomics (which require natural alignment) such as cmpxchg128(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.385005581@infradead.org
2023-06-05	cyrpto/b128ops: Remove struct u128	Peter Zijlstra
	Per git-grep u128_xor() and its related struct u128 are unused except to implement {be,le}128_xor(). Remove them to free up the namespace. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.314826687@infradead.org
2023-06-05	ALSA: emu10k1: vastly improve usefulness of info in /proc	Oswald Buddenhagen
	- Include the FX bus map, without which the already present send routing info would require looking up the documentation. - Include the physical I/O channels as known to the driver - Make the multi-channel capture map actually name the mapped input channels rather than "FXBUS" (Audigy) or even "???" (SbLive) - The latter two are omitted for E-MU cards, as their physical I/O is routed through the FPGA - While at it, make the "Card" field somewhat more useful This includes de-duplicating the label tables between emuproc and emufx, updating/improving the FX bus label table, and making the SB Live! 5.1 multi-track capture channel mapping hack data-driven. Tested-by: Jonathan Dowland <jon@dow.land> Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Link: https://lore.kernel.org/r/20230526101659.437969-7-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-05	ALSA: emu10k1: make E-MU FPGA register dump in /proc more useful	Oswald Buddenhagen
	Include the routing information, which can be actually read back. Somewhat as a drive-by, make the register dump format less obscure - the previous one made no sense at all. Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de> Link: https://lore.kernel.org/r/20230526101659.437969-6-oswald.buddenhagen@gmx.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-06-05	Merge 6.4-rc5 into tty-next	Greg Kroah-Hartman
	We need the tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-05	Merge 6.4-rc5 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here are well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-05	Merge 6.4-rc5 into driver-core-next	Greg Kroah-Hartman
	We need the driver core fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-05	Merge 6.4-rc5 into char-misc-next	Greg Kroah-Hartman
	We need the char/misc fixes in here as well for mergeing and testing. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-05	ata: libata-sata: Simplify ata_change_queue_depth()	Damien Le Moal
	Commit 141f3d6256e5 ("ata: libata-sata: Fix device queue depth control") added a struct ata_device argument to ata_change_queue_depth() to address problems with changing the queue depth of ATA devices managed through libsas. This was due to problems with ata_scsi_find_dev() which are now fixed with commit 7f875850f20a ("ata: libata-scsi: Use correct device no in ata_find_dev()"). Undo some of the changes of commit 141f3d6256e5: remove the added struct ata_device aregument and use again ata_scsi_find_dev() to find the target ATA device structure. While doing this, also make sure that ata_scsi_find_dev() is called with ap->lock held, as it should. libsas and libata call sites of ata_change_queue_depth() are updated to match the modified function arguments. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com>
2023-06-04	ublk: add control command of UBLK_U_CMD_GET_FEATURES	Ming Lei
	Add control command of UBLK_U_CMD_GET_FEATURES for returning driver's feature set or capability. This way can simplify userspace for maintaining compatibility because userspace doesn't need to send command to one device for querying driver feature set any more. Such as, with the queried feature set, userspace can choose to use: - UBLK_CMD_GET_DEV_INFO2 or UBLK_CMD_GET_DEV_INFO, - UBLK_U_CMD_* or UBLK_CMD_* Userspace code: https://github.com/ming1/ubdsrv/commits/features-cmd Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230603040601.775227-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-04	Merge tag 'media/v6.4-4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some driver fixes: - a regression fix for the verisilicon driver - uvcvideo: don't expose unsupported video formats to userspace - camss-video: don't zero subdev format after init - mediatek: some fixes for 4K decoder formats - fix a Sphinx build warning (missing doc for client_caps) - some fixes for imx and atomisp staging drivers And two CEC core fixes: - don't set last_initiator if TX in progress - disable adapter in cec_devnode_unregister" * tag 'media/v6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: uvcvideo: Don't expose unsupported formats to userspace media: v4l2-subdev: Fix missing kerneldoc for client_caps media: staging: media: imx: initialize hs_settle to avoid warning media: v4l2-mc: Drop subdev check in v4l2_create_fwnode_links_to_pad() media: staging: media: atomisp: init high & low vars media: cec: core: don't set last_initiator if tx in progress media: cec: core: disable adapter in cec_devnode_unregister media: mediatek: vcodec: Only apply 4K frame sizes on decoder formats media: camss: camss-video: Don't zero subdev format again after initialization media: verisilicon: Additional fix for the crash when opening the driver
2023-06-04	Merge tag 'char-misc-6.4-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that resolve a number of reported issues. Included in here are: - iio driver fixes - fpga driver fixes - test_firmware bugfixes - fastrpc driver tiny bugfixes - MAINTAINERS file updates for some subsystems All of these have been in linux-next this past week with no reported issues" * tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits) test_firmware: fix the memory leak of the allocated firmware buffer test_firmware: fix a memory leak with reqs buffer test_firmware: prevent race conditions by a correct implementation of locking firmware_loader: Fix a NULL vs IS_ERR() check MAINTAINERS: Vaibhav Gupta is the new ipack maintainer dt-bindings: fpga: replace Ivan Bornyakov maintainership MAINTAINERS: update Microchip MPF FPGA reviewers misc: fastrpc: reject new invocations during device removal misc: fastrpc: return -EPIPE to invocations on device removal misc: fastrpc: Reassign memory ownership only for remote heap misc: fastrpc: Pass proper scm arguments for secure map request iio: imu: inv_icm42600: fix timestamp reset iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value iio: dac: mcp4725: Fix i2c_master_send() return value handling iio: accel: kx022a fix irq getting iio: bu27034: Ensure reset is written iio: dac: build ad5758 driver when AD5758 is selected iio: addac: ad74413: fix resistance input processing iio: light: vcnl4035: fixed chip ID check ...
2023-06-04	Merge tag 'usb-6.4-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB driver and core fixes for 6.4-rc5. Most of these are tiny driver fixes, including: - udc driver bugfix - f_fs gadget driver bugfix - cdns3 driver bugfix - typec bugfixes But the "big" thing in here is a fix yet-again for how the USB buffers are handled from userspace when dealing with DMA issues. The changes were discussed a lot, and tested a lot, on the list, and acked by the relevant mm maintainers and have been in linux-next all this past week with no reported problems" * tag 'usb-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: tps6598x: Fix broken polling mode after system suspend/resume mm: page_table_check: Ensure user pages are not slab pages mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM usb: usbfs: Use consistent mmap functions usb: usbfs: Enforce page requirements for mmap dt-bindings: usb: snps,dwc3: Fix "snps,hsphy_interface" type usb: gadget: udc: fix NULL dereference in remove() usb: gadget: f_fs: Add unbind event before functionfs_unbind usb: cdns3: fix NCM gadget RX speed 20x slow than expection at iMX8QM
2023-06-03	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Five fixes, all in drivers. The most extensive is the target change to fix the hang in the login code, which involves changing timers from per login to per connection" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: stex: Fix gcc 13 warnings scsi: qla2xxx: Fix NULL pointer dereference in target mode scsi: target: iscsi: Prevent login threads from racing between each other scsi: target: iscsi: Remove unused transport_timer scsi: target: iscsi: Fix hang in the iSCSI login code
2023-06-02	net: phylib: fix phy_read*_poll_timeout()	Russell King (Oracle)
	Dan Carpenter reported a signedness bug in genphy_loopback(). Andrew reports that: "It is common to get this wrong in general with PHY drivers. Dan regularly posts fixes like this soon after a PHY driver patch it merged. I really wish we could somehow get the compiler to warn when the result from phy_read() is stored into a unsigned type. It would save Dan a lot of work." Let's make phy_read*_poll_timeout() immune to further issues when "val" is an unsigned type by storing the read function's result in a signed int as well as "val", and using the signed variable both to check for an error and for propagating that error to the caller. The advantage of this method is we don't change where the cast from the signed return code to the user's variable occurs - so users will see no change. Previously Heiner changed phy_read_poll_timeout() to check for an error before evaluating the user supplied condition, but didn't update phy_read_mmd_poll_timeout(). Make that change there too. Link: https://lore.kernel.org/r/d7bb312e-2428-45f6-b9b3-59ba544e8b94@kili.mountain Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1q4kX6-00BNuM-Mx@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>