git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-11-12	ARM: fix cacheflush with PAN	Russell King (Oracle)
	It seems that the cacheflush syscall got broken when PAN for LPAE was implemented. User access was not enabled around the cache maintenance instructions, causing them to fault. Fixes: 7af5b901e847 ("ARM: 9358/2: Implement PAN for LPAE by TTBR0 page table walks disablement") Reported-by: Michał Pecio <michal.pecio@gmail.com> Tested-by: Michał Pecio <michal.pecio@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-11-12	ARM: 9435/1: ARM/nommu: Fix typo "absence"	WangYuli
	There is a spelling mistake of 'absense' in comments which should be 'absence'. Link: https://lore.kernel.org/all/fca25741-c89f-43ea-95af-5e3232d513fc@arm.com/ Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-11-12	ARM: 9434/1: cfi: Fix compilation corner case	Linus Walleij
	When enabling expert mode CONFIG_EXPERT and using that power user mode to disable the branch prediction hardening !CONFIG_HARDEN_BRANCH_PREDICTOR, the assembly linker in CLANG notices that some assembly in proc-v7.S does not have corresponding C call sites, i.e. the prototypes in proc-v7-bugs.c are enclosed in ifdef CONFIG_HARDEN_BRANCH_PREDICTOR so this assembly: SYM_TYPED_FUNC_START(cpu_v7_smc_switch_mm) SYM_TYPED_FUNC_START(cpu_v7_hvc_switch_mm) Results in: ld.lld: error: undefined symbol: __kcfi_typeid_cpu_v7_smc_switch_mm >>> referenced by proc-v7.S:94 (.../arch/arm/mm/proc-v7.S:94) >>> arch/arm/mm/proc-v7.o:(.text+0x108) in archive vmlinux.a ld.lld: error: undefined symbol: __kcfi_typeid_cpu_v7_hvc_switch_mm >>> referenced by proc-v7.S:105 (.../arch/arm/mm/proc-v7.S:105) >>> arch/arm/mm/proc-v7.o:(.text+0x124) in archive vmlinux.a Fix this by adding an additional requirement that CONFIG_HARDEN_BRANCH_PREDICTOR has to be enabled to compile these assembly calls. Closes: https://lore.kernel.org/oe-kbuild-all/202411041456.ZsoEiD7T-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-11-12	md/raid5: Increase r5conf.cache_name size	John Garry
	For compiling with W=1, the following warning can be seen: drivers/md/raid5.c: In function ‘setup_conf’: drivers/md/raid5.c:2423:12: error: ‘%s’ directive output may be truncated writing up to 31 bytes into a region of size between 16 and 26 [-Werror=format-truncation=] "raid%d-%s", conf->level, mdname(conf->mddev)); ^~ drivers/md/raid5.c:2422:3: note: ‘snprintf’ output between 7 and 48 bytes into a destination of size 32 snprintf(conf->cache_name[0], namelen, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "raid%d-%s", conf->level, mdname(conf->mddev)); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Increase the array size to avoid this warning. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241112161019.4154616-2-john.g.garry@oracle.com Signed-off-by: Song Liu <song@kernel.org>
2024-11-12	vdpa/mlx5: Fix PA offset with unaligned starting iotlb map	Si-Wei Liu
	When calculating the physical address range based on the iotlb and mr [start,end) ranges, the offset of mr->start relative to map->start is not taken into account. This leads to some incorrect and duplicate mappings. For the case when mr->start < map->start the code is already correct: the range in [mr->start, map->start) was handled by a different iteration. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Cc: stable@vger.kernel.org Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2024-11-12	drm/amd: Fix initialization mistake for NBIO 7.7.0	Vijendar Mukunda
	There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME events while in the D0 state. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 447a54a0f79c9a409ceaa17804bdd2e0206397b9) Cc: stable@vger.kernel.org
2024-11-12	Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"	Alex Deucher
	This reverts commit 694c79769cb384bca8b1ec1d1e84156e726bd106. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 3c2296b1eec55b50c64509ba15406142d4a958dc) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12	drm/amd/display: Fix failure to read vram info due to static BP_RESULT	Hamish Claxton
	The static declaration causes the check to fail. Remove it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Fixes: 00c391102abc ("drm/amd/display: Add misc DC changes for DCN401") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 91314e7dfd83345b8b820b782b2511c9c32866cd) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12	drm/amdgpu: enable GTT fallback handling for dGPUs only	Christian König
	That is just a waste of time on APUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704 Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM\|GTT") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e8fc090d322346e5ce4c4cfe03a8100e31f61c3c) Cc: stable@vger.kernel.org
2024-11-12	srcu: Check for srcu_read_lock_lite() across all CPUs	Paul E. McKenney
	If srcu_read_lock_lite() is used on a given srcu_struct structure, then the grace-period processing must do synchronize_rcu() instead of smp_mb() between the scans of the ->srcu_unlock_count[] and ->srcu_lock_count[] counters. Currently, it does that by testing the SRCU_READ_FLAVOR_LITE bit of the ->srcu_reader_flavor mask, which works well. But only if the CPU running that srcu_struct structure's grace period has previously executed srcu_read_lock_lite(), which might not be the case, especially just after that srcu_struct structure has been created and initialized. This commit therefore updates the srcu_readers_unlock_idx() function to OR together the ->srcu_reader_flavor masks from all CPUs, and then make the srcu_readers_active_idx_check() function that test the SRCU_READ_FLAVOR_LITE bit in the resulting mask. Note that the srcu_readers_unlock_idx() function is already scanning all the CPUs to sum up the ->srcu_unlock_count[] fields and that this is on the grace-period slow path, hence no concerns about the small amount of extra work. Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	srcu: Remove smp_mb() from srcu_read_unlock_lite()	Paul E. McKenney
	The srcu_read_unlock_lite() function invokes __srcu_read_unlock() instead of __srcu_read_unlock_lite(), which means that it is doing an unnecessary smp_mb(). This is harmless other than the performance degradation. This commit therefore switches to __srcu_read_unlock_lite(). Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcutorture: Avoid printing cpu=-1 for no-fault RCU boost failure	Paul E. McKenney
	If a CPU runs throughout the stalled grace period without passing through a quiescent state, RCU priority boosting cannot help. The rcu_torture_boost_failed() function therefore prints a message flagging the first such CPU. However, if the stall was instead due to (for example) RCU's grace-period kthread being starved of CPU, there will be no such CPU, causing rcu_check_boost_fail() to instead pass back -1 through its cpup CPU-pointer parameter. Therefore, the current message complains about a mythical CPU -1. This commit therefore checks for this situation, and notes that all CPUs have passed through a quiescent state. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcuscale: Add guest_os_delay module parameter	Paul E. McKenney
	This commit adds a guest_os_delay module parameter that extends warm-up and cool-down the specified number of seconds before and after the series of test runs. This allows the data-collection intervals from any given rcuscale guest OSes to line up with active periods in the other rcuscale guest OSes, and also allows the thermal warm-up period required to obtain consistent results from one test to the next. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	refscale: Correct affinity check	Paul E. McKenney
	The current affinity check works fine until there are more reader processes than CPUs, at which point the affinity check is looking for non-existent CPUs. This commit therefore applies the same modulus to the check as is present in the set_cpus_allowed_ptr() call. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	torture: Add --no-affinity parameter to kvm.sh	Paul E. McKenney
	In performance tests, it can be counter-productive to spread torture-test guest OSes across sockets. Plus the experimenter might have ideas about what CPUs individual guest OSes are to run on. This commit therefore adds a --no-affinity parameter to kvm.sh to prevent it from running taskset on its guest OSes. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	hwmon: (tmp108) Add support for I3C device	Frank Li
	Add support for I3C device in the tmp108 driver to handle the P3T1085 sensor. Register the I3C device driver to enable I3C functionality for the sensor. Signed-off-by: Frank Li <Frank.Li@nxp.com> Message-ID: <20241112-p3t1085-v4-2-a1334314b1e6@nxp.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (tmp108) Add helper function tmp108_common_probe() to prepare I3C support	Frank Li
	Add help function tmp108_common_probe() to pave road to support i3c for P3T1085(NXP) chip. Use dev_err_probe() to simplify the code. Signed-off-by: Frank Li <Frank.Li@nxp.com> Message-ID: <20241112-p3t1085-v4-1-a1334314b1e6@nxp.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (acpi_power_meter) Fix fail to load module on platform without _PMD ↵	Huisong Li
	method According to the ACPI specification, the _PMD method is optional. The acpi_power_meter driver shouldn't fail to load if the platform has no _PMD method. Signed-off-by: Huisong Li <lihuisong@huawei.com> Message-ID: <20241112021228.22914-1-lihuisong@huawei.com> [groeck: Reworded commit description] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (nct6775-core) Fix overflows seen when writing limit attributes	Pei Xiao
	DIV_ROUND_CLOSEST() after kstrtoul() results in an overflow if a large number such as 18446744073709551615 is provided by the user. Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations. Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Fixes: c3963bc0a0cf ("hwmon: (nct6775) Split core and platform driver") Message-ID: <7d5084cea33f7c0fd0578c59adfff71f93de94d9.1731375425.git.xiaopei01@kylinos.cn> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (pwm-fan) Introduce start from stopped state handling	Marek Vasut
	Delta AFC0612DB-F00 fan has to be set to at least 30% PWM duty cycle to spin up from a stopped state, and can be afterward throttled down to lower PWM duty cycle. Introduce support for operating such fans which need to start at higher PWM duty cycle first and can slow down next. Introduce two new DT properties, "fan-stop-to-start-percent" and "fan-stop-to-start-us". The former describes the minimum percent of fan RPM at which it will surely spin up from stopped state. This value can be found in the fan datasheet and can be converted to PWM duty cycle easily. The "fan-stop-to-start-us" describes the minimum time in microseconds for which the fan has to be set to stopped state start RPM for the fan to surely spin up. Adjust the PWM setting code such that if the PWM duty cycle is below the minimum duty cycle needed by the fan to spin up from stopped state, then first set the PWM duty cycle to the minimum duty cycle needed by the fan to spin up from stopped state, then wait the time necessary for the fan to spin up from stopped state, and finally set the PWM duty cycle to the one desired by user. Signed-off-by: Marek Vasut <marex@denx.de> Message-ID: <20241106185925.223736-2-marex@denx.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	dt-bindings: hwmon: pwm-fan: Document start from stopped state properties	Marek Vasut
	Delta AFC0612DB-F00 fan has to be set to at least 30% PWM duty cycle to spin up from a stopped state, and can be afterward throttled down to lower PWM duty cycle. Introduce support for operating such fans which need to start at higher PWM duty cycle first and can slow down next. Document two new DT properties, "fan-stop-to-start-percent" and "fan-stop-to-start-usec". The former describes the minimum percent of fan RPM at which it will surely spin up from stopped state. This value can be found in the fan datasheet and can be converted to PWM duty cycle easily. The "fan-stop-to-start-usec" describes the minimum time in microseconds for which the fan has to be set to stopped state start RPM for the fan to surely spin up. Signed-off-by: Marek Vasut <marex@denx.de> Acked-by: Conor Dooley <conor.dooley@microchip.com> Message-ID: <20241106185925.223736-1-marex@denx.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (tmp108) Add NXP p3t1085 support	Frank Li
	Add compatible string 'nxp,p3t1085' since p3t1085's register layout is the same as tmp108. The p3t1085 supports I3C interface. Update document tmp108.rst and Kconfig's help context. Signed-off-by: Frank Li <Frank.Li@nxp.com> Message-ID: <20241111-p3t1085-v3-2-bff511550aad@nxp.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	dt-bindings: hwmon: ti,tmp108: Add nxp,p3t1085 compatible string	Frank Li
	The register layout of P3T1085 is the same as ti,tmp108. Add compatible string nxp,p3t1085 for it. The difference of P3T1085 is support I3C. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Message-ID: <20241111-p3t1085-v3-1-bff511550aad@nxp.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (sch5627, max31827) Fix typos in driver documentation	Abhinav Saxena
	Fix some typos in hwmon/sch5627 and hwmon/max31827 reported by checkpatch.pl. These changes are purely documentation cleanup with no functional modifications. Signed-off-by: Abhinav Saxena <xandfury@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Message-ID: <20241108212201.144482-1-xandfury@gmail.com> [groeck: Updated subject] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (jc42) Drop of_match_ptr() protection	Andy Shevchenko
	This prevents use of this driver with ACPI via PRP0001 and is an example of an anti pattern I'm trying to remove from the kernel. Hence drop from this driver. Also switch of.h for mod_devicetable.h include given use of struct of_device_id which is defined in that header. Reported-by: Konstantin Aladyshev <aladyshev22@gmail.com> Closes: https://lore.kernel.org/r/CACSj6VW7WKv5tiAkLCvSujENJvXq1Mc7_7vtkQsRSz3JGY0i3Q@mail.gmail.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Message-ID: <20241108124348.1392473-1-andriy.shevchenko@linux.intel.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (f71882fg) Fix grammar in fan speed trip points explanation	Abhinav Saxena
	Fix several grammatical issues in the fan speed trip points documentation: - Replace awkward "which % the fan should run at at" construction with clearer "that specify the percentage at which" - Fix incorrect "is chip depended" to "are chip dependent" for correct verb agreement and adjective form - Improve readability by reorganizing first sentence and separating the complex explanation into simpler parts - Add hyphen before "see" to improve readability - Remove redundant "at" in temperature description No functional changes, documentation only. Signed-off-by: Abhinav Saxena <xandfury@gmail.com> Message-ID: <20241107013849.47833-1-xandfury@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	dt-bindings: hwmon: pmbus: add ti tps25990 support	Jerome Brunet
	Add DT binding for the Texas Instruments TPS25990 eFuse Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Message-ID: <20241105-tps25990-v4-6-0e312ac70b62@baylibre.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (pmbus/core) clear faults after setting smbalert mask	Jerome Brunet
	pmbus_write_smbalert_mask() ignores the errors if the chip can't set smbalert mask the standard way. It is not necessarily a problem for the irq support if the chip is otherwise properly setup but it may leave an uncleared fault behind. pmbus_core will pick the fault on the next register_check(). The register check will fails regardless of the actual register support by the chip. This leads to missing attributes or debugfs entries for chips that should provide them. We cannot rely on register_check() as PMBUS_SMBALERT_MASK may be read-only. Unconditionally clear the page fault after setting PMBUS_SMBALERT_MASK to avoid the problem. Suggested-by: Guenter Roeck <linux@roeck-us.net> Fixes: 221819ca4c36 ("hwmon: (pmbus/core) Add interrupt support") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Message-ID: <20241105-tps25990-v4-5-0e312ac70b62@baylibre.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	hwmon: (pmbus/core) allow drivers to override WRITE_PROTECT	Jerome Brunet
	Use _pmbus_read_byte_data() rather than calling smbus directly to check the write protection status. This give a chance to device implementing write protection differently to report back on the actual write protection status. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Message-ID: <20241105-tps25990-v4-2-0e312ac70b62@baylibre.com> [groeck: Fix page parameter of _pmbus_read_byte_data()] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-11-12	rcu/nocb: Fix missed RCU barrier on deoffloading	Zqiang
	Currently, running rcutorture test with torture_type=rcu fwd_progress=8 n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60 test_boost=2, will trigger the following warning: WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0 RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0 Call Trace: <TASK> ? __warn+0x7e/0x120 ? rcu_nocb_rdp_deoffload+0x292/0x2a0 ? report_bug+0x18e/0x1a0 ? handle_bug+0x3d/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? rcu_nocb_rdp_deoffload+0x292/0x2a0 rcu_nocb_cpu_deoffload+0x70/0xa0 rcu_nocb_toggle+0x136/0x1c0 ? __pfx_rcu_nocb_toggle+0x10/0x10 kthread+0xd1/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> CPU0 CPU2 CPU3 //rcu_nocb_toggle //nocb_cb_wait //rcutorture // deoffload CPU1 // process CPU1's rdp rcu_barrier() rcu_segcblist_entrain() rcu_segcblist_add_len(1); // len == 2 // enqueue barrier // callback to CPU1's // rdp->cblist rcu_do_batch() // invoke CPU1's rdp->cblist // callback rcu_barrier_callback() rcu_barrier() mutex_lock(&rcu_state.barrier_mutex); // still see len == 2 // enqueue barrier callback // to CPU1's rdp->cblist rcu_segcblist_entrain() rcu_segcblist_add_len(1); // len == 3 // decrement len rcu_segcblist_add_len(-2); kthread_parkme() // CPU1's rdp->cblist len == 1 // Warn because there is // still a pending barrier // trigger warning WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); cpus_read_unlock(); // wait CPU1 to comes online and // invoke barrier callback on // CPU1 rdp's->cblist wait_for_completion(&rcu_state.barrier_completion); // deoffload CPU4 cpus_read_lock() rcu_barrier() mutex_lock(&rcu_state.barrier_mutex); // block on barrier_mutex // wait rcu_barrier() on // CPU3 to unlock barrier_mutex // but CPU3 unlock barrier_mutex // need to wait CPU1 comes online // when CPU1 going online will block on cpus_write_lock The above scenario will not only trigger a WARN_ON_ONCE(), but also trigger a deadlock. Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU will either observe the decremented callback counter down to 0 and spare the callback enqueue, or rcuo will observe the new callback and keep rdp->nocb_cb_sleep to false. Therefore check rdp->nocb_cb_sleep before parking to make sure no further rcu_barrier() is waiting on the rdp. Fixes: 1fcb932c8b5c ("rcu/nocb: Simplify (de-)offloading state machine") Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	block: remove the ioprio field from struct request	Christoph Hellwig
	The request ioprio is only initialized from the first attached bio, so requests without a bio already never set it. Directly use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-12	block: remove the write_hint field from struct request	Christoph Hellwig
	The write_hint is only used for read/write requests, which must have a bio attached to them. Just use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-12	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "x86 and selftests fixes. x86: - When emulating a guest TLB flush for a nested guest, flush vpid01, not vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if L2 and L1 are sharing VPID '0' (from L1's perspective). - Fix a bug in the SNP initialization flow where KVM would return '0' to userspace instead of -errno on failure. - Move the Intel PT virtualization (i.e. outputting host trace to host buffer and guest trace to guest buffer) behind CONFIG_BROKEN. - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START - Fix a bug where KVM fails to inject an interrupt from the IRR after KVM_SET_LAPIC. Selftests: - Increase the timeout for the memslot performance selftest to avoid false failures on arm64 and nested x86 platforms. - Fix a goof in the guest_memfd selftest where a for-loop initialized a bit mask to zero instead of BIT(0). - Disable strict aliasing when building KVM selftests to prevent the compiler from treating things like "u64 " to "uint64_t " cases as undefined behavior, which can lead to nasty, hard to debug failures. - Force -march=x86-64-v2 for KVM x86 selftests if and only if the uarch is supported by the compiler. - Fix broken compilation of kvm selftests after a header sync in tools/" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN KVM: x86: Unconditionally set irr_pending when updating APICv state kvm: svm: Fix gctx page leak on invalid inputs KVM: selftests: use X86_MEMTYPE_WB instead of VMX_BASIC_MEM_TYPE_WB KVM: SVM: Propagate error from snp_guest_req_init() to userspace KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled KVM: selftests: Don't force -march=x86-64-v2 if it's unsupported KVM: selftests: Disable strict aliasing KVM: selftests: fix unintentional noop test in guest_memfd_test.c KVM: selftests: memslot_perf_test: increase guest sync timeout
2024-11-12	Merge tag 'for-6.12/dm-fixes-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - fix warnings about duplicate slab cache names * tag 'for-6.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-cache: fix warnings about duplicate slab caches dm-bufio: fix warnings about duplicate slab caches
2024-11-12	Merge tag 'integrity-v6.12' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fixes from Mimi Zohar: "One bug fix, one performance improvement, and the use of static_assert: - The bug fix addresses "only a cosmetic change" commit, which didn't take into account the original 'ima' template definition. - The performance improvement limits the atomic_read()" * tag 'integrity-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Use static_assert() to check struct sizes evm: stop avoidably reading i_writecount in evm_file_release ima: fix buffer overrun in ima_eventdigest_init_common
2024-11-12	Merge tag 'landlock-6.12-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "This fixes issues in the Landlock's sandboxer sample and documentation, slightly refactors helpers (required for ongoing patch series), and improve/fix a feature merged in v6.12 (signal and abstract UNIX socket scoping)" * tag 'landlock-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Optimize scope enforcement landlock: Refactor network access mask management landlock: Refactor filesystem access mask management samples/landlock: Clarify option parsing behaviour samples/landlock: Refactor help message samples/landlock: Fix port parsing in sandboxer landlock: Fix grammar issues in documentation landlock: Improve documentation of previous limitations
2024-11-12	rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu	Uladzislau Rezki (Sony)
	KCSAN reports a data race when access the krcp->monitor_work.timer.expires variable in the schedule_delayed_monitor_work() function: <snip> BUG: KCSAN: data-race in __mod_timer / kvfree_call_rcu read to 0xffff888237d1cce8 of 8 bytes by task 10149 on cpu 1: schedule_delayed_monitor_work kernel/rcu/tree.c:3520 [inline] kvfree_call_rcu+0x3b8/0x510 kernel/rcu/tree.c:3839 trie_update_elem+0x47c/0x620 kernel/bpf/lpm_trie.c:441 bpf_map_update_value+0x324/0x350 kernel/bpf/syscall.c:203 generic_map_update_batch+0x401/0x520 kernel/bpf/syscall.c:1849 bpf_map_do_batch+0x28c/0x3f0 kernel/bpf/syscall.c:5143 __sys_bpf+0x2e5/0x7a0 __do_sys_bpf kernel/bpf/syscall.c:5741 [inline] __se_sys_bpf kernel/bpf/syscall.c:5739 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5739 x64_sys_call+0x2625/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:322 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f write to 0xffff888237d1cce8 of 8 bytes by task 56 on cpu 0: __mod_timer+0x578/0x7f0 kernel/time/timer.c:1173 add_timer_global+0x51/0x70 kernel/time/timer.c:1330 __queue_delayed_work+0x127/0x1a0 kernel/workqueue.c:2523 queue_delayed_work_on+0xdf/0x190 kernel/workqueue.c:2552 queue_delayed_work include/linux/workqueue.h:677 [inline] schedule_delayed_monitor_work kernel/rcu/tree.c:3525 [inline] kfree_rcu_monitor+0x5e8/0x660 kernel/rcu/tree.c:3643 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3310 worker_thread+0x51d/0x6f0 kernel/workqueue.c:3391 kthread+0x1d1/0x210 kernel/kthread.c:389 ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Reported by Kernel Concurrency Sanitizer on: CPU: 0 UID: 0 PID: 56 Comm: kworker/u8:4 Not tainted 6.12.0-rc2-syzkaller-00050-g5b7c893ed5ed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: events_unbound kfree_rcu_monitor <snip> kfree_rcu_monitor() rearms the work if a "krcp" has to be still offloaded and this is done without holding krcp->lock, whereas the kvfree_call_rcu() holds it. Fix it by acquiring the "krcp->lock" for kfree_rcu_monitor() so both functions do not race anymore. Reported-by: syzbot+061d370693bdd99f9d34@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/ZxZ68KmHDQYU0yfD@pc636/T/ Fixes: 8fc5494ad5fa ("rcu/kvfree: Move need_offload_krc() out of krcp->lock") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	thermal: testing: Simplify tt_get_tt_zone()	Rafael J. Wysocki
	Notice that tt_get_tt_zone() need not use the ret variable in the tt_thermal_zones list walk and make it return right after incrementing the reference counter of the matching entry. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/6120304.lOV4Wx5bFT@rjwysocki.net
2024-11-12	rcu/srcutiny: don't return before reenabling preemption	Michal Schmidt
	Code after the return statement is dead. Enable preemption before returning from srcu_drive_gp(). This will be important when/if PREEMPT_AUTO (lazy resched) gets merged. Fixes: 65b4a59557f6 ("srcu: Make Tiny SRCU explicitly disable preemption") Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcu-tasks: Remove open-coded one-byte cmpxchg() emulation	Paul E. McKenney
	This commit removes the open-coded one-byte cmpxchg() emulation from rcu_trc_cmpxchg_need_qs(), replacing it with just cmpxchg() given the latter's new-found ability to handle single-byte arguments across all architectures. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	doc: Remove kernel-parameters.txt entry for rcutorture.read_exit	Paul E. McKenney
	There is only ever the one read-exit task, and there is no module parameter named rcutorture.read_exit, so remove the bogus documentation. Instead, use rcutorture.read_exit_burst to enable/disable read-exit race testing. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcutorture: Test start-poll primitives with interrupts disabled	Paul E. McKenney
	This commit tests the ->start_poll() and ->start_poll_full() functions with interrupts disabled, but only for RCU variants setting the ->start_poll_irqsoff flag. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcu: Permit start_poll_synchronize_rcu*() with interrupts disabled	Paul E. McKenney
	The header comment for both start_poll_synchronize_rcu() and start_poll_synchronize_rcu_full() state that interrupts must be enabled when calling these two functions, and there is a lockdep assertion in start_poll_synchronize_rcu_common() enforcing this restriction. However, there is no need for this restrictions, as can be seen in call_rcu(), which does wakeups when interrupts are disabled. This commit therefore removes the lockdep assertion and the comments. Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcu: Allow short-circuiting of synchronize_rcu_tasks_rude()	Paul E. McKenney
	There are now architectures for which all deep-idle and entry-exit functions are properly inlined or marked noinstr. Such architectures do not need synchronize_rcu_tasks_rude(), or will not once RCU Tasks has been modified to pay attention to idle tasks. This commit therefore allows a CONFIG_ARCH_HAS_NOINSTR_MARKINGS Kconfig option to turn synchronize_rcu_tasks_rude() into a no-op. To facilitate testing, kernels built by rcutorture scripting will enable RCU Tasks Trace even on systems that do not need it. [ paulmck: Apply Peter Zijlstra feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	doc: Add rcuog kthreads to kernel-per-CPU-kthreads.rst	Paul E. McKenney
	This commit adds the rcuog kthreads to the list of callback-offloading kthreads that can be affinitied away from worker CPUs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	cpufreq: ACPI: Simplify MSR read on the boot CPU	Chang S. Bae
	Replace the 32-bit MSR access function with a 64-bit variant to simplify the call site, eliminating unnecessary 32-bit value manipulations. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Link: https://patch.msgid.link/20241106182313.165297-1-chang.seok.bae@intel.com [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-11-12	rcu: Add rcuog kthreads to RCU_NOCB_CPU help text	Paul E. McKenney
	The RCU_NOCB_CPU help text currently fails to mention rcuog kthreads, so this commit adds this information. Reported-by: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcu: Use the BITS_PER_LONG macro	Jinjie Ruan
	sizeof(unsigned long) * 8 is the number of bits in an unsigned long variable, replace it with BITS_PER_LONG macro to make it simpler. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	rcu: Use bitwise instead of arithmetic operator for flags	Hongbo Li
	This silences the following coccinelle warning: WARNING: sum of probable bitmasks, consider \| Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12	doc: rcu: update printed dynticks counter bits	Baruch Siach
	The stall warning prints 16 bits since commit 171476775d32 ("context_tracking: Convert state to atomic_t"). Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t") Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>