linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2023-03-22	tools/power/x86/intel-speed-select: Introduce TPMI interface support	Zhang Rui
	TPMI (Topology Aware Register and PM Capsule Interface) creates a flexible, extendable and software-PCIe-driver-enumerable MMIO interface for PM features. SST feature is exposed via the TPMI interface on newer Xeon platforms. Kernel TPMI based SST driver provides a series of new IOCTLs for userspace to use. Introduce support for the platforms that do SST control via TPMI interface. Compared with previous platforms, Newer Xeons also supports multi-punit in a package/die, including cpu punit and non-cpu punit. These have already been handled in the generic code. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Get punit core mapping information	Srinivas Pandruvada
	Get punit core mapping information using format of MSR 0x54. Based on the API version, decode is done using new format. The new format also include a power domain ID. TPMI SST information is for each power domain. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce api_version helper	Zhang Rui
	In some cases, the output format may be different with different api_version because of different capabilities or for backward capabilities reason. Introduce api_version() to get the api_version of the platform running. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Support large clos_min/max	Zhang Rui
	clos_min/max in TPMI interface is frequency in MHz, thus clos_min/max needs to support larger values. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce is_debug_enabled()	Zhang Rui
	Platform specific code also needs to give debug output. Introduce is_debug_enabled() for this purpose. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Allow api_version based platform callbacks	Zhang Rui
	Different api_version suggests different kernel driver used and different interface is used to communication with the hardware. Allow setting platform specific callbacks based on api_version. Currently, all platforms with api_version 1 uses Mbox/MMIO interfaces. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Move send_mbox_cmd to isst-core-mbox.c	Zhang Rui
	After the previous cleanup, there is no user of send_mbox_cmd outside of isst-core-mbox.c. Thus move send_mbox_cmd to isst-core-mbox.c as internal functions. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract adjust_uncore_freq	Zhang Rui
	Allow platform specific implementation to adjust the uncore frequency. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract read_pm_config	Zhang Rui
	Allow platform specific implementation to get SST-CP capability and current state. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract clos_associate	Zhang Rui
	Allow platform specific implementation to set per core CLOS setting. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract clos_get_assoc_status	Zhang Rui
	Allow platform specific implementation to get per core CLOS setting. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract set_clos	Zhang Rui
	Allow platform specific implementation to set CLOS priority setting. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract pm_get_clos	Zhang Rui
	Allow platform specific implementation to get CLOS priority setting. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract pm_qos_config	Zhang Rui
	Allow platform specific implementation to set CLOS config settings. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_clos_information	Zhang Rui
	Allow platform specific implementation to get CLOS config setting. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_get_trls	Zhang Rui
	Allow platform specific implementation to get turbo ratio limits of each AVX level, for a selected SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Enhance get_tdp_info	Zhang Rui
	mbox_get_uncore_p0_p1_info/get_p1_info/get_uncore_mem_freq can be done inside get_tdp_info(). Fold the code into get_tdp_info(). No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_uncore_p0_p1_info	Zhang Rui
	Allow platform specific implementation to get uncore frequency info. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_fact_info	Zhang Rui
	Allow platform specific implementation to get SST-TF info. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract set_pbf_fact_status	Zhang Rui
	Allow platform specific implementation to enable/disable SST-TF/BF. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Remove isst_get_pbf_info_complete	Zhang Rui
	isst_get_pbf_info_complete does nothing but just free the core_mask. Remove the function and do free core_mask directly and free core mask in the caller. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_pbf_info	Zhang Rui
	Allow platform specific implementation to get SST-BF information. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract set_tdp_level	Zhang Rui
	Allow platform specific implementation to set a SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_trl_bucket_info	Zhang Rui
	Allow platform specific implementation to get buckets info. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_get_trl	Zhang Rui
	Allow platform specific implementation to get turbo ratio limit of the selected SST-PP level, and AVX level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_coremask_info	Zhang Rui
	Allow platform specific implementation to get the core mask for a given SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_tjmax_info	Zhang Rui
	Allow platform specific implementation to get the Tjmax info for a given SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Move code right before its caller	Zhang Rui
	Some functions are defined far from its only caller. Rearrange the code. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_pwr_info	Zhang Rui
	Allow platform specific implementation to get min and max power for a given SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_tdp_info	Zhang Rui
	Allow platform specific implementation to get TDP information. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_ctdp_control	Zhang Rui
	Allow platform specific implementation to get SST-TF/BF/CP capabilities and status. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract get_config_levels	Zhang Rui
	Allow platform specific implementation to get SST-PP level. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Abstract is_punit_valid	Zhang Rui
	Allow platform specific implementation to identify a valid punit. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce isst-core-mbox.c	Zhang Rui
	isst-core.c should contain generic core APIs only. Platform specific implementations/configurations should be removed from this file. Introduce isst-core-mbox.c and move all mbox/mmio specific functions to this file. Introduce struct isst_platform_ops which contains a series of callbacks that used by the core APIs but need platform specific implementation. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Always invoke isst_fill_platform_info	Zhang Rui
	isst_fill_platform_info fills platform specific information. And it is the proper place to set platform specific callbacks, as done in next patch. As the platform specific callbacks are needed in all cases, including isst_print_platform_information. The best way to achieve both is to invoke isst_fill_platform_info unconditionally, and make isst_print_platform_information leverage the data already filled. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce isst_get_disp_freq_multiplier	Zhang Rui
	Remove hardcoded DISP_FREQ_MULTIPLIER in the code and use isst_get_disp_freq_multiplier() instead. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Move mbox functions to isst-core.c	Zhang Rui
	isst-config.c should only contain generic code. Move mbox functions which are platform specific code to isst-core.c. As there are some platform specific parameters set via generic application options, introduce isst_update_platform_param to pass these parameters to platform specific code. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Improve isst_print_extended_platform_info	Zhang Rui
	The main thing done in isst_print_extended_platform_info is to get the isst feature status by checking one of the power domains of the platform. This can be done using the for_each_online_power_domain_in_set() function, which makes the code clean and easier to read. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Rename for_each_online_package_in_set	Zhang Rui
	for_each_online_package_in_set is actually used to invoke callback for each power domain. This is not a problem when there is a single power domain within a package/die, but it does not reflect the truth in multi-punit case. Rename for_each_online_package_in_set to for_each_online_power_domain_in_set. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce support for multi-punit	Zhang Rui
	New platforms may have more than 1 punit in a Package/Die, thus it can have multiple power domains in a Package/Die. Package id and die id is not sufficient to refer to a specific Power domain. Introduce support for multi-punit per package/die. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce isst_is_punit_valid()	Zhang Rui
	Introduce isst_is_punit_valid() for checking a valid domain. For current platforms, it requires a punit 0 in a valid Package/Die. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Introduce punit to isst_id	Zhang Rui
	Punit id can also be retrieved from ISST_IF_GET_PHY_ID. punit id is unique within a Package/Die, and together with Package id and Die id, they can be used to refer to a specific SST power domain. For current platforms, Punit id is always Zero. So no functional changes are expected for the current platforms. While here, prevent issuing IOCTL if the file /dev/isst_interface can't be opened. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Follow TRL nameing for FACT info	Zhang Rui
	SST-TF high priority core count and ratios and low priority core ratios are also per TRL level. Cleanup the code to follow the same nameing convention as TRL. This removes hardcoded TRL level names and variables. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/power/x86/intel-speed-select: Unify TRL levels	Zhang Rui
	TRL supports different levels including SSE/AVX2/AVX512. Avoid using hardcoded level name and structure fields, so that a loop can be used to parse each TRL level instead. This reduces several lines of source code. No functional changes are expected. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-03-22	tools/memory-model: Add documentation about SRCU read-side critical sections	Alan Stern
	Expand the discussion of SRCU and its read-side critical sections in the Linux Kernel Memory Model documentation file explanation.txt. The new material discusses recent changes to the memory model made in commit 6cd244c87428 ("tools/memory-model: Provide exact SRCU semantics"). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Akira Yokosawa <akiyks@gmail.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Jonas Oberhauser <jonas.oberhauser@huawei.com> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> CC: Will Deacon <will@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-22	tools/memory-model: Make ppo a subrelation of po	Jonas Oberhauser
	As stated in the documentation and implied by its name, the ppo (preserved program order) relation is intended to link po-earlier to po-later instructions under certain conditions. However, a corner case currently allows instructions to be linked by ppo that are not executed by the same thread, i.e., instructions are being linked that have no po relation. This happens due to the mb/strong-fence/fence relations, which (as one case) provide order when locks are passed between threads followed by an smp_mb__after_unlock_lock() fence. This is illustrated in the following litmus test (as can be seen when using herd7 with `doshow ppo`): P0(spinlock_t x, spinlock_t y) { spin_lock(x); spin_unlock(x); } P1(spinlock_t x, spinlock_t y) { spin_lock(x); smp_mb__after_unlock_lock(); y = 1; } The ppo relation will link P0's spin_lock(x) and P1's y=1, because P0 passes a lock to P1 which then uses this fence. The patch makes ppo a subrelation of po by letting fence contribute to ppo only in case the fence links events of the same thread. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-22	tools/memory-model: Provide exact SRCU semantics	Alan Stern
	LKMM has long provided only approximate handling of SRCU read-side critical sections. This has not been a pressing problem because LKMM's traditional handling is correct for the common cases of non-overlapping and properly nested critical sections. However, LKMM's traditional handling of partially overlapping critical sections incorrectly fuses them into one large critical section. For example, consider the following litmus test: ------------------------------------------------------------------------ C C-srcu-nest-5 (* * Result: Sometimes * * This demonstrates non-nested overlapping of SRCU read-side critical * sections. Unlike RCU, SRCU critical sections do not unconditionally * nest. ) {} P0(int x, int y, struct srcu_struct s1) { int r1; int r2; int r3; int r4; r3 = srcu_read_lock(s1); r2 = READ_ONCE(y); r4 = srcu_read_lock(s1); srcu_read_unlock(s1, r3); r1 = READ_ONCE(x); srcu_read_unlock(s1, r4); } P1(int x, int y, struct srcu_struct s1) { WRITE_ONCE(y, 1); synchronize_srcu(s1); WRITE_ONCE(x, 1); } locations [0:r1] exists (0:r1=1 /\ 0:r2=0) ------------------------------------------------------------------------ Current mainline incorrectly flattens the two critical sections into one larger critical section, giving "Never" instead of the correct "Sometimes": ------------------------------------------------------------------------ $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus Test C-srcu-nest-5 Allowed States 3 0:r1=0; 0:r2=0; 0:r1=0; 0:r2=1; 0:r1=1; 0:r2=1; No Witnesses Positive: 0 Negative: 3 Flag srcu-bad-nesting Condition exists (0:r1=1 /\ 0:r2=0) Observation C-srcu-nest-5 Never 0 3 Time C-srcu-nest-5 0.01 Hash=e692c106cf3e84e20f12991dc438ff1b ------------------------------------------------------------------------ To its credit, it does complain about bad nesting. But with this commit we get the following result, which has the virtue of being correct: ------------------------------------------------------------------------ $ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus Test C-srcu-nest-5 Allowed States 4 0:r1=0; 0:r2=0; 0:r1=0; 0:r2=1; 0:r1=1; 0:r2=0; 0:r1=1; 0:r2=1; Ok Witnesses Positive: 1 Negative: 3 Condition exists (0:r1=1 /\ 0:r2=0) Observation C-srcu-nest-5 Sometimes 1 3 Time C-srcu-nest-5 0.05 Hash=e692c106cf3e84e20f12991dc438ff1b ------------------------------------------------------------------------ In addition, there are new srcu_down_read() and srcu_up_read() functions on their way to mainline. Roughly speaking, these are to srcu_read_lock() and srcu_read_unlock() as down() and up() are to mutex_lock() and mutex_unlock(). The key point is that srcu_down_read() can execute in one process and the matching srcu_up_read() in another, as shown in this litmus test: ------------------------------------------------------------------------ C C-srcu-nest-6 ( * Result: Never * * This would be valid for srcu_down_read() and srcu_up_read(). ) {} P0(int x, int y, struct srcu_struct s1, int idx, int f) { int r2; int r3; r3 = srcu_down_read(s1); WRITE_ONCE(idx, r3); r2 = READ_ONCE(y); smp_store_release(f, 1); } P1(int x, int y, struct srcu_struct s1, int idx, int f) { int r1; int r3; int r4; r4 = smp_load_acquire(f); r1 = READ_ONCE(x); r3 = READ_ONCE(idx); srcu_up_read(s1, r3); } P2(int x, int y, struct srcu_struct s1) { WRITE_ONCE(y, 1); synchronize_srcu(s1); WRITE_ONCE(x, 1); } locations [0:r1] filter (1:r4=1) exists (1:r1=1 /\ 0:r2=0) ------------------------------------------------------------------------ When run on current mainline, this litmus test gets a complaint about an unknown macro srcu_down_read(). With this commit: ------------------------------------------------------------------------ herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus Test C-srcu-nest-6 Allowed States 3 0:r1=0; 0:r2=0; 1:r1=0; 0:r1=0; 0:r2=1; 1:r1=0; 0:r1=0; 0:r2=1; 1:r1=1; No Witnesses Positive: 0 Negative: 3 Condition exists (1:r1=1 /\ 0:r2=0) Observation C-srcu-nest-6 Never 0 3 Time C-srcu-nest-6 0.02 Hash=c1f20257d052ca5e899be508bedcb2a1 ------------------------------------------------------------------------ Note that the user must supply the flag "f" and the "filter" clause, similar to what must be done to emulate call_rcu(). The commit works by treating srcu_read_lock()/srcu_down_read() as loads and srcu_read_unlock()/srcu_up_read() as stores. This allows us to determine which unlock matches which lock by looking for a data dependency between them. In order for this to work properly, the data dependencies have to be tracked through stores to intermediate variables such as "idx" in the litmus test above; this is handled by the new carry-srcu-data relation. But it's important here (and in the existing carry-dep relation) to avoid tracking the dependencies through SRCU unlock stores. Otherwise, in situations resembling: A: r1 = srcu_read_lock(s); B: srcu_read_unlock(s, r1); C: r2 = srcu_read_lock(s); D: srcu_read_unlock(s, r2); it would look as if D was dependent on both A and C, because "s" would appear to be an intermediate variable written by B and read by C. This explains the complications in the definitions of carry-srcu-dep and carry-dep. As a debugging aid, the commit adds a check for errors in which the value returned by one call to srcu_read_lock()/srcu_down_read() is passed to more than one instance of srcu_read_unlock()/srcu_up_read(). Finally, since these SRCU-related primitives are now treated as ordinary reads and writes, we have to add them into the lists of marked accesses (i.e., not subject to data races) and lock-related accesses (i.e., one shouldn't try to access an srcu_struct with a non-lock-related primitive such as READ_ONCE() or a plain write). Portions of this approach were suggested by Boqun Feng and Jonas Oberhauser. [ paulmck: Fix space-before-tab whitespace nit. ] Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-22	tools/memory-model: Restrict to-r to read-read address dependency	Joel Fernandes (Google)
	During a code-reading exercise of linux-kernel.cat CAT file, I generated a graph to show the to-r relations. While likely not problematic for the model, I found it confusing that a read-write address dependency would show as a to-r edge on the graph. This patch therefore restricts the to-r links derived from addr to only read-read address dependencies, so that read-write address dependencies don't show as to-r in the graphs. This should also prevent future users of to-r from deriving incorrect relations. Note that a read-write address dep, obviously, still ends up in the ppo relation via the to-w relation. I verified that a read-read address dependency still shows up as a to-r link in the graph, as it did before. For reference, the problematic graph was generated with the following command: herd7 -conf linux-kernel.cfg \ -doshow dep -doshow to-r -doshow to-w ./foo.litmus -show all -o OUT/ Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-22	tools/memory-model: Add smp_mb__after_srcu_read_unlock()	Paul E. McKenney
	This commit adds support for smp_mb__after_srcu_read_unlock(), which, when combined with a prior srcu_read_unlock(), implies a full memory barrier. No ordering is guaranteed to accesses between the two, and placing accesses between is bad practice in any case. Tests may be found at https://github.com/paulmckrcu/litmus in files matching manual/kernel/C-srcu-mb-*.litmus. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-22	tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-po	Jonas Oberhauser
	LKMM uses two relations for talking about UNLOCK+LOCK pairings: 1) po-unlock-lock-po, which handles UNLOCK+LOCK pairings on the same CPU or immediate lock handovers on the same lock variable 2) po;[UL];(co\|po);[LKW];po, which handles UNLOCK+LOCK pairs literally as described in rcupdate.h#L1002, i.e., even after a sequence of handovers on the same lock variable. The latter relation is used only once, to provide the guarantee defined in rcupdate.h#L1002 by smp_mb__after_unlock_lock(), which makes any UNLOCK+LOCK pair followed by the fence behave like a full barrier. This patch drops this use in favor of using po-unlock-lock-po everywhere, which unifies the way the model talks about UNLOCK+LOCK pairings. At first glance this seems to weaken the guarantee given by LKMM: When considering a long sequence of lock handovers such as below, where P0 hands the lock to P1, which hands it to P2, which finally executes such an after_unlock_lock fence, the mb relation currently links any stores in the critical section of P0 to instructions P2 executes after its fence, but not so after the patch. P0(int x, int y, spinlock_t mylock) { spin_lock(mylock); WRITE_ONCE(x, 2); spin_unlock(mylock); WRITE_ONCE(y, 1); } P1(int y, int z, spinlock_t mylock) { int r0 = READ_ONCE(y); // reads 1 spin_lock(mylock); spin_unlock(mylock); WRITE_ONCE(z,1); } P2(int z, int d, spinlock_t mylock) { int r1 = READ_ONCE(z); // reads 1 spin_lock(mylock); spin_unlock(mylock); smp_mb__after_unlock_lock(); WRITE_ONCE(d,1); } P3(int x, int d) { WRITE_ONCE(d,2); smp_mb(); WRITE_ONCE(*x,1); } exists (1:r0=1 /\ 2:r1=1 /\ x=2 /\ d=2) Nevertheless, the ordering guarantee given in rcupdate.h is actually not weakened. This is because the unlock operations along the sequence of handovers are A-cumulative fences. They ensure that any stores that propagate to the CPU performing the first unlock operation in the sequence must also propagate to every CPU that performs a subsequent lock operation in the sequence. Therefore any such stores will also be ordered correctly by the fence even if only the final handover is considered a full barrier. Indeed this patch does not affect the behaviors allowed by LKMM at all. The mb relation is used to define ordering through: 1) mb/.../ppo/hb, where the ordering is subsumed by hb+ where the lock-release, rfe, and unlock-acquire orderings each provide hb 2) mb/strong-fence/cumul-fence/prop, where the rfe and A-cumulative lock-release orderings simply add more fine-grained cumul-fence edges to substitute a single strong-fence edge provided by a long lock handover sequence 3) mb/strong-fence/pb and various similar uses in the definition of data races, where as discussed above any long handover sequence can be turned into a sequence of cumul-fence edges that provide the same ordering. Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>