Age | Commit message (Collapse) | Author |
|
TPMI (Topology Aware Register and PM Capsule Interface) creates a
flexible, extendable and software-PCIe-driver-enumerable MMIO interface
for PM features.
SST feature is exposed via the TPMI interface on newer Xeon platforms.
Kernel TPMI based SST driver provides a series of new IOCTLs for userspace
to use.
Introduce support for the platforms that do SST control via TPMI interface.
Compared with previous platforms, Newer Xeons also supports multi-punit in a
package/die, including cpu punit and non-cpu punit. These have already
been handled in the generic code.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Get punit core mapping information using format of MSR 0x54. Based
on the API version, decode is done using new format. The new format
also include a power domain ID. TPMI SST information is for each
power domain.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
In some cases, the output format may be different with different
api_version because of different capabilities or for backward
capabilities reason.
Introduce api_version() to get the api_version of the platform running.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
clos_min/max in TPMI interface is frequency in MHz, thus clos_min/max
needs to support larger values.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Platform specific code also needs to give debug output.
Introduce is_debug_enabled() for this purpose.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Different api_version suggests different kernel driver used and
different interface is used to communication with the hardware.
Allow setting platform specific callbacks based on api_version.
Currently, all platforms with api_version 1 uses Mbox/MMIO interfaces.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
After the previous cleanup, there is no user of send_mbox_cmd outside of
isst-core-mbox.c.
Thus move send_mbox_cmd to isst-core-mbox.c as internal functions.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to adjust the uncore frequency.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get SST-CP capability and
current state.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to set per core CLOS setting.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get per core CLOS setting.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to set CLOS priority setting.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get CLOS priority setting.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to set CLOS config settings.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get CLOS config setting.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get turbo ratio limits of each
AVX level, for a selected SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
mbox_get_uncore_p0_p1_info/get_p1_info/get_uncore_mem_freq can be done
inside get_tdp_info().
Fold the code into get_tdp_info().
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get uncore frequency info.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get SST-TF info.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to enable/disable SST-TF/BF.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
isst_get_pbf_info_complete does nothing but just free the core_mask.
Remove the function and do free core_mask directly and free core mask in
the caller.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get SST-BF information.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to set a SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get buckets info.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get turbo ratio limit of the
selected SST-PP level, and AVX level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get the core mask for a given
SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get the Tjmax info for a
given SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Some functions are defined far from its only caller.
Rearrange the code.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get min and max power for a
given SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get TDP information.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get SST-TF/BF/CP capabilities
and status.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to get SST-PP level.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Allow platform specific implementation to identify a valid punit.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
isst-core.c should contain generic core APIs only.
Platform specific implementations/configurations should be removed from
this file.
Introduce isst-core-mbox.c and move all mbox/mmio specific functions to
this file.
Introduce struct isst_platform_ops which contains a series of callbacks
that used by the core APIs but need platform specific implementation.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
isst_fill_platform_info fills platform specific information.
And it is the proper place to set platform specific callbacks, as done in
next patch.
As the platform specific callbacks are needed in all cases, including
isst_print_platform_information.
The best way to achieve both is to invoke isst_fill_platform_info
unconditionally, and make isst_print_platform_information leverage the
data already filled.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Remove hardcoded DISP_FREQ_MULTIPLIER in the code and use
isst_get_disp_freq_multiplier() instead.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
isst-config.c should only contain generic code.
Move mbox functions which are platform specific code to isst-core.c.
As there are some platform specific parameters set via generic
application options, introduce isst_update_platform_param to pass these
parameters to platform specific code.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
The main thing done in isst_print_extended_platform_info is to get the
isst feature status by checking one of the power domains of the
platform.
This can be done using the for_each_online_power_domain_in_set()
function, which makes the code clean and easier to read.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
for_each_online_package_in_set is actually used to invoke callback for
each power domain.
This is not a problem when there is a single power domain within a
package/die, but it does not reflect the truth in multi-punit case.
Rename for_each_online_package_in_set to
for_each_online_power_domain_in_set.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
New platforms may have more than 1 punit in a Package/Die, thus it can
have multiple power domains in a Package/Die. Package id and die id is not
sufficient to refer to a specific Power domain.
Introduce support for multi-punit per package/die.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Introduce isst_is_punit_valid() for checking a valid domain.
For current platforms, it requires a punit 0 in a valid Package/Die.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Punit id can also be retrieved from ISST_IF_GET_PHY_ID.
punit id is unique within a Package/Die, and together with Package id and
Die id, they can be used to refer to a specific SST power domain.
For current platforms, Punit id is always Zero. So no functional changes
are expected for the current platforms.
While here, prevent issuing IOCTL if the file /dev/isst_interface can't be
opened.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
SST-TF high priority core count and ratios and low priority core ratios
are also per TRL level.
Cleanup the code to follow the same nameing convention as TRL.
This removes hardcoded TRL level names and variables.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
TRL supports different levels including SSE/AVX2/AVX512.
Avoid using hardcoded level name and structure fields, so that a loop can
be used to parse each TRL level instead. This reduces several lines of
source code.
No functional changes are expected.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Expand the discussion of SRCU and its read-side critical sections in
the Linux Kernel Memory Model documentation file explanation.txt. The
new material discusses recent changes to the memory model made in
commit 6cd244c87428 ("tools/memory-model: Provide exact SRCU
semantics").
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Jonas Oberhauser <jonas.oberhauser@huawei.com>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
CC: Will Deacon <will@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
As stated in the documentation and implied by its name, the ppo
(preserved program order) relation is intended to link po-earlier
to po-later instructions under certain conditions. However, a
corner case currently allows instructions to be linked by ppo that
are not executed by the same thread, i.e., instructions are being
linked that have no po relation.
This happens due to the mb/strong-fence/fence relations, which (as
one case) provide order when locks are passed between threads
followed by an smp_mb__after_unlock_lock() fence. This is
illustrated in the following litmus test (as can be seen when using
herd7 with `doshow ppo`):
P0(spinlock_t *x, spinlock_t *y)
{
spin_lock(x);
spin_unlock(x);
}
P1(spinlock_t *x, spinlock_t *y)
{
spin_lock(x);
smp_mb__after_unlock_lock();
*y = 1;
}
The ppo relation will link P0's spin_lock(x) and P1's *y=1, because
P0 passes a lock to P1 which then uses this fence.
The patch makes ppo a subrelation of po by letting fence contribute
to ppo only in case the fence links events of the same thread.
Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
LKMM has long provided only approximate handling of SRCU read-side
critical sections. This has not been a pressing problem because LKMM's
traditional handling is correct for the common cases of non-overlapping
and properly nested critical sections. However, LKMM's traditional
handling of partially overlapping critical sections incorrectly fuses
them into one large critical section.
For example, consider the following litmus test:
------------------------------------------------------------------------
C C-srcu-nest-5
(*
* Result: Sometimes
*
* This demonstrates non-nested overlapping of SRCU read-side critical
* sections. Unlike RCU, SRCU critical sections do not unconditionally
* nest.
*)
{}
P0(int *x, int *y, struct srcu_struct *s1)
{
int r1;
int r2;
int r3;
int r4;
r3 = srcu_read_lock(s1);
r2 = READ_ONCE(*y);
r4 = srcu_read_lock(s1);
srcu_read_unlock(s1, r3);
r1 = READ_ONCE(*x);
srcu_read_unlock(s1, r4);
}
P1(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}
locations [0:r1]
exists (0:r1=1 /\ 0:r2=0)
------------------------------------------------------------------------
Current mainline incorrectly flattens the two critical sections into
one larger critical section, giving "Never" instead of the correct
"Sometimes":
------------------------------------------------------------------------
$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 3
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=1;
No
Witnesses
Positive: 0 Negative: 3
Flag srcu-bad-nesting
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Never 0 3
Time C-srcu-nest-5 0.01
Hash=e692c106cf3e84e20f12991dc438ff1b
------------------------------------------------------------------------
To its credit, it does complain about bad nesting. But with this
commit we get the following result, which has the virtue of being
correct:
------------------------------------------------------------------------
$ herd7 -conf linux-kernel.cfg C-srcu-nest-5.litmus
Test C-srcu-nest-5 Allowed
States 4
0:r1=0; 0:r2=0;
0:r1=0; 0:r2=1;
0:r1=1; 0:r2=0;
0:r1=1; 0:r2=1;
Ok
Witnesses
Positive: 1 Negative: 3
Condition exists (0:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-5 Sometimes 1 3
Time C-srcu-nest-5 0.05
Hash=e692c106cf3e84e20f12991dc438ff1b
------------------------------------------------------------------------
In addition, there are new srcu_down_read() and srcu_up_read()
functions on their way to mainline. Roughly speaking, these are to
srcu_read_lock() and srcu_read_unlock() as down() and up() are to
mutex_lock() and mutex_unlock(). The key point is that
srcu_down_read() can execute in one process and the matching
srcu_up_read() in another, as shown in this litmus test:
------------------------------------------------------------------------
C C-srcu-nest-6
(*
* Result: Never
*
* This would be valid for srcu_down_read() and srcu_up_read().
*)
{}
P0(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r2;
int r3;
r3 = srcu_down_read(s1);
WRITE_ONCE(*idx, r3);
r2 = READ_ONCE(*y);
smp_store_release(f, 1);
}
P1(int *x, int *y, struct srcu_struct *s1, int *idx, int *f)
{
int r1;
int r3;
int r4;
r4 = smp_load_acquire(f);
r1 = READ_ONCE(*x);
r3 = READ_ONCE(*idx);
srcu_up_read(s1, r3);
}
P2(int *x, int *y, struct srcu_struct *s1)
{
WRITE_ONCE(*y, 1);
synchronize_srcu(s1);
WRITE_ONCE(*x, 1);
}
locations [0:r1]
filter (1:r4=1)
exists (1:r1=1 /\ 0:r2=0)
------------------------------------------------------------------------
When run on current mainline, this litmus test gets a complaint about
an unknown macro srcu_down_read(). With this commit:
------------------------------------------------------------------------
herd7 -conf linux-kernel.cfg C-srcu-nest-6.litmus
Test C-srcu-nest-6 Allowed
States 3
0:r1=0; 0:r2=0; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=0;
0:r1=0; 0:r2=1; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (1:r1=1 /\ 0:r2=0)
Observation C-srcu-nest-6 Never 0 3
Time C-srcu-nest-6 0.02
Hash=c1f20257d052ca5e899be508bedcb2a1
------------------------------------------------------------------------
Note that the user must supply the flag "f" and the "filter" clause,
similar to what must be done to emulate call_rcu().
The commit works by treating srcu_read_lock()/srcu_down_read() as
loads and srcu_read_unlock()/srcu_up_read() as stores. This allows us
to determine which unlock matches which lock by looking for a data
dependency between them. In order for this to work properly, the data
dependencies have to be tracked through stores to intermediate
variables such as "idx" in the litmus test above; this is handled by
the new carry-srcu-data relation. But it's important here (and in the
existing carry-dep relation) to avoid tracking the dependencies
through SRCU unlock stores. Otherwise, in situations resembling:
A: r1 = srcu_read_lock(s);
B: srcu_read_unlock(s, r1);
C: r2 = srcu_read_lock(s);
D: srcu_read_unlock(s, r2);
it would look as if D was dependent on both A and C, because "s" would
appear to be an intermediate variable written by B and read by C.
This explains the complications in the definitions of carry-srcu-dep
and carry-dep.
As a debugging aid, the commit adds a check for errors in which the
value returned by one call to srcu_read_lock()/srcu_down_read() is
passed to more than one instance of srcu_read_unlock()/srcu_up_read().
Finally, since these SRCU-related primitives are now treated as
ordinary reads and writes, we have to add them into the lists of
marked accesses (i.e., not subject to data races) and lock-related
accesses (i.e., one shouldn't try to access an srcu_struct with a
non-lock-related primitive such as READ_ONCE() or a plain write).
Portions of this approach were suggested by Boqun Feng and Jonas
Oberhauser.
[ paulmck: Fix space-before-tab whitespace nit. ]
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
During a code-reading exercise of linux-kernel.cat CAT file, I generated
a graph to show the to-r relations. While likely not problematic for the
model, I found it confusing that a read-write address dependency would
show as a to-r edge on the graph.
This patch therefore restricts the to-r links derived from addr to only
read-read address dependencies, so that read-write address dependencies don't
show as to-r in the graphs. This should also prevent future users of to-r from
deriving incorrect relations. Note that a read-write address dep, obviously,
still ends up in the ppo relation via the to-w relation.
I verified that a read-read address dependency still shows up as a to-r
link in the graph, as it did before.
For reference, the problematic graph was generated with the following
command:
herd7 -conf linux-kernel.cfg \
-doshow dep -doshow to-r -doshow to-w ./foo.litmus -show all -o OUT/
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
This commit adds support for smp_mb__after_srcu_read_unlock(), which,
when combined with a prior srcu_read_unlock(), implies a full memory
barrier. No ordering is guaranteed to accesses between the two, and
placing accesses between is bad practice in any case.
Tests may be found at https://github.com/paulmckrcu/litmus in files
matching manual/kernel/C-srcu-mb-*.litmus.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
LKMM uses two relations for talking about UNLOCK+LOCK pairings:
1) po-unlock-lock-po, which handles UNLOCK+LOCK pairings
on the same CPU or immediate lock handovers on the same
lock variable
2) po;[UL];(co|po);[LKW];po, which handles UNLOCK+LOCK pairs
literally as described in rcupdate.h#L1002, i.e., even
after a sequence of handovers on the same lock variable.
The latter relation is used only once, to provide the guarantee
defined in rcupdate.h#L1002 by smp_mb__after_unlock_lock(), which
makes any UNLOCK+LOCK pair followed by the fence behave like a full
barrier.
This patch drops this use in favor of using po-unlock-lock-po
everywhere, which unifies the way the model talks about UNLOCK+LOCK
pairings. At first glance this seems to weaken the guarantee given
by LKMM: When considering a long sequence of lock handovers
such as below, where P0 hands the lock to P1, which hands it to P2,
which finally executes such an after_unlock_lock fence, the mb
relation currently links any stores in the critical section of P0
to instructions P2 executes after its fence, but not so after the
patch.
P0(int *x, int *y, spinlock_t *mylock)
{
spin_lock(mylock);
WRITE_ONCE(*x, 2);
spin_unlock(mylock);
WRITE_ONCE(*y, 1);
}
P1(int *y, int *z, spinlock_t *mylock)
{
int r0 = READ_ONCE(*y); // reads 1
spin_lock(mylock);
spin_unlock(mylock);
WRITE_ONCE(*z,1);
}
P2(int *z, int *d, spinlock_t *mylock)
{
int r1 = READ_ONCE(*z); // reads 1
spin_lock(mylock);
spin_unlock(mylock);
smp_mb__after_unlock_lock();
WRITE_ONCE(*d,1);
}
P3(int *x, int *d)
{
WRITE_ONCE(*d,2);
smp_mb();
WRITE_ONCE(*x,1);
}
exists (1:r0=1 /\ 2:r1=1 /\ x=2 /\ d=2)
Nevertheless, the ordering guarantee given in rcupdate.h is actually
not weakened. This is because the unlock operations along the
sequence of handovers are A-cumulative fences. They ensure that any
stores that propagate to the CPU performing the first unlock
operation in the sequence must also propagate to every CPU that
performs a subsequent lock operation in the sequence. Therefore any
such stores will also be ordered correctly by the fence even if only
the final handover is considered a full barrier.
Indeed this patch does not affect the behaviors allowed by LKMM at
all. The mb relation is used to define ordering through:
1) mb/.../ppo/hb, where the ordering is subsumed by hb+ where the
lock-release, rfe, and unlock-acquire orderings each provide hb
2) mb/strong-fence/cumul-fence/prop, where the rfe and A-cumulative
lock-release orderings simply add more fine-grained cumul-fence
edges to substitute a single strong-fence edge provided by a long
lock handover sequence
3) mb/strong-fence/pb and various similar uses in the definition of
data races, where as discussed above any long handover sequence
can be turned into a sequence of cumul-fence edges that provide
the same ordering.
Signed-off-by: Jonas Oberhauser <jonas.oberhauser@huaweicloud.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|