Age | Commit message (Collapse) | Author |
|
COMPILE_TEST
Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it
is possible to test-build any driver which depends on OF on any
architecture by explicitly selecting OF. Therefore depending on
COMPILE_TEST as an alternative is no longer needed.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230121182911.4e47a5ff@endymion.delvare
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
|
|
RISC-V provides an architectural clock source via the time CSR. This
clock source exposes a 64-bit counter synchronized across all CPUs.
Because it is accessed using a CSR, it is much more efficient to read
than MMIO clock sources. For example, on the Allwinner D1, reading the
sun4i timer in a loop takes 131 cycles/iteration, while reading the
RISC-V time CSR takes only 5 cycles/iteration.
Adjust the RISC-V clock source rating so it is preferred over the
various platform-specific MMIO clock sources.
Signed-off-by: Samuel Holland <samuel@sholland.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20221228004444.61568-1-samuel@sholland.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
|
|
We should set CLOCK_EVT_FEAT_C3STOP for a clock_event_device only
when riscv,timer-cannot-wake-cpu DT property is present in the RISC-V
timer DT node.
This way CLOCK_EVT_FEAT_C3STOP feature is set for clock_event_device
based on RISC-V platform capabilities rather than having it set for
all RISC-V platforms.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230103141102.772228-4-apatel@ventanamicro.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
|
|
We add DT bindings for a separate RISC-V timer DT node which can
be used to describe implementation specific behaviour (such as
timer interrupt not triggered during non-retentive suspend).
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230103141102.772228-3-apatel@ventanamicro.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
|
|
Similarly to commit 022eb8ae8b5e ("ARM: 8938/1: kernel: initialize
broadcast hrtimer based clock event device"), RISC-V needs to initiate
hrtimer based broadcast clock event device before C3STOP can be used.
Otherwise, the introduction of C3STOP for the RISC-V arch timer in
commit 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped
during CPU suspend") leaves us without any broadcast timer registered.
This prevents the kernel from entering oneshot mode, which breaks timer
behaviour, for example clock_nanosleep().
A test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250
& C3STOP enabled, the sleep times are rounded up to the next jiffy:
== CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 ==
Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179
Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193
Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000
Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000
Samples: 521 Samples: 521 Samples: 521 Samples: 521
Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/
Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
Suggested-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230103141102.772228-2-apatel@ventanamicro.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
|
|
Add rockchip timer compatible string for rockchip rv1126.
Signed-off-by: Jagan Teki <jagan@edgeble.ai>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221123183124.6911-3-jagan@edgeble.ai
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
|
|
With the tokens for all implemented RTAS functions now available via
rtas_function_token(), which is optimal and safe for arbitrary
contexts, there is no need to use rtas_token() or cache its result.
Most conversions are trivial, but a few are worth describing in more
detail:
* Error injection token comparisons for lockdown purposes are
consolidated into a simple predicate: token_is_restricted_errinjct().
* A couple of special cases in block_rtas_call() do not use
rtas_token() but perform string comparisons against names in the
function table. These are converted to compare against token values
instead, which is logically equivalent but less expensive.
* The lookup for the ibm,os-term token can be deferred until needed,
instead of caching it at boot to avoid device tree traversal during
panic.
* Since rtas_function_token() accesses a read-only data structure
without taking any locks, xmon's lookup of set-indicator can be
performed as needed instead of cached at startup.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-20-26929c8cce78@linux.ibm.com
|
|
Users of rtas_token() supply a string argument that can't be validated
at build time. A typo or misspelling has to be caught by inspection or
by observing wrong behavior at runtime.
Since the core RTAS code now has consolidated the names of all
possible RTAS functions and mapped them to their tokens, token lookup
can be implemented using symbolic constants to index a static array.
So introduce rtas_function_token(), a replacement API which does that,
along with a rtas_service_present()-equivalent helper,
rtas_function_implemented(). Callers supply an opaque predefined
function handle which is used internally to index the function
table. Typos or other inappropriate arguments yield build errors, and
the function handle is a type that can't be easily confused with RTAS
tokens or other integer types.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-19-26929c8cce78@linux.ibm.com
|
|
Convert the TLB block invalidate characteristics discovery to the new
papr_sysparm API. This occurs too early in boot to use
papr_sysparm_buf_alloc(), so use a static buffer.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-18-26929c8cce78@linux.ibm.com
|
|
The new papr_sysparm API handles the details of system parameter
retrieval. Use that instead of open-coding the RTAS call, work area
management, and retries.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-17-26929c8cce78@linux.ibm.com
|
|
/proc/powerpc/lparcfg derives the LPAR name and SPLPAR characteristics
it reports using bare calls to the RTAS ibm,get-system-parameter
function. Convert these to the higher-level papr_sysparm API, which
handles the tedious details.
While the SPLPAR string parsing code could stand to be updated, that
should be done in a separate change. It is minimally modified here to
reduce the risk of changing behavior.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-16-26929c8cce78@linux.ibm.com
|
|
Convert the direct invocation of the ibm,get-system-parameter RTAS
function to papr_sysparm_get().
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-15-26929c8cce78@linux.ibm.com
|
|
Introduce a set of APIs for retrieving and updating PAPR system
parameters. This encapsulates the toil of temporary RTAS work area
management, RTAS function call retries, and translation of RTAS call
statuses to conventional error values.
There are several places in the kernel that already retrieve system
parameters by calling the RTAS ibm,get-system-parameter function
directly. These will be converted to papr_sysparm_get() in changes to
follow.
As for updating system parameters, current practice is to use
sys_rtas() from user space; there are no in-kernel users of the RTAS
ibm,set-system-parameter function. However this will become deprecated
in time because it is not compatible with lockdown.
The papr_sysparm_* APIs will form the common basis for in-kernel
and user space access to system parameters. The code to expose the
set/get capabilities to user space will follow.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-14-26929c8cce78@linux.ibm.com
|
|
Hold a work area object for the duration of the RTAS
ibm,configure-connector sequence, eliminating locking and copying
around each RTAS call.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-13-26929c8cce78@linux.ibm.com
|
|
Various pseries-specific RTAS functions take a temporary "work area"
parameter - a buffer in memory accessible to RTAS. Typically such
functions are passed the statically allocated rtas_data_buf buffer as
the argument. This buffer is protected by a global spinlock. So users
of rtas_data_buf cannot perform sleeping operations while accessing
the buffer.
Most RTAS functions that have a work area parameter can return a
status (-2/990x) that indicates that the caller should retry. Before
retrying, the caller may need to reschedule or sleep (see
rtas_busy_delay() for details). This combination of factors
leads to uncomfortable constructions like this:
do {
spin_lock(&rtas_data_buf_lock);
rc = rtas_call(token, __pa(rtas_data_buf, ...);
if (rc == 0) {
/* parse or copy out rtas_data_buf contents */
}
spin_unlock(&rtas_data_buf_lock);
} while (rtas_busy_delay(rc));
Another unfortunately common way of handling this is for callers to
blithely ignore the possibility of a -2/990x status and hope for the
best.
If users were allowed to perform blocking operations while owning a
work area, the programming model would become less tedious and
error-prone. Users could schedule away, sleep, or perform other
blocking operations without having to release and re-acquire
resources.
We could continue to use a single work area buffer, and convert
rtas_data_buf_lock to a mutex. But that would impose an unnecessarily
coarse serialization on all users. As awkward as the current design
is, it prevents longer running operations that need to repeatedly use
rtas_data_buf from blocking the progress of others.
There are more considerations. One is that while 4KB is fine for all
current in-kernel uses, some RTAS calls can take much smaller buffers,
and some (VPD, platform dumps) would likely benefit from larger
ones. Another is that at least one RTAS function (ibm,get-vpd)
has *two* work area parameters. And finally, we should expect the
number of work area users in the kernel to increase over time as we
introduce lockdown-compatible ABIs to replace less safe use cases
based on sys_rtas/librtas.
So a special-purpose allocator for RTAS work area buffers seems worth
trying.
Properties:
* The backing memory for the allocator is reserved early in boot in
order to satisfy RTAS addressing requirements, and then managed with
genalloc.
* Allocations can block, but they never fail (mempool-like).
* Prioritizes first-come, first-serve fairness over throughput.
* Early boot allocations before the allocator has been initialized are
served via an internal static buffer.
Intended to replace rtas_data_buf. New code that needs RTAS work area
buffers should prefer this API.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
|
|
Decompose the RTAS entry C code into tracing and non-tracing variants,
calling the just-added tracepoints in the tracing-enabled path. Skip
tracing in contexts known to be unsafe (real mode, CPU offline).
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-11-26929c8cce78@linux.ibm.com
|
|
Add two sets of tracepoints to be used around RTAS entry:
* rtas_input/rtas_output, which emit the function name, its inputs,
the returned status, and any other outputs. These produce an API-level
record of OS<->RTAS activity.
* rtas_ll_entry/rtas_ll_exit, which are lower-level and emit the
entire contents of the parameter block (aka rtas_args) on entry and
exit. Likely useful only for debugging.
With uses of these tracepoints in do_enter_rtas() to be added in the
following patch, examples of get-time-of-day and event-scan functions
as rendered by trace-cmd (with some multi-line formatting manually
imposed on the rtas_ll_* entries to avoid extremely long lines in the
commit message):
cat-36800 [059] 4978.518303: rtas_input: get-time-of-day arguments:
cat-36800 [059] 4978.518306: rtas_ll_entry: token=3 nargs=0 nret=8
params: [0]=0x00000000 [1]=0x00000000 [2]=0x00000000 [3]=0x00000000
[4]=0x00000000 [5]=0x00000000 [6]=0x00000000 [7]=0x00000000
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
cat-36800 [059] 4978.518366: rtas_ll_exit: token=3 nargs=0 nret=8
params: [0]=0x00000000 [1]=0x000007e6 [2]=0x0000000b [3]=0x00000001
[4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
cat-36800 [059] 4978.518366: rtas_output: get-time-of-day status: 0, other outputs: 2022 11 1 0 14 8 772648000
kworker/39:1-336 [039] 4982.731623: rtas_input: event-scan arguments: 4294967295 0 80484920 2048
kworker/39:1-336 [039] 4982.731626: rtas_ll_entry: token=6 nargs=4 nret=1
params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
[4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
kworker/39:1-336 [039] 4982.731676: rtas_ll_exit: token=6 nargs=4 nret=1
params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800
[4]=0x00000001 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40
[8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000
[12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000
kworker/39:1-336 [039] 4982.731677: rtas_output: event-scan status: 1, other outputs:
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-10-26929c8cce78@linux.ibm.com
|
|
Make do_enter_rtas() take a pointer to struct rtas_args and do the
__pa() conversion in one place instead of leaving it to callers. This
also makes it possible to introduce enter/exit tracepoints that access
the rtas_args struct fields.
There's no apparent reason to force inlining of do_enter_rtas()
either, and it seems to bloat the code a bit. Let the compiler decide.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-9-26929c8cce78@linux.ibm.com
|
|
The core RTAS support code and its clients perform two types of lookup
for RTAS firmware function information.
First, mapping a known function name to a token. The typical use case
invokes rtas_token() to retrieve the token value to pass to
rtas_call(). rtas_token() relies on of_get_property(), which performs
a linear search of the /rtas node's property list under a lock with
IRQs disabled.
Second, and less common: given a token value, looking up some
information about the function. The primary example is the sys_rtas
filter path, which linearly scans a small table to match the token to
a rtas_filter struct. Another use case to come is RTAS entry/exit
tracepoints, which will require efficient lookup of function names
from token values. Currently there is no general API for this.
We need something much like the existing rtas_filters table, but more
general and organized to facilitate efficient lookups.
Introduce:
* A new rtas_function type, aggregating function name, token,
and filter. Other function characteristics could be added in the
future.
* An array of rtas_function, where each element corresponds to a known
RTAS function. All information in the table is static save the token
values, which are derived from the device tree at boot. The array is
sorted by function name to allow binary search.
* A named constant for each known RTAS function, used to index the
function array. These also will be used in a client-facing API to be
added later.
* An xarray that maps valid tokens to rtas_function objects.
Fold the existing rtas_filter table into the new rtas_function array,
with the appropriate adjustments to block_rtas_call(). Remove
now-redundant fields from struct rtas_filter. Preserve the function of
the CONFIG_CPU_BIG_ENDIAN guard in the current filter table by
introducing a per-function flag that is set for the function entries
related to pseries LPAR migration. These have never had working users
via sys_rtas on ppc64le; see commit de0f7349a0dd ("powerpc/rtas:
prevent suspend-related sys_rtas use on LE").
Convert rtas_token() to use a lockless binary search on the function
table. Fall back to the old behavior for lookups against names that
are not known to be RTAS functions, but issue a warning. rtas_token()
is for function names; it is not a general facility for accessing
arbitrary properties of the /rtas node. All known misuses of
rtas_token() have been converted to more appropriate of_ APIs in
preceding changes.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-8-26929c8cce78@linux.ibm.com
|
|
The pseries platform has been LPAR-only for several generations, and
the PAPR spec:
* Guarantees that timebase synchronization is performed by
the platform ("The timebase registers are synchronized by the
platform before CPUs are given to the OS" - 7.3.8 SMP Support).
* Completely omits the RTAS freeze-time-base and thaw-time-base RTAS
functions, which are CHRP artifacts.
This code is effectively unused on currently supported models, so drop
it.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-7-26929c8cce78@linux.ibm.com
|
|
Some RTAS functions that have work area parameters impose alignment
requirements on the work area passed to them by the OS. Examples
include:
- ibm,configure-connector
- ibm,update-nodes
- ibm,update-properties
4KB is the greatest alignment required by PAPR for such
buffers. rtas_data_buf used to have a __page_aligned attribute in the
arch/ppc64 days, but that was changed to __cacheline_aligned for
unknown reasons by commit 033ef338b6e0 ("powerpc: Merge rtas.c into
arch/powerpc/kernel"). That works out to 128-byte alignment
on ppc64, which isn't right.
This was found by inspection and I'm not aware of any real problems
caused by this. Either current RTAS implementations don't enforce the
alignment constraints, or rtas_data_buf is always being placed at a
4KB boundary by accident (or both, perhaps).
Use __aligned(SZ_4K) to ensure the rtas_data_buf has alignment
appropriate for all users.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-6-26929c8cce78@linux.ibm.com
|
|
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
pSeries_cmo_feature_init() ignores this, making it possible to fail to
detect cooperative memory overcommit capabilities during boot.
Move the RTAS call into a conventional rtas_busy_delay()-based
loop, dropping unnecessary clearing of rtas_data_buf.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-5-26929c8cce78@linux.ibm.com
|
|
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
lparcfg's parse_system_parameter_string() ignores this, making it
possible to intermittently report incorrect SPLPAR characteristics.
Move the RTAS call into a coventional rtas_busy_delay()-based loop.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-4-26929c8cce78@linux.ibm.com
|
|
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again.
pseries_lpar_read_hblkrm_characteristics() ignores this, making it
possible to incorrectly detect TLB block invalidation characteristics
at boot.
Move the RTAS call into a coventional rtas_busy_delay()-based loop.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 1211ee61b4a8 ("powerpc/pseries: Read TLB Block Invalidate Characteristics")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-3-26929c8cce78@linux.ibm.com
|
|
The ibm,get-system-parameter RTAS function may return -2 or 990x,
which indicate that the caller should try again. read_24x7_sys_info()
ignores this, allowing transient failures in reporting processor
module information.
Move the RTAS call into a coventional rtas_busy_delay()-based loop,
along with the parsing of results on success.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 8ba214267382 ("powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-2-26929c8cce78@linux.ibm.com
|
|
Some code that runs early in boot calls RTAS functions that can return
-2 or 990x statuses, which mean the caller should retry. An example is
pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but
treats these benign statuses as errors instead of retrying.
pSeries_cmo_feature_init() and similar code should be made to retry
until they succeed or receive a real error, using the usual pattern:
do {
rc = rtas_call(token, etc...);
} while (rtas_busy_delay(rc));
But rtas_busy_delay() will perform a timed sleep on any 990x
status. This isn't safe so early in boot, before the CPU scheduler and
timer subsystem have initialized.
The -2 RTAS status is much more likely to occur during single-threaded
boot than 990x in practice, at least on PowerVM. This is because -2
usually means that RTAS made progress but exhausted its self-imposed
timeslice, while 990x is associated with concurrent requests from the
OS causing internal contention. Regardless, according to the language
in PAPR, the OS should be prepared to handle either type of status at
any time.
Add a fallback path to rtas_busy_delay() to handle this as safely as
possible, performing a small delay on 990x. Include a counter to
detect retry loops that aren't making progress and bail out. Add __ref
to rtas_busy_delay() since it now conditionally calls an __init
function.
This was found by inspection and I'm not aware of any real
failures. However, the implementation of rtas_busy_delay() before
commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
was not susceptible to this problem, so let's treat this as a
regression.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
|
|
Add support for loading keys from the PLPKS on pseries machines, with the
"ibm,plpks-sb-v1" format.
The object format is expected to be the same, so there shouldn't be any
functional differences between objects retrieved on powernv or pseries.
Unlike on powernv, on pseries the format string isn't contained in the
device tree. Use secvar_ops->format() to fetch the format string in a
generic manner, rather than searching the device tree ourselves.
(The current code searches the device tree for a node compatible with
"ibm,edk2-compat-v1". This patch switches to calling secvar_ops->format(),
which in the case of OPAL/powernv means opal_secvar_format(), which
searches the device tree for a node compatible with "ibm,secvar-backend"
and checks its "format" property. These are equivalent, as skiboot creates
a node with both "ibm,edk2-compat-v1" and "ibm,secvar-backend" as
compatible strings.)
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-27-ajd@linux.ibm.com
|
|
A few improvements to load_powerpc.c:
- include integrity.h for the pr_fmt()
- move all error reporting out of get_cert_list()
- use ERR_PTR() to better preserve error detail
- don't use pr_err() for missing keys
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-26-ajd@linux.ibm.com
|
|
The pseries platform can support dynamic secure boot (i.e. secure boot
using user-defined keys) using variables contained with the PowerVM LPAR
Platform KeyStore (PLPKS). Using the powerpc secvar API, expose the
relevant variables for pseries dynamic secure boot through the existing
secvar filesystem layout.
The relevant variables for dynamic secure boot are signed in the
keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall.
Object labels in the keystore are encoded using ucs2 format. With our
fixed variable names we don't have to care about encoding outside of the
necessary byte padding.
When a user writes to a variable, the first 8 bytes of data must contain
the signed update flags as defined by the hypervisor.
When a user reads a variable, the first 4 bytes of data contain the
policies defined for the object.
Limitations exist due to the underlying implementation of sysfs binary
attributes, as is the case for the OPAL secvar implementation -
partial writes are unsupported and writes cannot be larger than PAGE_SIZE.
(Even when using bin_attributes, which can be larger than a single page,
sysfs only gives us one page's worth of write buffer at a time, and the
hypervisor does not expose an interface for partial writes.)
Co-developed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
[mpe: Add NLS dependency to fix build errors, squash fix from ajd]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230210080401.345462-25-ajd@linux.ibm.com
|
|
pci_msix_alloc_irq_at() and pci_msix_free_irq() are not declared when
CONFIG_PCI_MSI is disabled.
Users of these two calls do not yet exist but when users do appear (shown
below is an attempt to use the new API in vfio-pci) the following errors
will be encountered when compiling with CONFIG_PCI_MSI disabled:
drivers/vfio/pci/vfio_pci_intrs.c:461:4: error: implicit declaration of\
function 'pci_msix_free_irq' is invalid in C99\
[-Werror,-Wimplicit-function-declaration]
pci_msix_free_irq(pdev, msix_map);
^
drivers/vfio/pci/vfio_pci_intrs.c:511:15: error: implicit declaration of\
function 'pci_msix_alloc_irq_at' is invalid in C99\
[-Werror,-Wimplicit-function-declaration]
msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL);
Provide definitions for pci_msix_alloc_irq_at() and pci_msix_free_irq() in
preparation for users that need to compile when CONFIG_PCI_MSI is
disabled.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 34026364df8e ("PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/158e40e1cfcfc58ae30ecb2bbfaf86e5bba7a1ef.1675978686.git.reinette.chatre@intel.com
|
|
Oleksij Rempel says:
====================
net: add EEE support for KSZ9477 switch family
changes v8:
- fix comment for linkmode_to_mii_eee_cap1_t() function
- add Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
- add Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
changes v7:
- update documentation for genphy_c45_eee_is_active()
- address review comments on "net: dsa: microchip: enable EEE support"
patch
changes v6:
- split patch set and send only first 9 patches
- Add Reviewed-by: Andrew Lunn <andrew@lunn.ch>
- use 0xffff instead of GENMASK
- Document @supported_eee
- use "()" with function name in comments
changes v5:
- spell fixes
- move part of genphy_c45_read_eee_abilities() to
genphy_c45_read_eee_cap1()
- validate MDIO_PCS_EEE_ABLE register against 0xffff val.
- rename *eee_100_10000* to *eee_cap1*
- use linkmode_intersects(phydev->supported, PHY_EEE_CAP1_FEATURES)
instead of !linkmode_empty()
- add documentation to linkmode/register helpers
changes v4:
- remove following helpers:
mmd_eee_cap_to_ethtool_sup_t
mmd_eee_adv_to_ethtool_adv_t
ethtool_adv_to_mmd_eee_adv_t
and port drivers from this helpers to linkmode helpers.
- rebase against latest net-next
- port phy_init_eee() to genphy_c45_eee_is_active()
changes v3:
- rework some parts of EEE infrastructure and move it to c45 code.
- add supported_eee storage and start using it in EEE code and by the
micrel driver.
- add EEE support for ar8035 PHY
- add SmartEEE support to FEC i.MX series.
changes v2:
- use phydev->supported instead of reading MII_BMSR regiaster
- fix @get_eee > @set_eee
With this patch series we provide EEE control for KSZ9477 family of
switches and
AR8035 with i.MX6 configuration.
According to my tests, on a system with KSZ8563 switch and 100Mbit idle
link,
we consume 0,192W less power per port if EEE is enabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All preparations are done. Now we can start using new functions and remove
the old code.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reduce code duplicated by migrating phy_init_eee() to
genphy_c45_eee_is_active().
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv().
It should work as before except write operation to the EEE adv registers
will be done only if some EEE abilities was detected.
If some driver will have a regression, related driver should provide own
.get_features callback. See micrel.c:ksz9477_get_features() as example.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv().
It should work as before except write operation to the EEE adv registers
will be done only if some EEE abilities was detected.
If some driver will have a regression, related driver should provide own
.get_features callback. See micrel.c:ksz9477_get_features() as example.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add replacement for phy_ethtool_get/set_eee() functions.
Current phy_ethtool_get/set_eee() implementation is great and it is
possible to make it even better:
- this functionality is for devices implementing parts of IEEE 802.3
specification beyond Clause 22. The better place for this code is
phy-c45.c
- currently it is able to do read/write operations on PHYs with
different abilities to not existing registers. It is better to
use stored supported_eee abilities to avoid false read/write
operations.
- the eee_active detection will provide wrong results on not supported
link modes. It is better to validate speed/duplex properties against
supported EEE link modes.
- it is able to support only limited amount of link modes. We have more
EEE link modes...
By refactoring this code I address most of this point except of the last
one. Adding additional EEE link modes will need more work.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This function will be needed for genphy_c45_ethtool_get_eee() provided
by next patch.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
KSZ8563R, which has same PHYID as KSZ9477 family, will change "EEE control
and capability 1" (Register 3.20) content depending on configuration of
"EEE advertisement 1" (Register 7.60). Changes on the 7.60 will affect
3.20 register.
So, instead of depending on register 3.20, driver should set supported_eee.
Proper supported_eee configuration is needed to make use of generic
PHY c45 set/get_eee functions provided by next patches.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add generic function for EEE abilities defined by IEEE 802.3
specification. For now following registers are supported:
- IEEE 802.3-2018 45.2.3.10 EEE control and capability 1 (Register 3.20)
- IEEE 802.3cg-2019 45.2.1.186b 10BASE-T1L PMA status register
(Register 1.2295)
Since I was not able to find any flag signaling support of these
registers, we should detect link mode abilities first and then based on
these abilities doing EEE link modes detection.
Results of EEE ability detection will be stored into new variable
phydev->supported_eee.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some of KSZ9477 family switches provides EEE support. To enable it, we
just need to register set_mac_eee/set_mac_eee handlers and validate
supported chip version and port.
Currently supported chip variants are: KSZ8563, KSZ9477, KSZ9563,
KSZ9567, KSZ9893, KSZ9896, KSZ9897. KSZ8563 supports EEE only with
100BaseTX/Full. Other chips support 100BaseTX/Full and 1000BaseTX/Full.
Low Power Idle configuration is not supported and currently not
documented in the datasheets.
EEE PHY specific tunings are not documented in the switch datasheets, but can
overlap with KSZ9131 specification.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During probe, both sensor buffers need to be queried to
initialize the hwmon channels. This might be slow on some
machines, causing a unnecessary delay during boot.
Mark the driver with PROBE_PREFER_ASYNCHRONOUS so that it
can be probed asynchronously.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20230209211503.2739-3-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Thanks to bugreport 216655 on bugzilla triggered by the
dell-smm-hwmon driver, the contents of the sensor buffers
could be almost completely decoded.
Add an hwmon interface for exposing the fan and thermal
sensor values. Since the WMI interface can be quite slow
on some machines, the sensor buffers are cached for 1 second
to lessen the performance impact.
The debugfs interface remains in place to aid in reverse-engineering
of unknown sensor types and the thermal buffer.
Tested-by: Antonín Skala <skala.antonin@gmail.com>
Tested-by: Gustavo Walbon <gustavowalbon@gmail.com>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230209211503.2739-2-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Add description for new attributes added for rack manager switch and
NG800 family systems.
Attributes related to power converter board:
- reset_pwr_converter_fail;
- pwr_converter_prog_en;
Attributes related to External Root of Trust (EROT) devices recovery:
- erot1_ap_reset;
- erot2_ap_reset;
- erot1_recovery;
- erot2_recovery;
- erot1_reset;
- erot2_reset;
- erot1_wp;
- erot2_wp;
- spi_chnl_select;
Attributes related to clock board failures and recovery:
- clk_brd1_boot_fail;
- clk_brd2_boot_fail;
- clk_brd_fail;
- clk_brd_prog_en;
Attributes related to power failures:
- reset_ac_ok_fail;
- asic_pg_fail;
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-14-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Move assignment of bus shift setting out of the loop to avoid redundant
operation.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-13-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Extend writeable, readable, volatile registers of the 'regmap' object
with for I2C mux selector registers.
The motivation is to pass this object extended with selector registers
to I2C mux driver working over ‘regmap’.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-12-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Add field with PCIe remapped based address for passing it across
relevant platform drivers sharing common system resources.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-11-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
configuration
Currently hotplug configuration in logic device assumes that all items
are provided with no holes.
Thus, any group of hotplug events, associated with the specific
status/event/mask registers is configured in those registers
successively from bit zero to bit #n (#n < 8).
This logic is changed int order to allow non-successive definition to
support configuration with the skipped bits – for example bits 3, 5, 7
in status/event/mask registers can be associated with hotplug events,
while others can be skipped.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-10-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Motivation is to provide synchronization between I2C main bus and other
platform drivers using this notification callback.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-9-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Split logic in mlxplat_init()/mlxplat_exit() routines.
Separate initialization of I2C infrastructure and others platform
drivers.
Motivation is to provide synchronization between I2C bus and mux
drivers and other drivers using this infrastructure.
I2C main bus and MUX busses are implemented in FPGA logic. On some new
systems the numbers allocated for these busses could be variable
depending on order of initialization of I2C native busses. Since bus
numbers are passed to some other platform drivers during initialization
flow, it is necessary to synchronize completion of I2C infrastructure
drivers and activation of rest of drivers.
Thus initialization flow will be performed in synchronized order.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-8-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
Shannon Nelson says:
====================
ionic: on-chip descriptors
We start with a couple of house-keeping patches that were originally
presented for 'net', then we add support for on-chip descriptor rings
for tx-push, as well as adding support for rx-push.
I have a patch for the ethtool userland utility that I can send out
once this has been accepted.
v4: added rx-push attributes to ethtool netlink
converted CMB feature from using a priv-flag to using ethtool tx/rx-push
v3: edited commit message to describe interface-down limitation
added warn msg if cmb_inuse alloc fails
removed unnecessary clearing of phy_cmb_pages and cmb_npages
changed cmb_rings_toggle to use cmb_inuse
removed unrelated pci_set_drvdata()
removed unnecessary (u32) cast
added static inline func for writing CMB descriptors
v2: dropped the rx buffers patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|