summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-13clocksource/drivers/timer-microchip-pit64b: Drop obsolete dependency on ↵Jean Delvare
COMPILE_TEST Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it is possible to test-build any driver which depends on OF on any architecture by explicitly selecting OF. Therefore depending on COMPILE_TEST as an alternative is no longer needed. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230121182911.4e47a5ff@endymion.delvare Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-02-13clocksource/drivers/riscv: Increase the clock source ratingSamuel Holland
RISC-V provides an architectural clock source via the time CSR. This clock source exposes a 64-bit counter synchronized across all CPUs. Because it is accessed using a CSR, it is much more efficient to read than MMIO clock sources. For example, on the Allwinner D1, reading the sun4i timer in a loop takes 131 cycles/iteration, while reading the RISC-V time CSR takes only 5 cycles/iteration. Adjust the RISC-V clock source rating so it is preferred over the various platform-specific MMIO clock sources. Signed-off-by: Samuel Holland <samuel@sholland.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20221228004444.61568-1-samuel@sholland.org Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-02-13clocksource/drivers/timer-riscv: Set CLOCK_EVT_FEAT_C3STOP based on DTAnup Patel
We should set CLOCK_EVT_FEAT_C3STOP for a clock_event_device only when riscv,timer-cannot-wake-cpu DT property is present in the RISC-V timer DT node. This way CLOCK_EVT_FEAT_C3STOP feature is set for clock_event_device based on RISC-V platform capabilities rather than having it set for all RISC-V platforms. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-4-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-02-13dt-bindings: timer: Add bindings for the RISC-V timer deviceAnup Patel
We add DT bindings for a separate RISC-V timer DT node which can be used to describe implementation specific behaviour (such as timer interrupt not triggered during non-retentive suspend). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-3-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-02-13RISC-V: time: initialize hrtimer based broadcast clock event deviceConor Dooley
Similarly to commit 022eb8ae8b5e ("ARM: 8938/1: kernel: initialize broadcast hrtimer based clock event device"), RISC-V needs to initiate hrtimer based broadcast clock event device before C3STOP can be used. Otherwise, the introduction of C3STOP for the RISC-V arch timer in commit 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") leaves us without any broadcast timer registered. This prevents the kernel from entering oneshot mode, which breaks timer behaviour, for example clock_nanosleep(). A test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & C3STOP enabled, the sleep times are rounded up to the next jiffy: == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521 Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Suggested-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Samuel Holland <samuel@sholland.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20230103141102.772228-2-apatel@ventanamicro.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-02-13dt-bindings: timer: rk-timer: Add rktimer for rv1126Jagan Teki
Add rockchip timer compatible string for rockchip rv1126. Signed-off-by: Jagan Teki <jagan@edgeble.ai> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221123183124.6911-3-jagan@edgeble.ai Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
2023-02-13powerpc/rtas: arch-wide function token lookup conversionsNathan Lynch
With the tokens for all implemented RTAS functions now available via rtas_function_token(), which is optimal and safe for arbitrary contexts, there is no need to use rtas_token() or cache its result. Most conversions are trivial, but a few are worth describing in more detail: * Error injection token comparisons for lockdown purposes are consolidated into a simple predicate: token_is_restricted_errinjct(). * A couple of special cases in block_rtas_call() do not use rtas_token() but perform string comparisons against names in the function table. These are converted to compare against token values instead, which is logically equivalent but less expensive. * The lookup for the ibm,os-term token can be deferred until needed, instead of caching it at boot to avoid device tree traversal during panic. * Since rtas_function_token() accesses a read-only data structure without taking any locks, xmon's lookup of set-indicator can be performed as needed instead of cached at startup. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-20-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: introduce rtas_function_token() APINathan Lynch
Users of rtas_token() supply a string argument that can't be validated at build time. A typo or misspelling has to be caught by inspection or by observing wrong behavior at runtime. Since the core RTAS code now has consolidated the names of all possible RTAS functions and mapped them to their tokens, token lookup can be implemented using symbolic constants to index a static array. So introduce rtas_function_token(), a replacement API which does that, along with a rtas_service_present()-equivalent helper, rtas_function_implemented(). Callers supply an opaque predefined function handle which is used internally to index the function table. Typos or other inappropriate arguments yield build errors, and the function handle is a type that can't be easily confused with RTAS tokens or other integer types. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-19-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/lpar: convert to papr_sysparm APINathan Lynch
Convert the TLB block invalidate characteristics discovery to the new papr_sysparm API. This occurs too early in boot to use papr_sysparm_buf_alloc(), so use a static buffer. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-18-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/hv-24x7: convert to papr_sysparm APINathan Lynch
The new papr_sysparm API handles the details of system parameter retrieval. Use that instead of open-coding the RTAS call, work area management, and retries. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-17-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/lparcfg: convert to papr_sysparm APINathan Lynch
/proc/powerpc/lparcfg derives the LPAR name and SPLPAR characteristics it reports using bare calls to the RTAS ibm,get-system-parameter function. Convert these to the higher-level papr_sysparm API, which handles the tedious details. While the SPLPAR string parsing code could stand to be updated, that should be done in a separate change. It is minimally modified here to reduce the risk of changing behavior. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-16-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries: convert CMO probe to papr_sysparm APINathan Lynch
Convert the direct invocation of the ibm,get-system-parameter RTAS function to papr_sysparm_get(). Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-15-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries: PAPR system parameter APINathan Lynch
Introduce a set of APIs for retrieving and updating PAPR system parameters. This encapsulates the toil of temporary RTAS work area management, RTAS function call retries, and translation of RTAS call statuses to conventional error values. There are several places in the kernel that already retrieve system parameters by calling the RTAS ibm,get-system-parameter function directly. These will be converted to papr_sysparm_get() in changes to follow. As for updating system parameters, current practice is to use sys_rtas() from user space; there are no in-kernel users of the RTAS ibm,set-system-parameter function. However this will become deprecated in time because it is not compatible with lockdown. The papr_sysparm_* APIs will form the common basis for in-kernel and user space access to system parameters. The code to expose the set/get capabilities to user space will follow. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-14-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/dlpar: use RTAS work area APINathan Lynch
Hold a work area object for the duration of the RTAS ibm,configure-connector sequence, eliminating locking and copying around each RTAS call. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-13-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries: add RTAS work area allocatorNathan Lynch
Various pseries-specific RTAS functions take a temporary "work area" parameter - a buffer in memory accessible to RTAS. Typically such functions are passed the statically allocated rtas_data_buf buffer as the argument. This buffer is protected by a global spinlock. So users of rtas_data_buf cannot perform sleeping operations while accessing the buffer. Most RTAS functions that have a work area parameter can return a status (-2/990x) that indicates that the caller should retry. Before retrying, the caller may need to reschedule or sleep (see rtas_busy_delay() for details). This combination of factors leads to uncomfortable constructions like this: do { spin_lock(&rtas_data_buf_lock); rc = rtas_call(token, __pa(rtas_data_buf, ...); if (rc == 0) { /* parse or copy out rtas_data_buf contents */ } spin_unlock(&rtas_data_buf_lock); } while (rtas_busy_delay(rc)); Another unfortunately common way of handling this is for callers to blithely ignore the possibility of a -2/990x status and hope for the best. If users were allowed to perform blocking operations while owning a work area, the programming model would become less tedious and error-prone. Users could schedule away, sleep, or perform other blocking operations without having to release and re-acquire resources. We could continue to use a single work area buffer, and convert rtas_data_buf_lock to a mutex. But that would impose an unnecessarily coarse serialization on all users. As awkward as the current design is, it prevents longer running operations that need to repeatedly use rtas_data_buf from blocking the progress of others. There are more considerations. One is that while 4KB is fine for all current in-kernel uses, some RTAS calls can take much smaller buffers, and some (VPD, platform dumps) would likely benefit from larger ones. Another is that at least one RTAS function (ibm,get-vpd) has *two* work area parameters. And finally, we should expect the number of work area users in the kernel to increase over time as we introduce lockdown-compatible ABIs to replace less safe use cases based on sys_rtas/librtas. So a special-purpose allocator for RTAS work area buffers seems worth trying. Properties: * The backing memory for the allocator is reserved early in boot in order to satisfy RTAS addressing requirements, and then managed with genalloc. * Allocations can block, but they never fail (mempool-like). * Prioritizes first-come, first-serve fairness over throughput. * Early boot allocations before the allocator has been initialized are served via an internal static buffer. Intended to replace rtas_data_buf. New code that needs RTAS work area buffers should prefer this API. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: add tracepoints around RTAS entryNathan Lynch
Decompose the RTAS entry C code into tracing and non-tracing variants, calling the just-added tracepoints in the tracing-enabled path. Skip tracing in contexts known to be unsafe (real mode, CPU offline). Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-11-26929c8cce78@linux.ibm.com
2023-02-13powerpc/tracing: tracepoints for RTAS entry and exitNathan Lynch
Add two sets of tracepoints to be used around RTAS entry: * rtas_input/rtas_output, which emit the function name, its inputs, the returned status, and any other outputs. These produce an API-level record of OS<->RTAS activity. * rtas_ll_entry/rtas_ll_exit, which are lower-level and emit the entire contents of the parameter block (aka rtas_args) on entry and exit. Likely useful only for debugging. With uses of these tracepoints in do_enter_rtas() to be added in the following patch, examples of get-time-of-day and event-scan functions as rendered by trace-cmd (with some multi-line formatting manually imposed on the rtas_ll_* entries to avoid extremely long lines in the commit message): cat-36800 [059] 4978.518303: rtas_input: get-time-of-day arguments: cat-36800 [059] 4978.518306: rtas_ll_entry: token=3 nargs=0 nret=8 params: [0]=0x00000000 [1]=0x00000000 [2]=0x00000000 [3]=0x00000000 [4]=0x00000000 [5]=0x00000000 [6]=0x00000000 [7]=0x00000000 [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000 [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000 cat-36800 [059] 4978.518366: rtas_ll_exit: token=3 nargs=0 nret=8 params: [0]=0x00000000 [1]=0x000007e6 [2]=0x0000000b [3]=0x00000001 [4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40 [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000 [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000 cat-36800 [059] 4978.518366: rtas_output: get-time-of-day status: 0, other outputs: 2022 11 1 0 14 8 772648000 kworker/39:1-336 [039] 4982.731623: rtas_input: event-scan arguments: 4294967295 0 80484920 2048 kworker/39:1-336 [039] 4982.731626: rtas_ll_entry: token=6 nargs=4 nret=1 params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800 [4]=0x00000000 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40 [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000 [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000 kworker/39:1-336 [039] 4982.731676: rtas_ll_exit: token=6 nargs=4 nret=1 params: [0]=0xffffffff [1]=0x00000000 [2]=0x04cc1a38 [3]=0x00000800 [4]=0x00000001 [5]=0x0000000e [6]=0x00000008 [7]=0x2e0dac40 [8]=0x00000000 [9]=0x00000000 [10]=0x00000000 [11]=0x00000000 [12]=0x00000000 [13]=0x00000000 [14]=0x00000000 [15]=0x00000000 kworker/39:1-336 [039] 4982.731677: rtas_output: event-scan status: 1, other outputs: Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-10-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: strengthen do_enter_rtas() type safety, drop inlineNathan Lynch
Make do_enter_rtas() take a pointer to struct rtas_args and do the __pa() conversion in one place instead of leaving it to callers. This also makes it possible to introduce enter/exit tracepoints that access the rtas_args struct fields. There's no apparent reason to force inlining of do_enter_rtas() either, and it seems to bloat the code a bit. Let the compiler decide. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-9-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: improve function information lookupsNathan Lynch
The core RTAS support code and its clients perform two types of lookup for RTAS firmware function information. First, mapping a known function name to a token. The typical use case invokes rtas_token() to retrieve the token value to pass to rtas_call(). rtas_token() relies on of_get_property(), which performs a linear search of the /rtas node's property list under a lock with IRQs disabled. Second, and less common: given a token value, looking up some information about the function. The primary example is the sys_rtas filter path, which linearly scans a small table to match the token to a rtas_filter struct. Another use case to come is RTAS entry/exit tracepoints, which will require efficient lookup of function names from token values. Currently there is no general API for this. We need something much like the existing rtas_filters table, but more general and organized to facilitate efficient lookups. Introduce: * A new rtas_function type, aggregating function name, token, and filter. Other function characteristics could be added in the future. * An array of rtas_function, where each element corresponds to a known RTAS function. All information in the table is static save the token values, which are derived from the device tree at boot. The array is sorted by function name to allow binary search. * A named constant for each known RTAS function, used to index the function array. These also will be used in a client-facing API to be added later. * An xarray that maps valid tokens to rtas_function objects. Fold the existing rtas_filter table into the new rtas_function array, with the appropriate adjustments to block_rtas_call(). Remove now-redundant fields from struct rtas_filter. Preserve the function of the CONFIG_CPU_BIG_ENDIAN guard in the current filter table by introducing a per-function flag that is set for the function entries related to pseries LPAR migration. These have never had working users via sys_rtas on ppc64le; see commit de0f7349a0dd ("powerpc/rtas: prevent suspend-related sys_rtas use on LE"). Convert rtas_token() to use a lockless binary search on the function table. Fall back to the old behavior for lookups against names that are not known to be RTAS functions, but issue a warning. rtas_token() is for function names; it is not a general facility for accessing arbitrary properties of the /rtas node. All known misuses of rtas_token() have been converted to more appropriate of_ APIs in preceding changes. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-8-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries: drop RTAS-based timebase synchronizationNathan Lynch
The pseries platform has been LPAR-only for several generations, and the PAPR spec: * Guarantees that timebase synchronization is performed by the platform ("The timebase registers are synchronized by the platform before CPUs are given to the OS" - 7.3.8 SMP Support). * Completely omits the RTAS freeze-time-base and thaw-time-base RTAS functions, which are CHRP artifacts. This code is effectively unused on currently supported models, so drop it. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-7-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: ensure 4KB alignment for rtas_data_bufNathan Lynch
Some RTAS functions that have work area parameters impose alignment requirements on the work area passed to them by the OS. Examples include: - ibm,configure-connector - ibm,update-nodes - ibm,update-properties 4KB is the greatest alignment required by PAPR for such buffers. rtas_data_buf used to have a __page_aligned attribute in the arch/ppc64 days, but that was changed to __cacheline_aligned for unknown reasons by commit 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel"). That works out to 128-byte alignment on ppc64, which isn't right. This was found by inspection and I'm not aware of any real problems caused by this. Either current RTAS implementations don't enforce the alignment constraints, or rtas_data_buf is always being placed at a 4KB boundary by accident (or both, perhaps). Use __aligned(SZ_4K) to ensure the rtas_data_buf has alignment appropriate for all users. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-6-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/setup: add missing RTAS retry status handlingNathan Lynch
The ibm,get-system-parameter RTAS function may return -2 or 990x, which indicate that the caller should try again. pSeries_cmo_feature_init() ignores this, making it possible to fail to detect cooperative memory overcommit capabilities during boot. Move the RTAS call into a conventional rtas_busy_delay()-based loop, dropping unnecessary clearing of rtas_data_buf. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-5-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/lparcfg: add missing RTAS retry status handlingNathan Lynch
The ibm,get-system-parameter RTAS function may return -2 or 990x, which indicate that the caller should try again. lparcfg's parse_system_parameter_string() ignores this, making it possible to intermittently report incorrect SPLPAR characteristics. Move the RTAS call into a coventional rtas_busy_delay()-based loop. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-4-26929c8cce78@linux.ibm.com
2023-02-13powerpc/pseries/lpar: add missing RTAS retry status handlingNathan Lynch
The ibm,get-system-parameter RTAS function may return -2 or 990x, which indicate that the caller should try again. pseries_lpar_read_hblkrm_characteristics() ignores this, making it possible to incorrectly detect TLB block invalidation characteristics at boot. Move the RTAS call into a coventional rtas_busy_delay()-based loop. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 1211ee61b4a8 ("powerpc/pseries: Read TLB Block Invalidate Characteristics") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-3-26929c8cce78@linux.ibm.com
2023-02-13powerpc/perf/hv-24x7: add missing RTAS retry status handlingNathan Lynch
The ibm,get-system-parameter RTAS function may return -2 or 990x, which indicate that the caller should try again. read_24x7_sys_info() ignores this, allowing transient failures in reporting processor module information. Move the RTAS call into a coventional rtas_busy_delay()-based loop, along with the parsing of results on success. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 8ba214267382 ("powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-2-26929c8cce78@linux.ibm.com
2023-02-13powerpc/rtas: handle extended delays safely in early bootNathan Lynch
Some code that runs early in boot calls RTAS functions that can return -2 or 990x statuses, which mean the caller should retry. An example is pSeries_cmo_feature_init(), which invokes ibm,get-system-parameter but treats these benign statuses as errors instead of retrying. pSeries_cmo_feature_init() and similar code should be made to retry until they succeed or receive a real error, using the usual pattern: do { rc = rtas_call(token, etc...); } while (rtas_busy_delay(rc)); But rtas_busy_delay() will perform a timed sleep on any 990x status. This isn't safe so early in boot, before the CPU scheduler and timer subsystem have initialized. The -2 RTAS status is much more likely to occur during single-threaded boot than 990x in practice, at least on PowerVM. This is because -2 usually means that RTAS made progress but exhausted its self-imposed timeslice, while 990x is associated with concurrent requests from the OS causing internal contention. Regardless, according to the language in PAPR, the OS should be prepared to handle either type of status at any time. Add a fallback path to rtas_busy_delay() to handle this as safely as possible, performing a small delay on 990x. Include a counter to detect retry loops that aren't making progress and bail out. Add __ref to rtas_busy_delay() since it now conditionally calls an __init function. This was found by inspection and I'm not aware of any real failures. However, the implementation of rtas_busy_delay() before commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") was not susceptible to this problem, so let's treat this as a regression. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-1-26929c8cce78@linux.ibm.com
2023-02-13integrity/powerpc: Support loading keys from PLPKSRussell Currey
Add support for loading keys from the PLPKS on pseries machines, with the "ibm,plpks-sb-v1" format. The object format is expected to be the same, so there shouldn't be any functional differences between objects retrieved on powernv or pseries. Unlike on powernv, on pseries the format string isn't contained in the device tree. Use secvar_ops->format() to fetch the format string in a generic manner, rather than searching the device tree ourselves. (The current code searches the device tree for a node compatible with "ibm,edk2-compat-v1". This patch switches to calling secvar_ops->format(), which in the case of OPAL/powernv means opal_secvar_format(), which searches the device tree for a node compatible with "ibm,secvar-backend" and checks its "format" property. These are equivalent, as skiboot creates a node with both "ibm,edk2-compat-v1" and "ibm,secvar-backend" as compatible strings.) Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-27-ajd@linux.ibm.com
2023-02-13integrity/powerpc: Improve error handling & reporting when loading certsRussell Currey
A few improvements to load_powerpc.c: - include integrity.h for the pr_fmt() - move all error reporting out of get_cert_list() - use ERR_PTR() to better preserve error detail - don't use pr_err() for missing keys Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-26-ajd@linux.ibm.com
2023-02-13powerpc/pseries: Implement secvars for dynamic secure bootRussell Currey
The pseries platform can support dynamic secure boot (i.e. secure boot using user-defined keys) using variables contained with the PowerVM LPAR Platform KeyStore (PLPKS). Using the powerpc secvar API, expose the relevant variables for pseries dynamic secure boot through the existing secvar filesystem layout. The relevant variables for dynamic secure boot are signed in the keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall. Object labels in the keystore are encoded using ucs2 format. With our fixed variable names we don't have to care about encoding outside of the necessary byte padding. When a user writes to a variable, the first 8 bytes of data must contain the signed update flags as defined by the hypervisor. When a user reads a variable, the first 4 bytes of data contain the policies defined for the object. Limitations exist due to the underlying implementation of sysfs binary attributes, as is the case for the OPAL secvar implementation - partial writes are unsupported and writes cannot be larger than PAGE_SIZE. (Even when using bin_attributes, which can be larger than a single page, sysfs only gives us one page's worth of write buffer at a time, and the hypervisor does not expose an interface for partial writes.) Co-developed-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Add NLS dependency to fix build errors, squash fix from ajd] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210080401.345462-25-ajd@linux.ibm.com
2023-02-13PCI/MSI: Provide missing stubs for CONFIG_PCI_MSI=nReinette Chatre
pci_msix_alloc_irq_at() and pci_msix_free_irq() are not declared when CONFIG_PCI_MSI is disabled. Users of these two calls do not yet exist but when users do appear (shown below is an attempt to use the new API in vfio-pci) the following errors will be encountered when compiling with CONFIG_PCI_MSI disabled: drivers/vfio/pci/vfio_pci_intrs.c:461:4: error: implicit declaration of\ function 'pci_msix_free_irq' is invalid in C99\ [-Werror,-Wimplicit-function-declaration] pci_msix_free_irq(pdev, msix_map); ^ drivers/vfio/pci/vfio_pci_intrs.c:511:15: error: implicit declaration of\ function 'pci_msix_alloc_irq_at' is invalid in C99\ [-Werror,-Wimplicit-function-declaration] msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL); Provide definitions for pci_msix_alloc_irq_at() and pci_msix_free_irq() in preparation for users that need to compile when CONFIG_PCI_MSI is disabled. Reported-by: kernel test robot <lkp@intel.com> Fixes: 34026364df8e ("PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/158e40e1cfcfc58ae30ecb2bbfaf86e5bba7a1ef.1675978686.git.reinette.chatre@intel.com
2023-02-13Merge branch 'ksz9477-eee-support'David S. Miller
Oleksij Rempel says: ==================== net: add EEE support for KSZ9477 switch family changes v8: - fix comment for linkmode_to_mii_eee_cap1_t() function - add Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> - add Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> changes v7: - update documentation for genphy_c45_eee_is_active() - address review comments on "net: dsa: microchip: enable EEE support" patch changes v6: - split patch set and send only first 9 patches - Add Reviewed-by: Andrew Lunn <andrew@lunn.ch> - use 0xffff instead of GENMASK - Document @supported_eee - use "()" with function name in comments changes v5: - spell fixes - move part of genphy_c45_read_eee_abilities() to genphy_c45_read_eee_cap1() - validate MDIO_PCS_EEE_ABLE register against 0xffff val. - rename *eee_100_10000* to *eee_cap1* - use linkmode_intersects(phydev->supported, PHY_EEE_CAP1_FEATURES) instead of !linkmode_empty() - add documentation to linkmode/register helpers changes v4: - remove following helpers: mmd_eee_cap_to_ethtool_sup_t mmd_eee_adv_to_ethtool_adv_t ethtool_adv_to_mmd_eee_adv_t and port drivers from this helpers to linkmode helpers. - rebase against latest net-next - port phy_init_eee() to genphy_c45_eee_is_active() changes v3: - rework some parts of EEE infrastructure and move it to c45 code. - add supported_eee storage and start using it in EEE code and by the micrel driver. - add EEE support for ar8035 PHY - add SmartEEE support to FEC i.MX series. changes v2: - use phydev->supported instead of reading MII_BMSR regiaster - fix @get_eee > @set_eee With this patch series we provide EEE control for KSZ9477 family of switches and AR8035 with i.MX6 configuration. According to my tests, on a system with KSZ8563 switch and 100Mbit idle link, we consume 0,192W less power per port if EEE is enabled. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: start using genphy_c45_ethtool_get/set_eee()Oleksij Rempel
All preparations are done. Now we can start using new functions and remove the old code. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: migrate phy_init_eee() to genphy_c45_eee_is_active()Oleksij Rempel
Reduce code duplicated by migrating phy_init_eee() to genphy_c45_eee_is_active(). Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: c45: migrate to genphy_c45_write_eee_adv()Oleksij Rempel
Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv(). It should work as before except write operation to the EEE adv registers will be done only if some EEE abilities was detected. If some driver will have a regression, related driver should provide own .get_features callback. See micrel.c:ksz9477_get_features() as example. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: c22: migrate to genphy_c45_write_eee_adv()Oleksij Rempel
Migrate from genphy_config_eee_advert() to genphy_c45_write_eee_adv(). It should work as before except write operation to the EEE adv registers will be done only if some EEE abilities was detected. If some driver will have a regression, related driver should provide own .get_features callback. See micrel.c:ksz9477_get_features() as example. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: add genphy_c45_ethtool_get/set_eee() supportOleksij Rempel
Add replacement for phy_ethtool_get/set_eee() functions. Current phy_ethtool_get/set_eee() implementation is great and it is possible to make it even better: - this functionality is for devices implementing parts of IEEE 802.3 specification beyond Clause 22. The better place for this code is phy-c45.c - currently it is able to do read/write operations on PHYs with different abilities to not existing registers. It is better to use stored supported_eee abilities to avoid false read/write operations. - the eee_active detection will provide wrong results on not supported link modes. It is better to validate speed/duplex properties against supported EEE link modes. - it is able to support only limited amount of link modes. We have more EEE link modes... By refactoring this code I address most of this point except of the last one. Adding additional EEE link modes will need more work. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: export phy_check_valid() functionOleksij Rempel
This function will be needed for genphy_c45_ethtool_get_eee() provided by next patch. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: micrel: add ksz9477_get_features()Oleksij Rempel
KSZ8563R, which has same PHYID as KSZ9477 family, will change "EEE control and capability 1" (Register 3.20) content depending on configuration of "EEE advertisement 1" (Register 7.60). Changes on the 7.60 will affect 3.20 register. So, instead of depending on register 3.20, driver should set supported_eee. Proper supported_eee configuration is needed to make use of generic PHY c45 set/get_eee functions provided by next patches. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: phy: add genphy_c45_read_eee_abilities() functionOleksij Rempel
Add generic function for EEE abilities defined by IEEE 802.3 specification. For now following registers are supported: - IEEE 802.3-2018 45.2.3.10 EEE control and capability 1 (Register 3.20) - IEEE 802.3cg-2019 45.2.1.186b 10BASE-T1L PMA status register (Register 1.2295) Since I was not able to find any flag signaling support of these registers, we should detect link mode abilities first and then based on these abilities doing EEE link modes detection. Results of EEE ability detection will be stored into new variable phydev->supported_eee. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: dsa: microchip: enable EEE supportOleksij Rempel
Some of KSZ9477 family switches provides EEE support. To enable it, we just need to register set_mac_eee/set_mac_eee handlers and validate supported chip version and port. Currently supported chip variants are: KSZ8563, KSZ9477, KSZ9563, KSZ9567, KSZ9893, KSZ9896, KSZ9897. KSZ8563 supports EEE only with 100BaseTX/Full. Other chips support 100BaseTX/Full and 1000BaseTX/Full. Low Power Idle configuration is not supported and currently not documented in the datasheets. EEE PHY specific tunings are not documented in the switch datasheets, but can overlap with KSZ9131 specification. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13platform/x86: dell-ddv: Prefer asynchronous probingArmin Wolf
During probe, both sensor buffers need to be queried to initialize the hwmon channels. This might be slow on some machines, causing a unnecessary delay during boot. Mark the driver with PROBE_PREFER_ASYNCHRONOUS so that it can be probed asynchronously. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20230209211503.2739-3-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform/x86: dell-ddv: Add hwmon supportArmin Wolf
Thanks to bugreport 216655 on bugzilla triggered by the dell-smm-hwmon driver, the contents of the sensor buffers could be almost completely decoded. Add an hwmon interface for exposing the fan and thermal sensor values. Since the WMI interface can be quite slow on some machines, the sensor buffers are cached for 1 second to lessen the performance impact. The debugfs interface remains in place to aid in reverse-engineering of unknown sensor types and the thermal buffer. Tested-by: Antonín Skala <skala.antonin@gmail.com> Tested-by: Gustavo Walbon <gustavowalbon@gmail.com> Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230209211503.2739-2-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13Documentation/ABI: Add new attribute for mlxreg-io sysfs interfacesVadim Pasternak
Add description for new attributes added for rack manager switch and NG800 family systems. Attributes related to power converter board: - reset_pwr_converter_fail; - pwr_converter_prog_en; Attributes related to External Root of Trust (EROT) devices recovery: - erot1_ap_reset; - erot2_ap_reset; - erot1_recovery; - erot2_recovery; - erot1_reset; - erot2_reset; - erot1_wp; - erot2_wp; - spi_chnl_select; Attributes related to clock board failures and recovery: - clk_brd1_boot_fail; - clk_brd2_boot_fail; - clk_brd_fail; - clk_brd_prog_en; Attributes related to power failures: - reset_ac_ok_fail; - asic_pg_fail; Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-14-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform: mellanox: mlx-platform: Move bus shift assignment out of the loopVadim Pasternak
Move assignment of bus shift setting out of the loop to avoid redundant operation. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-13-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform: mellanox: mlx-platform: Add mux selection register to regmapVadim Pasternak
Extend writeable, readable, volatile registers of the 'regmap' object with for I2C mux selector registers. The motivation is to pass this object extended with selector registers to I2C mux driver working over ‘regmap’. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-12-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform_data/mlxreg: Add field with mapped resource addressVadim Pasternak
Add field with PCIe remapped based address for passing it across relevant platform drivers sharing common system resources. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-11-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform/mellanox: mlxreg-hotplug: Allow more flexible hotplug events ↵Vadim Pasternak
configuration Currently hotplug configuration in logic device assumes that all items are provided with no holes. Thus, any group of hotplug events, associated with the specific status/event/mask registers is configured in those registers successively from bit zero to bit #n (#n < 8). This logic is changed int order to allow non-successive definition to support configuration with the skipped bits – for example bits 3, 5, 7 in status/event/mask registers can be associated with hotplug events, while others can be skipped. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-10-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform: mellanox: Extend all systems with I2C notification callbackVadim Pasternak
Motivation is to provide synchronization between I2C main bus and other platform drivers using this notification callback. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-9-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13platform: mellanox: Split logic in init and exit flowVadim Pasternak
Split logic in mlxplat_init()/mlxplat_exit() routines. Separate initialization of I2C infrastructure and others platform drivers. Motivation is to provide synchronization between I2C bus and mux drivers and other drivers using this infrastructure. I2C main bus and MUX busses are implemented in FPGA logic. On some new systems the numbers allocated for these busses could be variable depending on order of initialization of I2C native busses. Since bus numbers are passed to some other platform drivers during initialization flow, it is necessary to synchronize completion of I2C infrastructure drivers and activation of rest of drivers. Thus initialization flow will be performed in synchronized order. Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Michael Shych <michaelsh@nvidia.com> Link: https://lore.kernel.org/r/20230208063331.15560-8-vadimp@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-13Merge branch 'ionic-on-chip-desc'David S. Miller
Shannon Nelson says: ==================== ionic: on-chip descriptors We start with a couple of house-keeping patches that were originally presented for 'net', then we add support for on-chip descriptor rings for tx-push, as well as adding support for rx-push. I have a patch for the ethtool userland utility that I can send out once this has been accepted. v4: added rx-push attributes to ethtool netlink converted CMB feature from using a priv-flag to using ethtool tx/rx-push v3: edited commit message to describe interface-down limitation added warn msg if cmb_inuse alloc fails removed unnecessary clearing of phy_cmb_pages and cmb_npages changed cmb_rings_toggle to use cmb_inuse removed unrelated pci_set_drvdata() removed unnecessary (u32) cast added static inline func for writing CMB descriptors v2: dropped the rx buffers patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>