Age | Commit message (Collapse) | Author |
|
Trigger full device recovery when the driver fails to restore device state
via engine reset and resume operations. This is necessary because, even if
submissions from a faulty context are blocked, the NPU may still process
previously submitted faulty jobs if the engine reset fails to abort them.
Such jobs can continue to generate faults and occupy device resources.
When engine reset is ineffective, the only way to recover is to perform
a full device recovery.
Fixes: dad945c27a42 ("accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250528154253.500556-1-jacek.lawrynowicz@linux.intel.com
|
|
Add debugfs interface to modify following priority bands properties:
* grace period
* process grace period
* process quantum
This allows for the adjustment of hardware scheduling algorithm parameters
for each existing priority band, facilitating validation and fine-tuning.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204084622.2422544-4-jacek.lawrynowicz@linux.intel.com
|
|
Copy engine was deprecated by the FW and is no longer supported.
Compute engine includes all copy engine functionality and should be used
instead.
This change does not affect user space as the copy engine was never
used outside of a couple of tests.
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241017145817.121590-4-jacek.lawrynowicz@linux.intel.com
|
|
IVPU_TEST_MODE_HWS_EXTRA_EVENTS was never used and can be safely removed
Reviewed-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-30-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
Use correct channel for dyndbg JSM message.
Reviewed-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-29-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
Refactor IPC send and receive functions to allow correct
handling of operations that should not trigger a recovery process.
Expose ivpu_send_receive_internal(), which is now utilized by the D0i3
entry, DCT initialization, and HWS initialization functions.
These functions have been modified to return error codes gracefully,
rather than initiating recovery.
The updated functions are invoked within ivpu_probe() and ivpu_resume(),
ensuring that any errors encountered during these stages result in a proper
teardown or shutdown sequence. The previous approach of triggering recovery
within these functions could lead to a race condition, potentially causing
undefined behavior and kernel crashes due to null pointer dereferences.
Fixes: 45e45362e095 ("accel/ivpu: Introduce ivpu_ipc_send_receive_active()")
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-23-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
Send JSM state dump message at the beginning of TDR handler. This allows
FW to collect debug info in the FW log before the state of the NPU is
lost allowing to analyze the cause of a TDR.
Wait a predefined timeout (10 ms) so the FW has a chance to write debug
logs. We cannot wait for JSM response at this point because IRQs are
already disabled before TDR handler is invoked.
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-9-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
This commit bumps:
- Boot API from 3.24.0 to 3.26.3
- JSM API from 3.16.0 to 3.25.0
Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Co-developed-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240930195322.461209-2-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
Remove duplicated debug messages from ivpu_jsm_(un)register_db().
Debug messages are already printed one level higher.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-15-jacek.lawrynowicz@linux.intel.com
|
|
When host system is under heavy load and the NPU is already running
on the lowest frequency, PUNIT may request Duty Cycle Throttling (DCT).
This will further reduce NPU power usage.
PUNIT requests DCT mode using Survabilty IRQ and mailbox register.
The driver then issues a JSM message to the FW that enables
the DCT mode. If the NPU resets while in DCT mode, the driver request
DCT mode during FW boot.
Also add debugfs "dct" file that allows to set arbitrary DCT percentage,
which is used by driver tests.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Wachowski, Karol <karol.wachowski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-7-jacek.lawrynowicz@linux.intel.com
|
|
Abort all jobs that belong to contexts generating MMU faults in order
to avoid flooding host with MMU IRQs.
Jobs are cancelled with:
- SSID_RELEASE command when OS scheduling is enabled
- DESTROY_CMDQ command when HW scheduling is enabled
Signed-off-by: Maciej Falkowski <maciej.falkowski@intel.com>
Co-developed-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611120433.1012423-3-jacek.lawrynowicz@linux.intel.com
|
|
Implement time based Metric Streamer profiling UAPI.
This is a generic mechanism allowing user mode tools to sample
NPU metrics. These metrics are defined by the FW and transparent to
the driver.
The user space can check for this feature by checking
DRM_IVPU_CAP_METRIC_STREAMER driver capability.
Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513120431.3187212-9-jacek.lawrynowicz@linux.intel.com
|
|
Add JSM messages that will be used to implement hardware scheduler.
Most of these messages are used to create and manage HWS specific
command queues.
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513120431.3187212-6-jacek.lawrynowicz@linux.intel.com
|
|
Currently the VPU firmware prepares for D0i3 every time the VPU
is entering D0i2 Idle state. This is not optimal as we might not
enter D0i3 every time we enter D0i2 Idle and this preparation
is quite costly.
This optimization moves D0i3 preparation to a dedicated
message sent from the host driver only when the driver is about
to enter D0i3 - this reduces power consumption and latency for
certain workloads, for example audio workloads that submit
inference every 10 ms.
The VPU needs non zero time to enter IDLE state after responding to
D0i3 entry message. If the driver does not wait for the VPU to enter
IDLE state it could cause warm boot failures.
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-12-stanislaw.gruszka@linux.intel.com
|
|
Bump boot API to 4.20
Bump JSM API to 3.15
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028133415.1169975-2-stanislaw.gruszka@linux.intel.com
|
|
Introduce ivpu_jsm_msg_type_to_str() helper to print type of IPC
message. This will make reading logs and debugging IPC issues easier.
Co-developed-by: Maciej Falkowski <maciej.falkowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@intel.com>
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231020104501.697763-5-stanislaw.gruszka@linux.intel.com
|
|
Quite often during test corner cases IPC, JSM functions can flood
dmesg with warn or err messages. With that lost dmesg history.
Change warn, err to ratelimited versions in IPC, JSM to suppress
dmesg spam occurrence during fail test scenarios.
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231020104501.697763-2-stanislaw.gruszka@linux.intel.com
|
|
`strncpy` is deprecated for use on NUL-terminated destination strings [1].
A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on its destination buffer argument which is
_not_ the case for `strncpy`!
Also remove extraneous if-statement as it can never be entered. The
return value from `strncpy` is it's first argument. In this case,
`...dyndbg_cmd` is an array:
| char dyndbg_cmd[VPU_DYNDBG_CMD_MAX_LEN];
^^^^^^^^^^
This can never be NULL which means `strncpy`'s return value cannot be
NULL here. Just use `strscpy` which is more robust and results in
simpler and less ambiguous code.
Moreover, remove needless `... - 1` as `strscpy`'s implementation
ensures NUL-termination and we do not need to carefully dance around
ending boundaries with a "- 1" anymore.
Fixes: 5d7422cfb498 ("accel/ivpu: Add IPC driver and JSM messages")
Link: www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230824-strncpy-drivers-accel-ivpu-ivpu_jsm_msg-c-v1-1-12d9b52d2dff@google.com
|
|
The VPU_JSM_MSG_CONTEXT_DELETE will remove any resources associated
with the SSID, that included any blobs create by the user space
application.
The command can also remove doorbell registrations, but since this
does not work in HW scheduling case, we do not depend on this
capability and unregister the doorbells explicitly.
Fixes: cd7272215c44 ("accel/ivpu: Add command buffer submission logic")
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230202092114.2637452-3-stanislaw.gruszka@linux.intel.com
(cherry picked from commit 38257f514d85cd9d3f7586e768cec4f246635f77)
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
|
|
The IPC driver is used to send and receive messages to/from firmware
running on the VPU.
The only supported IPC message format is Job Submission Model (JSM)
defined in vpu_jsm_api.h header.
Co-developed-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Co-developed-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20230117092723.60441-5-jacek.lawrynowicz@linux.intel.com
|