Age | Commit message (Collapse) | Author |
|
This fixes the build here locally on my 32-bit arm build.
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit f439a959dcfb6b39d6fd4b85ca1110a1d1de1587)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The PPU contains a series of identical MMIO register ranges, one for
each power domain. Each range contains control/status bits for a clock
gate, reset line, output gates, and a power switch. (The clock and reset
are separate from, and in addition to, the bits in the CCU.) It also
contains a hardware power sequence engine to control the other bits.
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20230126063419.15971-3-samuel@sholland.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
|
|
Commit 2aeaf663b85e introduced strict checking for variable length
payload size validation. The payload length of received data must
match the size of the requested data by the caller except for the case
where the min_out value is set.
The Get Log command does not have a header with a length field set.
The Log size is determined by the Get Supported Logs command (CXL 3.0,
8.2.9.5.1). However, the actual size can be smaller and the number of
valid bytes in the payload output must be determined reading the
Payload Length field (CXL 3.0, Table 8-36, Note 2).
Two issues arise: The command can successfully complete with a payload
length of zero. And, the valid payload length must then also be
consumed by the caller.
Change cxl_xfer_log() to pass the number of payload bytes back to the
caller to determine the number of log entries. Implement the payload
handling as a special case where mbox_cmd->size_out is consulted when
cxl_internal_send_cmd() returns -EIO. A WARN_ONCE() is added to check
that -EIO is only returned in case of an unexpected output size.
Logs can be bigger than the maximum payload length and multiple Get
Log commands can be issued. If the received payload size is smaller
than the maximum payload size we can assume all valid bytes have been
fetched. Stop sending further Get Log commands then.
On that occasion, change debug messages to also report the opcodes of
supported commands.
The variable payload commands GET_LSA and SET_LSA are not affected by
this strict check: SET_LSA cannot be broken because SET_LSA does not
return an output payload, and GET_LSA never expects short reads.
Fixes: 2aeaf663b85e ("cxl/mbox: Add variable output size validation for internal commands")
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230119094934.86067-1-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A bunch of driver fixes with a tiny bit of new IDs"
* tag 'i2c-for-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rk3x: fix a bunch of kernel-doc warnings
i2c: axxia: use 'struct' for kernel-doc notation
dt-bindings: i2c: renesas,rzv2m: Fix SoC specific string
i2c: mxs: suppress probe-deferral error message
i2c: designware-pci: Add new PCI IDs for AMD NAVI GPU
i2c: designware: Fix unbalanced suspended flag
i2c: designware: use casting of u64 in clock multiplication to avoid overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix the -c option in the gpio-event-mode user-space example program
- fix the irq number translation in gpio-ep93xx and make its irqchip
immutable
- add a missing spin_unlock in error path in gpio-mxc
- fix a suspend breakage on System76 and Lenovo Gen2a introduced in
GPIO ACPI
* tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
tools: gpio: fix -c option of gpio-event-mon
gpio: ep93xx: remove unused variable
gpio: ep93xx: Make irqchip immutable
gpio: ep93xx: Fix port F hwirq numbers in handler
gpio: mxc: Unlock on error path in mxc_flip_edge()
gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
|
|
Pull drm fixes from Dave Airlie:
"Fairly small this week as well, i915 has a memory leak fix and some
minor changes, and amdgpu has some MST fixes, and some other minor
ones:
drm:
- DP MST kref fix
- fb_helper: check return value
i915:
- Fix BSC default context for Meteor Lake
- Fix selftest-scheduler's modify_type
- memory leak fix
amdgpu:
- GC11.x fixes
- SMU13.0.0 fix
- Freesync video fix
- DP MST fixes
- build fix"
* tag 'drm-fixes-2023-01-27' of git://anongit.freedesktop.org/drm/drm:
amdgpu: fix build on non-DCN platforms.
drm/amd/display: Fix timing not changning when freesync video is enabled
drm/display/dp_mst: Correct the kref of port.
drm/amdgpu/display/mst: update mst_mgr relevant variable when long HPD
drm/amdgpu/display/mst: limit payload to be updated one by one
drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count assignments
drm/amdgpu: declare firmware for new MES 11.0.4
drm/amdgpu: enable imu firmware for GC 11.0.4
drm/amd/pm: add missing AllowIHInterrupt message mapping for SMU13.0.0
drm/amdgpu: remove unconditional trap enable on add gfx11 queues
drm/fb-helper: Use a per-driver FB deferred I/O handler
drm/fb-helper: Check fb_deferred_io_init() return value
drm/i915/selftest: fix intel_selftest_modify_policy argument types
drm/i915/mtl: Fix bcs default context
drm/i915: Fix a memory leak with reused mmap_offset
drm/drm_vma_manager: Add drm_vma_node_allow_once()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"Add ACPI backlight handling quirks for 3 machines (Hans de Goede)"
* tag 'acpi-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: video: Add backlight=native DMI quirk for Asus U46E
ACPI: video: Add backlight=native DMI quirk for HP EliteBook 8460p
ACPI: video: Add backlight=native DMI quirk for HP Pavilion g6-1d80nr
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
"Add locking to the Intel int340x thermal control driver to prevent its
thermal zone callbacks from racing with firmware-induced thermal trip
point updates (Srinivas Pandruvada, Rafael Wysocki)"
* tag 'thermal-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()
thermal: intel: int340x: Protect trip temperature from concurrent updates
|
|
The GuC specific register state entry in the error capture object was
just called 'capture'. Although the companion 'node' entry was called
'guc_capture_node'. Rename the base entry to be 'guc_capture' instead
so that it is a) more consistent and b) more obvious what it is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
|
|
For understanding bug reports, it can be useful to have an explicit
dmesg print when a reset notification is received from GuC. As opposed
to simply inferring that this happened from other messages.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com
|
|
Engine resets are supposed to never fail. But in the case when one
does (due to unknown reasons that normally come down to a missing
w/a), it is useful to get as much information out of the system as
possible. Given that the GuC intentionally dies on such a situation,
it is not possible to get a guilty context notification back. So do a
manual search instead. Given that GuC is dead, this is safe because
GuC won't be changing the engine state asynchronously.
v2: Change comment to be less alarming (Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com
|
|
A hang situation has been observed where the only requests on the
context were either completed or not yet started according to the
breaadcrumbs. However, the register state claimed a batch was (maybe)
in progress. So, allow capture of the pending request on the grounds
that this might be better than nothing.
v2: Reword 'not started' warning message (Tvrtko)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-6-John.C.Harrison@Intel.com
|
|
There was a report of error captures occurring without any hung
context being indicated despite the capture being initiated by a 'hung
context notification' from GuC. The problem was not reproducible.
However, it is possible to happen if the context in question has no
active requests. For example, if the hang was in the context switch
itself then the breadcrumb write would have occurred and the KMD would
see an idle context.
In the interests of attempting to provide as much information as
possible about a hang, it seems wise to include the engine info
regardless of whether a request was found or not. As opposed to just
prentending there was no hang at all.
So update the error capture code to always record engine information
if a context is given. Which means updating record_context() to take a
context instead of a request (which it only ever used to find the
context anyway). And split the request agnostic parts of
intel_engine_coredump_add_request() out into a seaprate function.
v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null
pointer.
v3: Tidy up request locking code flow (Tvrtko)
v4: Pull in improved info message from next patch and fix up potential
leak of GuC register state (Daniele)
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2)
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-5-John.C.Harrison@Intel.com
|
|
The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.
So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.
v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
|
|
When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.
The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.
The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.
v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
|
|
intel_guc_find_hung_context() was not acquiring the correct spinlock
before searching the request list. So fix that up. While at it, add
some extra whitespace padding for readability.
Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
- Fix event counting regression in Arm CMN PMU driver due to broken
optimisation
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Partially revert "perf/arm-cmn: Optimise DTC counter accesses"
|
|
To work around a Clang __builtin_object_size bug that shows up under
CONFIG_FORTIFY_SOURCE and UBSAN_BOUNDS, move the per-loop-iteration
mem_block wipe into a single wipe of the entire pool structure after
the loop.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1780
Cc: Weili Qian <qianweili@huawei.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Link: https://lore.kernel.org/r/20230106041945.never.831-kees@kernel.org
|
|
Zero-length arrays are deprecated[1]. Replace struct i40e_lump_tracking's
"list" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:
In function 'i40e_put_lump',
inlined from 'i40e_clear_interrupt_scheme' at drivers/net/ethernet/intel/i40e/i40e_main.c:5145:2:
drivers/net/ethernet/intel/i40e/i40e_main.c:278:27: warning: array subscript <unknown> is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Warray-bounds=]
278 | pile->list[i] = 0;
| ~~~~~~~~~~^~~
drivers/net/ethernet/intel/i40e/i40e.h: In function 'i40e_clear_interrupt_scheme':
drivers/net/ethernet/intel/i40e/i40e.h:179:13: note: while referencing 'list'
179 | u16 list[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230105234557.never.799-kees@kernel.org
|
|
One-element arrays are deprecated, and we are replacing them with
flexible array members instead. So, replace one-element array with
flexible-array member in struct gvt_firmware_header and refactor the
rest of the code accordingly.
Additionally, previous implementation was allocating 8 bytes more than
required to represent firmware_header + cfg_space data + mmio data.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].
To make reviewing this patch easier, I'm pasting before/after struct
sizes.
pahole -C gvt_firmware_header before/drivers/gpu/drm/i915/gvt/firmware.o
struct gvt_firmware_header {
u64 magic; /* 0 8 */
u32 crc32; /* 8 4 */
u32 version; /* 12 4 */
u64 cfg_space_size; /* 16 8 */
u64 cfg_space_offset; /* 24 8 */
u64 mmio_size; /* 32 8 */
u64 mmio_offset; /* 40 8 */
unsigned char data[1]; /* 48 1 */
/* size: 56, cachelines: 1, members: 8 */
/* padding: 7 */
/* last cacheline: 56 bytes */
};
pahole -C gvt_firmware_header after/drivers/gpu/drm/i915/gvt/firmware.o
struct gvt_firmware_header {
u64 magic; /* 0 8 */
u32 crc32; /* 8 4 */
u32 version; /* 12 4 */
u64 cfg_space_size; /* 16 8 */
u64 cfg_space_offset; /* 24 8 */
u64 mmio_size; /* 32 8 */
u64 mmio_offset; /* 40 8 */
unsigned char data[]; /* 48 0 */
/* size: 48, cachelines: 1, members: 8 */
/* last cacheline: 48 bytes */
};
As you can see the additional byte of the fake-flexible array (data[1])
forced the compiler to pad the struct but those bytes aren't actually used
as first & last bytes (of both cfg_space and mmio) are controlled by the
<>_size and <>_offset members present in the gvt_firmware_header struct.
Link: https://github.com/KSPP/linux/issues/79
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/Y6Eu2604cqtryP4g@mail.google.com
|
|
Both Coverity and GCC with -Wstringop-overflow noticed that
nvif_outp_acquire_dp() accidentally defined its second argument with 1
additional element:
drivers/gpu/drm/nouveau/dispnv50/disp.c: In function 'nv50_pior_atomic_enable':
drivers/gpu/drm/nouveau/dispnv50/disp.c:1813:17: error: 'nvif_outp_acquire_dp' accessing 16 bytes in a region of size 15 [-Werror=stringop-overflow=]
1813 | nvif_outp_acquire_dp(&nv_encoder->outp, nv_encoder->dp.dpcd, 0, 0, false, false);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/nouveau/dispnv50/disp.c:1813:17: note: referencing argument 2 of type 'u8[16]' {aka 'unsigned char[16]'}
drivers/gpu/drm/nouveau/include/nvif/outp.h:24:5: note: in a call to function 'nvif_outp_acquire_dp'
24 | int nvif_outp_acquire_dp(struct nvif_outp *, u8 dpcd[16],
| ^~~~~~~~~~~~~~~~~~~~
Avoid these warnings by defining the argument size using the matching
define (DP_RECEIVER_CAP_SIZE, 15) instead of having it be a literal
(and incorrect) value (16).
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527269 ("Memory - corruptions")
Addresses-Coverity-ID: 1527268 ("Memory - corruptions")
Link: https://lore.kernel.org/lkml/202211100848.FFBA2432@keescook/
Link: https://lore.kernel.org/lkml/202211100848.F4C2819BB@keescook/
Fixes: 813443721331 ("drm/nouveau/disp: move DP link config into acquire")
Reviewed-by: Lyude Paul <lyude@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221127183036.never.139-kees@kernel.org
|
|
The PF controls the set of queues that the RDMA auxiliary_driver requests
resources from. The set_channel command will alter that pool and trigger a
reconfiguration of the VSI, which breaks RDMA functionality.
Prevent set_channel from executing when RDMA driver bound to auxiliary
device.
Adding a locked variable to pass down the call chain to avoid double
locking the device_lock.
Fixes: 348048e724a0 ("ice: Implement iidc operations")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The non-cache mkeys are stored in the cache only to shorten restarting
application time. Don't store them longer than needed.
Configure cache entries that store non-cache MRs as temporary entries. If
30 seconds have passed and no user reclaimed the temporarily cached mkeys,
an asynchronous work will destroy the mkeys entries.
Link: https://lore.kernel.org/r/20230125222807.6921-7-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently, when dereging an MR, if the mkey doesn't belong to a cache
entry, it will be destroyed. As a result, the restart of applications
with many non-cached mkeys is not efficient since all the mkeys are
destroyed and then recreated. This process takes a long time (for 100,000
MRs, it is ~20 seconds for dereg and ~28 seconds for re-reg).
To shorten the restart runtime, insert all cacheable mkeys to the cache.
If there is no fitting entry to the mkey properties, create a temporary
entry that fits it.
After a predetermined timeout, the cache entries will shrink to the
initial high limit.
The mkeys will still be in the cache when consuming them again after an
application restart. Therefore, the registration will be much faster
(for 100,000 MRs, it is ~4 seconds for dereg and ~5 seconds for re-reg).
The temporary cache entries created to store the non-cache mkeys are not
exposed through sysfs like the default cache entries.
Link: https://lore.kernel.org/r/20230125222807.6921-6-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Switch from using the mkey order to using the new struct as the key to the
RB tree of cache entries.
The key is all the mkey properties that UMR operations can't modify.
Using this key to define the cache entries and to search and create cache
mkeys.
Link: https://lore.kernel.org/r/20230125222807.6921-5-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently, the cache structure is a static linear array. Therefore, his
size is limited to the number of entries in it and is not expandable. The
entries are dedicated to mkeys of size 2^x and no access_flags. Mkeys with
different properties are not cacheable.
In this patch, we change the cache structure to an RB-tree. This will
allow to extend the cache to support more entries with different mkey
properties.
Link: https://lore.kernel.org/r/20230125222807.6921-4-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Implicit ODP mkey doesn't have unique properties. It shares the same
properties as the order 18 cache entry. There is no need to devote a
special entry for that.
Link: https://lore.kernel.org/r/20230125222807.6921-3-michaelgur@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mkc.log_page_size can be changed using UMR. Therefore, don't treat it as a
cache entry property.
Removing it from struct mlx5_cache_ent.
All cache mkeys will be created with default PAGE_SHIFT, and updated with
the needed page_shift using UMR when passing them to a user.
Link: https://lore.kernel.org/r/20230125222807.6921-2-michaelgur@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Use the newly implemented tegra_dev_iommu_get_stream_id() helper to
encapsulate and centralize the IOMMU stream ID access.
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
When calling spidev_message() from the one of the ioctl() callbacks, the
spi_lock is already taken. When we then end up calling spidev_sync(), we
get the following splat:
[ 214.047619]
[ 214.049198] ============================================
[ 214.054533] WARNING: possible recursive locking detected
[ 214.059858] 6.2.0-rc3-0.0.0-devel+git.97ec4d559d93 #1 Not tainted
[ 214.065969] --------------------------------------------
[ 214.071290] spidev_test/1454 is trying to acquire lock:
[ 214.076530] c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x8e0/0xab8
[ 214.084164]
[ 214.084164] but task is already holding lock:
[ 214.090007] c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x44/0xab8
[ 214.097537]
[ 214.097537] other info that might help us debug this:
[ 214.104075] Possible unsafe locking scenario:
[ 214.104075]
[ 214.110004] CPU0
[ 214.112461] ----
[ 214.114916] lock(&spidev->spi_lock);
[ 214.118687] lock(&spidev->spi_lock);
[ 214.122457]
[ 214.122457] *** DEADLOCK ***
[ 214.122457]
[ 214.128386] May be due to missing lock nesting notation
[ 214.128386]
[ 214.135183] 2 locks held by spidev_test/1454:
[ 214.139553] #0: c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x44/0xab8
[ 214.147524] #1: c4925e14 (&spidev->buf_lock){+.+.}-{3:3}, at: spidev_ioctl+0x70/0xab8
[ 214.155493]
[ 214.155493] stack backtrace:
[ 214.159861] CPU: 0 PID: 1454 Comm: spidev_test Not tainted 6.2.0-rc3-0.0.0-devel+git.97ec4d559d93 #1
[ 214.169012] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 214.175555] unwind_backtrace from show_stack+0x10/0x14
[ 214.180819] show_stack from dump_stack_lvl+0x60/0x90
[ 214.185900] dump_stack_lvl from __lock_acquire+0x874/0x2858
[ 214.191584] __lock_acquire from lock_acquire+0xfc/0x378
[ 214.196918] lock_acquire from __mutex_lock+0x9c/0x8a8
[ 214.202083] __mutex_lock from mutex_lock_nested+0x1c/0x24
[ 214.207597] mutex_lock_nested from spidev_ioctl+0x8e0/0xab8
[ 214.213284] spidev_ioctl from sys_ioctl+0x4d0/0xe2c
[ 214.218277] sys_ioctl from ret_fast_syscall+0x0/0x1c
[ 214.223351] Exception stack(0xe75cdfa8 to 0xe75cdff0)
[ 214.228422] dfa0: 00000000 00001000 00000003 40206b00 bee266e8 bee266e0
[ 214.236617] dfc0: 00000000 00001000 006a71a0 00000036 004c0040 004bfd18 00000000 00000003
[ 214.244809] dfe0: 00000036 bee266c8 b6f16dc5 b6e8e5f6
Fix it by introducing an unlocked variant of spidev_sync() and calling it
from spidev_message() while other users who don't check the spidev->spi's
existence keep on using the locking flavor.
Reported-by: Francesco Dolcini <francesco@dolcini.it>
Fixes: 1f4d2dd45b6e ("spi: spidev: fix a race condition when accessing spidev->spi")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Max Krummenacher <max.krummenacher@toradex.com>
Link: https://lore.kernel.org/r/20230116144149.305560-1-brgl@bgdev.pl
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Due to using the u16 type in the min_t() macros the SPI transfer length
will be cast to word before participating in the conditional statement
implied by the macro. Thus if the transfer length is greater than 64KB the
Tx/Rx FIFO threshold level value will be determined by the leftover of the
truncated after the type-case length. In the worst case it will cause the
dramatical performance drop due to the "Tx FIFO Empty" or "Rx FIFO Full"
interrupts triggered on each xfer word sent/received to/from the bus.
The problem can be easily fixed by specifying the unsigned int type in the
min_t() macros thus preventing the possible data loss.
Fixes: ea11370fffdf ("spi: dw: get TX level without an additional variable")
Reported-by: Sergey Nazarov <Sergey.Nazarov@baikalelectronics.ru>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230113185942.2516-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
DW eDMA v4.70a and older have the read and write channels context CSRs
indirectly accessible, which means CSRs like Channel Control, Xfer size,
SAR, DAR and LLP address are accessed at a fixed MMIO address, with their
reference to the corresponding channel determined by the Viewport CSR. To
have a coherent access to these registers the CSR IOs are supposed to be
protected with a spinlock. DW eDMA v4.80a and newer normally have unrolled
Read/Write channel context registers, with these CSRs directly mapped in
the controller MMIO space.
Both normal and viewport-based registers are exposed via debugfs nodes, and
the original algorithm was based on the unrolled CSRs mapping and
recalculated the viewport addresses when required. This is unscalable (it
only supports a platform with a single eDMA since a base address is
statically preserved) and also needlessly overcomplicated (it loops over
all Rd/Wr context addresses and recalculates the viewport base address on
each debugfs node access).
Simplify the algorithm by adding the channel ID and its direction fields in
the eDMA debugfs node descriptor. These new fields can be used to find a
CSR offset in the channel register space. The DW eDMA debugfs node getter
will also use them to activate the respective context CSRs viewport before
reading data from the specified register. For the unrolled CSR mapping, no
spinlock or viewport activation is needed.
Note: this replaces some REGISTER() uses with CTX_REGISTER(), which avoids
an implicit dependency on a local variable name. The same problem with the
rest of the macro will be fixed in the next commit.
Link: https://lore.kernel.org/r/20230113171409.30470-16-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Since we are about to add the eDMA channels direction support to the
debugfs module it will be confusing to have both the debugfs directory and
the channels direction short names used in the same code.
Rename the debugfs dentry 'dir' variables to 'dent' to prevent confusion.
Suggested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20230113171409.30470-15-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Currently DW eDMA debugfs node descriptors are allocated on the stack,
which won't work for multi-eDMA platforms. As a preparation to supporting
multi-eDMA systems, allocate each debugfs node separately. Afterwards
we'll add info like Read/Write channel flag, channel ID, DW eDMA private
data reference.
Note: this conversion is mainly required due to having the legacy DW eDMA
controllers with indirect Read/Write channels context CSRs access. If we
didn't need to synchronize access to these registers, the debugfs code of
the driver would have been much simpler.
Link: https://lore.kernel.org/r/20230113171409.30470-14-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Other local names include a "dw_edma" prefix.
Add a "dw_edma" prefix to the debugfs_entries structure, too, so it won't
be confused with global debugfs things.
Link: https://lore.kernel.org/r/20230113171409.30470-13-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The debugfs_create_*() functions never return NULL, so checking their
return value for NULL is pointless. Secondly, the debugfs subsystem is
designed to be as simple as possible, so if one of the debugfs_create_*()
method in a hierarchy fails, the following methods should silently return
the passed erroneous parental dentry. Finally, the code should work no
matter whether anything debugfs-related fails.
To make code simpler and debugfs-independent, stop checking the
debugfs_create_*() return values.
If the debugfs file system is unavailable, skip the debugfs node
initialization altogether to preserve some memory space.
Link: https://lore.kernel.org/r/20230113171409.30470-12-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The debugfs_entries structure declared in dw-edma-v0-debugfs.c contains the
debugfs node register address. The address is declared as dma_addr_t type,
but is cast to "void *".
Change the type to "void __iomem *" and drop the unnecessary casts.
Link: https://lore.kernel.org/r/20230113171409.30470-11-Sergey.Semin@baikalelectronics.ru
Fixes: 305aebeff879 ("dmaengine: Add Synopsys eDMA IP version 0 debugfs support")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The DMA engine core manages dma_device.chancnt itself, e.g., in
dma_async_device_register(). DMA device drivers should not initialize
chancnt because it causes the wrong number of channels printed in the
device summary.
Drop the dw-edma chancnt initialization.
Link: https://lore.kernel.org/r/20230113171409.30470-10-Sergey.Semin@baikalelectronics.ru
Fixes: e63d79d1ffcd ("dmaengine: Add Synopsys eDMA IP core driver")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The Synopsys PCIe Endpoint IP prototype kit can be attached via any PCI
host controller, including one where the PCI bus address space is different
from the CPU address space. Therefore, we need to make sure the source and
destination addresses of the DMA slave devices are converted to the PCI bus
address space; otherwise DMA transactions may cause memory corruption.
Add a new dw_edma_pcie_address() interface to perform this translation by
using pcibios_resource_to_bus().
Link: https://lore.kernel.org/r/20230113171409.30470-9-Sergey.Semin@baikalelectronics.ru
Fixes: 41aaff2a2ac0 ("dmaengine: Add Synopsys eDMA IP PCIe glue-logic")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Since 9575632052ba ("dmaengine: make slave address physical"), the source
and destination addresses of the DMA slave device have been converted to
physical addresses in the CPU address space. It's the DMA device driver's
responsibility to convert them to the DMA bus address space. In case of the
DW eDMA device, the source or destination peripheral (slave) devices reside
in PCI bus space. Thus we need to perform the PCI Host/Endpoint windows-
based (i.e. DT "ranges" property) address translation; otherwise the eDMA
transactions won't work as expected (or can be even harmful) if the CPU and
PCI address spaces don't match.
Note 1: Even though the DMA interleaved template has both source and
destination addresses declared as dma_addr_t, only the CPU memory range
should be mapped to be seen by the DMA device since it's a subject of the
DMA getting towards the system side. The device part must not be mapped
since the slave device resides in the PCI bus space, which isn't affected
by IOMMUs or iATU translations. DW PCIe eDMA generates corresponding
MWr/MRd TLPs on its own.
Note 2: This functionality is mainly required for the remote eDMA setup
since the CPU address must be manually translated into the PCI bus space
before being written to LLI.{SAR,DAR}. If eDMA is embedded in the locally
accessible DW PCIe Root Port/Endpoint, software-based translation isn't
required since hardware will translate it via the Outbound iATU as long as
the DMA_BYPASS flag is cleared. If DMA_BYPASS is set or there is no
Outbound iATU entry that contains the SAR or DAR (for Read and Write
channel respectively), there won't be any translation performed but DMA
will proceed with the corresponding source/destination address as-is.
Link: https://lore.kernel.org/r/20230113171409.30470-8-Sergey.Semin@baikalelectronics.ru
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The interleaved DMA transfer support added by 85e7518f42c8 ("dmaengine:
dw-edma: Add device_prep_interleave_dma() support") seems contradictory to
what the DMA engine defines. The next conditional statements:
if (!xfer->xfer.il->numf)
return NULL;
if (xfer->xfer.il->numf > 0 && xfer->xfer.il->frame_size > 0)
return NULL;
mean that numf can't be zero and frame_size must always be zero, otherwise
the transfer won't be executed. Furthermore, the transfer execution method
takes the frame size from the dma_interleaved_template.sgl[] array for each
frame. That array in accordance with [1] is supposed to be of
dma_interleaved_template.frame_size size, which as we discovered before the
code expects to be zero. So judging by the dw_edma_device_transfer()
implementation, the method implies the dma_interleaved_template.sgl[] array
being of dma_interleaved_template.numf size, which is wrong. Since the
dw_edma_device_transfer() method doesn't permit
dma_interleaved_template.frame_size being non-zero, the multi-chunk
interleaved transfer turns to be unsupported even though the code implies
having it supported.
Add fully functioning support of interleaved DMA transfers.
First of all, dma_interleaved_template.frame_size is supposed to be greater
or equal to one thus having at least simple linear chunked frames.
Secondly, we can create a walk-through over all the chunks and frames by
initializing the number of the eDMA burst transactions as a multiple of
dma_interleaved_template.numf and dma_interleaved_template.frame_size and
getting the frame_size-modulo of the iteration step as an index of the
dma_interleaved_template.sgl[] array.
[1] include/linux/dmaengine.h: doc struct dma_interleaved_template
Link: https://lore.kernel.org/r/20230113171409.30470-7-Sergey.Semin@baikalelectronics.ru
Fixes: 85e7518f42c8 ("dmaengine: dw-edma: Add device_prep_interleave_dma() support")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The DW eDMA controller always increments both source and destination
addresses. Permitting DMA interleaved transfers with no src_inc/dst_inc
flags set may lead to unexpected behaviour for the device users.
Terminate interleaved transfers if at least one of the
dma_interleaved_template.{src_inc,dst_inc} flag is initialized to "false".
Note that in addition, we need to increase the source and destination
addresses after each iteration.
Link: https://lore.kernel.org/r/20230113171409.30470-6-Sergey.Semin@baikalelectronics.ru
Fixes: 85e7518f42c8 ("dmaengine: dw-edma: Add device_prep_interleave_dma() support")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Interleaved DMA transfer support was added by 85e7518f42c8 ("dmaengine:
dw-edma: Add device_prep_interleave_dma() support"), but depending on the
selected channel, either source or destination address are left
uninitialized which was obviously wrong.
Initialize the destination address of the eDMA burst descriptors for
DEV_TO_MEM interleaved operations and the source address for MEM_TO_DEV
operations.
Link: https://lore.kernel.org/r/20230113171409.30470-5-Sergey.Semin@baikalelectronics.ru
Fixes: 85e7518f42c8 ("dmaengine: dw-edma: Add device_prep_interleave_dma() support")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
The dw_edma_region.paddr field should be a memory base address visible by
the DW eDMA controller. If the DMA engine is embedded in the DW PCIe
Host/Endpoint controller, the address should belong to the Local CPU/
Application memory. If eDMA is remotely accessible across the PCI bus via
PCI memory IOs, the address should be part of the PCI bus memory space.
The latter case hasn't been well covered in the corresponding glue-driver.
Since pci_dev.resource[] contains resources defined in the CPU memory
space, they need to be converted to the PCI bus address space. Convert the
LL, DT and CSRs PCI memory ranges with pci_bus_address().
In addition, extend the dw_edma_region.paddr field size. The field normally
contains a memory range base address to be set in the DW eDMA Linked-List
pointer register or as a base address of the Linked-List data buffer. In
accordance with [1] the LL range is supposed to be created in the Local
CPU/Application memory, but depending on the DW eDMA utilization the memory
can be created as a part of the PCI bus address space (as in the case of
the DW PCIe Endpoint prototype kit).
In the former case dw_edma_region.paddr should be a dma_addr_t, while in
the latter one it should be a pci_bus_addr_t. Since the corresponding CSRs
are always 64 bits wide, convert dw_edma_region.paddr to be u64, and let
the client make sure it has a valid address visible by the DW eDMA
controller. For instance, the DW eDMA PCIe glue-driver initializes the
field with addresses from the PCI bus memory space.
[1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port,
v.5.40a, March 2019, p.1103
Link: https://lore.kernel.org/r/20230113171409.30470-4-Sergey.Semin@baikalelectronics.ru
Fixes: 41aaff2a2ac0 ("dmaengine: Add Synopsys eDMA IP PCIe glue-logic")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
If dw_edma_irq_request() fails to initialize an IRQ handler, any previously
requested IRQs will be left initialized.
Release the previously requested IRQs in the cleanup-on-error path of
dw_edma_irq_request().
Link: https://lore.kernel.org/r/20230113171409.30470-3-Sergey.Semin@baikalelectronics.ru
Fixes: e63d79d1ffcd ("dmaengine: Add Synopsys eDMA IP core driver")
Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
|
|
Replace struct rxe-phys_buf and struct rxe_map by struct xarray
in rxe_verbs.h. This allows using rcu locking on reads for
the memory maps stored in each mr.
This is based off of a sketch of a patch from Jason Gunthorpe in the
link below. Some changes were needed to make this work. It applies
cleanly to the current for-next and passes the pyverbs, perftest
and the same blktests test cases which run today.
Link: https://lore.kernel.org/r/20230119235936.19728-7-rpearsonhpe@gmail.com
Link: https://lore.kernel.org/linux-rdma/Y3gvZr6%2FNCii9Avy@nvidia.com/
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
mtk_drm_bind() can fail, in which case drm_dev_put() is called,
destroying the drm_device object. However a pointer to it was still
being held in the private object, and that pointer would be passed along
to DRM in mtk_drm_sys_prepare() if a suspend were triggered at that
point, resulting in a panic. Clean the pointer when destroying the
object in the error path to prevent this from happening.
Fixes: 119f5173628a ("drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20221122143949.3493104-1-nfraprado@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|