Age | Commit message (Collapse) | Author |
|
The maximum DMA transfer size is limited by the maximum values that can
be written to the word count fields (WDLENx) in the Transmit and Control
Data Registers (SITDR2/SIRDR2). As all MSIOF variants support
transferring data of multiple (two or four) groups, the maximum size can
be doubled by using two groups instead of one, thus reducing setup
overhead for very large SPI transfers.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/bad522c76b8d225c195433977b22f95015cf2612.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
As FIELD_PREP() masks the value to be stored in the field, the Baud Rate
Generator's Division Ratio handling can be simplified from a look-up
table to a single subtraction.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/e736221942b0381fb53dc64109a1389f7ec5f44a.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The MSIOF transmit FIFOs on R-Car V4H and V4M have 256 stages.
Add a new family-specific match entry to handle this.
Add quirk match entries for older R-Car Gen4 Socs (R-Car V3U and S4-8)
that have transmit FIFOs with only 64 stages, just like on R-Car Gen3.
Update the (unused) definition of SIFCTR_TFUA for consistency.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/69cb5fc48f034d37484fa127b9864a1971a83417.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
According to the R-Car Gen3 Hardware Manual Errata for Rev 0.55 of
September 28, 2017, the MSIOF receive FIFOs on R-Car Gen3 SoCs have room
for 256 words of 32 bits.
Note that this change has no actual impact on the behavior of the
driver, as SPI_CONTROLLER_MUST_TX is set, and transfer size is currenty
limited to the minimum of the transmit and receive FIFO sizes.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/6f74508ea4681aa0b7c6bf6810eab026725e75a3.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
According to Renesas Technical Updates TN-RCS-S068A/E, the MSIOF receive
FIFOs on R-Car Gen2 SoCs have room for 128 words of 32 bits.
Note that this change has no actual impact on the behavior of the
driver, as SPI_CONTROLLER_MUST_TX is set, and transfer size is currenty
limited to the minimum of the transmit and receive FIFO sizes.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/fd11933f932df81d84f417a21e2179bd4fdcfdc1.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
All MSIOF variants support transferring data of multiple (2 or 4)
groups. Add definitions for the register bits related to multiple
groups, and enhance sh_msiof_spi_set_mode_regs() to accept a second
group size.
For now the second group is unused.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/be75e20cfcd2a6c0d73ab09e0126f902911adc69.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The Group Output Mask is not a single bit, but a bit field, containing
one bit for each of the four possible groups. Correct the definition.
Note that this change has no direct impact, as the driver only uses
the first group.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/ad268d67807cb7e544eddaf7a056793482a965d4.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert MSIOF FIFO Control Register field accesses to use the
FIELD_PREP() bitfield access macro.
This gets rid of explicit shifts and custom field preparation macros.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/0bf4c366381a8999c9755285272897300852bc18.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert MSIOF Control Register field accesses to use the FIELD_PREP()
bitfield access macro.
This gets rid of explicit shifts.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/4511c678c8fce5969eb50ffa7372d53396ff80ff.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert MSIOF Transmit and Receive Clock Select Register field accesses
to use the FIELD_PREP() bitfield access macro.
This gets rid of explicit shifts and custom field preparation macros.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/f2462c99b6ea2e45b995ab4509c2f039043da032.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert MSIOF Transmit and Receive Mode Register 2 field accesses to use
the FIELD_PREP() bitfield access macro.
This gets rid of explicit shifts and custom field preparation macros.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/135b92d010a71e2c224feab3a5792724b4e60ff1.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert MSIOF Transmit and Receive Mode Register 1 field accesses to use
the FIELD_PREP() bitfield access macro.
This gets rid of explicit shifts.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/9685c54e752b8ef4256c9b281e9d8292e71d222e.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Make the words and fs parameters of the various FIFO filler and
emptier functions unsigned, as they can never be negative.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/a7b13ecb1811148227ec8c883079085ed1ea6eac.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Make the words and bits parameters of sh_msiof_spi_txrx_once() unsigned,
as that matches what is passed by the caller.
This allows us to replace min_t() by the safer min().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/30eff1052642a4bcb0f1bc4bed7aae25d355a7dc.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use bools instead of integers for boolean flags, which allows us to
remove the "!!" idiom from several expressions.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/35cd51bdfb3c810911a5be757e0ce5bb29dcc755.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Commit c4887bd4b35b225f ("spi: sh-msiof: use dev in
sh_msiof_spi_probe()") forgot to convert one instance.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/88d271b2d16c6ad7f174858894573f91cec1bc90.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The maximum amount of data to transfer in a single DMA request is
calculated from the FIFO sizes (which is technically not 100% correct,
but a simplification, as it is limited by the maximum word count values
in the Transmit and Control Data Registers). However, in case there is
both data to transmit and to receive, the transmit limit is overwritten
by the receive limit.
Fix this by using the minimum applicable FIFO size instead. Move the
calculation outside the loop, so it is not repeated for each individual
DMA transfer.
As currently tx_fifo_size is always equal to rx_fifo_size, this bug had
no real impact.
Fixes: fe78d0b7691c0274 ("spi: sh-msiof: Fix FIFO size to 64 word from 256 word")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/d9961767a97758b2614f2ee8afe1bd56dc900a60.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The Clock-Synchronized Serial Interfaces with FIFO (MSIOF) driver
matches against both SoC-specific and family-specific compatible values
to maintain backwards-compatibility with old DTBs predating the
introduction of the family-specific compatible values.
For RZ/G1, the SoC-specific compatible match entry can be removed from
the driver: their DT always had the family-specific compatible values,
and thus there was never a need to add the SoC-specific compatible
values to the driver.
For R-Car Gen2 and M3-W, the SoC-specific compatible match entries can
be removed, too, as there are a few points in time where DT
backwards-compatibility was broken for other reasons:
- Legacy DT clock support is no longer supported since commit
58256143cff7c2e0 ("clk: renesas: Remove R-Car Gen2 legacy DT clock
support") in v5.5, and the addition of "renesas,rcar-gen2-msiof" to
DTS in v4.11 predates the completion of the clock conversion in
v4.15,
- Legacy DT LVDS support is no longer supported since commit
841281fe52a769fe ("drm: rcar-du: Drop LVDS device tree backward
compatibility") in v5.18, and the addition of
"renesas,rcar-gen3-msiof" in commit 8b51f97138ca22b6 ("arm64: dts:
r8a7796: Use R-Car Gen 3 fallback binding for msiof nodes") in v4.11
predates the LVDS conversion in commit 58e8ed2ee9abe718 ("arm64:
dts: renesas: Convert to new LVDS DT bindings") in v4.20.
For R-Car H3, the SoC-specific compatible match entry cannot be removed,
as its purpose is to handle an SoC-specific quirk.
Suggested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/d33393ac7536bc3f0f624b079f70d80dd19843db.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
It does not make sense to have a comma after a sentinel, as any new
elements must be added before the sentinel.
Add a comment to clarify the purpose of the empty element.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/2ab5745407339ba54b63c3e6410082c7c566bf95.1747401908.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
'struct spmi_voltage_range' are only modified at runtime to compile a
field, n_voltages, that could be computed at compile time.
So, simplify spmi_calculate_num_voltages() and compute n_voltages at
compile time within the SPMI_VOLTAGE_RANGE macro.
Constifying these structures moves some data to a read-only section, so
increase overall security.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
85437 26776 512 112725 1b855 drivers/regulator/qcom_spmi-regulator.o
After:
=====
text data bss dec hex filename
86857 24760 512 112129 1b601 drivers/regulator/qcom_spmi-regulator.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://patch.msgid.link/ef2a4b6df61e19470ddf6cbd1f3ca1ce88a3c1a0.1747570556.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use kmemdup_array() to avoid multiplication and possible overflows.
Signed-off-by: Zhang Enpei <zhang.enpei@zte.com.cn>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
When DP connected to a device with HDR capability,
the hdr structure was filled.Then connected to another
sink device without hdr capability, but the hdr info
still exist.
Fixes: e85959d6cbe0 ("drm: Parse HDR metadata info from EDID")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: "feijuan.li" <feijuan.li@samsung.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250514063511.4151780-1-feijuan.li@samsung.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The tee info reg for teev2 is the same as teev1.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The bootloader info reg for pspv5 is the same as pspv4.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
'struct ahash_request' has a flexible array at the end, so it must be the
last member in a struct, to avoid overwriting other struct members.
Therefore, move 'fallback_req' to the end of the 'sun8i_ce_hash_reqctx'
struct.
Fixes: 56f6d5aee88d ("crypto: sun8i-ce - support hash algorithms")
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Current driver uses static LMTST region allocated by firmware.
Firmware allocated memory for LMTST is available in PF/VF BAR2.
Using this memory have performance impact as this is mapped as
device memory. There is another option to allocate contiguous
memory at run time and map this in LMT MAP table with the
help of AF driver. With this patch dynamic allocated memory
is used for LMTST.
Also add myself as maintainer for crypto octeontx2 driver
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Function otx2_cptlf_set_dev_info() initializes common
fields of cptlfs data-struct. This function is called
every time a cptlf is initialized but this needs be done
once for a cptlf block. So this initialization is moved
to early device probe code to avoid redundant initialization.
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Enable the reporting of error counters through sysfs for QAT GEN6
devices and update the ABI documentation.
This enables the reporting of the following:
- errors_correctable - hardware correctable errors that allow the
system to recover without data loss.
- errors_nonfatal: errors that can be isolated to specific in-flight
requests.
- errors_fatal: errors that cannot be contained to a request,
requiring a Function Level Reset (FLR) upon occurrence.
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Enable the reporting and handling of errors for QAT GEN6 devices.
Errors are categorized as correctable, non-fatal, or fatal. Error
handling involves reading the error source registers (ERRSOU0 to ERRSOU3)
to determine the source of the error and then decoding the actual source
reading specific registers.
The action taken depends on the error type:
- Correctable and Non-Fatal errors. These error are logged, cleared and
the corresponding counter is incremented.
- Fatal errors. These errors are logged, cleared and a Function Level
Reset (FLR) is scheduled.
This reports and handles the following errors:
- Accelerator engine (AE) correctable errors
- Accelerator engine (AE) uncorrectable errors
- Chassis push-pull (CPP) errors
- Host interface (HI) parity errors
- Internal memory parity errors
- Receive interface (RI) errors
- Transmit interface (TI) errors
- Interface for system-on-chip (SoC) fabric (IOSF) primary command
parity errors
- Shared RAM and slice module (SSM) errors
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add a new CCP/PSP PCI device ID.
Signed-off-by: John Allen <john.allen@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-next
Updates for v6.16
CI:
- uprev mesa
GPU:
- ACD (Adaptive Clock Distribution) support for X1-85. This is required
enable the higher frequencies.
- Drop fictional `address_space_size`. For some older devices, the address
space size is limited to 4GB to avoid potential 64b rollover math problems
in the fw. For these, an `ADRENO_QUIRK_4GB_VA` quirk is added. For
everyone else we get the address space size from the SMMU `ias` (input
address sizes), which is usually 48b.
- Improve robustness when GMU HFI responses time out
- Fix crash when throttling GPU immediately during boot
- Fix for rgb565_predicator on Adreno 7c3
- Remove `MODULE_FIRMWARE()`s for GPU, the GPU can load the firmware after
probe and having partial set of fw (ie. sqe+gmu but not zap) causes problems
MDSS:
- Added SAR2130P support to MDSS driver
DPU:
- Changed to use single CTL path for flushing on DPU 5.x+
- Improved SSPP allocation code to allow sharing of SSPP between planes
- Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
- Added SAR2130P support
- Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
- Misc fixes
DP:
- Switch to use new helpers for DP Audio / HDMI codec handling
- Fixed LTTPR handling
DSI:
- Added support for SA8775P
- Added SAR2130P support
MDP4:
- Fixed LCDC / LVDS controller on
HDMI:
- Switched to use new helpers for ACR data
- Fixed old standing issue of HPD not working in some cases
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/CAF6AEGv2Go+nseaEwRgeZbecet-h+Pf2oBKw1CobCF01xu2XVg@mail.gmail.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amdgpu:
- Misc code cleanups
- UserQ fixes
- MALL reporting fix
- DP AUX fixes
- DCN 3.5 fixes
- DP MST fixes
- DC DMI quirks cleanup
- RAS fixes
- SR-IOV updates
- GC 9.5 updates
- Misc display fixes
- VCN 4.0.5 powergating race fix
- SMU 13.x updates
- Paritioning fixes
- VCN 5.0.1 SR-IOV updates
- JPEG 5.0.1 SR-IOV updates
amdkfd:
- Fix spurious warning in interrupt code
- XNACK fixes
radeon:
- CIK doorbell cleanup
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250516204609.2437472-1-alexander.deucher@amd.com
|
|
Log write descriptors for the event queue, leveraging vhost_get_vq_desc()
to retrieve the array of write descriptors to obtain the log buffer.
There is only one path for event queue.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20250403063028.16045-9-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Log write descriptors for the control queue, leveraging
vhost_scsi_get_desc() and vhost_get_vq_desc() to retrieve the array of
write descriptors to obtain the log buffer.
For Task Management Requests, similar to the I/O queue, store the log
buffer during the submission path and log it in the completion or error
handling path.
For Asynchronous Notifications, only the submission path is involved.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20250403063028.16045-8-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
|
|
Log write descriptors for the I/O queue, leveraging vhost_scsi_get_desc()
and vhost_get_vq_desc() to retrieve the array of write descriptors to
obtain the log buffer.
In addition, introduce a vhost-scsi specific function to log vring
descriptors. In this function, the 'partial' argument is set to false, and
the 'len' argument is set to 0, because vhost-scsi always logs all pages
shared by a vring descriptor. Add WARN_ON_ONCE() since vhost-scsi doesn't
support VIRTIO_F_ACCESS_PLATFORM.
The per-cmd log buffer is allocated on demand in the submission path after
VHOST_F_LOG_ALL is set. Return -ENOMEM on allocation failure, in order to
send SAM_STAT_TASK_SET_FULL to the guest.
It isn't reclaimed in the completion path. Instead, it is reclaimed when
VHOST_F_LOG_ALL is removed, or during VHOST_SCSI_SET_ENDPOINT when all
commands are destroyed.
Store the log buffer during the submission path and log it in the
completion path. Logging is also required in the error handling path of the
submission process.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20250403063028.16045-7-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
|
|
Adjust vhost_scsi_get_desc() to facilitate logging of vring descriptors.
Add new arguments to allow passing the log buffer and length to
vhost_get_vq_desc().
In addition, reset 'log_num' since vhost_get_vq_desc() may reset it only
after certain condition checks.
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20250403063028.16045-6-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Currently, the only user of vhost_log_write() is vhost-net. The 'len'
argument prevents logging of pages that are not tainted by the RX path.
Adjustments are needed since more drivers (i.e. vhost-scsi) begin using
vhost_log_write(). So far vhost-net RX path may only partially use pages
shared via the last vring descriptor. Unlike vhost-net, vhost-scsi always
logs all pages shared via vring descriptors. To accommodate this,
use (len == U64_MAX) to indicate whether the driver would log all pages of
vring descriptors, or only pages that are tainted by the driver.
In addition, removes BUG().
Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20250403063028.16045-5-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Adds basic support for the new display classes available on GB20x GPUs.
Most of the changes here deal with HW method moves, with the only other
change of note being tweaks to skip allocation of CTXDMA objects, which
aren't required on Blackwell display.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Some older NVIDIA and some newer NVIDIA hardware/firmware seems to
have issues with address only transactions (firmware rejects them).
Add an option to the core drm dp to avoid address only transactions,
This just puts the MOT flag removal on the last message of the transfer
and avoids the start of transfer transaction.
This with the flag set in nouveau, allows eDP probing on GB203 device.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit adds support for the GB20x GPUs found on GeForce RTX 50xx
series boards.
Beyond a few miscellaneous register moves and HW class ID plumbing,
this reuses most of the code added to support GH100/GB10x.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The doorbell register on GB20x GPUs has additional fields.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit enables basic support for the GB100/GB102 Blackwell GPUs.
Beyond HW class ID plumbing there's very little change here vs GH100.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
From VOLTA_CHANNEL_GPFIFO_A onwards, HW no longer updates the GET/GP_GET
pointers in USERD following channel progress, but instead updates on a
timer for compatibility, and SW is expected to implement its own method
of tracking channel progress (typically via non-WFI semaphore release).
Nouveau has been making use of the compatibility mode up until now,
however, from BLACKWELL_CHANNEL_GPFIFO_A HW no longer supports USERD
writeback at all.
Allocate a per-channel buffer in system memory, and append a non-WFI
semaphore release to the end of each push buffer segment to simulate
the pointers previously read from USERD.
This change is implemented for Fermi (which is the first to support non-
WFI semaphore release) onwards, as readback from system memory is likely
faster than BAR1 reads.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Primarily a cleanup to allow for changes in newer CHANNEL_GPFIFO classes
to be more easily implemented.
Compared to the prior implementation, this submits userspace push buffer
segments as subroutines and uses the NV_RAMUSERD_TOP_LEVEL_GET registers
to track the main (kernel) push buffer progress.
Fixes a number of sporadic failures seen during piglit runs.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Replace some awkward sequences that are repeated in a number of places
with helper functions.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit enables basic support for Hopper GPUs, and is intended
primarily as a base supporting Blackwell GPUs, which reuse most of
the code added here.
Advanced features such as Confidential Compute are not supported.
Beyond a few miscellaneous register moves and HW class ID plumbing,
the bulk of the changes implemented here are to support the GSP-RM
boot sequence used on Hopper/Blackwell GPUs, as well as a new page
table layout.
There should be no changes here that impact prior GPUs.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Co-developed-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
GPUs exist now with a 64-bit BAR0, which mean that BAR1 and BAR2's
indices (as passed to pci_resource_len() etc) are bumped up by one.
Modify nvkm_device.resource_addr/size() to take an enum instead of
an integer bar index, and take IORESOURCE_MEM_64 into account when
translating to the "raw" bar id.
[airlied: fixup ERR_PTR]
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
HOPPER_CHANNEL_GPFIFO_A removes the SEMAPHORE[A-D] methods that are
currently used by nouveau to implement fences on GF100 and newer.
Switch to the newer SEM methods available from VOLTA_CHANNEL_GPFIFO,
which are also available on the Hopper/Blackwell host classes.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Use data from 'struct nvkm_vmm_page/desc' to determine which PDEs need
to be mirrored to RM instead of hardcoded values for pre-Hopper page
tables.
Needed to support Hopper/Blackwell.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The current code using NV90F1_CTRL_CMD_VASPACE_COPY_SERVER_RESERVED_PDES
not only requires changes to support the new page table layout used on
Hopper/Blackwell GPUs, but is also broken in that it always mirrors the
PDEs used for virtual address 0, rather than the area reserved for RM.
This works fine for the non-NVK case where the kernel has full control
of the VMM layout and things end up in the right place, but NVK puts
its kernel reserved area much higher in the address space.
Fixing the code to work at any VA is not enough as some parts of RM want
the reserved area in a specific location, and NVK would then hit other
assertions in RM instead.
Fortunately, it appears that RM never needs to allocate anything within
its reserved area for DRM clients, and the COPY_SERVER_RESERVED_PDES
control call primarily serves to allow RM to locate the root page table
when initialising a channel's instance block.
Flag VMMs allocated by the DRM driver as externally owned, and use
NV0080_CTRL_CMD_DMA_SET_PAGE_DIRECTORY to inform RM of the root page
table in a similar way to NVIDIA's UVM driver.
The COPY_SERVER_RESERVED_PDES paths are kept for the golden context
image and gr scrubber channel, where RM needs the reserved area.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|