Age | Commit message (Collapse) | Author |
|
The shash desc needs to be zeroed after use in setkey as it is
not finalised (finalisation automatically zeroes it).
Also remove the final function as it's been superseded by finup.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Provide an option to handle the partial blocks in the ahash API.
Almost every hash algorithm has a block size and are only able
to hash partial blocks on finalisation.
As a first step disable virtual address support for algorithms
with state sizes larger than HASH_MAX_STATESIZE. This is OK as
virtual addresses are currently only used on synchronous fallbacks.
This means ahash_do_req_chain only needs to handle synchronous
fallbacks, removing the complexities of saving the request state.
Also move the saved request state into the ahash_request object
as nesting is no longer possible.
Add a scatterlist to ahash_request to store the partial block.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add export_core and import_core hooks. These are intended to be
used by algorithms which are wrappers around block-only algorithms,
but are not themselves block-only, e.g., hmac.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The core export and import functions are targeted at implementors
so move them into internal/hash.h.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the Marvell CESA binding to DT schema format. The
marvell-cesa.txt and mv_cesa.txt are duplicate bindings.
The clock properties are quite varied for each platform hence the
if/then schemas. The old binding was fairly accurate with reality.
The original binding didn't mention there is 1 interrupt per CESA
engine. Based on users, there's a maximum of 2 engines.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the Imagination Technologies hardware hash accelerator binding
to DT schema format. It's a straight forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the HiSilicon HIP06/7 Security Accelerator binding to DT schema
format. It's a straight forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the Broadcom SPUM/SPU2 binding to DT schema format. It's a
straight forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the Axis Crypto engine binding to DT schema format. It's a
straight forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Convert the AMD Cryptographic Coprocessor binding to DT schema format.
It's a straight forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The mediatek,eip97-crypto binding is half abandoned. The driver was
dropped in 2020 as the Mediatek platforms use InsideSecure block and
the driver for it. All the platforms except MT7623 were updated. A
patch to update it was submitted, but never addressed the review
comments.
Link: https://lore.kernel.org/all/20210303080923.16761-1-vic.wu@mediatek.com/
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The fsl,sec-v6.0 binding is the same as the fsl,sec-v4.0 binding, so add
it to the existing schema and drop the old .txt binding.
The compatibles in the .txt binding don't match the 1 user. Follow the
user for the schema.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Enable the reporting of error counters through sysfs for QAT GEN6
devices and update the ABI documentation.
This enables the reporting of the following:
- errors_correctable - hardware correctable errors that allow the
system to recover without data loss.
- errors_nonfatal: errors that can be isolated to specific in-flight
requests.
- errors_fatal: errors that cannot be contained to a request,
requiring a Function Level Reset (FLR) upon occurrence.
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Enable the reporting and handling of errors for QAT GEN6 devices.
Errors are categorized as correctable, non-fatal, or fatal. Error
handling involves reading the error source registers (ERRSOU0 to ERRSOU3)
to determine the source of the error and then decoding the actual source
reading specific registers.
The action taken depends on the error type:
- Correctable and Non-Fatal errors. These error are logged, cleared and
the corresponding counter is incremented.
- Fatal errors. These errors are logged, cleared and a Function Level
Reset (FLR) is scheduled.
This reports and handles the following errors:
- Accelerator engine (AE) correctable errors
- Accelerator engine (AE) uncorrectable errors
- Chassis push-pull (CPP) errors
- Host interface (HI) parity errors
- Internal memory parity errors
- Receive interface (RI) errors
- Transmit interface (TI) errors
- Interface for system-on-chip (SoC) fabric (IOSF) primary command
parity errors
- Shared RAM and slice module (SSM) errors
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Add a new CCP/PSP PCI device ID.
Signed-off-by: John Allen <john.allen@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
If accept(2) is called on socket type algif_hash with
MSG_MORE flag set and crypto_ahash_import fails,
sk2 is freed. However, it is also freed in af_alg_release,
leading to slab-use-after-free error.
Fixes: fe869cdb89c9 ("crypto: algif_hash - User-space interface for hash operations")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A recent patch that addressed a UAF introduced a reference count leak:
the parallel_data refcount is incremented unconditionally, regardless
of the return value of queue_work(). If the work item is already queued,
the incremented refcount is never decremented.
Fix this by checking the return value of queue_work() and decrementing
the refcount when necessary.
Resolves:
Unreferenced object 0xffff9d9f421e3d80 (size 192):
comm "cryptomgr_probe", pid 157, jiffies 4294694003
hex dump (first 32 bytes):
80 8b cf 41 9f 9d ff ff b8 97 e0 89 ff ff ff ff ...A............
d0 97 e0 89 ff ff ff ff 19 00 00 00 1f 88 23 00 ..............#.
backtrace (crc 838fb36):
__kmalloc_cache_noprof+0x284/0x320
padata_alloc_pd+0x20/0x1e0
padata_alloc_shell+0x3b/0xa0
0xffffffffc040a54d
cryptomgr_probe+0x43/0xc0
kthread+0xf6/0x1f0
ret_from_fork+0x2f/0x50
ret_from_fork_asm+0x1a/0x30
Fixes: dd7d37ccf6b1 ("padata: avoid UAF for reorder_work")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This wasn't checking indirect extents.
Fixes: https://github.com/koverstreet/bcachefs/issues/887
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-next
Updates for v6.16
CI:
- uprev mesa
GPU:
- ACD (Adaptive Clock Distribution) support for X1-85. This is required
enable the higher frequencies.
- Drop fictional `address_space_size`. For some older devices, the address
space size is limited to 4GB to avoid potential 64b rollover math problems
in the fw. For these, an `ADRENO_QUIRK_4GB_VA` quirk is added. For
everyone else we get the address space size from the SMMU `ias` (input
address sizes), which is usually 48b.
- Improve robustness when GMU HFI responses time out
- Fix crash when throttling GPU immediately during boot
- Fix for rgb565_predicator on Adreno 7c3
- Remove `MODULE_FIRMWARE()`s for GPU, the GPU can load the firmware after
probe and having partial set of fw (ie. sqe+gmu but not zap) causes problems
MDSS:
- Added SAR2130P support to MDSS driver
DPU:
- Changed to use single CTL path for flushing on DPU 5.x+
- Improved SSPP allocation code to allow sharing of SSPP between planes
- Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
- Added SAR2130P support
- Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
- Misc fixes
DP:
- Switch to use new helpers for DP Audio / HDMI codec handling
- Fixed LTTPR handling
DSI:
- Added support for SA8775P
- Added SAR2130P support
MDP4:
- Fixed LCDC / LVDS controller on
HDMI:
- Switched to use new helpers for ACR data
- Fixed old standing issue of HPD not working in some cases
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/CAF6AEGv2Go+nseaEwRgeZbecet-h+Pf2oBKw1CobCF01xu2XVg@mail.gmail.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amdgpu:
- Misc code cleanups
- UserQ fixes
- MALL reporting fix
- DP AUX fixes
- DCN 3.5 fixes
- DP MST fixes
- DC DMI quirks cleanup
- RAS fixes
- SR-IOV updates
- GC 9.5 updates
- Misc display fixes
- VCN 4.0.5 powergating race fix
- SMU 13.x updates
- Paritioning fixes
- VCN 5.0.1 SR-IOV updates
- JPEG 5.0.1 SR-IOV updates
amdkfd:
- Fix spurious warning in interrupt code
- XNACK fixes
radeon:
- CIK doorbell cleanup
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250516204609.2437472-1-alexander.deucher@amd.com
|
|
There is a race condition in the readdir concurrency process, which may
access the rsp buffer after it has been released, triggering the
following KASAN warning.
==================================================================
BUG: KASAN: slab-use-after-free in cifs_fill_dirent+0xb03/0xb60 [cifs]
Read of size 4 at addr ffff8880099b819c by task a.out/342975
CPU: 2 UID: 0 PID: 342975 Comm: a.out Not tainted 6.15.0-rc6+ #240 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_report+0xce/0x640
kasan_report+0xb8/0xf0
cifs_fill_dirent+0xb03/0xb60 [cifs]
cifs_readdir+0x12cb/0x3190 [cifs]
iterate_dir+0x1a1/0x520
__x64_sys_getdents+0x134/0x220
do_syscall_64+0x4b/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f996f64b9f9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 0d f7 c3 0c 00 f7 d8 64 89 8
RSP: 002b:00007f996f53de78 EFLAGS: 00000207 ORIG_RAX: 000000000000004e
RAX: ffffffffffffffda RBX: 00007f996f53ecdc RCX: 00007f996f64b9f9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007f996f53dea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000207 R12: ffffffffffffff88
R13: 0000000000000000 R14: 00007ffc8cd9a500 R15: 00007f996f51e000
</TASK>
Allocated by task 408:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
__kasan_slab_alloc+0x6e/0x70
kmem_cache_alloc_noprof+0x117/0x3d0
mempool_alloc_noprof+0xf2/0x2c0
cifs_buf_get+0x36/0x80 [cifs]
allocate_buffers+0x1d2/0x330 [cifs]
cifs_demultiplex_thread+0x22b/0x2690 [cifs]
kthread+0x394/0x720
ret_from_fork+0x34/0x70
ret_from_fork_asm+0x1a/0x30
Freed by task 342979:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x37/0x50
kmem_cache_free+0x2b8/0x500
cifs_buf_release+0x3c/0x70 [cifs]
cifs_readdir+0x1c97/0x3190 [cifs]
iterate_dir+0x1a1/0x520
__x64_sys_getdents64+0x134/0x220
do_syscall_64+0x4b/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff8880099b8000
which belongs to the cache cifs_request of size 16588
The buggy address is located 412 bytes inside of
freed 16588-byte region [ffff8880099b8000, ffff8880099bc0cc)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99b8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
anon flags: 0x80000000000040(head|node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
head: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
head: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
head: 0080000000000003 ffffea0000266e01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880099b8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880099b8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880099b8180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880099b8200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880099b8280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
POC is available in the link [1].
The problem triggering process is as follows:
Process 1 Process 2
-----------------------------------------------------------------
cifs_readdir
/* file->private_data == NULL */
initiate_cifs_search
cifsFile = kzalloc(sizeof(struct cifsFileInfo), GFP_KERNEL);
smb2_query_dir_first ->query_dir_first()
SMB2_query_directory
SMB2_query_directory_init
cifs_send_recv
smb2_parse_query_directory
srch_inf->ntwrk_buf_start = (char *)rsp;
srch_inf->srch_entries_start = (char *)rsp + ...
srch_inf->last_entry = (char *)rsp + ...
srch_inf->smallBuf = true;
find_cifs_entry
/* if (cfile->srch_inf.ntwrk_buf_start) */
cifs_small_buf_release(cfile->srch_inf // free
cifs_readdir ->iterate_shared()
/* file->private_data != NULL */
find_cifs_entry
/* in while (...) loop */
smb2_query_dir_next ->query_dir_next()
SMB2_query_directory
SMB2_query_directory_init
cifs_send_recv
compound_send_recv
smb_send_rqst
__smb_send_rqst
rc = -ERESTARTSYS;
/* if (fatal_signal_pending()) */
goto out;
return rc
/* if (cfile->srch_inf.last_entry) */
cifs_save_resume_key()
cifs_fill_dirent // UAF
/* if (rc) */
return -ENOENT;
Fix this by ensuring the return code is checked before using pointers
from the srch_inf.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220131 [1]
Fixes: a364bc0b37f1 ("[CIFS] fix saving of resume key before CIFSFindNext")
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
https://gitlab.freedesktop.org/drm/kernel into drm-next
drm/nouveau: r570 and hopper/blackwell support
This series implements support for booting GSP-RM firmware version
570.144, and adds support for GH100, GB10x, and GB20x GPUs.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Adds basic support for the new display classes available on GB20x GPUs.
Most of the changes here deal with HW method moves, with the only other
change of note being tweaks to skip allocation of CTXDMA objects, which
aren't required on Blackwell display.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Some older NVIDIA and some newer NVIDIA hardware/firmware seems to
have issues with address only transactions (firmware rejects them).
Add an option to the core drm dp to avoid address only transactions,
This just puts the MOT flag removal on the last message of the transfer
and avoids the start of transfer transaction.
This with the flag set in nouveau, allows eDP probing on GB203 device.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit adds support for the GB20x GPUs found on GeForce RTX 50xx
series boards.
Beyond a few miscellaneous register moves and HW class ID plumbing,
this reuses most of the code added to support GH100/GB10x.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The doorbell register on GB20x GPUs has additional fields.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit enables basic support for the GB100/GB102 Blackwell GPUs.
Beyond HW class ID plumbing there's very little change here vs GH100.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
From VOLTA_CHANNEL_GPFIFO_A onwards, HW no longer updates the GET/GP_GET
pointers in USERD following channel progress, but instead updates on a
timer for compatibility, and SW is expected to implement its own method
of tracking channel progress (typically via non-WFI semaphore release).
Nouveau has been making use of the compatibility mode up until now,
however, from BLACKWELL_CHANNEL_GPFIFO_A HW no longer supports USERD
writeback at all.
Allocate a per-channel buffer in system memory, and append a non-WFI
semaphore release to the end of each push buffer segment to simulate
the pointers previously read from USERD.
This change is implemented for Fermi (which is the first to support non-
WFI semaphore release) onwards, as readback from system memory is likely
faster than BAR1 reads.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Primarily a cleanup to allow for changes in newer CHANNEL_GPFIFO classes
to be more easily implemented.
Compared to the prior implementation, this submits userspace push buffer
segments as subroutines and uses the NV_RAMUSERD_TOP_LEVEL_GET registers
to track the main (kernel) push buffer progress.
Fixes a number of sporadic failures seen during piglit runs.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Replace some awkward sequences that are repeated in a number of places
with helper functions.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This commit enables basic support for Hopper GPUs, and is intended
primarily as a base supporting Blackwell GPUs, which reuse most of
the code added here.
Advanced features such as Confidential Compute are not supported.
Beyond a few miscellaneous register moves and HW class ID plumbing,
the bulk of the changes implemented here are to support the GSP-RM
boot sequence used on Hopper/Blackwell GPUs, as well as a new page
table layout.
There should be no changes here that impact prior GPUs.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Co-developed-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
GPUs exist now with a 64-bit BAR0, which mean that BAR1 and BAR2's
indices (as passed to pci_resource_len() etc) are bumped up by one.
Modify nvkm_device.resource_addr/size() to take an enum instead of
an integer bar index, and take IORESOURCE_MEM_64 into account when
translating to the "raw" bar id.
[airlied: fixup ERR_PTR]
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
HOPPER_CHANNEL_GPFIFO_A removes the SEMAPHORE[A-D] methods that are
currently used by nouveau to implement fences on GF100 and newer.
Switch to the newer SEM methods available from VOLTA_CHANNEL_GPFIFO,
which are also available on the Hopper/Blackwell host classes.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Use data from 'struct nvkm_vmm_page/desc' to determine which PDEs need
to be mirrored to RM instead of hardcoded values for pre-Hopper page
tables.
Needed to support Hopper/Blackwell.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The current code using NV90F1_CTRL_CMD_VASPACE_COPY_SERVER_RESERVED_PDES
not only requires changes to support the new page table layout used on
Hopper/Blackwell GPUs, but is also broken in that it always mirrors the
PDEs used for virtual address 0, rather than the area reserved for RM.
This works fine for the non-NVK case where the kernel has full control
of the VMM layout and things end up in the right place, but NVK puts
its kernel reserved area much higher in the address space.
Fixing the code to work at any VA is not enough as some parts of RM want
the reserved area in a specific location, and NVK would then hit other
assertions in RM instead.
Fortunately, it appears that RM never needs to allocate anything within
its reserved area for DRM clients, and the COPY_SERVER_RESERVED_PDES
control call primarily serves to allow RM to locate the root page table
when initialising a channel's instance block.
Flag VMMs allocated by the DRM driver as externally owned, and use
NV0080_CTRL_CMD_DMA_SET_PAGE_DIRECTORY to inform RM of the root page
table in a similar way to NVIDIA's UVM driver.
The COPY_SERVER_RESERVED_PDES paths are kept for the golden context
image and gr scrubber channel, where RM needs the reserved area.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
When mirroring BAR2 page tables to RM, we need to know the level shift
for the root page table (which is currently hardcoded), as well as the
raw PDE value (which is currently hardcoded in GP1xx-AD1xx format).
In order to support GH100/GBxxx, modify the code to determine the page
shift from per-GPU info in nvkm_vmm_page, as well as read the relevant
PDE back from the root page table rather than recalculating it.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
GH100/GBxxx have 6-level page tables.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
GH100/GBxxx have moved the register that controls where in VRAM the
the BAR0 NV_PRAMIN window points.
Add a HAL for this, as the BAR0 window is needed for BAR2 bootstrap.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
These registers have moved on GH100/GBxxx, and the GSP-RM init code uses
hardcoded values from earlier GPUs to fill GspSystemInfo.
Replace the per-GPU accessors in nvkm_pci_func with region info, and use
it when initialising GspSystemInfo.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Add r570-specific HAL routines, and support loading of GSP-RM version
570.144 if firmware is available.
There should be no impact on r535, or non-GSP paths.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
570.144 has incompatible changes to NV0000_ALLOC_PARAMETERS.
Factor out the common code so it can be shared.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
570.86.15 uses a slightly different calculation for the size of the
sysmem buffer needed to store GSP-RM's vidmem data across suspend.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
565.57.01 has incompatible changes to
NV50VAIO_CHANNELDMA_ALLOCATION_PARAMETERS.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
565.57.01 has incompatible changes to rpc_rc_triggered_v17_02.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
555.42.02 reserves some CHIDs for internal use.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
570.86.16 has incompatible changes to NV_CHANNEL_ALLOC_PARAMS.
At the same time, remove the duplicated channel allocation code from
golden context init.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
555.42.02 has incompatible changes to NV0073_CTRL_CMD_DP_GET_CAPS.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
555.42.02 has incompatible changes to NV0073_CTRL_CMD_SYSTEM_GET_ACTIVE.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
555.42.02 has incompatible changes to
NV0073_CTRL_CMD_SYSTEM_GET_CONNECT_STATE.
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|