summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2021-07-01drm/i915/display/dg1: Correctly map DPLLs during state readoutJosé Roberto de Souza
_DG1_DPCLKA0_CFGCR0 maps between DPLL 0 and 1 with one bit for phy A and B while _DG1_DPCLKA1_CFGCR0 maps between DPLL 2 and 3 with one bit for phy C and D. Reusing _cnl_ddi_get_pll() don't take that into cosideration returing DPLL 0 and 1 for phy C and D. That is a regression introduced in the refactor done in commit 351221ffc5e5 ("drm/i915: Move DDI clock readout to encoder->get_config()"). While at it also dropping the macros previously used, not reusing it to improve readability. BSpec: 50286 Fixes: 351221ffc5e5 ("drm/i915: Move DDI clock readout to encoder->get_config()") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630210522.162674-1-jose.souza@intel.com
2021-07-01drm/amdgpu: Avoid printing of stack contents on firmware load errorJiri Kosina
In case when psp_init_asd_microcode() fails to load ASD microcode file, psp_v12_0_init_microcode() tries to print the firmware filename that failed to load before bailing out. This is wrong because: - the firmware filename it would want it print is an incorrect one as psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading different filenames - it tries to print fw_name, but that's not yet been initialized by that time, so it prints random stack contents, e.g. amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2 amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff" Fix that by bailing out immediately, instead of priting the bogus error message. Reported-by: Vojtech Pavlik <vojtech@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Fix resource leak on probe error pathJiri Kosina
This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980. It is not true (as stated in the reverted commit changelog) that we never unmap the BAR on failure; it actually does happen properly on amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> amdgpu_device_fini() error path. What's worse, this commit actually completely breaks resource freeing on probe failure (like e.g. failure to load microcode), as amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too early, leaving all the resources that'd normally be freed in amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading to all sorts of oopses when someone tries to, for example, access the sysfs and procfs resources which are still around while the driver is gone. Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure") Reported-by: Vojtech Pavlik <vojtech@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01Merge drm/drm-next into drm-intel-nextJani Nikula
Bring drm-intel-next closer to drm-next and drm-intel-gt-next for a more feasible baseline for topic branches. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-07-01drm/aperture: Pass DRM driver structure instead of driver nameThomas Zimmermann
Print the name of the DRM driver when taking over fbdev devices. Makes the output to dmesg more consistent. Note that the driver name is only used for printing a string to the kernel log. No UAPI is affected by this change. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Chen-Yu Tsai <wens@csie.org> # sun4i Acked-by: Neil Armstrong <narmstrong@baylibre.com> # meson Link: https://patchwork.freedesktop.org/patch/msgid/20210629135833.22679-1-tzimmermann@suse.de
2021-07-01drm/panfrost: Increase the AS_ACTIVE polling timeoutBoris Brezillon
Experience has shown that 1ms is sometimes not enough, even when the GPU is running at its maximum frequency, not to mention that an MMU operation might take longer if the GPU is running at a lower frequency, which is likely to be the case if devfreq is active. Let's pick a significantly bigger timeout value (1ms -> 100ms) to be on the safe side. v5: * New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-17-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Queue jobs on the hardwareSteven Price
The hardware has a set of '_NEXT' registers that can hold a second job while the first is executing. Make use of these registers to enqueue a second job per slot. v5: * Fix a comment in panfrost_job_init() v3: * Fix the done/err job dequeuing logic to get a valid active state * Only enable the second slot on GPUs supporting jobchain disambiguation * Split interrupt handling in sub-functions Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-16-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Kill in-flight jobs on FD closeBoris Brezillon
If the process who submitted these jobs decided to close the FD before the jobs are done it probably means it doesn't care about the result. v5: * Add a panfrost_exception_is_fault() helper and the DRM_PANFROST_EXCEPTION_MAX_NON_FAULT value v4: * Don't disable/restore irqs when taking the job_lock (not needed since this lock is never taken from an interrupt context) v3: * Set fence error to ECANCELED when a TERMINATED exception is received Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-15-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Don't reset the GPU on job faults unless we really have toBoris Brezillon
If we can recover from a fault without a reset there's no reason to issue one. v3: * Drop the mention of Valhall requiring a reset on JOB_BUS_FAULT * Set the fence error to -EINVAL instead of having per-exception error codes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-14-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Reset the GPU when the AS_ACTIVE bit is stuckBoris Brezillon
Things are unlikely to resolve until we reset the GPU. Let's not wait for other faults/timeout to happen to trigger this reset. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-13-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Disable the AS on unhandled page faultsBoris Brezillon
If we don't do that, we have to wait for the job timeout to expire before the fault jobs gets killed. v3: * Make sure the AS is re-enabled when new jobs are submitted to the context Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-12-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Make sure job interrupts are masked before resettingBoris Brezillon
This is not yet needed because we let active jobs be killed during by the reset and we don't really bother making sure they can be restarted. But once we start adding soft-stop support, controlling when we deal with the remaining interrrupts and making sure those are handled before the reset is issued gets tricky if we keep job interrupts active. Let's prepare for that and mask+flush job IRQs before issuing a reset. v4: * Add a comment explaining why we WARN_ON(!job) in the irq handler * Keep taking the job_lock when evicting stalled jobs v3: * New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-11-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Simplify the reset serialization logicBoris Brezillon
Now that we can pass our own workqueue to drm_sched_init(), we can use an ordered workqueue on for both the scheduler timeout tdr and our own reset work (which we use when the reset is not caused by a fault/timeout on a specific job, like when we have AS_ACTIVE bit stuck). This guarantees that the timeout handlers and reset handler can't run concurrently which drastically simplifies the locking. v5: * Don't call cancel_delayed_timeout() in the reset path (those works are canceled in drm_sched_stop()) v4: * Actually pass the reset workqueue to drm_sched_init() * Don't call cancel_work_sync() in panfrost_reset(). It will deadlock since it might be called from the reset work, which is executing and cancel_work_sync() will wait for the handler to return. Checking the reset pending status should avoid spurious resets v3: * New patch Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-10-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Use a threaded IRQ for job interruptsBoris Brezillon
This should avoid switching to interrupt context when the GPU is under heavy use. v3: * Don't take the job_lock in panfrost_job_handle_irq() Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-9-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Expose a helper to trigger a GPU resetBoris Brezillon
Expose a helper to trigger a GPU reset so we can easily trigger reset operations outside the job timeout handler. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-8-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Do the exception -> string translation using a tableBoris Brezillon
Do the exception -> string translation using a table. This way we get rid of those magic numbers and can easily add new fields if we need to attach extra information to exception types. v4: * Don't expose exception type to userspace * Merge the enum definition and the enum -> string table declaration in the same patch v3: * Drop the error field Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-7-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name()Boris Brezillon
Currently unused. We'll add it back if we need per-GPU definitions. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-6-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Get rid of the unused JS_STATUS_EVENT_ACTIVE definitionBoris Brezillon
Exception types will be defined as an enum. v4: * Fix typo in the commit message Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-5-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriateBoris Brezillon
If the fence creation fail, we can return the error pointer directly. The core will update the fence error accordingly. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-4-boris.brezillon@collabora.com
2021-07-01drm/sched: Allow using a dedicated workqueue for the timeout/fault tdrBoris Brezillon
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU reset. This leads to extra complexity when we need to synchronize timeout works with the reset work. One solution to address that is to have an ordered workqueue at the driver level that will be used by the different schedulers to queue their timeout work. Thanks to the serialization provided by the ordered workqueue we are guaranteed that timeout handlers are executed sequentially, and can thus easily reset the GPU from the timeout handler without extra synchronization. v5: * Add a new paragraph to the timedout_job() method v3: * New patch v4: * Actually use the timeout_wq to queue the timeout work Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Emma Anholt <emma@anholt.net> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-07-01drm/amdgpu: show explicit name instead of id in psp_cmd_submit_bufLang Yu
Use amdgpu_ucode_name to show ucode name and psp_gfx_cmd_name to show psp_gfx_cmd name in psp_cmd_submit_buf. v2: adjust function name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: add function to show psp_gfx_cmd name via idLang Yu
Implement function psp_gfx_cmd_name to show cmd name via cmd id. v2: rename it to psp_gfx_cmd_name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: add function to show ucode name via idLang Yu
Implement function amdgpu_ucode_name to show ucode name via ucode id. v2: rename it to amdgpu_ucode_name Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: add license to umc_8_7_0_sh_mask.hAlex Deucher
Was missing. Add it. Fixes: 6b36fa6143f6ca ("drm/amdgpu: add umc v8_7_0 IP headers") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: rectify line endings in umc v8_7_0 IP headersLukas Bulwahn
Commit 6b36fa6143f6 ("drm/amdgpu: add umc v8_7_0 IP headers") adds the new file ./drivers/gpu/drm/amd/include/asic_reg/umc/umc_8_7_0_sh_mask.h with DOS line endings, which is very uncommon for the kernel repository. Rectify the line endings in this file with dos2unix. Identified by a checkpatch evaluation on the whole kernel repository and spot-checking for really unexpected checkpatch rule violations. Reported-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amd/pm: Simplify managed I2C transfer of AldebaranLuben Tuikov
Simplify Aldebaran managed I2C transfer function to correctly play with the upper I2C layers. This gets it in line with Navi10, Acturus, and Sienna Cichlid. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Correctly disable the I2C IP blockLuben Tuikov
On long transfers to the EEPROM device, i.e. write, it is observed that the driver aborts the transfer. The reason for this is that the driver isn't patient enough--the IC_STATUS register's contents is 0x27, which is MST_ACTIVITY | TFE | TFNF | ACTIVITY. That is, while the transmission FIFO is empty, we, the I2C master device, are still driving the bus. Implement the correct procedure to disable the block, as described in the DesignWare I2C Databook, section 3.8.3 Disabling DW_apb_i2c on page 56. Now there are no premature aborts on long data transfers. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Use a single loopLuben Tuikov
In smu_v11_0_i2c_transmit() use a single loop to transmit bytes, instead of two nested loops. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Fix koops when accessing RAS EEPROMLuben Tuikov
Debugfs RAS EEPROM files are available when the ASIC supports RAS, and when the debugfs is enabled, an also when "ras_enable" module parameter is set to 0. However in this case, we get a kernel oops when accessing some of the "ras_..." controls in debugfs. The reason for this is that struct amdgpu_ras::adev is unset. This commit sets it, thus enabling access to those facilities. Note that this facilitates EEPROM access and not necessarily RAS features or functionality. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: fix 64 bit divide in eeprom codeAlex Deucher
pos is 64 bits. Fixes: c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs") Cc: luben.tuikov@amd.com Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: RAS EEPROM table is now in debugfsLuben Tuikov
Add "ras_eeprom_size" file in debugfs, which reports the maximum size allocated to the RAS table in EEROM, as the number of bytes and the number of records it could store. For instance, $cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size 262144 bytes or 10921 records $_ Add "ras_eeprom_table" file in debugfs, which dumps the RAS table stored EEPROM, in a formatted way. For instance, $cat ras_eeprom_table Signature Version FirstOffs Size Checksum 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage 0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000 1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001 2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002 3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003 4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004 5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005 6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006 7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007 8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008 $_ Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Xinhui Pan <xinhui.pan@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Optimize EEPROM RAS table I/OLuben Tuikov
Split functionality between read and write, which simplifies the code and exposes areas of optimization and more or less complexity, and take advantage of that. Read and write the table in one go; use a separate stage to decode or encode the data, as opposed to on the fly, which keeps the I2C bus busy. Use a single read/write to read/write the table or at most two if the number of records we're reading/writing wraps around. Check the check-sum of a table in EEPROM on init. Update the checksum at the same time as when updating the table header signature, when the threshold was increased on boot. Take advantage of arithmetic modulo 256, that is, use a byte!, to greatly simplify checksum arithmetic. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Get rid of test functionLuben Tuikov
The code is now tested from userspace. Remove already macroed out test function. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Some renamesLuben Tuikov
Qualify with "ras_". Use kernel's own--don't redefine your own. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Nerf buffLuben Tuikov
buff --> buf. Essentially buffer abbreviates to buf, remove 1/2 of it, or just the iron part, as opposed to just the Er, Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Use explicit cardinality for clarityLuben Tuikov
RAS_MAX_RECORD_NUM may mean the maximum record number, as in the maximum house number on your street, or it may mean the maximum number of records, as in the count of records, which is also a number. To make this distinction whether the number is ordinal (index) or cardinal (count), rename this macro to RAS_MAX_RECORD_COUNT. This makes it easy to understand what it refers to, especially when we compute quantities such as, how many records do we have left in the table, especially when there are so many other numbers, quantities and numerical macros around. Also rename the long, amdgpu_ras_eeprom_get_record_max_length() to the more succinct and clear, amdgpu_ras_eeprom_max_record_count(). When computing the threshold, which also deals with counts, i.e. "how many", use cardinal "max_eeprom_records_count", than the quantitative "max_eeprom_records_len". Simplify the logic here and there, as well. Cc: Guchun Chen <guchun.chen@amd.com> Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Simplify RAS EEPROM checksum calculationsLuben Tuikov
Rename update_table_header() to write_table_header() as this function is actually writing it to EEPROM. Use kernel types; use u8 to carry around the checksum, in order to take advantage of arithmetic modulo 8-bits (256). Tidy up to 80 columns. When updating the checksum, just recalculate the whole thing. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Fix amdgpu_ras_eeprom_init()Luben Tuikov
No need to account for the 2 bytes of EEPROM address--this is now well abstracted away by the fixes the the lower layers. Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Return result fix in RASLuben Tuikov
The low level EEPROM write method, doesn't return 1, but the number of bytes written. Thus do not compare to 1, instead, compare to greater than 0 for success. Other cleanup: if the lower layers returned -errno, then return that, as opposed to overwriting the error code with one-fits-all -EINVAL. For instance, some return -EAGAIN. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Fix width of I2C addressLuben Tuikov
The I2C address is kept as a 16-bit quantity in the kernel. The I2C_TAR::I2C_TAR field is 10-bit wide. Fix the width of the I2C address for Vega20 from 8 bits to 16 bits to accommodate the full spectrum of I2C address space. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amd/pm: Simplify managed I2C transfer functionsLuben Tuikov
Now that we have an I2C quirk table for SMU-managed I2C controllers, the I2C core does the checks for us, so we don't need to do them, and so simplify the managed I2C transfer functions. Also, for Arcturus and Navi10, fix setting the command type from "cmd->CmdConfig" to "cmd->Cmd". The latter is what appears to be taking in the enumeration I2C_CMD_... as an integer, not a bit-flag. For Sienna, the "Cmd" field seems to have been eliminated, and command type and flags all live in the "CmdConfig" field--this is left untouched. Fix: Detect and add changing of direction bit-flag, as this is necessary for the SMU to detect the direction change in the 1-d array of data it gets. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amd/pm: Extend the I2C quirk tableLuben Tuikov
Extend the I2C quirk table for SMU access controlled I2C adapters. Let the kernel I2C layer check that the messages all have the same address, and that their combined size doesn't exceed the maximum size of a SMU software I2C request. Suggested-by: Jean Delvare <jdelvare@suse.de> Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: EEPROM: add explicit read and writeLuben Tuikov
Add explicit amdgpu_eeprom_read() and amdgpu_eeprom_write() for clarity. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: RAS xfer to read/writeLuben Tuikov
Wrap amdgpu_ras_eeprom_xfer(..., bool write), into amdgpu_ras_eeprom_read() and amdgpu_ras_eeprom_write(), as that makes reading and understanding the code clearer. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Rename misspelled functionLuben Tuikov
Instead of fixing the spelling in amdgpu_ras_eeprom_process_recods(), rename it to, amdgpu_ras_eeprom_xfer(), to look similar to other I2C and protocol transfer (read/write) functions. Also to keep the column span to within reason by using a shorter name. Change the "num" function parameter from "int" to "const u32" since it is the number of items (records) to xfer, i.e. their count, which cannot be a negative number. Also swap the order of parameters, keeping the pointer to records and their number next to each other, while the direction now becomes the last parameter. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: RAS: EEPROM --> RASLuben Tuikov
In amdgpu_ras_eeprom.c--the interface from RAS to EEPROM, rename macros from EEPROM to RAS, to indicate that the quantities and objects are RAS specific, not EEPROM. We can decrease the RAS table, or put it in different offset of EEPROM as needed in the future. Remove EEPROM_ADDRESS_SIZE macro definition, equal to 2, from the file and calculations, as that quantity is computed and added on the stack, in the lower layer, amdgpu_eeprom_xfer(). Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: I2C class is HWMONLuben Tuikov
Set the auto-discoverable class of I2C bus to HWMON. Remove SPD. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: Fix wrap-around bugs in RASLuben Tuikov
Fix the size of the EEPROM from 256000 bytes to 262144 bytes (256 KiB). Fix a couple or wrap around bugs. If a valid value/address is 0 <= addr < size, the inverse of this inequality (barring negative values which make no sense here) is addr >= size. Fix this in the RAS code. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: RAS and FRU now use 19-bit I2C addressLuben Tuikov
Convert RAS and FRU code to use the 19-bit I2C memory address and remove all "slave_addr", as this is now absolved into the 19-bit address. Cc: Jean Delvare <jdelvare@suse.de> Cc: John Clements <john.clements@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01drm/amdgpu: I2C EEPROM full memory addressingLuben Tuikov
* "eeprom_addr" is now 32-bit wide. * Remove "slave_addr" from the I2C EEPROM driver interface. The I2C EEPROM Device Type Identifier is fixed at 1010b, and the rest of the bits of the Device Address Byte/Device Select Code, are memory address bits, where the first three of those bits are the hardware selection bits. All this is now a 19-bit address and passed as "eeprom_addr". This abstracts the I2C bus for EEPROM devices for this I2C EEPROM driver. Now clients only pass the 19-bit EEPROM memory address, to the I2C EEPROM driver, as the 32-bit "eeprom_addr", from which they want to read from or write to. Cc: Jean Delvare <jdelvare@suse.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>