Age | Commit message (Collapse) | Author |
|
[Why]
OPTC_BYTES_PER_PIXEL calculation for 4:2:2 and 4:2:0 could have error.
[How]
Change to use following formula:
OPTC_DSC_BYTES_PER_PIXEL = ceiling((chunk size * 2^28) / slice width)
v2: squash in 64 bit divide fix (Alex)
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Bing Guo <Bing.Guo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
- Adding a DM interface to enable DSC over eDP on Linux
- DSC over eDP will allow to power savings by reducing
the bandwidth required to support panel's modes
- Apply link optimization algorithm to reduce link bandwidth
when DSC is enabled
[how]
- Read eDP panel's DSC capabilities
- Apply DSC policy on eDP panel based on its DSC capabilities
- Enable DSC encoder's on the pipe
- Enable DSC on panel's side by setting DSC_ENABLE DPCD register
- Adding link optimization algorithm to reduce link rate or lane
count based
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Coverity discovers holes in logic that needs to be addressed for
improved code integrity.
[How]
Address issues found by coverity without changing the actual logic.
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Chris Park <Chris.Park@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
As part of the FPU isolation work documented in
https://patchwork.freedesktop.org/series/93042/, isolate code that uses
FPU in DSC to DML, where all FPU code should locate.
This change does not refactor any functions but move code around.
Cc: Christian König <christian.koenig@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Anson Jacob <Anson.Jacob@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Before get dsc bw range is used to compute DSC bw range
based on the given fixed bpp min/max input.
The new change will merge any specs, signal, timing specific
bpp range decision into this function. So the function needs to make
a decision with all aspects considered.
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
DM needs to know how much overhead is added to DSC as result
of AMD internal DSC limitation.
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: George Shen <george.shen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[HOW]
Added #ifdefs and refactored various parts of dc to
allow dc_link to be built by AMD EDID UTILITY
[WHY]
dc_dsc was refactored moving some of the code that AMD EDID UTILITY needed
to dc_link, so now dc_link needs to be included by AMD EDID UTILITY
Squash in DCN config fix (Alex)
Reviewed-by: Leung Martin <Martin.Leung@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Mark Morra <MarkAlbert.Morra@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Based on hardware team recommendation this additional dsc overhead
is only required for DP DSC.
[how]
Add a check for is_dp and only apply the overhead if this flag is set.
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
As hardware team suggested that we need to add a max dsc bw overhead
into existing stream bandwidth when DSC is used.
The formula as below:
max_dsc_bw_overhead =
v_addressable * slice_count * 256 bit * pixel clock / v_total / h_total
effective stream bandwidth = pixel clock * bpp
stream bandwidth = effective stream bandwidth + dsc stream overhead
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Wayne Lin <waynelin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why?]
Some code still expected bpp to be used in whole bits, not 16ths. dsc.c uses
redundant function now found in dc to calculate stream bandwidth from timing.
[How?]
Fix code to work with 16ths instead of whole bits for dsc bpp.
Refactor get_dsc_bandwidth to accept inputs in 16ths of a bit.
Use dc function to calculate bandwidth from timing, and make dsc bw calculation
a part of dsc.c.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
3x4K60 displays over MST with DSC enabled was not able to light up
due to this patch.
Signed-off-by: Jun Lei <jun.lei@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Some panels contain active converters (e.g. DP to MIPI) which only support
restricted DSC configurations. DID2.0 adds support for such displays to
explicitly define per timing BPP restrictions on DSC. Ignoring these
restrictions leads to blackscreen.
[How]
Add parsing in DID2.0 parser to get this bpp info.
Add support in DSC module to constraint target bpp based
on this info.
Signed-off-by: Jun Lei <jun.lei@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
From Ard:
"Simply disabling -mgeneral-regs-only left and right is risky, given that
the standard AArch64 ABI permits the use of FP/SIMD registers anywhere,
and GCC is known to use SIMD registers for spilling, and may invent
other uses of the FP/SIMD register file that have nothing to do with the
floating point code in question. Note that putting kernel_neon_begin()
and kernel_neon_end() around the code that does use FP is not sufficient
here, the problem is in all the other code that may be emitted with
references to SIMD registers in it.
So the only way to do this properly is to put all floating point code in
a separate compilation unit, and only compile that unit with
-mgeneral-regs-only."
Disable support until the code can be properly refactored to support this
properly on aarch64.
Acked-by: Will Deacon <will@kernel.org>
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
The C standard does not specify whether an enum is signed or unsigned.
In the function prototype, one of the argument is defined as an enum
but its declaration was unit32_t. Fix this by changing the function
argument to enum in the declaration.
Signed-off-by: Tao.Huang <Tao.Huang@amd.com>
Signed-off-by: Florin Iucha <florin.iucha@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why] Can be used for debug purposes
[How] Add max target bpp override field and related handling
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
DSC will be disabled if DPCD 0006F[6:3] is set to a non-zero value
because bits 6:3 are not currently supported. When 6:3 is populated, an
unsupported INCREMENT OF bits_per_pixel value is read (DPCD 0006F[2:0])
[How]
Mask the INCREMENT OF bits_per_pixel field so that values in the
unsupported field are ignored.
Signed-off-by: Jaehyun Chung <jaehyun.chung@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Previously when force enabling DSC on SST display we unknowingly
supressed lane count, which caused DSC to be enabled automatically.
[how]
By adding an additional flag to force enable DSC in dc_dsc.c DSC can
always be enabled with debugfs dsc_clock_en forced to 1
Cc: stable@vger.kernel.org
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This adds ARM64 support into the DCN. This mainly enables support
for Navi graphics cards. The dcn10 changes haven't been tested,
since I don't have the relevant hardware available, but there
is no way to conditionally disable them, so I've done them anyway.
Signed-off-by: Daniel Kolesa <daniel@octaforge.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
The capability fields are reserved for DSC branch
only to report the capability related to the
branch's DSC decoder.
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
DSC calculations fail because the u16 bits_per_pixel from
the DRM struct is being casted to the u8 drm_bpp parameters
and locals. Integer wraparound is happening because this
value is greater than 255.
[How]
Use u16 to match what's in the structure instead of u8.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When we want to use float point operation on Linux
we need to use within special kernel protection
(`kernel_fpu_{begin,end}()`.), otherwise the kernel
can clobber userspace FPU register state. For detecting
these issues we use a tool named objtool (with -Ffa
flags) to highlight the FPU problems, all warnings can
be summed up as follows:
./tools/objtool/objtool check -Ffa
drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.o
[..] dc/dsc/rc_calc.o: warning: objtool: get_qp_set()+0x2f8:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_roundf()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: dsc_ceil()+0x5:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: get_ofs_set()+0x3eb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc.o: warning: objtool: calc_rc_params()+0x3c:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool:
get_dsc_bandwidth_range.isra.0()+0x8d:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/dc_dsc.o: warning: objtool: setup_dsc_config()+0x2ef:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:copy_pps_fields()+0xbb:
FPU instruction outside of kernel_fpu_{begin,end}()
[..] dc/dsc/rc_calc_dpi.o: warning: objtool:
dscc_compute_dsc_parameters()+0x7b:
FPU instruction outside of kernel_fpu_{begin,end}()
This commit fixes the above issues by rework DSC as described:
1. Isolate all FPU operations in a single file;
2. Use FPU flags only in the file that handles FPU operations;
3. Isolate all functions that require float point operation in static
functions;
4. Add a mid-layer function that does not use any float point operation,
and that could be safely invoked in other parts of the code.
5. Keep float point operation under DC_FP_{START/END} macro.
CC: Christian König <christian.koenig@amd.com>
CC: Alexander Deucher <Alexander.Deucher@amd.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Tony Cheng <tony.cheng@amd.com>
CC: Harry Wentland <hwentlan@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In the file drm_dp_helper.h we have a macro named
DP_DSC_THROUGHPUT_MODE_{0,1}_UPSUPPORTED, the correct name should be
DP_DSC_THROUGHPUT_MODE_{0,1}_UNSUPPORTED. This commits adjusts this typo
in the header file and in other places that attempt to access this
macro.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429184142.1867987-1-Rodrigo.Siqueira@amd.com
|
|
[how]
Empty dsc enc caps when debug option is set to disable DSC.
Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
It's required for debug purposes.
[how]
Add a dsc_bpp_increment_div debug option that overrides DPCD
BITS_PER_PIXEL_INCREMENT value. The value dsc_bpp_increment_div should
be set to is the one after parsing, i.e. it could be 1, 2, 4, 8 or 16
(meaning 1pix, 1/2pix, ..., 1/16pix).
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
For some use cases we need to be able to adjust the maximum target bpp
allowed by DSC policy.
[How]
New interface dc_dsc_policy_set_max_target_bpp_limit
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
DCN requires floating point support to operate. Add the appropriate
x86/ppc64 guards and FPU / AltiVec / VSX context switches to DCN.
Note that the current DC20 code doesn't contain all required FPU
wrappers on x86 or POWER, so this patch is insufficient to fully
enable DC20 on POWER.
v2: s/X86_64/X86/g to retain previous behavior.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.6-2019-12-11:
amdgpu:
- Add MST atomic routines
- Add support for DMCUB (new helper microengine for displays)
- Add OEM i2c support in DC
- Use vstartup for vblank events on DCN
- Simplify Kconfig for DC
- Renoir fixes for DC
- Clean up function pointers in DC
- Initial support for HDCP 2.x
- Misc code cleanups
- GFX10 fixes
- Rework JPEG engine handling for VCN
- Add clock and power gating support for JPEG
- BACO support for Arcturus
- Cleanup PSP ring handling
- Add framework for using BACO with runtime pm to save power
- Move core pci state handling out of the driver for pm ops
- Allow guest power control in 1 VF case with SR-IOV
- SR-IOV fixes
- RAS fixes
- Support for power metrics on renoir
- Golden settings updates for gfx10
- Enable gfxoff on supported navi10 skus
- Update MAINTAINERS
amdkfd:
- Clean up generational gfx code
- Fixes for gfx10
- DIQ fixes
- Share more code with amdgpu
radeon:
- PPC DMA fix
- Register checker fixes for r1xx/r2xx
- Misc cleanups
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191211223020.7510-1-alexander.deucher@amd.com
|
|
amdgpu is MIT licensed.
Fixes: ec8f24b7faaf3d ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu is MIT licensed.
Fixes: ec8f24b7faaf3d ("treewide: Add SPDX license identifier - Makefile/Kconfig")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
dc needs to expose its internal dsc policy.
Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Need to support 6 bpp for 420 pixel encoding only.
[how]
Add a dc function to determine what bpp range can be supported
for given pixel encoding.
Signed-off-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Need to be able to enable native 422 for debugging purposes.
[How]
Add new dc_debug_options bool and check it in the get_dsc_enc_caps
function.
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
DCN2 and DSC are stable enough to be build by default. So drop the flags.
[How]
Remove them using the unifdef tool. The following commands were executed
in sequence:
$ find -name '*.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'
$ find -name '*.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';'
In addition:
* Remove from kconfig, and replace any dependencies with DCN1_0.
* Remove from any makefiles.
* Fix and cleanup NV defninitions in dal_asic_id.h
* Expand DCN1 ifdef to include DCN2 code in the following files:
* clk_mgr/clk_mgr.c: dc_clk_mgr_create()
* core/dc_resources.c: dc_create_resource_pool()
* dce/dce_dmcu.c: dcn20_*lock_phy()
* dce/dce_dmcu.c: dcn20_funcs
* dce/dce_dmcu.c: dcn20_dmcu_create()
* gpio/hw_factory.c: dal_hw_factory_init()
* gpio/hw_translate.c: dal_hw_translate_init()
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
A final attempt at enabling sse2 for GCC users.
Orininally attempted in:
commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines")
Reverted due to "reported instability" in:
commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"")
Re-added just for Clang in:
commit 0f0727d971f6 ("drm/amd/display: readd -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines")
The original report didn't have enough information to know if the GPF
was due to misalignment, but I suspect that it was. (The missing
information was the disassembly of the function at the bottom of the
trace, to see if the instruction pointer pointed to an instruction with
16B alignment memory operand requirements. The stack trace does show
the stack was only 8B but not 16B aligned though, which makes this a
strong possibility).
Now that the stack misalignment issue has been fixed for users of GCC
7.1+, reattempt adding -msse2. This matches Clang.
It will likely never be safe to enable this for pre-GCC 7.1 AND use a
16B aligned stack in these translation units.
This is only a functional change for GCC 7.1+ users, and should be boot
tested.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GCC earlier than 7.1 errors when compiling code that makes use of
`double`s and sets a stack alignment outside of the range of [2^4-2^12]:
$ cat foo.c
double foo(double x, double y) {
return x + y;
}
$ gcc-4.9 -mpreferred-stack-boundary=3 foo.c
error: -mpreferred-stack-boundary=3 is not between 4 and 12
This is likely why the AMDGPU driver was ever compiled with a different
stack alignment (and thus different ABI) than the rest of the x86
kernel. The kernel uses 8B stack alignment, while the driver was using
16B stack alignment in a few places.
Since GCC 7.1+ doesn't error, fix the ABI mismatch for users of newer
versions of GCC.
There was discussion about whether to mark the driver broken or not for
users of GCC earlier than 7.1, but since the driver currently is
working, don't explicitly break the driver for them here.
Relying on differing stack alignment is unspecified behavior, and
brittle, and may break in the future.
This patch is no functional change for GCC users earlier than 7.1. It's
been compile tested on GCC 4.9 and 8.3 to check the correct flags. It
should be boot tested when built with GCC 7.1+.
-mincoming-stack-boundary= or -mstackrealign may help keep this code
building for pre-GCC 7.1 users.
The version check for GCC is broken into two conditionals, both because
cc-ifversion is currently GCC specific, and it simplifies a subsequent
patch.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The x86 kernel is compiled with an 8B stack alignment via
`-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via
commit d9b0cde91c60 ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if supported")
or `-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are
compiled with 16B stack alignment.
Generally, the stack alignment is part of the ABI. Linking together two
different translation units with differing stack alignment is dangerous,
particularly when the translation unit with the smaller stack alignment
makes calls into the translation unit with the larger stack alignment.
While 8B aligned stacks are sometimes also 16B aligned, they are not
always.
Multiple users have reported General Protection Faults (GPF) when using
the AMDGPU driver compiled with Clang. Clang is placing objects in stack
slots assuming the stack is 16B aligned, and selecting instructions that
require 16B aligned memory operands.
At runtime, syscall handlers with 8B aligned stack call into code that
assumes 16B stack alignment. When the stack is a multiple of 8B but not
16B, these instructions result in a GPF.
Remove the code that added compatibility between the differing compiler
flags, as it will result in runtime GPFs when built with Clang. Cleanups
for GCC will be sent in later patches in the series.
Link: https://github.com/ClangBuiltLinux/linux/issues/735
Debugged-by: Yuxuan Shui <yshuiv7@gmail.com>
Reported-by: Shirish S <shirish.s@amd.com>
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-10-09:
amdgpu:
- Additional RAS enablement for vega20
- RAS page retirement and bad page storage in EEPROM
- No GPU reset with unrecoverable RAS errors
- Reserve vram for page tables rather than trying to evict
- Fix issues with GPU reset and xgmi hives
- DC i2c over aux fixes
- Direct submission for clears, PTE/PDE updates
- Improvements to help support recoverable GPU page faults
- Silence harmless SAD block messages
- Clean up code for creating a bo at a fixed location
- Initial DC HDCP support
- Lots of documentation fixes
- GPU reset for renoir
- Add IH clockgating support for soc15 asics
- Powerplay improvements
- DC MST cleanups
- Add support for MSI-X
- Misc cleanups and bug fixes
amdkfd:
- Query KFD device info by asic type rather than pci ids
- Add navi14 support
- Add renoir support
- Add navi12 support
- gfx10 trap handler improvements
- pasid cleanups
- Check against device cgroup
ttm:
- Return -EBUSY with pipelining with no_gpu_wait
radeon:
- Silence harmless SAD block messages
device_cgroup:
- Export devcgroup_check_permission
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
|
|
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/display/dc/dsc/rc_calc.c: In function calc_rc_params:
drivers/gpu/drm/amd/display/dc/dsc/rc_calc.c:180:6: warning: variable source_bpp set but not used [-Wunused-but-set-variable]
It is not used since commit 97bda0322b8a ("drm/amd/display:
Add DSC support for Navi (v2)")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- add modpost warn exported symbols marked as 'static' because 'static'
and EXPORT_SYMBOL is an odd combination
- break the build early if gold linker is used
- optimize the Bison rule to produce .c and .h files by a single
pattern rule
- handle PREEMPT_RT in the module vermagic and UTS_VERSION
- warn CONFIG options leaked to the user-space except existing ones
- make single targets work properly
- rebuild modules when module linker scripts are updated
- split the module final link stage into scripts/Makefile.modfinal
- fix the missed error code in merge_config.sh
- improve the error message displayed on the attempt of the O= build in
unclean source tree
- remove 'clean-dirs' syntax
- disable -Wimplicit-fallthrough warning for Clang
- add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC
- remove ARCH_{CPP,A,C}FLAGS variables
- add $(BASH) to run bash scripts
- change *CFLAGS_<basetarget>.o to take the relative path to $(obj)
instead of the basename
- stop suppressing Clang's -Wunused-function warnings when W=1
- fix linux/export.h to avoid genksyms calculating CRC of trimmed
exported symbols
- misc cleanups
* tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits)
genksyms: convert to SPDX License Identifier for lex.l and parse.y
modpost: use __section in the output to *.mod.c
modpost: use MODULE_INFO() for __module_depends
export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols
export.h: remove defined(__KERNEL__), which is no longer needed
kbuild: allow Clang to find unused static inline functions for W=1 build
kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN
kbuild: refactor scripts/Makefile.extrawarn
merge_config.sh: ignore unwanted grep errors
kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)
modpost: add NOFAIL to strndup
modpost: add guid_t type definition
kbuild: add $(BASH) to run scripts with bash-extension
kbuild: remove ARCH_{CPP,A,C}FLAGS
kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC
kbuild: Do not enable -Wimplicit-fallthrough for clang for now
kbuild: clean up subdir-ymn calculation in Makefile.clean
kbuild: remove unneeded '+' marker from cmd_clean
kbuild: remove clean-dirs syntax
kbuild: check clean srctree even earlier
...
|
|
[Why]
Edid Utility wishes to include DSC module from driver instead
of doing it's own logic which will need to be updated every time
someone modifies the driver logic.
[How]
Modify some functions such that we dont need to pass the entire
DC structure as parameter.
-Remove DC inclusion from module.
-Filter out problematic types and inclusions
Signed-off-by: Bayan Zabihiyan <bayan.zabihiyan@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
It's sometimes useful to have this option when debugging
[how]
Add a config flag. If the flag is not set, use driver default policy.
If the flag is set, use the value from the flag as the starting DSC slice
height.
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Kbuild provides per-file compiler flag addition/removal:
CFLAGS_<basetarget>.o
CFLAGS_REMOVE_<basetarget>.o
AFLAGS_<basetarget>.o
AFLAGS_REMOVE_<basetarget>.o
CPPFLAGS_<basetarget>.lds
HOSTCFLAGS_<basetarget>.o
HOSTCXXFLAGS_<basetarget>.o
The <basetarget> is the filename of the target with its directory and
suffix stripped.
This syntax comes into a trouble when two files with the same basename
appear in one Makefile, for example:
obj-y += foo.o
obj-y += dir/foo.o
CFLAGS_foo.o := <some-flags>
Here, the <some-flags> applies to both foo.o and dir/foo.o
The real world problem is:
scripts/kconfig/util.c
scripts/kconfig/lxdialog/util.c
Both files are compiled into scripts/kconfig/mconf, but only the
latter should be given with the ncurses flags.
It is more sensible to use the relative path to the Makefile, like this:
obj-y += foo.o
CFLAGS_foo.o := <some-flags>
obj-y += dir/foo.o
CFLAGS_dir/foo.o := <other-flags>
At first, I attempted to replace $(basetarget) with $*. The $* variable
is replaced with the stem ('%') part in a pattern rule. This works with
most of cases, but does not for explicit rules.
For example, arch/ia64/lib/Makefile reuses rule_as_o_S in its own
explicit rules, so $* will be empty, resulting in ignoring the per-file
AFLAGS.
I introduced a new variable, target-stem, which can be used also from
explicit rules.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Marc Zyngier <maz@kernel.org>
|
|
height
[why] Minimum slice height is recommended by VESA DSC Spreadsheet user guide
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This file was accidentally added to the driver during
Navi promotion
Nothing includes it. No makefile attempts to compile it, and
it would fail compilation if they tried
Remove it
Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>w
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
undefined SW FP routines
arch/x86/Makefile disables SSE and SSE2 for the whole kernel. The
AMDGPU drivers modified in this patch re-enable SSE but not SSE2. Turn
on SSE2 to support emitting double precision floating point instructions
rather than calls to non-existent (usually available from gcc_s or
compiler_rt) floating point helper routines for Clang.
This was originally landed in:
commit 10117450735c ("drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines")
but reverted in:
commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"")
due to bugreports from GCC builds. Add guards to only do so for Clang.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=109487
Link: https://github.com/ClangBuiltLinux/linux/issues/327
Suggested-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
'second_line_offset_adj' was mistakenly left at zero, even though DSC spec
v1.2a recommends setting this field to 512 for 4:2:0.
[how]
Set 'second_line_offset_adj' to 512 for 4:2:0 and leave at zero otherwise
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Some parts of the DSC spec relating to 4.2.0 were not reflected in
drm_dsc_compute_rc_parameters, causing unexpected config failures
[How]
Add nsl_bpg_offset and rbs_min computation
Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Nikola Cornij <Nikola.Cornij@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
'second_line_offset_adj' was mistakenly left at zero, even though DSC spec
v1.2a recommends setting this field to 512 for 4:2:0.
[how]
Set 'second_line_offset_adj' to 512 for 4:2:0 and leave at zero otherwise
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
As previously fixed for dml in commit 4769278e5c7f ("amdgpu/dc/dml:
Support clang option for stack alignment") and calcs in commit
cc32ad8f559c ("amdgpu/dc/calcs: Support clang option for stack
alignment"), dcn20 uses an option that is not available with clang:
clang: error: unknown argument: '-mpreferred-stack-boundary=4'
scripts/Makefile.build:281: recipe for target 'drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.o' failed
Use the same trick that we have in the other two files.
Fixes: 7ed4e6352c16 ("drm/amd/display: Add DCN2 HW Sequencer and Resource")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|