diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 12:53:43 -0700 |
commit | e058a84bfddc42ba356a2316f2cf1141974625c9 (patch) | |
tree | e6a02dd913e83f44ea9f5a779f9b9bd56d06a9e3 /drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | |
parent | c288d9cd710433e5991d58a0764c4d08a933b871 (diff) | |
parent | 8a02ea42bc1d4c448caf1bab0e05899dad503f74 (diff) |
Merge tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- AMD enables two more GPUs, with resulting header files
- i915 has started to move to TTM for discrete GPU and enable DG1
discrete GPU support (not by default yet)
- new HyperV drm driver
- vmwgfx adds arm64 support
- TTM refactoring ongoing
- 16bpc display support for AMD hw
Otherwise it's just the usual insane amounts of work all over the
place in lots of drivers and the core, as mostly summarised below:
Core:
- mark AGP ioctls as legacy
- disable force probing for non-master clients
- HDR metadata property helpers
- HDMI infoframe signal colorimetry support
- remove drm_device.pdev pointer
- remove DRM_KMS_FB_HELPER config option
- remove drm_pci_alloc/free
- drm_err_*/drm_dbg_* helpers
- use drm driver names for fbdev
- leaked DMA handle fix
- 16bpc fixed point format fourcc
- add prefetching memcpy for WC
- Documentation fixes
aperture:
- add aperture ownership helpers
dp:
- aux fixes
- downstream 0 port handling
- use extended base receiver capability DPCD
- Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
- mst: use khz as link rate during init
- VCPI fixes for StarTech hub
ttm:
- provide tt_shrink file via debugfs
- warn about freeing pinned BOs
- fix swapping error handling
- move page alignment into BO
- cleanup ttm_agp_backend
- add ttm_sys_manager
- don't override vm_ops
- ttm_bo_mmap removed
- make ttm_resource base of all managers
- remove VM_MIXEDMAP usage
panel:
- sysfs_emit support
- simple: runtime PM support
- simple: power up panel when reading EDID + caching
bridge:
- MHDP8546: HDCP support + DT bindings
- MHDP8546: Register DP AUX channel with userspace
- TI SN65DSI83 + SN65DSI84: add driver
- Sil8620: Fix module dependencies
- dw-hdmi: make CEC driver loading optional
- Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
- It66121: Add driver + DT bindings
- Adv7511: Support I2S IEC958 encoding
- Anx7625: fix power-on delay
- Nwi-dsi: Modesetting fixes; Cleanups
- lt6911: add missing MODULE_DEVICE_TABLE
- cdns: fix PM reference leak
hyperv:
- add new DRM driver for HyperV graphics
efifb:
- non-PCI device handling fixes
i915:
- refactor IP/device versioning
- XeLPD Display IP preperation work
- ADL-P enablement patches
- DG1 uAPI behind BROKEN
- disable mmap ioctl for discerte GPUs
- start enabling HuC loading for Gen12+
- major GuC backend rework for new platforms
- initial TTM support for Discrete GPUs
- locking rework for TTM prep
- use correct max source link rate for eDP
- %p4cc format printing
- GLK display fixes
- VLV DSI panel power fixes
- PSR2 disabled for RKL and ADL-S
- ACPI _DSM invalid access fixed
- DMC FW path abstraction
- ADL-S PCI ID update
- uAPI headers converted to kerneldoc
- initial LMEM support for DG1
- x86/gpu: add Jasperlake to gen11 early quirks
amdgpu:
- Aldebaran updates + initial SR-IOV
- new GPU: Beige Goby and Yellow Carp support
- more LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- PCIe ASPM support
- Renoir TMZ enablement
- initial multiple eDP panel support
- use fdinfo to track devices/process info
- pin/unpin TTM fixes
- free resource on fence usage query
- fix fence calculation
- fix hotunplug/suspend issues
- GC/MM register access macro cleanup for SR-IOV
- W=1 fixes
- ACPI ATCS/ATIF handling rework
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes
- new INFO query for additional vbios info
amdkfd:
- SR-IOV aldebaran support
- HMM SVM support
radeon:
- SMU regression fixes
- Oland flickering fix
vmwgfx:
- enable console with fbdev emulation
- fix cpu updates of coherent multisample surfaces
- remove reservation semaphore
- add initial SVGA3 support
- support arm64
msm:
- devcoredump support for display errors
- dpu/dsi: yaml bindings conversion
- mdp5: alpha/blend_mode/zpos support
- a6xx: cached coherent buffer support
- gpu iova fault improvement
- a660 support
rockchip:
- RK3036 win1 scaling support
- RK3066/3188 missing register support
- RK3036/3066/3126/3188 alpha support
mediatek:
- MT8167 HDMI support
- MT8183 DPI dual edge support
tegra:
- fixed YUV support/scaling on Tegra186+
ast:
- use pcim_iomap
- fix DP501 EDID
bochs:
- screen blanking support
etnaviv:
- export more GPU ID values to userspace
- add HWDB entry for GPU on i.MX8MP
- rework linear window calcs
exynos:
- pm runtime changes
imx:
- Annotate dma_fence critical section
- fix PRG modifiers after drmm conversion
- Add 8 pixel alignment fix for 1366x768
- fix YUV advertising
- add color properties
ingenic:
- IPU planes fix
panfrost:
- Mediatek MT8183 support + DT bindings
- export AFBC_FEATURES register to userspace
simpledrm:
- %pr for printing resources
nouveau:
- pin/unpin TTM fixes
qxl:
- unpin shadow BO
virtio:
- create dumb BOs as guest blob
vkms:
- drmm_universal_plane_alloc
- add XRGB plane composition
- overlay support"
* tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm: (1570 commits)
drm/i915: Reinstate the mmap ioctl for some platforms
drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
Revert "drm/msm/mdp5: provide dynamic bandwidth management"
drm/msm/mdp5: provide dynamic bandwidth management
drm/msm/mdp5: add perf blocks for holding fudge factors
drm/msm/mdp5: switch to standard zpos property
drm/msm/mdp5: add support for alpha/blend_mode properties
drm/msm/mdp5: use drm_plane_state for pixel blend mode
drm/msm/mdp5: use drm_plane_state for storing alpha value
drm/msm/mdp5: use drm atomic helpers to handle base drm plane state
drm/msm/dsi: do not enable PHYs when called for the slave DSI interface
drm/msm: Add debugfs to trigger shrinker
drm/msm/dpu: Avoid ABBA deadlock between IRQ modules
drm/msm: devcoredump iommu fault support
iommu/arm-smmu-qcom: Add stall support
drm/msm: Improve the a6xx page fault handler
iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info
iommu/arm-smmu: Add support for driver IOMMU fault handlers
drm/msm: export hangcheck_period in debugfs
drm/msm/a6xx: add support for Adreno 660 GPU
...
Diffstat (limited to 'drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c')
-rw-r--r-- | drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 128 |
1 files changed, 103 insertions, 25 deletions
diff --git a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c index be57088d185d..f403d8e84a8c 100644 --- a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c +++ b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c @@ -37,6 +37,8 @@ static uint32_t dsc_policy_max_target_bpp_limit = 16; /* default DSC policy enables DSC only when needed */ static bool dsc_policy_enable_dsc_when_not_needed; +static bool dsc_policy_disable_dsc_stream_overhead; + static bool dsc_buff_block_size_from_dpcd(int dpcd_buff_block_size, int *buff_block_size) { @@ -250,6 +252,7 @@ static bool intersect_dsc_caps( if (pixel_encoding == PIXEL_ENCODING_YCBCR422 || pixel_encoding == PIXEL_ENCODING_YCBCR420) dsc_common_caps->bpp_increment_div = min(dsc_common_caps->bpp_increment_div, (uint32_t)8); + dsc_common_caps->is_dp = dsc_sink_caps->is_dp; return true; } @@ -258,12 +261,63 @@ static inline uint32_t dsc_div_by_10_round_up(uint32_t value) return (value + 9) / 10; } +static struct fixed31_32 compute_dsc_max_bandwidth_overhead( + const struct dc_crtc_timing *timing, + const int num_slices_h, + const bool is_dp) +{ + struct fixed31_32 max_dsc_overhead; + struct fixed31_32 refresh_rate; + + if (dsc_policy_disable_dsc_stream_overhead || !is_dp) + return dc_fixpt_from_int(0); + + /* use target bpp that can take entire target bandwidth */ + refresh_rate = dc_fixpt_from_int(timing->pix_clk_100hz); + refresh_rate = dc_fixpt_div_int(refresh_rate, timing->h_total); + refresh_rate = dc_fixpt_div_int(refresh_rate, timing->v_total); + refresh_rate = dc_fixpt_mul_int(refresh_rate, 100); + + max_dsc_overhead = dc_fixpt_from_int(num_slices_h); + max_dsc_overhead = dc_fixpt_mul_int(max_dsc_overhead, timing->v_total); + max_dsc_overhead = dc_fixpt_mul_int(max_dsc_overhead, 256); + max_dsc_overhead = dc_fixpt_div_int(max_dsc_overhead, 1000); + max_dsc_overhead = dc_fixpt_mul(max_dsc_overhead, refresh_rate); + + return max_dsc_overhead; +} + +static uint32_t compute_bpp_x16_from_target_bandwidth( + const uint32_t bandwidth_in_kbps, + const struct dc_crtc_timing *timing, + const uint32_t num_slices_h, + const uint32_t bpp_increment_div, + const bool is_dp) +{ + struct fixed31_32 overhead_in_kbps; + struct fixed31_32 effective_bandwidth_in_kbps; + struct fixed31_32 bpp_x16; + + overhead_in_kbps = compute_dsc_max_bandwidth_overhead( + timing, num_slices_h, is_dp); + effective_bandwidth_in_kbps = dc_fixpt_from_int(bandwidth_in_kbps); + effective_bandwidth_in_kbps = dc_fixpt_sub(effective_bandwidth_in_kbps, + overhead_in_kbps); + bpp_x16 = dc_fixpt_mul_int(effective_bandwidth_in_kbps, 10); + bpp_x16 = dc_fixpt_div_int(bpp_x16, timing->pix_clk_100hz); + bpp_x16 = dc_fixpt_from_int(dc_fixpt_floor(dc_fixpt_mul_int(bpp_x16, bpp_increment_div))); + bpp_x16 = dc_fixpt_div_int(bpp_x16, bpp_increment_div); + bpp_x16 = dc_fixpt_mul_int(bpp_x16, 16); + return dc_fixpt_floor(bpp_x16); +} + /* Get DSC bandwidth range based on [min_bpp, max_bpp] target bitrate range, and timing's pixel clock * and uncompressed bandwidth. */ static void get_dsc_bandwidth_range( const uint32_t min_bpp_x16, const uint32_t max_bpp_x16, + const uint32_t num_slices_h, const struct dsc_enc_caps *dsc_caps, const struct dc_crtc_timing *timing, struct dc_dsc_bw_range *range) @@ -272,16 +326,21 @@ static void get_dsc_bandwidth_range( range->stream_kbps = dc_bandwidth_in_kbps_from_timing(timing); /* max dsc target bpp */ - range->max_kbps = dc_dsc_stream_bandwidth_in_kbps(timing->pix_clk_100hz, max_bpp_x16); + range->max_kbps = dc_dsc_stream_bandwidth_in_kbps(timing, + max_bpp_x16, num_slices_h, dsc_caps->is_dp); range->max_target_bpp_x16 = max_bpp_x16; if (range->max_kbps > range->stream_kbps) { /* max dsc target bpp is capped to native bandwidth */ range->max_kbps = range->stream_kbps; - range->max_target_bpp_x16 = calc_dsc_bpp_x16(range->stream_kbps, timing->pix_clk_100hz, dsc_caps->bpp_increment_div); + range->max_target_bpp_x16 = compute_bpp_x16_from_target_bandwidth( + range->max_kbps, timing, num_slices_h, + dsc_caps->bpp_increment_div, + dsc_caps->is_dp); } /* min dsc target bpp */ - range->min_kbps = dc_dsc_stream_bandwidth_in_kbps(timing->pix_clk_100hz, min_bpp_x16); + range->min_kbps = dc_dsc_stream_bandwidth_in_kbps(timing, + min_bpp_x16, num_slices_h, dsc_caps->is_dp); range->min_target_bpp_x16 = min_bpp_x16; if (range->min_kbps > range->max_kbps) { /* min dsc target bpp is capped to max dsc bandwidth*/ @@ -290,7 +349,6 @@ static void get_dsc_bandwidth_range( } } - /* Decides if DSC should be used and calculates target bpp if it should, applying DSC policy. * * Returns: @@ -303,6 +361,7 @@ static bool decide_dsc_target_bpp_x16( const struct dsc_enc_caps *dsc_common_caps, const int target_bandwidth_kbps, const struct dc_crtc_timing *timing, + const int num_slices_h, int *target_bpp_x16) { bool should_use_dsc = false; @@ -311,7 +370,7 @@ static bool decide_dsc_target_bpp_x16( memset(&range, 0, sizeof(range)); get_dsc_bandwidth_range(policy->min_target_bpp * 16, policy->max_target_bpp * 16, - dsc_common_caps, timing, &range); + num_slices_h, dsc_common_caps, timing, &range); if (!policy->enable_dsc_when_not_needed && target_bandwidth_kbps >= range.stream_kbps) { /* enough bandwidth without dsc */ *target_bpp_x16 = 0; @@ -327,7 +386,10 @@ static bool decide_dsc_target_bpp_x16( should_use_dsc = true; } else if (target_bandwidth_kbps >= range.min_kbps) { /* use target bpp that can take entire target bandwidth */ - *target_bpp_x16 = calc_dsc_bpp_x16(target_bandwidth_kbps, timing->pix_clk_100hz, dsc_common_caps->bpp_increment_div); + *target_bpp_x16 = compute_bpp_x16_from_target_bandwidth( + target_bandwidth_kbps, timing, num_slices_h, + dsc_common_caps->bpp_increment_div, + dsc_common_caps->is_dp); should_use_dsc = true; } else { /* not enough bandwidth to fulfill minimum requirement */ @@ -531,18 +593,6 @@ static bool setup_dsc_config( if (!is_dsc_possible) goto done; - if (target_bandwidth_kbps > 0) { - is_dsc_possible = decide_dsc_target_bpp_x16( - &policy, - &dsc_common_caps, - target_bandwidth_kbps, - timing, - &target_bpp); - dsc_cfg->bits_per_pixel = target_bpp; - } - if (!is_dsc_possible) - goto done; - sink_per_slice_throughput_mps = 0; // Validate available DSC settings against the mode timing @@ -690,12 +740,26 @@ static bool setup_dsc_config( dsc_cfg->num_slices_v = pic_height/slice_height; + if (target_bandwidth_kbps > 0) { + is_dsc_possible = decide_dsc_target_bpp_x16( + &policy, + &dsc_common_caps, + target_bandwidth_kbps, + timing, + num_slices_h, + &target_bpp); + dsc_cfg->bits_per_pixel = target_bpp; + } + if (!is_dsc_possible) + goto done; + // Final decission: can we do DSC or not? if (is_dsc_possible) { // Fill out the rest of DSC settings dsc_cfg->block_pred_enable = dsc_common_caps.is_block_pred_supported; dsc_cfg->linebuf_depth = dsc_common_caps.lb_bit_depth; dsc_cfg->version_minor = (dsc_common_caps.dsc_version & 0xf0) >> 4; + dsc_cfg->is_dp = dsc_sink_caps->is_dp; } done: @@ -806,6 +870,7 @@ bool dc_dsc_parse_dsc_dpcd(const struct dc *dc, const uint8_t *dpcd_dsc_basic_da dsc_sink_caps->branch_max_line_width = dpcd_dsc_branch_decoder_caps[DP_DSC_BRANCH_MAX_LINE_WIDTH - DP_DSC_BRANCH_OVERALL_THROUGHPUT_0] * 320; ASSERT(dsc_sink_caps->branch_max_line_width == 0 || dsc_sink_caps->branch_max_line_width >= 5120); + dsc_sink_caps->is_dp = true; return true; } @@ -838,7 +903,8 @@ bool dc_dsc_compute_bandwidth_range( dsc_min_slice_height_override, max_bpp_x16, &config); if (is_dsc_possible) - get_dsc_bandwidth_range(min_bpp_x16, max_bpp_x16, &dsc_common_caps, timing, range); + get_dsc_bandwidth_range(min_bpp_x16, max_bpp_x16, + config.num_slices_h, &dsc_common_caps, timing, range); return is_dsc_possible; } @@ -864,13 +930,20 @@ bool dc_dsc_compute_config( return is_dsc_possible; } -uint32_t dc_dsc_stream_bandwidth_in_kbps(uint32_t pix_clk_100hz, uint32_t bpp_x16) +uint32_t dc_dsc_stream_bandwidth_in_kbps(const struct dc_crtc_timing *timing, + uint32_t bpp_x16, uint32_t num_slices_h, bool is_dp) { - struct fixed31_32 link_bw_kbps; - link_bw_kbps = dc_fixpt_from_int(pix_clk_100hz); - link_bw_kbps = dc_fixpt_div_int(link_bw_kbps, 160); - link_bw_kbps = dc_fixpt_mul_int(link_bw_kbps, bpp_x16); - return dc_fixpt_ceil(link_bw_kbps); + struct fixed31_32 overhead_in_kbps; + struct fixed31_32 bpp; + struct fixed31_32 actual_bandwidth_in_kbps; + + overhead_in_kbps = compute_dsc_max_bandwidth_overhead( + timing, num_slices_h, is_dp); + bpp = dc_fixpt_from_fraction(bpp_x16, 16); + actual_bandwidth_in_kbps = dc_fixpt_from_fraction(timing->pix_clk_100hz, 10); + actual_bandwidth_in_kbps = dc_fixpt_mul(actual_bandwidth_in_kbps, bpp); + actual_bandwidth_in_kbps = dc_fixpt_add(actual_bandwidth_in_kbps, overhead_in_kbps); + return dc_fixpt_ceil(actual_bandwidth_in_kbps); } void dc_dsc_get_policy_for_timing(const struct dc_crtc_timing *timing, uint32_t max_target_bpp_limit_override_x16, struct dc_dsc_policy *policy) @@ -954,3 +1027,8 @@ void dc_dsc_policy_set_enable_dsc_when_not_needed(bool enable) { dsc_policy_enable_dsc_when_not_needed = enable; } + +void dc_dsc_policy_set_disable_dsc_stream_overhead(bool disable) +{ + dsc_policy_disable_dsc_stream_overhead = disable; +} |