summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-10-10drm/etnaviv: add infrastructure to query perf counterChristian Gmeiner
Make it possible that userspace can query all performance domains and its signals. This information is needed to sample those signals via submit ioctl. At the moment no performance domain is available. Changes from v1 -> v2: - use a 16 bit value for signals - fix padding issues - add id member to domain and signal struct Changes v4 -> v5 - provide for each pipe an own set of pm domains Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10drm/etnaviv: make it possible to allocate multiple eventsChristian Gmeiner
This makes it possible to allocate multiple events under the event spinlock. This change is needed to support 'sync'-points. Changes v2 -> v3: - wait for the completion of all events - use 10sec timeout regardless of the number of events - removed validation if there are enough free events - fixed return value evaluation of event_alloc(..) in etnaviv_gpu_submit(..) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10drm/etnaviv: use bitmap to keep track of eventsChristian Gmeiner
This is prep work to be able to allocate multiple events in one go. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10drm/etnaviv: rework clock initializationLucas Stach
The reset path wants to initialize the clock control register regardless of the DYNAMIC_FREQUENCY_SCALING feature, so don't call clock update, but explicitly load the register. Also disabling of the debug registers is moved into the reset function, so we always get to the same state after a GPU reset. This means the clock update function should not touch the bits already set in the clock control register, but instead only update the scaling bits. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-10drm/etnaviv: remove IOMMU dependencyLucas Stach
Using the IOMMU API to manage the internal GPU MMU has been an historical accident and it keeps getting in the way, as well as entangling the driver with the inner workings of the IOMMU subsystem. Clean this up by removing the usage of iommu_domain, which is the last piece linking etnaviv to the IOMMU subsystem. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10drm/etnaviv: mmu: mark local functions staticLucas Stach
And clean up the header file a bit. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: mmu: stop using iommu map/unmap functionsLucas Stach
This is a preparation to remove the etnaviv dependency on the IOMMU subsystem by importing the relevant parts of the iommu map/unamp functions into the driver. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: iommuv1: remove map_lockLucas Stach
It wasn't protecting anything, as the single word writes used to set up or tear down a translation are already inherently atomic, so the spinlock is pure overhead. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: iommuv1: fold pgtable_write into callersLucas Stach
A function doing a single assignment is not really helping the code flow. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: iommuv1: fold pagetable alloc and free into callerLucas Stach
Those functions are simple enough to fold them into the calling function. This also fixes a correctness issue, as the alloc/free functions didn't specifiy the device the memory was allocated for. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: remove iova_to_phys iommu opsLucas Stach
They are not used in any way, so can go away. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/etnaviv: remove iommu fault handlerLucas Stach
The handler has never been used, so it's really just dead code. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10drm/bridge/synopsys: dsi :remove is_panel_bridgebenjamin.gaignard@linaro.org
When using drm_of_panel_bridge_remove() we can simplify the code and remove is_panel_bridge from dw_mipi_dsi structure. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/1506936888-23844-6-git-send-email-benjamin.gaignard@linaro.org
2017-10-10drm/vc4: remove bridge from driver internal structurebenjamin.gaignard@linaro.org
With a call to drm_of_panel_bridge_remove() we could remove the bridge without store it in vc4_dpi internal driver structure. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1506936888-23844-5-git-send-email-benjamin.gaignard@linaro.org
2017-10-10drm/stm: ltdc: remove bridge from driver internal structurebenjamin.gaignard@linaro.org
With a call to drm_of_panel_bridge_remove() we could remove the bridge without store it in ldtc internal driver structure. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/1506936888-23844-4-git-send-email-benjamin.gaignard@linaro.org
2017-10-10drm/drm_of: add drm_of_panel_bridge_remove functionbenjamin.gaignard@linaro.org
This function is the pendant of drm_of_find_panel_or_bridge() to remove a previously allocated panel_bridge. Given a specific port and endpoint it remove the panel bridge. Since drm_panel_bridge_remove() will check that bridge parameter is not NULL and is a real drm_panel_bridge and no a simple bridge it is safe to call it directly. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/1506936888-23844-3-git-send-email-benjamin.gaignard@linaro.org
2017-10-10drm/bridge: make drm_panel_bridge_remove more robustbenjamin.gaignard@linaro.org
Make sure that bridge parameter is not NULL and can be safely cast into a panel_bridge structure. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Philippe Cornu <philippe.cornu@st.com> Tested-by: Philippe Cornu <philippe.cornu@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/1506936755-23625-2-git-send-email-benjamin.gaignard@linaro.org
2017-10-10drm/i915/bios: don't pass bdb to parsers that don't parse VBT directlyJani Nikula
Hint that you're not supposed to look at VBT in these functions. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b82c326be8c796a70bdc2bd1c479bbb6159f5cb0.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: parse SDVO device mapping from pre-parsed child devicesJani Nikula
We parse and store the child devices in parse_general_definitions(). There is no need to parse the VBT block again for SDVO device mapping. Do the same as we do in parse_ddi_ports(). We no longer have access to child device size at this stage, but we also don't need to worry about reading past the child device anymore. Instead of a child device size check, do a mild optimization by limiting the parsing to gens 3 through 7. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c918d4173dd38a165295f1270cb16c2c01bd8cd1.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: merge parse_device_mapping() into parse_general_definitions()Jani Nikula
They're both parsing the same block, and there's no need for them to be split. The former also benefits from the range checks in the latter. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/64a292606ecbb0b8602e6c5523c5746573ec3944.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: cleanup comments and useless returnJani Nikula
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4a95fb9d23d980830e3158d3c57258e6539965ce.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: remove an unnecessary temp variableJani Nikula
Prepare for merging parse_device_mapping() into parse_general_definitions(). No functional changes. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f1c3621e2622f4debdfb4a2f5c1959845754ac04.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: don't initialize fields based on vbt versionJani Nikula
In theory, these might clobber information for older VBT versions. We might have to store the BDB version for later parsing, but currently all code accessing these fields will only use them on newer platforms with new enough BDB versions. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/0232d9cb258e8f83c4180cdb8aad1459a312ec2a.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: refactor parse general definitionsJani Nikula
Early return on failures. Rename the variable for later merging with parse_device_mappings(). Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/785abb904a572752fec68d90d34efeb67774dc1f.1506586821.git.jani.nikula@intel.com
2017-10-10drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channelJani Nikula
While technically CHV isn't DDI, we do look at the VBT based DDI port info for HDMI DDC pin and DP AUX channel. (We call these "alternate", but they're really just something that aren't platform defaults.) In commit e4ab73a13291 ("drm/i915: Respect alternate_ddc_pin for all DDI ports") Ville writes, "IIRC there may be CHV system that might actually need this." I'm not sure why there couldn't be even more platforms that need this, but start conservative, and parse the info for CHV in addition to DDI. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100553 Reported-by: Marek Wilczewski <mw@3cte.pl> Cc: stable@vger.kernel.org Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d0815082cb98487618429b62414854137049b888.1506586821.git.jani.nikula@intel.com
2017-10-09drm/i915: avoid division by zero on cnl_calc_wrpll_linkPaulo Zanoni
If for some unexpected reason the registers all read zero it's better to WARN and return instead of dividing by zero and completely freezing the machine. I don't expect this to happen in the wild with the current code, but I accidentally triggered the division by zero while doing some debugging in an unusual environment. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171005213842.11423-2-paulo.r.zanoni@intel.com
2017-10-09drm/i915: add the BXT and CNL DPLL registers to pipe_config_comparePaulo Zanoni
Looks like we were missing them. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170922205343.16006-2-paulo.r.zanoni@intel.com
2017-10-09drm/amdgpu: add interface for editing a foreign process's priority v3Andres Rodriguez
The AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE ioctls are used to set the priority of a different process in the current system. When a request is dropped, the process's contexts will be restored to the priority specified at context creation time. A request can be dropped by setting the override priority to AMDGPU_CTX_PRIORITY_UNSET. An fd is used to identify the remote process. This is simpler than passing a pid number, which is vulnerable to re-use, etc. This functionality is limited to DRM_MASTER since abuse of this interface can have a negative impact on the system's performance. v2: removed unused output structure v3: change refcounted interface for a regular set operation Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: add plumbing for ctx priority changes v2Andres Rodriguez
Introduce amdgpu_ctx_priority_override(). A mechanism to override a context's priority. An override can be terminated by setting the override to AMD_SCHED_PRIORITY_UNSET. v2: change refcounted interface for a direct set Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: introduce AMDGPU_CTX_PRIORITY_UNSETAndres Rodriguez
Use _INVALID to identify bad parameters and _UNSET to represent the lack of interest in a specific value. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amd/sched: allow clients to edit an entity's rq v2Andres Rodriguez
This is useful for changing an entity's priority at runtime. v2: don't modify the order of amd_sched_entity members Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: make amdgpu_to_sched_priority detect invalid parametersAndres Rodriguez
Returning invalid priorities as _NORMAL is a backwards compatibility quirk of amdgpu_ctx_ioctl(). Move this detail one layer up where it belongs. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: implement ring set_priority for gfx_v8 compute v9Andres Rodriguez
Programming CP_HQD_QUEUE_PRIORITY enables a queue to take priority over other queues on the same pipe. Multiple queues on a pipe are timesliced so this gives us full precedence over other queues. Programming CP_HQD_PIPE_PRIORITY changes the SPI_ARB_PRIORITY of the wave as follows: 0x2: CS_H 0x1: CS_M 0x0: CS_L The SPI block will then dispatch work according to the policy set by SPI_ARB_PRIORITY. In the current policy CS_H is higher priority than gfx. In order to prevent getting stuck in loops of resources bouncing between GFX and high priority compute and introducing further latency, we statically reserve a portion of the pipe. v2: fix srbm_select to ring->queue and use ring->funcs->type v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: switch int to enum amd_sched_priority v5: corresponding changes for srbm_lock v6: change CU reservation to PIPE_PERCENT allocation v7: use kiq instead of MMIO v8: back to MMIO, and make the implementation sleep safe. v9: corresponding changes for splitting HIGH into _HW/_SW Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: add framework for HW specific priority settings v9Andres Rodriguez
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters v7: replace spinlocks with mutexes for KIQ compatibility v8: raise ring priority during cs_ioctl, instead of job_run v9: priority_get() before push_job() Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: add parameter to allocate high priority contexts v11Andres Rodriguez
Add a new context creation parameter to express a global context priority. The priority ranking in descending order is as follows: * AMDGPU_CTX_PRIORITY_HIGH_HW * AMDGPU_CTX_PRIORITY_HIGH_SW * AMDGPU_CTX_PRIORITY_NORMAL * AMDGPU_CTX_PRIORITY_LOW_SW * AMDGPU_CTX_PRIORITY_LOW_HW The driver will attempt to schedule work to the hardware according to the priorities. No latency or throughput guarantees are provided by this patch. This interface intends to service the EGL_IMG_context_priority extension, and vulkan equivalents. Setting a priority above NORMAL requires CAP_SYS_NICE or DRM_MASTER. v2: Instead of using flags, repurpose __pad v3: Swap enum values of _NORMAL _HIGH for backwards compatibility v4: Validate usermode priority and store it v5: Move priority validation into amdgpu_ctx_ioctl(), headline reword v6: add UAPI note regarding priorities requiring CAP_SYS_ADMIN v7: remove ctx->priority v8: added AMDGPU_CTX_PRIORITY_LOW, s/CAP_SYS_ADMIN/CAP_SYS_NICE v9: change the priority parameter to __s32 v10: split priorities into _SW and _HW v11: Allow DRM_MASTER without CAP_SYS_NICE Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2Andres Rodriguez
Introduce a flag to signal that access to a BO will be synchronized through an external mechanism. Currently all buffers shared between contexts are subject to implicit synchronization. However, this is only required for protocols that currently don't support an explicit synchronization mechanism (DRI2/3). This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that users can specify when it is safe to disable implicit sync. v2: only disable explicit sync in amdgpu_cs_ioctl Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: add helper to convert a ttm bo to amdgpu_boAndres Rodriguez
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: add VM support for huge pages v2Christian König
Convert GTT mappings into linear ones for huge page handling. v2: use fragment size as minimum for linear conversion Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/ttm: DMA map/unmap consecutive pages as a whole v2Christian König
Instead of mapping them bit by bit map/unmap all consecutive pages as in one call. v2: test for consecutive pages instead of using compound page order. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/ttm: allocate/free multiple pages in a single callChristian König
Totally surprisingly this is more efficient than doing it page by page. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: Reserve shared memory on VRAM for SR-IOVHorace Chen
SR-IOV need to reserve a piece of shared VRAM at the exact place to exchange data betweem PF and VF. The start address and size of the shared mem are passed to guest through VBIOS structure VRAM_UsageByFirmware. VRAM_UsageByFirmware is a general feature in VBIOS, it indicates that VBIOS need to reserve a piece of memory on the VRAM. Because the mem address is specified. Reserve it early in amdgpu_ttm_init to make sure that it can monoplize the space. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/amdgpu: Set the correct value for PDEs/PTEs of ATC memory on RavenYong Zhao
Without the additional bits set in PDEs/PTEs, the ATC memory access would have failed on Raven. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09drm/i915: s/sg_mask/sg_page_sizes/Matthew Auld
It's a little unclear what the sg_mask actually is, so prefer the more meaningful name of sg_page_sizes. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009110024.29114-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-09drm/i915: Early rejection of mappable GGTT pin attempts for large boChris Wilson
Currently, we reject attempting to pin a large bo into the mappable aperture, but only after trying to create the vma. Under debug kernels, repeatedly creating and freeing that vma for an oversized bo consumes one-third of the runtime for pwrite/pread tests as it is spent on kmalloc/kfree tracking. If we move the rejection to before creating that vma, we lose some accuracy of checking against the fence_size as opposed to object size, though the fence can never be smaller than the object. Note that the vma creation itself will reject an attempt to create a vma larger than the GTT so we can remove one redundant test. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-7-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Avoid evicting user fault mappable vma for pread/pwriteChris Wilson
Both pread/pwrite GTT paths provide a fast fallback in case we cannot map the whole object at a time. Currently, we use the fallback for very large objects and for active objects that would require remapping, but we can also add active fault mappable objects to the list that we want to avoid evicting. The rationale is that such fault mappable objects are in active use and to evict requires tearing down the CPU PTE and forcing a page fault on the next access; more costly, and intefers with other processes, than our per-page GTT fallback. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Try a minimal attempt to insert the whole object for relocationsChris Wilson
As we have a lightweight fallback to insert a single page into the aperture, try to avoid any heavier evictions when attempting to insert the entire object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Check PIN_NONFAULT overlaps in evict_for_nodeChris Wilson
If the caller says that he doesn't want to evict any other faulting vma, honour that flag. The logic was used in evict_something, but not the more specific evict_for_node, now being used as a preliminary probe since commit 606fec956c0e ("drm/i915: Prefer random replacement before eviction search"). Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search") Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT") References: https://bugs.freedesktop.org/show_bug.cgi?id=102490 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Track user GTT faulting per-vmaChris Wilson
We don't wish to refault the entire object (other vma) when unbinding one partial vma. To do this track which vma have been faulted into the user's address space. v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Consolidate get_fence with pin_fenceChris Wilson
Following the pattern now used for obj->mm.pages, use just pin_fence and unpin_fence to control access to the fence registers. I.e. instead of calling get_fence(); pin_fence(), we now just need to call pin_fence(). This will make it easier to reduce the locking requirements around fence registers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-09drm/i915: Pin fence for iomapChris Wilson
Acquire the fence register for the iomap in i915_vma_pin_iomap() on behalf of the caller. We probably want for the caller to specify whether the fence should be pinned for their usage, but at the moment all callers do want the associated fence, or none, so take it on their behalf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk