summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-17drm/nouveau/fifo/gk104-: preempt recoveryBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: trigger mmu fault before attempting engine recoveryBen Skeggs
Greatly improves the chances of recovering the GPU from a CTXSW_TIMEOUT. Tested with piglit's arb_shader_image_load_store-atomicity, which causes GR to hang in such a way that recovery failed (CTXSW_TIMEOUT continually re-triggers). Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: ACK SCHED_ERROR before attempting CTXSW_TIMEOUT ↵Ben Skeggs
recovery Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: directly use new recovery code for ctxsw timeoutBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: directly use new recovery code for mmu faultsBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: reset all engines a killed channel is still active onBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: refactor recovery codeBen Skeggs
This will serve as a basis for implementing some improvements to how we recover the GPU from channel errors. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: better detection of chid when parsing engine statusBen Skeggs
The previous commit simply changes the interface, but should result in the same behaviour as previously. This commit has been split out from it as it can result in a different channel being selected. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gk104-: separate out engine status parsingBen Skeggs
We'll be wanting to reuse this logic in more places. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo: add an api for initiating channel recoveryBen Skeggs
This will be used by callers outside of fifo interrupt handlers. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/top: add function to translate subdev index to mmu fault idBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/gr/gf100-: implement chsw_load() methodBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/gr: implement chsw_load() methodBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/core: add engine method to assist in determining chsw directionBen Skeggs
FIFO gives us load/save/switch status, and we need to be able to determine which direction a "switch" is failing during channel recovery. In order to do this, we apparently need to query the engine itself. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: check for dead channel before trying to idleBen Skeggs
This prevents *very* long waits while attempting to destroy channels after a fault has occurred. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: request notifications for channels that have been killedBen Skeggs
These will be used to improve error recovery behaviour. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/gf100-: provide notification to user if channel is killedBen Skeggs
There are instances (such as non-recoverable GPU page faults) where NVKM decides that a channel's context is no longer viable, and will be removed from the runlist. This commit notifies the owner of the channel when this happens, so it has the opportunity to take some kind of recovery action instead of hanging. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo/g84-: rename non-stall interrupt eventBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/fifo: tidy up channel creation event codeBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/core: increase maximum number of notifies that a client can requestBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/drm/nouveau/led: prevent a possible use-after-freeMartin Peres
If the led class registration fails, we free drm->led but do not reset it to NULL, which means that the suspend/resume/fini function will act as if everything went well in init() and will likely crash the kernel. This patch adds the missing drm->led = NULL. Reported-by: Emmanuel Pescosta <emmanuelpescosta099@gmail.com> Signed-off-by: Martin Peres <martin.peres@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/pci/g92: Enable changing pcie link speedsKarol Herbst
Tested on a G92, seems to work. Confirmed by 8 mmiotraces. Signed-off-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/pci: Rename g94 to g92Karol Herbst
Signed-off-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/devinit/nv50: return error code if pll calculation failsBen Skeggs
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: fix bug id typo in commentIlia Mirkin
The issue was recorded in fd.o bug 27501, not 25701. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: size is u64 everywhereBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17reset: fix shared reset triggered_count decrement on errorJerome Brunet
For a shared reset, when the reset is successful, the triggered_count is incremented when trying to call the reset callback, so that another device sharing the same reset line won't trigger it again. If the reset has not been triggered successfully, the trigger_count should be decremented. The code does the opposite, and decrements the trigger_count on success. As a consequence, another device sharing the reset will be able to trigger it again. Fixed be removing negation in from of the error code of the reset function. Fixes: 7da33a37b48f ("reset: allow using reset_control_reset with shared reset") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-02-17gpu: ipu-v3: Stop overwriting pdev->dev.of_node of child devicesPhilipp Zabel
Setting dev->of_node changes the modalias and breaks module autoloading. Since there is an of_node field in the platform data passed to child devices, we don't even need this anymore. Suggested-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-02-17gpu: ipu-v3: export ipu_csi_set_downsizePhilipp Zabel
This function will be used by the media drivers and needs to be exported to allow them to be built as modules. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 867341b95891 ("gpu: ipu-v3: add ipu_csi_set_downsize") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-02-17drm/imx: lift 64x64 pixel minimum framebuffer size requirementPhilipp Zabel
There is no reason to limit framebuffer size to 64x64 pixels at a minimum on creation. The actual scanout limitations (width >= 13 for the base plane and height >= 2) are checked in atomic_check. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-02-17drm/imx: imx-tve: Do not set the regulator voltageFabio Estevam
Commit deb65870b5d9d ("drm/imx: imx-tve: check the value returned by regulator_set_voltage()") exposes the following probe issue: 63ff0000.tve supply dac not found, using dummy regulator imx-drm display-subsystem: failed to bind 63ff0000.tve (ops imx_tve_ops): -22 When the 'dac-supply' is not passed in the device tree a dummy regulator is used and setting its voltage is not allowed. To fix this issue, do not set the dac-supply voltage inside the driver and let its voltage be specified in the device tree. Print a warning if the the 'dac-supply' voltage has a value different from 2.75V. Fixes: deb65870b5d9d ("drm/imx: imx-tve: check the value returned by regulator_set_voltage()") Cc: <stable@vger.kernel.org> # 4.8+ Suggested-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-02-17s390: add missing "do {} while (0)" loop constructs to multiline macrosHeiko Carstens
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390/mm: add cond_resched call to kernel page table dumperHeiko Carstens
Walking kernel page tables within the kernel page table dumper may potentially take a lot of time. This may lead to soft lockup warning messages. To avoid this add a cond_resched call for each pgd_level iteration. This is the same as "x86/mm/ptdump: Fix soft lockup in page table walker" for x86. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: get rid of MACHINE_HAS_PFMF and MACHINE_HAS_HPAGEHeiko Carstens
Both MACHINE_HAS_PFMF and MACHINE_HAS_HPAGE are just an alias for MACHINE_HAS_EDAT1. So simply use MACHINE_HAS_EDAT1 instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390/mm: make memory_block_size_bytes available for !MEMORY_HOTPLUGHeiko Carstens
Fix this compile error for !MEMORY_HOTPLUG && NUMA: arch/s390/built-in.o: In function `emu_setup_size_adjust': arch/s390/numa/mode_emu.c:477: undefined reference to `memory_block_size_bytes' Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: replace ACCESS_ONCE with READ_ONCEChristian Borntraeger
Remove the last places of ACCESS_ONCE in s390 code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: Audit and remove any remaining unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as implicit use of ptrace.h and string.h headers. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: mm: Audit and remove any unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. An instance where module_param was used without moduleparam.h was also fixed, as well as an implict use of asm/elf.h header. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390: kernel: Audit and remove any unnecessary uses of module.hPaul Gortmaker
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each change instance for the presence of either and replace as needed. Build testing revealed some implicit header usage that was fixed up accordingly. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17s390/kdump: Use "LINUX" ELF note name instead of "CORE"Michael Holzheu
In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified as note name. Otherwise the notes are ignored. For /proc/vmcore we currently use "CORE" for these notes. Up to now this has not been a real problem because the dump analysis tool "crash" does not check the note name. But it will break all programs that use libbfd for processing ELF notes. So fix this and use "LINUX" for all s390 specific notes to comply with libbfd. Cc: stable@vger.kernel.org # v4.4+ Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17dm: flush queued bios when process blocks to avoid deadlockMikulas Patocka
Commit df2cb6daa4 ("block: Avoid deadlocks with bio allocation by stacking drivers") created a workqueue for every bio set and code in bio_alloc_bioset() that tries to resolve some low-memory deadlocks by redirecting bios queued on current->bio_list to the workqueue if the system is low on memory. However other deadlocks (see below **) may happen, without any low memory condition, because generic_make_request is queuing bios to current->bio_list (rather than submitting them). ** the related dm-snapshot deadlock is detailed here: https://www.redhat.com/archives/dm-devel/2016-July/msg00065.html Fix this deadlock by redirecting any bios on current->bio_list to the bio_set's rescue workqueue on every schedule() call. Consequently, when the process blocks on a mutex, the bios queued on current->bio_list are dispatched to independent workqueus and they can complete without waiting for the mutex to be available. The structure blk_plug contains an entry cb_list and this list can contain arbitrary callback functions that are called when the process blocks. To implement this fix DM (ab)uses the onstack plug's cb_list interface to get its flush_current_bio_list() called at schedule() time. This fixes the snapshot deadlock - if the map method blocks, flush_current_bio_list() will be called and it redirects bios waiting on current->bio_list to appropriate workqueues. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1267650 Depends-on: df2cb6daa4 ("block: Avoid deadlocks with bio allocation by stacking drivers") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-02-17dm round robin: revert "use percpu 'repeat_count' and 'current_path'"Mike Snitzer
The sloppy nature of lockless access to percpu pointers (s->current_path) in rr_select_path(), from multiple threads, is causing some paths to used more than others -- which results in less IO performance being observed. Revert these upstream commits to restore truly symmetric round-robin IO submission in DM multipath: b0b477c dm round robin: use percpu 'repeat_count' and 'current_path' 802934b dm round robin: do not use this_cpu_ptr() without having preemption disabled There is no benefit to all this complexity if repeat_count = 1 (which is the recommended default). Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-02-17drm/nouveau: s/mem/reg/ for struct ttm_mem_reg variablesBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: allocate device object for every clientBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: create userspace clients as subclientsBen Skeggs
This will allow the DRM to share memory objects between clients later down the track. For the moment, the only immediate benefit is less logic required to handle suspend/resume. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: tidy up the client init/fini interfacesBen Skeggs
These were a little insane. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: pass nvif_client to nouveau_gem_new() instead of drm_deviceBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau: pass nvif_client to nouveau_bo_new() instead of drm_deviceBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/core/memory: distinguish between coherent/non-coherent targetsBen Skeggs
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-17drm/nouveau/core/mm: replace region list with next pointerBen Skeggs
We never have any need for a double-linked list here, and as there's generally a large number of these objects, replace it with a single- linked list in order to save some memory. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>