summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-07-12drm/i915/guc: Skip cleaning up the doorbells on error-before-allocateChris Wilson
If we fail the module load, we may try and cleanup before we even allocate the GuC clients. KISS in order to try and re-enable drv_module_reload for BAT. Testcase: igt/drv_module_reload/basic-reload-inject Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180712105830.20390-1-chris@chris-wilson.co.uk
2018-07-12drm/i915: Silence warning for no vlv powercontextChris Wilson
Along a module load error path, we may try to cleanup the powercontext even before we have allocated it. Reorganising GT powermanagement is an on going process, so for simplicity handle it. [ 522.733832] WARN_ON(!dev_priv->vlv_pctx) [ 522.733986] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915] [ 522.733991] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btbcm btintel intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul bluetooth snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core ecdh_generic lpc_ich r8169 snd_pcm mii i2c_hid prime_numbers [last unloaded: i915] [ 522.734105] CPU: 1 PID: 3856 Comm: drv_module_relo Tainted: G U 4.18.0-rc4-CI-CI_DRM_4474+ #1 [ 522.734110] Hardware name: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff/DN2820FYK, BIOS FYBYT10H.86A.0059.2017.0607.2130 06/07/2017 [ 522.734193] RIP: 0010:intel_cleanup_gt_powersave+0x5f/0x70 [i915] [ 522.734197] Code: 00 74 0d 48 c7 83 68 a6 00 00 00 00 00 00 eb c8 e8 36 6f 37 e1 eb ec 48 c7 c6 c5 7a 3d a0 48 c7 c7 b5 78 3d a0 e8 71 04 e0 e0 <0f> 0b eb aa 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 c3 0f 1f 40 [ 522.734445] RSP: 0018:ffffc900004f3af0 EFLAGS: 00010282 [ 522.734453] RAX: 0000000000000000 RBX: ffff880106360000 RCX: 0000000000000001 [ 522.734458] RDX: 0000000080000001 RSI: ffffffff820c65c4 RDI: 00000000ffffffff [ 522.734463] RBP: ffff880106360000 R08: 000000009f79baee R09: 0000000000000000 [ 522.734467] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013b3133f8 [ 522.734472] R13: 00000000ffffffed R14: ffff880106360d58 R15: ffff88013b3133f8 [ 522.734477] FS: 00007f43f70af980(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 [ 522.734481] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 522.734486] CR2: 000055a13a787580 CR3: 00000001325e6000 CR4: 00000000001006e0 [ 522.734490] Call Trace: [ 522.734595] intel_modeset_cleanup+0xcf/0x140 [i915] [ 522.734682] i915_driver_load+0xc85/0x10a0 [i915] [ 522.734694] ? _raw_spin_unlock_irqrestore+0x4c/0x60 [ 522.734703] ? trace_hardirqs_on_caller+0xe0/0x1b0 [ 522.734790] i915_pci_probe+0x29/0x90 [i915] [ 522.734801] pci_device_probe+0xa1/0x130 [ 522.734813] driver_probe_device+0x306/0x480 [ 522.734824] __driver_attach+0xdb/0x100 [ 522.734830] ? driver_probe_device+0x480/0x480 [ 522.734836] ? driver_probe_device+0x480/0x480 [ 522.734844] bus_for_each_dev+0x74/0xc0 [ 522.734855] bus_add_driver+0x15f/0x250 [ 522.734863] ? 0xffffffffa0793000 [ 522.734870] driver_register+0x56/0xe0 [ 522.734877] ? 0xffffffffa0793000 [ 522.734883] do_one_initcall+0x58/0x370 [ 522.734893] ? do_init_module+0x1d/0x1ea [ 522.734900] ? rcu_read_lock_sched_held+0x6f/0x80 [ 522.734906] ? kmem_cache_alloc_trace+0x282/0x2e0 [ 522.734918] do_init_module+0x56/0x1ea [ 522.734927] load_module+0x2435/0x2b20 [ 522.734965] ? __se_sys_finit_module+0xd3/0xf0 [ 522.734972] __se_sys_finit_module+0xd3/0xf0 [ 522.734995] do_syscall_64+0x55/0x190 [ 522.735003] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 522.735009] RIP: 0033:0x7f43f675d839 [ 522.735014] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 [ 522.735260] RSP: 002b:00007ffe69384238 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 522.735269] RAX: ffffffffffffffda RBX: 000056100e387090 RCX: 00007f43f675d839 [ 522.735273] RDX: 0000000000000000 RSI: 000056100e37bff0 RDI: 0000000000000003 [ 522.735278] RBP: 000056100e37bff0 R08: 0000000000000000 R09: 0000000000000000 [ 522.735282] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 [ 522.735286] R13: 000056100e37c890 R14: 0000000000000020 R15: 0000000000000027 [ 522.735309] irq event stamp: 1389594 [ 522.735316] hardirqs last enabled at (1389593): [<ffffffff810f896c>] console_unlock+0x3fc/0x600 [ 522.735323] hardirqs last disabled at (1389594): [<ffffffff81a0111c>] error_entry+0x7c/0x100 [ 522.735329] softirqs last enabled at (1389356): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505 [ 522.735336] softirqs last disabled at (1389335): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0 [ 522.735432] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915] Testcase: igt/drv_module_reload/basic-reload-inject Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180712105454.16091-1-chris@chris-wilson.co.uk
2018-07-12ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary coresNishanth Menon
Call secure services to enable ACTLR[0] (Enable invalidates of BTB with ICIALLU) when branch hardening is enabled for kernel. On GP devices OMAP5/DRA7, there is no possibility to update secure side since "secure world" is ROM and there are no override mechanisms possible. On HS devices, appropriate PPA should do the workarounds as well. However, the configuration is only done for secondary core, since it is expected that firmware/bootloader will have enabled the required configuration for the primary boot core (note: bootloaders typically will NOT enable secondary processors, since it has no need to do so). Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2018-07-12xen: remove global bit from __default_kernel_pte_mask for pv guestsJuergen Gross
When removing the global bit from __supported_pte_mask do the same for __default_kernel_pte_mask in order to avoid the WARN_ONCE() in check_pgprot() when setting a kernel pte before having called init_mem_mapping(). Cc: <stable@vger.kernel.org> # 4.17 Reported-by: Michael Young <m.a.young@durham.ac.uk> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-07-12drm/i915/tv: fix strncpy truncation warningDominique Martinet
Change it to use strlcpy instead Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180712074103.21571-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-07-12drm/sun4i: fix build failure with CONFIG_DRM_SUN8I_MIXER=mArnd Bergmann
Having DRM_SUN4I built-in but DRM_SUN8I_MIXER as a loadable module results in a link error, as we try to access a symbol from the sun8i_tcon_top.ko module: ERROR: "sun8i_tcon_top_of_table" [drivers/gpu/drm/sun4i/sun8i-drm-hdmi.ko] undefined! ERROR: "sun8i_tcon_top_of_table" [drivers/gpu/drm/sun4i/sun4i-drm.ko] undefined! This solves the problem by adding a silent symbol for the tcon_top module, building it as a separate module in exactly the cases that we need it, but in a way that it is reachable by the other modules. Fixes: 57e23de02f48 ("drm/sun4i: DW HDMI: Expand algorithm for possible crtcs") Fixes: ef0cf6441fbb ("drm/sun4i: Add support for traversing graph with TCON TOP") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180711144403.1022829-1-arnd@arndb.de
2018-07-12drm/sun4i: mixer: Read id from DTJernej Skrabec
Currently, TCON supports 2 ways to match TCON with engine (mixer in this case). Old way is to just traverse of graph backwards and compare node pointer. New way is to match TCON and engine by their respective ids. All SoCs with DE2 enabled till now used the old way, which means mixer id was never used and thus never implemented. However, for R40, only the new way will be used. To prepare for that, implement mixer id fetching from DT. Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180711112706.30222-1-jernej.skrabec@siol.net
2018-07-12drm/sun4i: DW HDMI: Make symbol sun8i_dw_hdmi_pltfm_driver staticWei Yongjun
Fixes the following sparse warning: drivers/gpu/drm/sun4i/sun8i_dw_hdmi.c:228:24: warning: symbol 'sun8i_dw_hdmi_pltfm_driver' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/1531315367-190647-1-git-send-email-weiyongjun1@huawei.com
2018-07-12Merge tag 'gvt-next-2018-07-11' of https://github.com/intel/gvt-linux into ↵Rodrigo Vivi
drm-intel-next-queued gvt-next-2018-07-11 - vGPU huge page support (Changbin) - BXT display irq warning fix (Colin) - Handle GVT dependency well (Henry) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180711023353.GU1267@zhen-hp.sh.intel.com
2018-07-12Merge branch 'ieee802154-for-davem-2018-07-11' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2018-07-11 An update from ieee802154 for your *net* tree. Build system fix for a missing include from Arnd Bergmann. Setting the IFLA_LINK for the lowpan parent from Lubomir Rintel. Fixes for some RX corner cases in adf7242 driver by Michael Hennerich. And some small patches to cleanup our BUG_ON vs WARN_ON usage. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12ALSA: hda/ca0132: Update a pci quirk device nameAlastair Bridgewater
The PCI subsystem in question for this quirk rule has been identified as a Gigabyte GA-Z170X-Gaming 7 motherboard. Set the device name appropriately. Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com> Reviewed-by: Connor McAdams <conmanx360@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-12ALSA: hda/ca0132: Add Recon3Di quirk for Gigabyte G1.Sniper Z97Alastair Bridgewater
These motherboards have Sound Core3D and apparently "support" Recon3Di. Added to the quirk list as QUIRK_R3DI. Issue report, PCI Subsystem ID, and testing by a contributor on IRC who wished to remain anonymous. Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com> Reviewed-by: Connor McAdams <conmanx360@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-12Merge tag 'gvt-fixes-2018-07-11' of https://github.com/intel/gvt-linux into ↵Rodrigo Vivi
drm-intel-fixes gvt-fixes-2018-07-11 - Fix KBL virtual register update from LRI for GPU hang (Henry) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180711024056.GV1267@zhen-hp.sh.intel.com
2018-07-12qed: fix spelling mistake "successffuly" -> "successfully"Ewan D. Milne
Trivial fix to spelling mistake in qed_probe message. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12drm/vkms: Add vblank events simulated by hrtimersRodrigo Siqueira
This commit adds regular vblank events simulated through hrtimers, which is a feature required by VKMS to mimic real hardware. Additionally, all the vblank event send after pageflip is kept in the atomic_flush function. Changes since V1: - Compute the vblank timer interval per interruption Ville Syrjälä and Daniel Vetter: - Removes hardcoded vblank interval to get it from user space Changes since V2: Chris Wilson - Removes unnecessary algorithm to compute the next period Daniel Vetter: - Uses drm_calc_timestamping_constants to get the vblank interval instead of calculating it manually - Adds disable_vblank helper that turns of crtc - Simplifies implementation by using drm_crtc_arm_vblank_event - Replaces the code in atomic_begin to atomic_flush - Removes unnecessary field in vkms_output Changes since V3: Daniel Vetter: - Squash "drm/vkms: Add atomic helpers functions" into the commit that handling vblank events simulated by hrtimers Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/7709bba40782ec06332d57fff337797b272581fc.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12drm/vkms: Add connectors helpersRodrigo Siqueira
This patch adds the struct drm_connector_helper_funcs with some necessary hooks. Additionally, it also adds some missing hooks at drm_connector_funcs. Changes since V1: - None Change since V2: Daniel Vetter: - Remove vkms_conn_mode_valid Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/c8ee28b889234e866ef18bce4216385661c48041.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12drm: gma500: Changed __attribute__((packed)) to __packedEames Trinh
Signed-off-by: Eames Trinh <eamestrinh@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180710130021.4499-1-eamestrinh@gmail.com
2018-07-12drm/vkms: Add dumb operationsRodrigo Siqueira
VKMS currently does not handle dumb data, and as a consequence, it does not provide mechanisms for handling gem. This commit adds the necessary support for gem object/handler and the dumb functions. Changes since V1: Daniel Vetter: - Add dumb buffer support to the same patchset Changes since V2: Haneen: - Add missing gem_free_object_unlocked callback to fix the warning "Memory manager not clean during takedown" Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/70b7becc91c6a323dbc15cb5fc912cbdfe4ef7d9.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12nvme-pci: fix memory leak on probe failureKeith Busch
The nvme driver specific structures need to be initialized prior to enabling the generic controller so we can unwind on failure with out using the reference counting callbacks so that 'probe' and 'remove' can be symmetric. The newly added iod_mempool is the only resource that was being allocated out of order, and a failure there would leak the generic controller memory. This patch just moves that allocation above the controller initialization. Fixes: 943e942e6266f ("nvme-pci: limit max IO size and segments to avoid high order allocations") Reported-by: Weiping Zhang <zwp10758@gmail.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-11sfp: fix module initialisation with netdev already upRussell King
It was been observed that with a particular order of initialisation, the netdev can be up, but the SFP module still has its TX_DISABLE signal asserted. This occurs when the network device brought up before the SFP kernel module has been inserted by userspace. This occurs because sfp-bus layer does not hear about the change in network device state, and so assumes that it is still down. Set netdev->sfp when the upstream is registered to work around this problem. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11sfp: ensure we clean up properly on bus registration failureRussell King
We fail to correctly clean up after a bus registration failure, which can lead to an incorrect assumption about the registration state of the upstream or sfp cage. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11drm/amdkfd: Clean up reference of radeonYong Zhao
Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Replace mqd with mqd_mgr as the variable name for mqd_managerYong Zhao
This will make reading code much easier. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Use module parameters noretry as the internal variable nameYong Zhao
This makes all module parameters use the same form. Meanwhile clean up the surrounding code. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Introduce KFD module parameter halt_if_hws_hangYong Zhao
This avoids triggering a GPU reset or otherwise changing the HW state. Instead KFD will hang, which allows HW debugging tools to analyze the problem. Signed-off-by: Yong Zhao <yong.zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Add debugfs interface to trigger HWS hangShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Avoid destroy hqd when GPU is on resetShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Avoid invalidate tlbs when gpu is on resetShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculationShaoyun Liu
The bitmap index calculation should reverse the logic used on allocation so it will clear the same bit used on allocation Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Check NULL pointer for job before reset job's ringShaoyun Liu
job could be NULL when amdgpu_device_gpu_recover is called Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Don't use shadow BO for compute contextShaoyun Liu
Compute contexts cannot keep going after a GPU reset. Currently the process must terminate. In the future a process may be able recreate its context from scratch. Either way, there is no need to restore the GPUVM page table from shadow BOs. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Implement hang detection in KFD and call amdgpuShaoyun Liu
The reset will be performed in a new hw_exception work thread to handle HWS hang without blocking the thread that detected the hang. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Enable the gpu reset from KFDShaoyun Liu
Hook up the gpu_recover callback from KFD to amdgpu to enable handling of GPU hangs detected by KFD. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Implement GPU reset handlers in KFDShaoyun Liu
Lock KFD and evict existing queues on reset. Notify user mode by signaling hw_exception events. Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: Call KFD reset handlers during GPU resetShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Add gpu reset interface and place holderShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amd: Add kfd ioctl defines for hw_exception eventShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amd: Add gpu reset interfaces between amdgpu and amdkfdShaoyun Liu
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: fix zero reading of VMID and PASID for HawaiiLan Xiao
Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by: Lan Xiao <Lan.Xiao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Handle VM faults in KFDshaoyunl
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdgpu: save vm fault information for amdkfdshaoyunl
amdgpu save the vm fault related information for KFD usage and keep the copy until KFD read it. Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORYMoses Reuben
Signed-off-by: Moses Reuben <moses.reuben@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Fix error codes in kfd_get_processWei Lu
Return ERR_PTR(-EINVAL) if kfd_get_process fails to find the process. This fixes kernel oopses when a child process calls KFD ioctls with a file descriptor inherited from the parent process. Signed-off-by: Wei Lu <wei.lu2@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Fix race between scheduler and context restoreJay Cornwall
The scheduler may raise SQ_WAVE_STATUS.SPI_PRIO via SQ_CMD before context restore has completed. Restoring SPI_PRIO=0 after this point may cause context save to fail as the lower priority wavefronts are not selected for execution among spin-waiting wavefronts. Leave SPI_PRIO at its SPI-initialized or scheduler-raised value. v2: Also fix race with exception handler Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Stop using GFP_NOIO explicitlyFelix Kuehling
This is no longer needed with the memalloc_nofs_save/restore in dqm_lock/unlock. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11drm/amdkfd: Reliably prevent reclaim-FS while holding DQM lockFelix Kuehling
This is needed to prevent deadlocks when MMU notifiers run in reclaim-FS context and take the DQM lock for userptr evictions. Previously this was done by making all memory allocations under DQM locks GFP_NOIO. This is error prone. Using memalloc_nofs_save/restore will reliably affect all memory allocations anywhere in the kernel while the DQM lock is held. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-07-11bpf: fix panic due to oob in bpf_prog_test_run_skbDaniel Borkmann
sykzaller triggered several panics similar to the below: [...] [ 248.851531] BUG: KASAN: use-after-free in _copy_to_user+0x5c/0x90 [ 248.857656] Read of size 985 at addr ffff8808017ffff2 by task a.out/1425 [...] [ 248.865902] CPU: 1 PID: 1425 Comm: a.out Not tainted 4.18.0-rc4+ #13 [ 248.865903] Hardware name: Supermicro SYS-5039MS-H12TRF/X11SSE-F, BIOS 2.1a 03/08/2018 [ 248.865905] Call Trace: [ 248.865910] dump_stack+0xd6/0x185 [ 248.865911] ? show_regs_print_info+0xb/0xb [ 248.865913] ? printk+0x9c/0xc3 [ 248.865915] ? kmsg_dump_rewind_nolock+0xe4/0xe4 [ 248.865919] print_address_description+0x6f/0x270 [ 248.865920] kasan_report+0x25b/0x380 [ 248.865922] ? _copy_to_user+0x5c/0x90 [ 248.865924] check_memory_region+0x137/0x190 [ 248.865925] kasan_check_read+0x11/0x20 [ 248.865927] _copy_to_user+0x5c/0x90 [ 248.865930] bpf_test_finish.isra.8+0x4f/0xc0 [ 248.865932] bpf_prog_test_run_skb+0x6a0/0xba0 [...] After scrubbing the BPF prog a bit from the noise, turns out it called bpf_skb_change_head() for the lwt_xmit prog with headroom of 2. Nothing wrong in that, however, this was run with repeat >> 0 in bpf_prog_test_run_skb() and the same skb thus keeps changing until the pskb_expand_head() called from skb_cow() keeps bailing out in atomic alloc context with -ENOMEM. So upon return we'll basically have 0 headroom left yet blindly do the __skb_push() of 14 bytes and keep copying data from there in bpf_test_finish() out of bounds. Fix to check if we have enough headroom and if pskb_expand_head() fails, bail out with error. Another bug independent of this fix (but related in triggering above) is that BPF_PROG_TEST_RUN should be reworked to reset the skb/xdp buffer to it's original state from input as otherwise repeating the same test in a loop won't work for benchmarking when underlying input buffer is getting changed by the prog each time and reused for the next run leading to unexpected results. Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command") Reported-by: syzbot+709412e651e55ed96498@syzkaller.appspotmail.com Reported-by: syzbot+54f39d6ab58f39720a55@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-11ARM: 8780/1: ftrace: Only set kernel memory back to read-only after bootSteven Rostedt (VMware)
Dynamic ftrace requires modifying the code segments that are usually set to read-only. To do this, a per arch function is called both before and after the ftrace modifications are performed. The "before" function will set kernel code text to read-write to allow for ftrace to make the modifications, and the "after" function will set the kernel code text back to "read-only" to keep the kernel code text protected. The issue happens when dynamic ftrace is tested at boot up. The test is done before the kernel code text has been set to read-only. But the "before" and "after" calls are still performed. The "after" call will change the kernel code text to read-only prematurely, and other boot code that expects this code to be read-write will fail. The solution is to add a variable that is set when the kernel code text is expected to be converted to read-only, and make the ftrace "before" and "after" calls do nothing if that variable is not yet set. This is similar to the x86 solution from commit 162396309745 ("ftrace, x86: make kernel text writable only for conversions"). Link: http://lkml.kernel.org/r/20180620212906.24b7b66e@vmware.local.home Reported-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-07-11bpf: btf: Fix bitfield extraction for big endianOkash Khawaja
When extracting bitfield from a number, btf_int_bits_seq_show() builds a mask and accesses least significant byte of the number in a way specific to little-endian. This patch fixes that by checking endianness of the machine and then shifting left and right the unneeded bits. Thanks to Martin Lau for the help in navigating potential pitfalls when dealing with endianess and for the final solution. Fixes: b00b8daec828 ("bpf: btf: Add pretty print capability for data with BTF type info") Signed-off-by: Okash Khawaja <osk@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-11bpf: fix availability probing for seg6 helpersMathieu Xhonneux
bpf_lwt_seg6_* helpers require CONFIG_IPV6_SEG6_BPF, and currently return -EOPNOTSUPP to indicate unavailability. This patch forces the BPF verifier to reject programs using these helpers when !CONFIG_IPV6_SEG6_BPF, allowing users to more easily probe if they are available or not. Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>