linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-06-29	Merge tag 'ceph-for-4.18-rc3' of git://github.com/ceph/ceph-client	Linus Torvalds
	Pull ceph fix from Ilya Dryomov: "A trivial dentry leak fix from Zheng" * tag 'ceph-for-4.18-rc3' of git://github.com/ceph/ceph-client: ceph: fix dentry leak in splice_dentry()
2018-06-29	parisc: Build kernel without -ffunction-sections	Helge Deller
	As suggested by Nick Piggin it seems we can drop the -ffunction-sections compile flag, now that the kernel uses thin archives. Testing with 32- and 64-bit kernel showed no difference in kernel size. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
2018-06-29	sg: remove ->sg_magic member	Jens Axboe
	This was introduced more than a decade ago when sg chaining was added, but we never really caught anything with it. The scatterlist entry size can be critical, since drivers allocate it, so remove the magic member. Recently it's been triggering allocation stalls and failures in NVMe. Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-29	Merge tag 'pci-v4.18-fixes-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix crash caused by endpoint library initialization order change (Alan Douglas) - Fix shpchp NULL pointer dereference regression on non-ACPI platforms (Bjorn Helgaas) - Move PCI_DOMAINS selection to fix build regression (Lorenzo Pieralisi) * tag 'pci-v4.18-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: controller: Move PCI_DOMAINS selection to arch Kconfig PCI: Initialize endpoint library before controllers PCI: shpchp: Manage SHPC unconditionally on non-ACPI systems
2018-06-29	Merge tag 'pm-4.18-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix up recently added features (the Kryo cpufreq driver and performance states coverage in the generic power domains framework), add missing documentation for a recently added sysfs knob in the intel_pstate driver and fix an error in its documentation. Specifics: - Fix the initialization time error handling in the recently added Kryo cpufreq driver (Dan Carpenter). - Fix up the recently added coverage of performance states in the generic power domains (genpd) framework (Viresh Kumar). - Add missing documentation of the new hwp_dynamic_boost sysfs knob in the intel_pstate driver (Rafael Wysocki). - Fix incorrect sysfs path in the intel_pstate driver documentation (Rafael Wysocki)" * tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Documentation: intel_pstate: Describe hwp_dynamic_boost sysfs knob Documentation: admin-guide: intel_pstate: Fix sysfs path PM / Domains: Rename opp_node to np PM / Domains: Fix return value of of_genpd_opp_to_performance_state() cpufreq: qcom-kryo: Fix error handling in probe()
2018-06-29	Merge tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Nothing too major this round: - small set of mali-dp fixes - single meson fix - a bunch of amdgpu fixes (one makes non-4k page sizes not be a bad experience)" * tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: release spinlock before committing updates to stream drm/amdgpu:Support new VCN FW version naming convention drm/amdgpu: fix UBSAN: Undefined behaviour for amdgpu_fence.c drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()' drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping drm/amdgpu: Count disabled CRTCs in commit tail earlier drm/mali-dp: Rectify the width and height passed to rotmem_required() drm/arm/malidp: Preserve LAYER_FORMAT contents when setting format drm: mali-dp: Enable Global SE interrupts mask for DP500 drm/arm/malidp: Ensure that the crtcs are shutdown before removing any encoder/connector
2018-06-29	Merge tag 'for-4.18/dm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix dm core to use more efficient bio_split() instead of bio_clone_bioset(). Also fixes splitting bio that has integrity payload. - Three fixes related to properly validating DAX capabilities of a stacked DM device that will advertise DAX support. - Update DM writecache target to use 2-factor allocator arguments. Kees says this is the last related change for 4.18. - Fix DM zoned target to use GFP_NOIO to avoid triggering reclaim during IO submission (caught by lockdep). - Fix DM thinp to gracefully recover from running out of data space while a previous async discard completes (whereby freeing space). - Fix DM thinp's metadata transaction commit to avoid needless work. * tag 'for-4.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: prevent DAX mounts if not supported dax: check for QUEUE_FLAG_DAX in bdev_dax_supported() pmem: only set QUEUE_FLAG_DAX for fsdax mode dm thin: handle running out of data space vs concurrent discard dm raid: don't use 'const' in function return dm zoned: avoid triggering reclaim from inside dmz_map() dm writecache: use 2-factor allocator arguments dm thin metadata: remove needless work from __commit_transaction dm: use bio_split() when splitting out the already processed bio
2018-06-29	Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus	Jens Axboe
	Pull single NVMe fix from Christoph. * 'nvme-4.18' of git://git.infradead.org/nvme: nvme-rdma: fix possible double free of controller async event buffer
2018-06-29	drbd: Fix drbd_request_prepare() discard handling	Bart Van Assche
	Fix the test that verifies whether bio_op(bio) represents a discard or write zeroes operation. Compile-tested only. Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Fixes: 7435e9018f91 ("drbd: zero-out partial unaligned discards on local backend") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-29	blk-mq: don't queue more if we get a busy return	Jens Axboe
	Some devices have different queue limits depending on the type of IO. A classic case is SATA NCQ, where some commands can queue, but others cannot. If we have NCQ commands inflight and encounter a non-queueable command, the driver returns busy. Currently we attempt to dispatch more from the scheduler, if we were able to queue some commands. But for the case where we ended up stopping due to BUSY, we should not attempt to retrieve more from the scheduler. If we do, we can get into a situation where we attempt to queue a non-queueable command, get BUSY, then successfully retrieve more commands from that scheduler and queue those. This can repeat forever, starving the non-queuable command indefinitely. Fix this by NOT attempting to pull more commands from the scheduler, if we get a BUSY return. This should also be more optimal in terms of letting requests stay in the scheduler for as long as possible, if we get a BUSY due to the regular out-of-tags condition. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-29	aio: mark __aio_sigset::sigmask const	Avi Kivity
	io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Avi Kivity <avi@scylladb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [hch: reapply the patch that got incorrectly reverted] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-29	net: handle NULL ->poll gracefully	Christoph Hellwig
	The big aio poll revert broke various network protocols that don't implement ->poll as a patch in the aio poll serie removed sock_no_poll and made the common code handle this case. Reported-by: syzbot+57727883dbad76db2ef0@syzkaller.appspotmail.com Reported-by: syzbot+cdb0d3176b53d35ad454@syzkaller.appspotmail.com Reported-by: syzbot+2c7e8f74f8b2571c87e8@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Fixes: a11e1d432b51 ("Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-29	Merge tag 'ib-fbdev-drm-v4.19-deferred-console-takeover-fixup' of ↵	Gustavo Padovan
	https://github.com/bzolnier/linux into drm-misc-next Immutable branch between fbdev and drm for the v4.19 merge window (contains build fixup for the deferred console takeover feature) Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com> # gpg: Signature made Fri 29 Jun 2018 06:47:23 AM -03 # gpg: using RSA key 7E33B63FA047C20B # gpg: Can't check signature: public key not found Link: https://patchwork.freedesktop.org/patch/msgid/3340294.YySDL1Tsl7@amdc3058
2018-06-29	console: dummycon: export dummycon_[un]register_output_notifier	Hans de Goede
	Export dummycon_[un]register_output_notifier, the fbcon code needs this and may be build as a module. Fixes: 83d83bebf401 ("console/fbcon: Add support for deferred console takeover") Cc: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2018-06-29	drm/exynos: ipp: use correct enum type	Stefan Agner
	The limit_id_fallback array uses enum drm_ipp_size_id to index its content. The content itself is of type enum drm_exynos_ipp_limit_type. Cc: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: decon5433: Fix WINCONx reset value	Marek Szyprowski
	The only bits that should be preserved in decon_win_set_fmt() is WINCONx_ENWIN_F. All other bits depends on the selected pixel formats and are set by the mentioned function. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes	Marek Szyprowski
	Set per-plane global alpha to maximum value to get proper blending of XRGB and ARGB planes. This fixes the strange order of overlapping planes. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: fimc: Use real buffer width for configuring the hardware	Marek Szyprowski
	DMA hardware should respect buffer pitch, so use the width calculated from the buffer pitch instead of the virtual one. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: gsc: Fix support for NV16/61, YUV420/YVU420 and YUV422 modes	Marek Szyprowski
	Fix following issues related to planar YUV pixel format configuration: - NV16/61 modes were incorrectly programmed as NV12/21, - YVU420 was programmed as YUV420 on source, - YVU420 and YUV422 were programmed as YUV420 on output. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: gsc: Fix DRM_MODE_REFLECT_{X,Y} interpretation	Marek Szyprowski
	Horizontal (DRM_MODE_REFLECT_Y) and vertical (DMR_MODE_REFLECT_Y) flip were swapped in GScaler driver. Fix this by swapping code for interpreting them. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: gsc: Increase Exynos5433 buffer width alignment to 16 pixels	Marek Szyprowski
	Investigation revealed that GScaler hardware requires the real buffer width (pitch) to be aligned to 16 pixels. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: gsc: Use real buffer width for configuring the hardware	Marek Szyprowski
	DMA hardware should respect buffer pitch, so use the width calculated from the buffer pitch instead of the virtual one. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: scaler: Fix support for YUV420, YUV422 and YUV444 modes	Marek Szyprowski
	Fix Cb/CR components order in two-planar YUV420, YUV422 and YUV444 modes. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: scaler: Reset hardware before starting the operation	Andrzej Pietrasiewicz
	Ensure that Scaler hardware is properly reset and interrupts are cleared before processing next image. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: rotator: Fix DRM_MODE_REFLECT_{X,Y} interpretation	Marek Szyprowski
	Horizontal (DRM_MODE_REFLECT_Y) and vertical (DMR_MODE_REFLECT_Y) flip were swapped in Rotator driver. Fix this by swapping code for interpreting them. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/exynos: ipp: Rework checking for the correct buffer formats	Marek Szyprowski
	Prepare a common function for size and scale checks and call it for source and destination buffers. Then also move there the state-less checks from exynos_drm_ipp_task_setup_buffer, so the format information is already available in limits processing. Finally perform the IPP_LIMIT_BUFFER check on the real width of the buffer (the width calculated from the provided buffer pitch). Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2018-06-29	drm/i915: Remove delayed FBC activation.	Maarten Lankhorst
	The only time we should start FBC is when we have waited a vblank after the atomic update. We've already forced a vblank wait by doing wait_for_flip_done before intel_post_plane_update(), so we don't need to wait a second time before enabling. Removing the worker simplifies the code and removes possible race conditions, like happening in 103167. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103167 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180625163758.10871-2-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2018-06-29	drm/i915: Block enabling FBC until flips have been completed	Maarten Lankhorst
	There is a small race window in which FBC can be enabled after pre_plane_update is called, but before the page flip has been queued or completed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103167 Link: https://patchwork.freedesktop.org/patch/msgid/20180625163758.10871-1-maarten.lankhorst@linux.intel.com Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2018-06-29	Merge branch 'pm-domains'	Rafael J. Wysocki
	Merge fixups for the recent extenstion of the generic power domains (genpd) framework covering performance states. * pm-domains: PM / Domains: Rename opp_node to np PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
2018-06-29	i2c: gpio: initialize SCL to HIGH again	Wolfram Sang
	It seems that during the conversion from gpio* to gpiod*, the initial state of SCL was wrongly switched to LOW. Fix it to be HIGH again. Fixes: 7bb75029ef34 ("i2c: gpio: Enforce open drain through gpiolib") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-06-29	i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers	Peter Rosin
	If DMA safe memory was allocated, but the subsequent I2C transfer fails the memory is leaked. Plug this leak. Fixes: 8a77821e74d6 ("i2c: smbus: use DMA safe buffers for emulated SMBus transactions") Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-06-29	i2c: algos: bit: mention our experience about initial states	Wolfram Sang
	So, if somebody wants to re-implement this in the future, we pinpoint to a problem case. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-06-29	Revert "i2c: algo-bit: init the bus to a known state"	Wolfram Sang
	This reverts commit 3e5f06bed72fe72166a6778f630241a893f67799. As per bugzilla #200045, this caused a regression. I don't really see a way to fix it without having the hardware. So, revert the patch and I will fix the issue I was seeing originally in the i2c-gpio driver itself. I couldn't find new users of this algorithm since, so there should be no one depending on the new behaviour. Reported-by: Sergey Larin <cerg2010cerg2010@mail.ru> Fixes: 3e5f06bed72f ("i2c: algo-bit: init the bus to a known state") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Sergey Larin <cerg2010cerg2010@mail.ru> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-06-28	selinux: move user accesses in selinuxfs out of locked regions	Jann Horn
	If a user is accessing a file in selinuxfs with a pointer to a userspace buffer that is backed by e.g. a userfaultfd, the userspace access can stall indefinitely, which can block fsi->mutex if it is held. For sel_read_policy(), remove the locking, since this method doesn't seem to access anything that requires locking. For sel_read_bool(), move the user access below the locked region. For sel_write_bool() and sel_commit_bools_write(), move the user access up above the locked region. Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: removed an unused variable in sel_read_policy()] Signed-off-by: Paul Moore <paul@paul-moore.com>
2018-06-28	Merge tag 'ib-fbdev-drm-v4.19-deferred-console-takeover' of ↵	Gustavo Padovan
	https://github.com/bzolnier/linux into drm-misc-next Immutable branch between fbdev and drm for the v4.19 merge window (contains the deferred console takeover feature) Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com> # gpg: Signature made Thu 28 Jun 2018 10:24:50 AM -03 # gpg: using RSA key 7E33B63FA047C20B # gpg: Can't check signature: public key not found # Conflicts: # drivers/gpu/drm/i915/i915_gem.c # drivers/gpu/drm/i915/intel_crt.c # drivers/gpu/drm/i915/intel_display.c # drivers/gpu/drm/i915/intel_lrc.c Link: https://patchwork.freedesktop.org/patch/msgid/2462549.rLSfW9kX99@amdc3058
2018-06-28	drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)	Chris Wilson
	Back in commit 27af5eea54d1 ("drm/i915: Move execlists irq handler to a bottom half"), we came to the conclusion that running our CSB processing and ELSP submission from inside the irq handler was a bad idea. A really bad idea as we could impose nearly 1s latency on other users of the system, on average! Deferring our work to a tasklet allowed us to do the processing with irqs enabled, reducing the impact to an average of about 50us. We have since eradicated the use of forcewaked mmio from inside the CSB processing and ELSP submission, bringing the impact down to around 5us (on Kabylake); an order of magnitude better than our measurements 2 years ago on Broadwell and only about 2x worse on average than the gem_syslatency on an unladen system. In this iteration of the tasklet-vs-direct submission debate, we seek a compromise where by we submit new requests immediately to the HW but defer processing the CS interrupt onto a tasklet. We gain the advantage of low-latency and ksoftirqd avoidance when waking up the HW, while avoiding the system-wide starvation of our CS irq-storms. Comparing the impact on the maximum latency observed (that is the time stolen from an RT process) over a 120s interval, repeated several times (using gem_syslatency, similar to RT's cyclictest) while the system is fully laden with i915 nops, we see that direct submission an actually improve the worse case. Maximum latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 2) x Always using tasklets (a couple of >1000us outliers removed) + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ \| + \| \| + \| \| + \| \| + + \| \| + + + \| \| + + + + x x x \| \| +++ + + + x x x x x x \| \| +++ + ++ + + x x x x x x \| \| +++ + ++ + x x x x x \| \| + +++ + ++ * * +xxx x x xx \| \| * +++ + ++++* x+xx+ x x xxxx x \| \| *x++++++*+xx**x+ +x xx xxxx x x \| \|x* ****+************+++**xxxxxx xxx xxx + x+\| \| \|__________MA___________\| \| \| \|______M__A________\| \| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 118 91 186 124 125.28814 16.279137 + 120 92 187 109 112.00833 13.458617 Difference at 95.0% confidence -13.2798 +/- 3.79219 -10.5994% +/- 3.02677% (Student's t, pooled s = 14.9237) However the mean latency is adversely affected: Mean latency in microseconds of a third party RT thread (gem_syslatency -t 120 -f 1) x Always using tasklets + Only using tasklets from CS irq, direct submission of requests +------------------------------------------------------------------------+ \| xxxxxx + ++ \| \| xxxxxx + ++ \| \| xxxxxx + +++ ++ \| \| xxxxxxx +++++ ++ \| \| xxxxxxx +++++ ++ \| \| xxxxxxx +++++ +++ \| \| xxxxxxx + ++++++++++ \| \| xxxxxxxx ++ ++++++++++ \| \| xxxxxxxx ++ ++++++++++ \| \| xxxxxxxxxx +++++++++++++++ \| \| xxxxxxxxxxx x +++++++++++++++ \| \|x xxxxxxxxxxxxx x + + ++++++++++++++++++ +\| \| \|__A__\| \| \| \|____A___\| \| +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 120 3.506 3.727 3.631 3.6321417 0.02773109 + 120 3.834 4.149 4.039 4.0375167 0.041221676 Difference at 95.0% confidence 0.405375 +/- 0.00888913 11.1608% +/- 0.244735% (Student's t, pooled s = 0.03513) However, since the mean latency corresponds to the amount of irqsoff processing we have to do for a CS interrupt, we only need to speed that up to benefit not just system latency but our own throughput. v2: Remember to defer submissions when under reset. v4: Only use direct submission for new requests v5: Be aware that with mixing direct tasklet evaluation and deferred tasklets, we may end up idling before running the deferred tasklet. v6: Remove the redudant likely() from tasklet_is_enabled(), restrict the annotation to reset_in_progress(). v7: Take the full timeline.lock when enabling perf_pmu stats as the tasklet is no longer a valid guard. A consequence is that the stats are now only valid for engines also using the timeline.lock to process state. Testcase: igt/gem_exec_latency/rthog* References: 27af5eea54d1 ("drm/i915: Move execlists irq handler to a bottom half") Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-9-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Trust the CSB	Chris Wilson
	Now that we use the CSB stored in the CPU friendly HWSP, we do not need to track interrupts for when the mmio CSB registers are valid and can just check where we read up to last from the cached HWSP. This means we can forgo the atomic bit tracking from interrupt, and in the next patch it means we can check the CSB at any time. v2: Change the splitting inside reset_prepare, we only want to lose testing the interrupt in this patch, the next patch requires the change in locking Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-8-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Stop storing the CSB read pointer in the mmio register	Chris Wilson
	As we now never read back our current head position from the CSB pointers register, and the HW itself doesn't use it to prevent overwriting unread CSB entries, we do not need to keep updating the register. As it turns out this register is not listed as being shadowed, and so requires forcewake -- but we haven't been taking forcewake around it so the writes has probably been regularly dropped. Fortuitously, we only read the value after a reset where it did not matter, and zero was the right answer (well, close enough). Mika pointed out that this was how we used to do it (accidentally!) before he fixed it in commit cc53699b25b5 ("drm/i915: Use masked write for Context Status Buffer Pointer"). References: cc53699b25b5 ("drm/i915: Use masked write for Context Status Buffer Pointer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-7-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Reset CSB write pointer after reset	Chris Wilson
	On HW reset, the HW clears the write pointer (to 0). But since it also writes its first CSB entry to slot 0, we need to reset the write pointer back to the element before (so the first entry we read is 0). This is required for the next patch, where we trust the CSB completely! v2: Use _MASKED_FIELD v3: Store the reset value, so that we differentiate between mmio/hwsp transparently and without pretense. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-6-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Unify CSB access pointers	Chris Wilson
	Following the removal of the last workarounds, the only CSB mmio access is for the old vGPU interface. The mmio registers presented by vGPU do not require forcewake and can be treated as ordinary volatile memory, i.e. they behave just like the HWSP access just at a different location. We can reduce the CSB access to a set of read/write/buffer pointers and treat the various paths identically and not worry about forcewake. (Forcewake is nightmare for worstcase latency, and we want to process this all with irqsoff -- no latency allowed!) v2: Comments, comments, comments. Well, 2 bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-5-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Process one CSB update at a time	Chris Wilson
	In the next patch, we will process the CSB events directly from the submission path, rather than only after a CS interrupt. Hence, we will no longer have the need for a loop until the has-interrupt bit is clear, and in the meantime can remove that small optimisation. v2: Tvrtko pointed out it was safer to unconditionally kick the tasklet after each irq, when assuming that the tasklet is called for each irq. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-4-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Pull CSB reset under the timeline.lock	Chris Wilson
	In the following patch, we will process the CSB events under the timeline.lock and not serialised by the tasklet. This also means that we will need to protect access to common variables such as execlists->csb_head with the timeline.lock during reset. v2: Move sync_irq to avoid deadlocks between taking timeline.lock from our interrupt handler. v3: Kill off the synchronize_hardirq as it raises more questions than answered; now we use the timeline.lock entirely for CSB serialisation between the irq and elsewhere, we don't need to be so heavy handed with flushing v4: Treat request cancellation (wedging after failed reset) similarly Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-3-chris@chris-wilson.co.uk
2018-06-28	drm/i915/execlists: Pull submit after dequeue under timeline lock	Chris Wilson
	In the next patch, we will begin processing the CSB from inside the submission path (underneath an irqsoff section, and even from inside interrupt handlers). This means that updating the execlists->port[] will no longer be serialised by the tasklet but needs to be locked by the engine->timeline.lock instead. Pull dequeue and submit under the same lock for protection. (An alternate future plan is to keep the in/out arrays separate for concurrent processing and reduced lock coverage.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-2-chris@chris-wilson.co.uk
2018-06-28	drm/i915: Drop posting reads to flush master interrupts	Chris Wilson
	We do not need to do a posting read of our uncached mmio write to re-enable the master interrupt lines after handling an interrupt, so don't. This saves us a slow UC read before we can process the interrupt, most noticeable in execlists where any stalls imposes extra latency on GPU command execution. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-1-chris@chris-wilson.co.uk
2018-06-28	drm/i915/uc: Fetch GuC/HuC firmwares from guc/huc specific init	Michal Wajdeczko
	We're fetching GuC/HuC firmwares directly from uc level during init_early stage but this breaks guc/huc struct isolation and also strict SW-only initialization rule for init_early. Move fw fetching to init phase and do it separately per guc/huc struct. v2: don't forget to move wopcm_init - Michele v3: fetch in init_misc phase - Michal Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> #2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180628141522.62788-2-michal.wajdeczko@intel.com
2018-06-28	drm/i915/guc: Use intel_guc_init_misc to hide GuC internals	Michal Wajdeczko
	We will add more init steps to misc phase and there is no need to expose them separately for use in uc_init_misc function. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180628141522.62788-1-michal.wajdeczko@intel.com
2018-06-28	parisc: Reduce debug output in unwind code	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2018-06-29	Merge tag 'drm-misc-fixes-2018-06-28' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v4.18-rc3: - A single fix in meson for an unhandled error path in meson_drv_bind_master(). Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/fa740f31-5a8d-ed45-5e8a-aecd3f6f11b7@linux.intel.com
2018-06-29	Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes A few fixes for 4.18: - fix a read past the end of an array due to vega20 changes - fix driver on systems with non-4K pages - fix locking with pageflipping in DC that could lead to a sleep while atomic - fix VCN firmware version reporting for upcoming firmware Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628032641.2765-1-alexander.deucher@amd.com
2018-06-28	dm: prevent DAX mounts if not supported	Ross Zwisler
	Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX flag is set on the device's request queue to decide whether or not the device supports filesystem DAX. Really we should be using bdev_dax_supported() like filesystems do at mount time. This performs other tests like checking to make sure the dax_direct_access() path works. We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if any of the underlying devices do not support DAX. This makes the handling of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags in dm_table_set_restrictions(). Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this will ensure that filesystems built upon DM devices will only be able to mount with DAX if all underlying devices also support DAX. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support") Cc: stable@vger.kernel.org Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>