Age | Commit message (Collapse) | Author |
|
During IPS disabling the current 42ms timeout value leads to occasional
timeouts, increase it to 100ms which seems to get rid of the problem.
References: https://bugs.freedesktop.org/show_bug.cgi?id=107494
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107562
Reported-by: Diego Viola <diego.viola@gmail.com>
Tested-by: Diego Viola <diego.viola@gmail.com>
Cc: Diego Viola <diego.viola@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180905100005.7663-1-imre.deak@intel.com
(cherry picked from commit acb3ef0ee40ea657280a4a11d9f60eb2937c0dca)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
This patch updates license to use SPDX-License-Identifier
instead of verbose license text.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The gasket in-kernel framework, recently introduced under staging,
re-implements what is already long-time provided by the UIO
subsystem, with extra PCI BAR remapping and MSI conveniences.
Before moving it out of staging, make sure we add the new bits to
the UIO framework instead, then transform its signle client, the
Apex driver, to a proper UIO driver (uio_driver.h).
Link: https://lkml.kernel.org/r/20180828103817.GB1397@do-kernel
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 550ddadcc758 ("tty: hvc: hvc_write() may sleep") broke the
termination condition in case the driver stops accepting characters.
This can result in unnecessary polling of the busy driver.
Restore it by testing the hvc_push return code.
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit ec97eaad1383 ("tty: hvc: hvc_poll() break hv read loop")
removes get_chars batching entirely, which slows down large console
operations like paste -- virtio console "feels worse than a 9600 baud
serial line," reports Matteo.
This adds back batching in a more latency friendly way. If the caller
can sleep then we try to fill the entire flip buffer, releasing the
lock and scheduling between each iteration. If it can not sleep, then
batches are limited to 128 bytes. Matteo confirms this fixes the
performance problem.
Latency testing the powerpc OPAL console with OpenBMC UART with a
large paste shows about 0.25ms latency, which seems reasonable. 10ms
latencies were typical for this case before the latency breaking work,
so we still see most of the benefit.
kopald-1204 0d.h. 5us : hvc_poll <-hvc_handle_interrupt
kopald-1204 0d.h. 5us : __hvc_poll <-hvc_handle_interrupt
kopald-1204 0d.h. 5us : _raw_spin_lock_irqsave <-__hvc_poll
kopald-1204 0d.h. 5us : tty_port_tty_get <-__hvc_poll
kopald-1204 0d.h. 6us : _raw_spin_lock_irqsave <-tty_port_tty_get
kopald-1204 0d.h. 6us : _raw_spin_unlock_irqrestore <-tty_port_tty_get
kopald-1204 0d.h. 6us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 7us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 7us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 36us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 36us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 36us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 65us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 65us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 66us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 94us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 95us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 95us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 124us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 124us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 125us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 154us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 154us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 154us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 183us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 184us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 184us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 213us : tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 213us : __tty_buffer_request_room <-__hvc_poll
kopald-1204 0d.h. 213us+: opal_get_chars <-__hvc_poll
kopald-1204 0d.h. 242us : _raw_spin_unlock_irqrestore <-__hvc_poll
kopald-1204 0d.h. 242us : tty_flip_buffer_push <-__hvc_poll
kopald-1204 0d.h. 243us : queue_work_on <-tty_flip_buffer_push
kopald-1204 0d.h. 243us : tty_kref_put <-__hvc_poll
kopald-1204 0d.h. 243us : hvc_kick <-hvc_handle_interrupt
kopald-1204 0d.h. 243us : wake_up_process <-hvc_kick
kopald-1204 0d.h. 244us : try_to_wake_up <-hvc_kick
kopald-1204 0d.h. 244us : _raw_spin_lock_irqsave <-try_to_wake_up
kopald-1204 0d.h. 244us : _raw_spin_unlock_irqrestore <-try_to_wake_up
Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit ec97eaad1383 ("tty: hvc: hvc_poll() break hv read loop") causes
the virtio console to hang at times (e.g., if you paste a bunch of
characters to it.
The reason is that get_chars must return 0 before we can be sure the
driver will kick or poll input again, but this change only scheduled a
poll if get_chars had returned a full count. Change this to poll on
any > 0 count.
Reported-by: Matteo Croce <mcroce@redhat.com>
Reported-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since the addition of WARN_ON() in nand_subop_get_data/addr_len()
helpers, this driver will produce harmless warnings (mostly at probe)
just because it always calls the nand_subop_get_data_len() helper in
the parsing function (even on non-data instructions, where this value
is meaningless and unneeded).
Fix these warnings by deriving the length only when it is relevant.
Fixes: 760c435e0f85 ("mtd: rawnand: make subop helpers return unsigned values")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
|
|
Motivated by the ksummit-discuss discussion.
Cc: Shuah Khan <shuahkhan@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch follows commit 1751e8a6cb93 ("Rename superblock
flags (MS_xyz -> SB_xyz)") and after commit ("vfs: Suppress
MS_* flag defs within the kernel unless explicitly enabled"),
there is no MS_RDONLY and MS_NOATIME at all.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Added memory barriers where they were missing to support multiple
architectures, and removed redundant ones.
As part of removing the redundant memory barriers and improving
performance, we moved to more relaxed versions of memory barriers,
as well as to the more relaxed version of writel - writel_relaxed,
while maintaining correctness.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add READ_ONCE calls where necessary (for example when iterating
over a memory field that gets updated by the hardware).
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
acquire the rtnl_lock during device destruction to avoid
using partially destroyed device.
ena_remove() shares almost the same logic as ena_destroy_device(),
so use ena_destroy_device() and avoid duplications.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ena_destroy_device() can potentially be called twice.
To avoid this, check that the device is running and
only then proceed destroying it.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When ena_destroy_device() is called from ena_suspend(), the device is
still reachable from the driver. Therefore, the driver can send a command
to the device to free all resources.
However, in all other cases of calling ena_destroy_device(), the device is
potentially in an error state and unreachable from the driver. In these
cases the driver must not send commands to the device.
The current implementation does not request resource freeing from the
device even when possible. We add the graceful parameter to
ena_destroy_device() to enable resource freeing when possible, and
use it in ena_suspend().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The buffer length field in the ena rx descriptor is 16 bit, and the
current driver passes a full page in each ena rx descriptor.
When PAGE_SIZE equals 64kB or more, the buffer length field becomes
zero.
To solve this issue, limit the ena Rx descriptor to use 16kB even
when allocating 64kB kernel pages. This change would not impact ena
device functionality, as 16kB is still larger than maximum MTU.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Starting with driver version 1.5.0, in case of a surprise device
unplug, there is a race caused by invoking ena_destroy_device()
from two different places. As a result, the readless register might
be accessed after it was destroyed.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irqchip fix from Thomas Gleixner:
"A single fix to prevent allocating excessive memory in the GIC/ITS
driver.
While the subject of the patch might suggest otherwise this is a real
fix as some SoCs exceed the memory allocation limits and fail to boot"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Cap lpi_id_bits to reduce memory footprint
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random driver fix from Ted Ts'o:
"Fix things so the choice of whether or not to trust RDRAND to
initialize the CRNG is configurable via the boot option
random.trust_cpu={on,off}"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: make CPU trust a boot parameter
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
First set of IIO fixes for the 4.19 cycle.
* ad9523
- sysfs write should return the number of characters used or an error, not
0 which could result in an infinite loop in userspace.
* lsm6dsx
- Fix computation of length when updating the watermark to include
timestamps avoiding the watermark being set earlier than intended.
* maxim_thermocouple
- Revert a patch adding the max31856 as it's not actually compatible with
the register set that this driver supports.
* si1133
- Fix an impossible value check so we don't always error out whether the
passed in value is good or bad.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few more fixes who have trickled in:
- MMC bus width fixup for some Allwinner platforms
- Fix for NULL deref in ti-aemif when no platform data is passed in
- Fix div by 0 in SCMI code
- Add a missing module alias in a new RPi driver"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
memory: ti-aemif: fix a potential NULL-pointer dereference
firmware: arm_scmi: fix divide by zero when sustained_perf_level is zero
hwmon: rpi: add module alias to raspberrypi-hwmon
arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
|
|
Commit 822fb18a82aba ("xen-netfront: wait xenbus state change when load
module manually") added a new wait queue to wait on for a state change
when the module is loaded manually. Unfortunately there is no wakeup
anywhere to stop that waiting.
Instead of introducing a new wait queue rename the existing
module_unload_q to module_wq and use it for both purposes (loading and
unloading).
As any state change of the backend might be intended to stop waiting
do the wake_up_all() in any case when netback_changed() is called.
Fixes: 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually")
Cc: <stable@vger.kernel.org> #4.18
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- bugfixes for uniphier, i801, and xiic drivers
- ID removal (never produced) for imx
- one MAINTAINER addition
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: xiic: Record xilinx i2c with Zynq fragment
i2c: xiic: Make the start and the byte count write atomic
i2c: i801: fix DNV's SMBCTRL register offset
i2c: imx-lpi2c: Remove mx8dv compatible entry
dt-bindings: imx-lpi2c: Remove mx8dv compatible entry
i2c: uniphier-f: issue STOP only for last message or I2C_M_STOP
i2c: uniphier: issue STOP only for last message or I2C_M_STOP
|
|
Commit 3559d81e76bf ("r8169: simplify rtl_hw_start_8169") changed order of
two register writes:
1) Caused RxConfig to be written before TX / RX is enabled,
2) Caused TxConfig to be written before TX / RX is enabled.
At least on XIDs 10000000 ("RTL8169sb/8110sb") and
18000000 ("RTL8169sc/8110sc") such writes are ignored by the chip, leaving
values in these registers intact.
Change 1) was reverted by
commit 05212ba8132b42 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices"),
however change 2) wasn't.
In practice, this caused TxConfig's "InterFrameGap time" and "Max DMA Burst
Size per Tx DMA Burst" bits to be zero dramatically reducing TX performance
(in my tests it dropped from around 500Mbps to around 50Mbps).
This patch fixes the issue by moving TxConfig register write a bit later in
the code so it happens after TX / RX is already enabled.
Fixes: 05212ba8132b42 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull MD fixes from Shaohua Li:
- Fix a locking issue for md-cluster (Guoqing)
- Fix a sync crash for raid10 (Ni)
- Fix a reshape bug with raid5 cache enabled (me)
* tag 'md/4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md-cluster: release RESYNC lock after the last resync message
RAID10 BUG_ON in raise_barrier when force is true and conf->barrier is 0
md/raid5-cache: disable reshape completely
|
|
Pull ceph fixes from Ilya Dryomov:
"Two rbd patches to complete support for images within namespaces that
went into -rc1 and a use-after-free fix.
The rbd changes have been sitting in a branch for quite a while but
couldn't be included into the -rc1 pull request because of a pending
wire protocol backwards compatibility fixup that only got committed
early this week"
* tag 'ceph-for-4.19-rc3' of https://github.com/ceph/ceph-client:
rbd: support cloning across namespaces
rbd: factor out get_parent_info()
ceph: avoid a use-after-free in ceph_destroy_options()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix a regression from the 4.18 cycle in the ACPI driver for
Intel SoCs (LPSS) and prevent dmi_check_system() from being called on
non-x86 systems in the ACPI core.
Specifics:
- Fix a power management regression in the ACPI driver for Intel SoCs
(LPSS) introduced by a system-wide suspend/resume fix during the
4.18 cycle (Zhang Rui).
- Prevent dmi_check_system() from being called on non-x86 systems in
the ACPI core (Jean Delvare)"
* tag 'acpi-4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / LPSS: Force LPSS quirks on boot
ACPI / bus: Only call dmi_check_system() on X86
|
|
Not all fans have a fan pulse register. This can result in reading
beyond the end of REG_FAN_PULSES and FAN_PULSE_SHIFT arrays,
and was reported by smatch as possible error.
1672 for (i = 0; i < ARRAY_SIZE(data->rpm); i++) {
^^^^^^^^^^^^^^^^^^^^^^^^
This is a 7 element array.
...
1685 data->fan_pulses[i] =
1686 (nct6775_read_value(data, data->REG_FAN_PULSES[i])
1687 >> data->FAN_PULSE_SHIFT[i]) & 0x03;
^^^^^^^^^^^^^^^^^^^^^^^^
FAN_PULSE_SHIFT is either 5 or 6
elements.
To fix the problem, we have to ensure that all REG_FAN_PULSES and
FAN_PULSE_SHIFT have the appropriate length, and that REG_FAN_PULSES
is only read if the register actually exists.
Fixes: 6c009501ff200 ("hwmon: (nct6775) Add support for NCT6102D/6106D")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Merge ACPI core fix to avoid calling dmi_check_system() on non-x86.
* acpi-bus:
ACPI / bus: Only call dmi_check_system() on X86
|
|
Pull drm fixes from Dave Airlie:
"Seems to have been overly quiet this week so I expect next week will
be more stuff, just one pull from Rodrigo with i915 fixes in it.
Quoting Rodrigo:
'The critical fix here on display side is the DP MST regression one.
But this pull also include fixes for DP SST, small VDSC register
fix and GVT's bucked with "BXT fixes, two guest warning fixes,
dmabuf format mod fix and one for recent multiple VM timeout
failure'."
* tag 'drm-fixes-2018-09-07' of git://anongit.freedesktop.org/drm/drm:
drm/i915/dp_mst: Fix enabling pipe clock for all streams
drm/i915/dsc: Fix PPS register definition macros for 2nd VDSC engine
drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"
drm/i915/gvt: Give new born vGPU higher scheduling chance
drm/i915/gvt: Fix drm_format_mod value for vGPU plane
drm/i915/gvt: move intel_runtime_pm_get out of spin_lock in stop_schedule
drm/i915/gvt: Handle GEN9_WM_CHICKEN3 with F_CMD_ACCESS.
drm/i915/gvt: Make correct handling to vreg BXT_PHY_CTL_FAMILY
drm/i915/gvt: emulate gen9 dbuf ctl register access
|
|
Bump target version to reflect the documented fixes are available.
Also fix some code comments (typos and clarity).
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
On fast devices such as NVMe, a flaw in rs_get_progress() results in
false target status output when userspace lvm2 requests leg rebuilds
(symptom of the failure is device health chars 'aaaaaaaa' instead of
expected 'aAaAAAAA' causing lvm2 to fail).
The correct sync action state definitions already exist in
decipher_sync_action() so fix rs_get_progress() to use it.
Change decipher_sync_action() to return an enum rather than a string for
the sync states and call it from rs_get_progress(). Introduce
sync_str() to translate from enum to the string that is needed by
raid_status().
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Update superblock when particular devices are requested via rebuild
(e.g. lvconvert --replace ...) to avoid spurious failure with the "New
device injected into existing raid set without 'delta_disks' or
'rebuild' parameter specified" error message.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
When initiating a stripe adding reshape, a deadlock between
md_stop_writes() waiting for the sync thread to stop and the running
sync thread waiting for inactive stripes occurs (this frequently happens
on single-core but rarely on multi-core systems).
Fix this deadlock by setting MD_RECOVERY_WAIT to have the main MD
resynchronization thread worker (md_do_sync()) bail out when initiating
the reshape via constructor arguments.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Pull block fixes from Jens Axboe:
"Small collection of fixes that should go into this release. This
contains:
- Small series that fixes a race between blkcg teardown and writeback
(Dennis Zhou)
- Fix disallowing invalid block size settings from the nbd ioctl (me)
- BFQ fix for a use-after-free on last release of a bfqg (Konstantin
Khlebnikov)
- Fix for the "don't warn for flush" fix (Mikulas)"
* tag 'for-linus-20180906' of git://git.kernel.dk/linux-block:
block: bfq: swap puts in bfqg_and_blkg_put
block: don't warn when doing fsync on read-only devices
nbd: don't allow invalid blocksize settings
blkcg: use tryget logic when associating a blkg with a bio
blkcg: delay blkg destruction until after writeback has finished
Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"
|
|
panels
Fixes eDP backlight issues on more recent laptops.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
If a HPD pulse signalling the need to retrain the link occurs between
the KMS driver releasing the output and the supervisor interrupt that
finishes the teardown, it was possible get a NULL-ptr deref.
Avoid this by marking the link as inactive earlier.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
We need to do this earlier to prevent aux channel timeouts in resume
paths on certain systems.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This Falcon application doesn't appear to be present on some newer
systems, so let's not fail init if we can't find it.
TBD: is there a way to determine whether it *should* be there?
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Fixes oopses in certain failure paths.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
The NV_ERROR macro requires drm->client to be initialised, which it may not
be at this stage of the init process.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
It looks like that when we moved over to using
drm_connector_for_each_possible_encoder() in nouveau, that one rather
important part of this function got dropped by accident:
/* Right v here */
for (i = 0; nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER; i++) {
int id = connector->encoder_ids[i];
if (id == 0)
break;
Since it's rather difficult to notice: the conditional in this loop is
actually:
nv_encoder = NULL, i < DRM_CONNECTOR_MAX_ENCODER
Meaning that all early breaks result in nv_encoder keeping it's value,
otherwise nv_encoder = NULL. Ugh.
Since this got dropped, nouveau_connector_ddc_detect() now returns an
encoder for every single connector, regardless of whether or not it's
detected:
[ 1780.056185] nouveau 0000:01:00.0: DRM: DDC responded, but no EDID for DP-2
So: fix this to ensure we only return an encoder if we actually found
one, and clean up the rest of the function while we're at it since it's
nearly impossible to read properly.
Changes since v1:
- Don't skip ddc probing for LVDS if we can't switch DDC through
vga-switcheroo, just do the DDC probing without calling
vga_switcheroo_lock_ddc() - skeggsb
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: ddba766dd07e ("drm/nouveau: Use drm_connector_for_each_possible_encoder()")
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Currently, there's nothing in nouveau that actually cancels this work
struct. So, cancel it on suspend/unload. Otherwise, if we're unlucky
enough hpd_work might try to keep running up until the system is
suspended.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
On most systems with ACPI hotplugging support, it seems that we always
receive a hotplug event once we re-enable EC interrupts even if the GPU
hasn't even been resumed yet.
This can cause problems since even though we schedule hpd_work to handle
connector reprobing for us, hpd_work synchronizes on
pm_runtime_get_sync() to wait until the device is ready to perform
reprobing. Since runtime suspend/resume callbacks are disabled before
the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
this period will grab a runtime PM ref and return immediately with
-EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
a connector reprobe immediately even if the GPU isn't actually resumed
just yet. This causes various warnings in dmesg and occasionally, also
prevents some displays connected to the dedicated GPU from coming back
up after suspend. Example:
usb 1-4: USB disconnect, device number 14
usb 1-4.1: USB disconnect, device number 15
WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Workqueue: events nouveau_display_hpd_work [nouveau]
RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
FS: 0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? _cond_resched+0x15/0x30
nouveau_connector_detect+0x2ce/0x520 [nouveau]
? _cond_resched+0x15/0x30
? ww_mutex_lock+0x12/0x40
drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
nouveau_display_hpd_work+0x2a/0x60 [nouveau]
process_one_work+0x187/0x340
worker_thread+0x2e/0x380
? pwq_unbound_release_workfn+0xd0/0xd0
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
---[ end trace 55d811b38fc8e71a ]---
So, to fix this we attempt to grab a runtime PM reference in the ACPI
handler itself asynchronously. If the GPU is already awake (it will have
normal hotplugging at this point) or runtime PM callbacks are currently
disabled on the device, we drop our reference without updating the
autosuspend delay. We only schedule connector reprobes when we
successfully managed to queue up a resume request with our asynchronous
PM ref.
This also has the added benefit of preventing redundant connector
reprobes from ACPI while the GPU is runtime resumed!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <kherbst@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
When probing a new MST device, it's not safe to make any assumptions
about it's current state. While most well mannered MST hubs will just
disable the branching unit on hotplug disconnects, this isn't enough to
save us from various other scenarios that might have resulted in
something writing to the MST branching unit before we got control of it.
This could happen if a previous probe we tried failed, if we're booting
in kexec context and the hub is still in the state the last kernel put
it in, etc.
Luckily; there is no reason we can't just reset the branching unit
every time we enable a new topology. So, fix this by resetting it on
enabling new topologies to ensure that we always start off with a clean,
unmodified topology state on MST sinks.
This fixes occasional hard-lockups on my P50's laptop dock (e.g. AUX
times out all DPCD trasactions) observed after multiple docks, undocks,
and module reloads.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Currently, nouveau will re-write the DP_MSTM_CTRL register for an MST
hub every time it receives a long HPD pulse on DP. This isn't actually
necessary and additionally, has some unintended side effects.
With the P50 I've got here, rewriting DP_MSTM_CTRL constantly seems to
make it rather likely (1 out of 5 times usually) that bringing up MST
with it's ThinkPad dock will fail and result in sideband messages timing
out in the middle. Afterwards, successive probes don't manage to get the
dock to communicate properly over MST sideband properly.
Many times sideband message timeouts from MST hubs are indicative of
either the source or the sink dropping an ESI event, which can cause
DRM's perspective of the topology's current state to go out of sync with
reality. While it's tough to really know for sure what's happening to
the dock, using userspace tools to write to DP_MSTM_CTRL in the middle
of the MST link probing process does appear to make things flaky. It's
possible that when we write to DP_MSTM_CTRL, the function that gets
triggered to respond in the dock's firmware temporarily puts it in a
state where it might end up not reporting an ESI to the source, or ends
up dropping a sideband message we sent it.
So, to fix this we make it so that when probing an MST topology, we
respect it's current state. If the dock's already enabled, we simply
read DP_MSTM_CTRL and disable the topology if it's value is not what we
expected. Otherwise, we perform the normal MST probing dance. We avoid
taking any action except if the state of the MST topology actually
changes.
This fixes MST sideband message timeouts and detection failures on my
P50 with its ThinkPad dock.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Again, this doesn't do anything. drm_kms_helper_poll_enable() will have
already been called in nouveau_display_init()
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This won't do anything but potentially make us miss hotplugs. We already
call drm_kms_helper_poll_disable() in
nouveau_pmops_suspend()->nouveau_display_suspend()->nouveau_display_fini()
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This doesn't do anything, drm_kms_helper_poll_enable() gets called in
nouveau_pmops_resume()->nouveau_display_resume()->nouveau_display_init()
already.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
When we disable hotplugging on the GPU, we need to be able to
synchronize with each connector's hotplug interrupt handler before the
interrupt is finally disabled. This can be a problem however, since
nouveau_connector_detect() currently grabs a runtime power reference
when handling connector probing. This will deadlock the runtime suspend
handler like so:
[ 861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
[ 861.483290] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.486332] kworker/0:2 D 0 61 2 0x80000000
[ 861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
[ 861.487737] Call Trace:
[ 861.488394] __schedule+0x322/0xaf0
[ 861.489070] schedule+0x33/0x90
[ 861.489744] rpm_resume+0x19c/0x850
[ 861.490392] ? finish_wait+0x90/0x90
[ 861.491068] __pm_runtime_resume+0x4e/0x90
[ 861.491753] nouveau_display_hpd_work+0x22/0x60 [nouveau]
[ 861.492416] process_one_work+0x231/0x620
[ 861.493068] worker_thread+0x44/0x3a0
[ 861.493722] kthread+0x12b/0x150
[ 861.494342] ? wq_pool_ids_show+0x140/0x140
[ 861.494991] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.495648] ret_from_fork+0x3a/0x50
[ 861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
[ 861.496968] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.498341] kworker/6:2 D 0 320 2 0x80000080
[ 861.499045] Workqueue: pm pm_runtime_work
[ 861.499739] Call Trace:
[ 861.500428] __schedule+0x322/0xaf0
[ 861.501134] ? wait_for_completion+0x104/0x190
[ 861.501851] schedule+0x33/0x90
[ 861.502564] schedule_timeout+0x3a5/0x590
[ 861.503284] ? mark_held_locks+0x58/0x80
[ 861.503988] ? _raw_spin_unlock_irq+0x2c/0x40
[ 861.504710] ? wait_for_completion+0x104/0x190
[ 861.505417] ? trace_hardirqs_on_caller+0xf4/0x190
[ 861.506136] ? wait_for_completion+0x104/0x190
[ 861.506845] wait_for_completion+0x12c/0x190
[ 861.507555] ? wake_up_q+0x80/0x80
[ 861.508268] flush_work+0x1c9/0x280
[ 861.508990] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 861.509735] nvif_notify_put+0xb1/0xc0 [nouveau]
[ 861.510482] nouveau_display_fini+0xbd/0x170 [nouveau]
[ 861.511241] nouveau_display_suspend+0x67/0x120 [nouveau]
[ 861.511969] nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[ 861.512715] nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
[ 861.513435] pci_pm_runtime_suspend+0x6b/0x180
[ 861.514165] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.514897] __rpm_callback+0x7a/0x1d0
[ 861.515618] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.516313] rpm_callback+0x24/0x80
[ 861.517027] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.517741] rpm_suspend+0x142/0x6b0
[ 861.518449] pm_runtime_work+0x97/0xc0
[ 861.519144] process_one_work+0x231/0x620
[ 861.519831] worker_thread+0x44/0x3a0
[ 861.520522] kthread+0x12b/0x150
[ 861.521220] ? wq_pool_ids_show+0x140/0x140
[ 861.521925] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.522622] ret_from_fork+0x3a/0x50
[ 861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
[ 861.523977] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.525349] kworker/6:0 D 0 1329 2 0x80000000
[ 861.526073] Workqueue: events nvif_notify_work [nouveau]
[ 861.526751] Call Trace:
[ 861.527411] __schedule+0x322/0xaf0
[ 861.528089] schedule+0x33/0x90
[ 861.528758] rpm_resume+0x19c/0x850
[ 861.529399] ? finish_wait+0x90/0x90
[ 861.530073] __pm_runtime_resume+0x4e/0x90
[ 861.530798] nouveau_connector_detect+0x7e/0x510 [nouveau]
[ 861.531459] ? ww_mutex_lock+0x47/0x80
[ 861.532097] ? ww_mutex_lock+0x47/0x80
[ 861.532819] ? drm_modeset_lock+0x88/0x130 [drm]
[ 861.533481] drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
[ 861.534127] drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
[ 861.534940] nouveau_connector_hotplug+0x98/0x120 [nouveau]
[ 861.535556] nvif_notify_work+0x2d/0xb0 [nouveau]
[ 861.536221] process_one_work+0x231/0x620
[ 861.536994] worker_thread+0x44/0x3a0
[ 861.537757] kthread+0x12b/0x150
[ 861.538463] ? wq_pool_ids_show+0x140/0x140
[ 861.539102] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.539815] ret_from_fork+0x3a/0x50
[ 861.540521]
Showing all locks held in the system:
[ 861.541696] 2 locks held by kworker/0:2/61:
[ 861.542406] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.543071] #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.543814] 1 lock held by khungtaskd/64:
[ 861.544535] #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 861.545160] 3 locks held by kworker/6:2/320:
[ 861.545896] #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.546702] #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.547443] #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
[ 861.548146] 1 lock held by dmesg/983:
[ 861.548889] 2 locks held by zsh/1250:
[ 861.549605] #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 861.550393] #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 861.551122] 6 locks held by kworker/6:0/1329:
[ 861.551957] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.552765] #1: 00000000ddb499ad ((work_completion)(¬ify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.553582] #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
[ 861.554357] #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
[ 861.555227] #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
[ 861.556133] #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]
[ 861.557864] =============================================
[ 861.559507] NMI backtrace for cpu 2
[ 861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[ 861.561948] Call Trace:
[ 861.562757] dump_stack+0x8e/0xd3
[ 861.563516] nmi_cpu_backtrace.cold.3+0x14/0x5a
[ 861.564269] ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[ 861.565029] nmi_trigger_cpumask_backtrace+0xa1/0xae
[ 861.565789] arch_trigger_cpumask_backtrace+0x19/0x20
[ 861.566558] watchdog+0x316/0x580
[ 861.567355] kthread+0x12b/0x150
[ 861.568114] ? reset_hung_task_detector+0x20/0x20
[ 861.568863] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.569598] ret_from_fork+0x3a/0x50
[ 861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
[ 861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[ 861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[ 861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[ 861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[ 861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[ 861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[ 861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
[ 861.572428] Kernel panic - not syncing: hung_task: blocked tasks
So: fix this by making it so that normal hotplug handling /only/ happens
so long as the GPU is currently awake without any pending runtime PM
requests. In the event that a hotplug occurs while the device is
suspending or resuming, we can simply defer our response until the GPU
is fully runtime resumed again.
Changes since v4:
- Use a new trick I came up with using pm_runtime_get() instead of the
hackish junk we had before
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|