summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2018-12-18xen: Introduce shared buffer helpers for page directory...Oleksandr Andrushchenko
based frontends. Currently the frontends which implement similar code for sharing big buffers between frontend and backend are para-virtualized DRM and sound drivers. Both define the same way to share grant references of a data buffer with the corresponding backend with little differences. Move shared code into a helper module, so there is a single implementation of the same functionality for all. This patch introduces code which is used by sound and display frontend drivers without functional changes with the intention to remove shared code from the corresponding drivers. Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-12-18nvme-rdma: implement polling queue mapSagi Grimberg
When passed with nr_poll_queues setup additional queues with cq polling context IB_POLL_DIRECT (no interrupts) and make sure to set QUEUE_FLAG_POLL on the connect_q. In addition add the third queue mapping for polling queues. nvmf connect on this queue is polled for like all other requests so make nvmf_connect_io_queue poll for polling queues. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-fabrics: allow user to pass in nr_poll_queuesSagi Grimberg
This argument will specify how many polling I/O queues to connect when creating the controller. These I/O queues will host I/O that is set with REQ_HIPRI. Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-fabrics: allow nvmf_connect_io_queue to pollSagi Grimberg
Preparation for polling support for fabrics. Polling support means that our completion queues are not generating any interrupts which means we need to poll for the nvmf io queue connect as well. Reviewed by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-core: optionally poll sync commandsSagi Grimberg
Pass poll bool to indicate that we need it to poll. This prepares us for polling support in nvmf since connect is an I/O that will be queued and has to be polled in order to complete. If poll is passed, we call nvme_execute_rq_polled which sends the requests and polls for its completion. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"Colin Ian King
There is a spelling mistake in a dev_info message, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvme-tcp: fix endianess annotationsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-18nvmet-tcp: fix endianess annotationsChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-18nvme-pci: refactor nvme_poll_irqdisable to make sparse happyChristoph Hellwig
By duplicating the nvme_process_cq in both branches we keep the sparse lock context checking happy, so do it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-18nvme-pci: only set nr_maps to 2 if poll queues are supportedChristoph Hellwig
The block layer now enables polling support on a queue if nr_maps includes the poll map, so we should only set that if we actually support poll queues. Fixes: 6544d229bf ("block: enable polling by default if a poll map is initalized") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-18nvmet: use a macro for default error locationChaitanya Kulkarni
This patch defines a new macro NVMET_NO_ERROR_LOC to represent the default error location value in the nvme-error-log-page. This is a pure cleanup patch and it does not change any functionality. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18nvmet: fix comparison of a u16 with -1Colin Ian King
Currently the u16 req->error_loc is being compared to -1 which will always be false. Fix this by casting -1 to u16 to fix this. Detected by clang: warning: result of comparison of constant -1 with expression of type 'u16' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] Fixes: 76574f37bf4c ("nvmet: add interface to update error-log page") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-18irqchip/stm32: protect configuration registers with hwspinlockBenjamin Gaignard
If a hwspinlock is defined in device tree use it to protect configuration registers. Do not request for hwspinlock during the exti driver init since the hwspinlock driver is not probed yet at that stage and the exti driver does not support deferred probe. Instead of this, postpone the hwspinlock request at the first time the hwspinlock is actually needed. Use the hwspin_trylock_raw() API which is the most appropriated here Indeed: - hwspin_lock_() calls are under spin_lock protection (chip_data->rlock or gc->lock). - the _timeout() API relies on jiffies count which won't work if IRQs are disabled which is the case here (a large part of the IRQ setup is done atomically (see irq/manage.c)) As a consequence implement the retry/timeout lock from here. And since all of this is done atomically, reduce the timeout delay to 1 ms. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18irqchip: Add driver for imx-irqsteer controllerLucas Stach
The irqsteer block is a interrupt multiplexer/remapper found on the i.MX8 line of SoCs. Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18drm/i915/icl: Add fallback detection method for TypeC legacy portsImre Deak
Add a fallback detection method for TypeC legacy ports in case the VBT port information used to detect normally such ports is incorrect. For the fallback method we use the TypeC legacy mode specific HPD interrupt flag which should only be raised for a legacy port. WARN if the VBT port info is incorrect. In a case where we'd detect the port in a contradicting way both as a legacy and also as a USB DP and/or TBT alternate port treat the port as legacy (by also emitting a WARN from icl_update_tc_port_type). v2: - Repurpose the detection as a fallback method instead of using it only for the DP legacy case. By now we should normally use VBT to detect DP legacy ports as well. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v1) Reviewed-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181214182703.18865-5-imre.deak@intel.com
2018-12-18drm/i915/icl: Fix HPD handling for TypeC legacy portsImre Deak
Atm HPD disconnect events on TypeC ports will break things, since we'll switch the TypeC mode (between legacy and disconnected modes as well as among USB DP alternate, Thunderbolt alternate and disconnected modes) on the fly from the HPD disconnect interrupt work while the port may be still active. Even if the port happens to be not active during the disconnect we'd still have a problem during a subsequent modeset or AUX transfer that could happen regardless of the port's connected state. For instance the system resume display mode restore code and userspace could perform a modeset on the port or userspace could start an AUX transfer even if the port is in disconnected state. To fix this keep TypeC legacy ports in legacy mode whenever we're not suspended. This mode is a static configuration as opposed to the Thunderbolt and USB DP alternate modes between which we can switch dynamically. We determine if a TypeC port is legacy (wired to a legacy HDMI or a legacy DP connector) via the VBT DDI port specific USB-TypeC and Thunderbolt flags. If both these flags are cleared then the port is configured for legacy mode. On such legacy ports we'll run the TypeC PHY connect sequence explicitly during driver loading and system resume (vs. running the sequence during HPD processing). The connect will succeed even if the display is not connected to begin with (or disappears during the suspended state) since for legacy ports the PORT_TX_DFLEXDPPMS / DP_PHY_MODE_STATUS_COMPLETED flag is always set (as opposed to the USB DP alternate mode where it gets set only when a display is connected). Correspondingly run the TypeC PHY disconnect sequence during system suspend and driver unloading. For the unloading case I had to split up intel_dp_encoder_destroy() to be able to have the 1. flush any pending encoder work, 2. disconnect TC PHY, 3. call DRM core cleanup and kfree on the encoder object. For now run the PHY disconnect during suspend only for TypeC legacy ports. We will need to disconnect even in USB DP alternate mode in the future, but atm we don't have a way to reconnect the port in this mode during resume if the display disappears while being suspended. So for now punt on this case. Note that we do not disconnect the port during runtime suspend; in legacy mode there are no shared HW resources (PHY lanes) with other HW blocks (USB), so no need to release / reacquire these resources as with USB DP alternate mode. The only reason to disconnect legacy ports during system suspend is that the PORT_TX_DFLEXDPPMS / DP_PHY_MODE_STATUS_COMPLETED flag must be rechecked and the port must be connected again during system resume. We'll also have to turn the check for this flag into a poll, after figuring out what's the proper timeout value for it. v2: - Remove the redundant special casing of legacy mode when doing a disconnect in icl_tc_port_connected(). It's guaranteed already that we won't disconnect legacy ports in that function. - Add a note about the new intel_ddi_encoder_destroy() hook. - Reword the commit message after switching to the VBT based detection. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108070 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108924 Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181214182703.18865-4-imre.deak@intel.com
2018-12-18drm/i915/bios: Parse the VBT TypeC and Thunderbolt port flagsImre Deak
This is needed by the next patch to determine if a DDI TypeC port is physically wired to a legacy DP or legacy HDMI connector or if the port is wired to a USB-C/Thunderbolt connector. Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181214182703.18865-3-imre.deak@intel.com
2018-12-18drm/i915/icl: Add a debug print for TypeC port disconnectionImre Deak
It's useful to see at which point a TypeC port gets disconnected, so add a debug print for it. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181214182703.18865-2-imre.deak@intel.com
2018-12-18usb: musb: dsps: fix runtime pm for peripheral modeBin Liu
Since the runtime PM support was added in musb, dsps relies on the timer calling otg_timer() to activate the usb subsystem. However the driver doesn't enable the timer for peripheral port, then the peripheral port is unable to be enumerated by a host if the other usb port is disabled or in peripheral mode too. So let's start the timer for peripheral port too. Fixes: ea2f35c01d5e ("usb: musb: Fix sleeping function called from invalid context for hdrc glue") Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-18usb: musb: dsps: fix otg state machineBin Liu
Due to lack of ID pin interrupt event on AM335x devices, the musb dsps driver uses polling to detect usb device attach for dual-role port. But in the case if a micro-A cable adapter is attached without a USB device attached to the cable, the musb state machine gets stuck in a_wait_vrise state waiting for the MUSB_CONNECT interrupt which won't happen due to the usb device is not attached. The state is stuck in a_wait_vrise even after the micro-A cable is detached, which could cause VBUS retention if then the dual-role port is attached to a host port. To fix the problem, make a_wait_vrise as a transient state, then move the state to either a_wait_bcon for host port or a_idle state for dual-role port, if no usb device is attached to the port. Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-18drm/i915: Apply missed interrupt after reset w/a to all ringbuffer genChris Wilson
Having completed a test run of gem_eio across all machines in CI we also observe the phenomenon (of lost interrupts after resetting the GPU) on gen3 machines as well as the previously sighted gen6/gen7. Let's apply the same HWSTAM workaround that was effective for gen6+ for all, as although we haven't seen the same failure on gen4/5 it seems prudent to keep the code the same. As a consequence we can remove the extra setting of HWSTAM and apply the register from a single site. v2: Delazy and move the HWSTAM into its own function v3: Mask off all HWSP writes on driver unload and engine cleanup. v4: And what about the physical hwsp? v5: No, engine->init_hw() is not called from driver_init_hw(), don't be daft. Really scrub HWSTAM as early as we can in driver_init_mmio() v6: Rename set_hwsp as it was setting the mask not the hwsp register. v7: Ville pointed out that although vcs(bsd) was introduced for g4x/ilk, per-engine HWSTAM was not introduced until gen6! References: https://bugs.freedesktop.org/show_bug.cgi?id=108735 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181218102712.11058-1-chris@chris-wilson.co.uk
2018-12-18irqchip: Add driver for Cirrus Logic Madera codecsRichard Fitzgerald
The Cirrus Logic Madera codecs (Cirrus Logic CS47L35/85/90/91 and WM1840) are highly complex devices containing up to 7 programmable DSPs and many other internal sources of interrupts plus a number of GPIOs that can be used as interrupt inputs. The large number (>150) of internal interrupt sources are managed by an on-board interrupt controller. This driver provides the handling for the interrupt controller. As the codec is accessed via regmap, we can make use of the generic IRQ functionality from regmap to do most of the work. Only around half of the possible interrupt source are currently of interest from the driver so only this subset is defined. Others can be added in future if needed. The KConfig options are not user-configurable because this driver is mandatory so is automatically included when the parent MFD driver is selected. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18dm rq: cleanup leftover code from recently removed q->mq_ops branchingMike Snitzer
When commit 6a23e05c2fe3c6 ("dm: remove legacy request-based IO path") removed some q->mq_ops branching from map_request() it left in place a goto that was only needed if that branching (and conditional 'r' assignment) existed. Now that the branching is gone map_request()'s goto can be removed too. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm verity: log the hash algorithm implementationEric Biggers
Log the hash algorithm's driver name when a dm-verity target is created. This will help people determine whether the expected implementation is being used. It can make an enormous difference; e.g., SHA-256 on ARM can be 8x faster with the crypto extensions than without. It can also be useful to know if an implementation using an external crypto accelerator is being used instead of a software implementation. Example message: [ 35.281945] device-mapper: verity: sha256 using implementation "sha256-ce" We've already found the similar message in fs/crypto/keyinfo.c to be very useful. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm crypt: log the encryption algorithm implementationEric Biggers
Log the encryption algorithm's driver name when a dm-crypt target is created. This will help people determine whether the expected implementation is being used. In some cases we've seen people do benchmarks and reject using encryption for performance reasons, when in fact they used a much slower implementation than was possible on the hardware. It can make an enormous difference; e.g., AES-XTS on ARM can be over 10x faster with the crypto extensions than without. It can also be useful to know if an implementation using an external crypto accelerator is being used instead of a software implementation. Example message: [ 29.307629] device-mapper: crypt: xts(aes) using implementation "xts-aes-ce" We've already found the similar message in fs/crypto/keyinfo.c to be very useful. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm integrity: fix spelling mistake in workqueue nameColin Ian King
Rename the workqueue from dm-intergrity-recalc to dm-integrity-recalc. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm flakey: Properly corrupt multi-page bios.Sweet Tea
The flakey target is documented to be able to corrupt the Nth byte in a bio, but does not corrupt byte indices after the first biovec in the bio. Change the corrupting function to actually corrupt the Nth byte no matter in which biovec that index falls. A test device generating two-page bios, atop a flakey device configured to corrupt a byte index on the second page, verified both the failure to corrupt before this patch and the expected corruption after this change. Signed-off-by: John Dorminy <jdorminy@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm: Check for device sector overflow if CONFIG_LBDAF is not setMilan Broz
Reference to a device in device-mapper table contains offset in sectors. If the sector_t is 32bit integer (CONFIG_LBDAF is not set), then several device-mapper targets can overflow this offset and validity check is then performed on a wrong offset and a wrong table is activated. See for example (on 32bit without CONFIG_LBDAF) this overflow: # dmsetup create test --table "0 2048 linear /dev/sdg 4294967297" # dmsetup table test 0 2048 linear 8:96 1 This patch adds explicit check for overflow if the offset is sector_t type. Signed-off-by: Milan Broz <gmazyland@gmail.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm crypt: use u64 instead of sector_t to store iv_offsetAliOS system security
The iv_offset in the mapping table of crypt target is a 64bit number when IV algorithm is plain64, plain64be, essiv or benbi. It will be assigned to iv_offset of struct crypt_config, cc_sector of struct convert_context and iv_sector of struct dm_crypt_request. These structures members are defined as a sector_t. But sector_t is 32bit when CONFIG_LBDAF is not set in 32bit kernel. In this situation sector_t is not big enough to store the 64bit iv_offset. Here is a reproducer. Prepare test image and device (loop is automatically allocated by cryptsetup): # dd if=/dev/zero of=tst.img bs=1M count=1 # echo "tst"|cryptsetup open --type plain -c aes-xts-plain64 \ --skip 500000000000000000 tst.img test On 32bit system (use IV offset value that overflows to 64bit; CONFIG_LBDAF if off) and device checksum is wrong: # dmsetup table test --showkeys 0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 3551657984 7:0 0 # sha256sum /dev/mapper/test 533e25c09176632b3794f35303488c4a8f3f965dffffa6ec2df347c168cb6c19 /dev/mapper/test On 64bit system (and on 32bit system with the patch), table and checksum is now correct: # dmsetup table test --showkeys 0 2048 crypt aes-xts-plain64 dfa7cfe3c481f2239155739c42e539ae8f2d38f304dcc89d20b26f69daaf0933 500000000000000000 7:0 0 # sha256sum /dev/mapper/test 5d16160f9d5f8c33d8051e65fdb4f003cc31cd652b5abb08f03aa6fce0df75fc /dev/mapper/test Signed-off-by: AliOS system security <alios_sys_security@linux.alibaba.com> Tested-and-Reviewed-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm kcopyd: Fix bug causing workqueue stallsNikos Tsironis
When using kcopyd to run callbacks through dm_kcopyd_do_callback() or submitting copy jobs with a source size of 0, the jobs are pushed directly to the complete_jobs list, which could be under processing by the kcopyd thread. As a result, the kcopyd thread can continue running completed jobs indefinitely, without releasing the CPU, as long as someone keeps submitting new completed jobs through the aforementioned paths. Processing of work items, queued for execution on the same CPU as the currently running kcopyd thread, is thus stalled for excessive amounts of time, hurting performance. Running the following test, from the device mapper test suite [1], dmtest run --suite snapshot -n parallel_io_to_many_snaps_N , with 8 active snapshots, we get, in dmesg, messages like the following: [68899.948523] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 95s! [68899.949282] Showing busy workqueues and worker pools: [68899.949288] workqueue events: flags=0x0 [68899.949295] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 [68899.949306] pending: vmstat_shepherd, cache_reap [68899.949331] workqueue mm_percpu_wq: flags=0x8 [68899.949337] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949345] pending: vmstat_update [68899.949387] workqueue dm_bufio_cache: flags=0x8 [68899.949392] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 [68899.949400] pending: work_fn [dm_bufio] [68899.949423] workqueue kcopyd: flags=0x8 [68899.949429] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949437] pending: do_work [dm_mod] [68899.949452] workqueue kcopyd: flags=0x8 [68899.949458] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 [68899.949466] in-flight: 13:do_work [dm_mod] [68899.949474] pending: do_work [dm_mod] [68899.949487] workqueue kcopyd: flags=0x8 [68899.949493] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949501] pending: do_work [dm_mod] [68899.949515] workqueue kcopyd: flags=0x8 [68899.949521] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949529] pending: do_work [dm_mod] [68899.949541] workqueue kcopyd: flags=0x8 [68899.949547] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949555] pending: do_work [dm_mod] [68899.949568] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=95s workers=4 idle: 27130 27223 1084 Fix this by splitting the complete_jobs list into two parts: A user facing part, named callback_jobs, and one used internally by kcopyd, retaining the name complete_jobs. dm_kcopyd_do_callback() and dispatch_job() now push their jobs to the callback_jobs list, which is spliced to the complete_jobs list once, every time the kcopyd thread wakes up. This prevents kcopyd from hogging the CPU indefinitely and causing workqueue stalls. Re-running the aforementioned test: * Workqueue stalls are eliminated * The maximum writing time among all targets is reduced from 09m37.10s to 06m04.85s and the total run time of the test is reduced from 10m43.591s to 7m19.199s [1] https://github.com/jthornber/device-mapper-test-suite Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm snapshot: Fix excessive memory usage and workqueue stallsNikos Tsironis
kcopyd has no upper limit to the number of jobs one can allocate and issue. Under certain workloads this can lead to excessive memory usage and workqueue stalls. For example, when creating multiple dm-snapshot targets with a 4K chunk size and then writing to the origin through the page cache. Syncing the page cache causes a large number of BIOs to be issued to the dm-snapshot origin target, which itself issues an even larger (because of the BIO splitting taking place) number of kcopyd jobs. Running the following test, from the device mapper test suite [1], dmtest run --suite snapshot -n many_snapshots_of_same_volume_N , with 8 active snapshots, results in the kcopyd job slab cache growing to 10G. Depending on the available system RAM this can lead to the OOM killer killing user processes: [463.492878] kthreadd invoked oom-killer: gfp_mask=0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null), order=1, oom_score_adj=0 [463.492894] kthreadd cpuset=/ mems_allowed=0 [463.492948] CPU: 7 PID: 2 Comm: kthreadd Not tainted 4.19.0-rc7 #3 [463.492950] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [463.492952] Call Trace: [463.492964] dump_stack+0x7d/0xbb [463.492973] dump_header+0x6b/0x2fc [463.492987] ? lockdep_hardirqs_on+0xee/0x190 [463.493012] oom_kill_process+0x302/0x370 [463.493021] out_of_memory+0x113/0x560 [463.493030] __alloc_pages_slowpath+0xf40/0x1020 [463.493055] __alloc_pages_nodemask+0x348/0x3c0 [463.493067] cache_grow_begin+0x81/0x8b0 [463.493072] ? cache_grow_begin+0x874/0x8b0 [463.493078] fallback_alloc+0x1e4/0x280 [463.493092] kmem_cache_alloc_node+0xd6/0x370 [463.493098] ? copy_process.part.31+0x1c5/0x20d0 [463.493105] copy_process.part.31+0x1c5/0x20d0 [463.493115] ? __lock_acquire+0x3cc/0x1550 [463.493121] ? __switch_to_asm+0x34/0x70 [463.493129] ? kthread_create_worker_on_cpu+0x70/0x70 [463.493135] ? finish_task_switch+0x90/0x280 [463.493165] _do_fork+0xe0/0x6d0 [463.493191] ? kthreadd+0x19f/0x220 [463.493233] kernel_thread+0x25/0x30 [463.493235] kthreadd+0x1bf/0x220 [463.493242] ? kthread_create_on_cpu+0x90/0x90 [463.493248] ret_from_fork+0x3a/0x50 [463.493279] Mem-Info: [463.493285] active_anon:20631 inactive_anon:4831 isolated_anon:0 [463.493285] active_file:80216 inactive_file:80107 isolated_file:435 [463.493285] unevictable:0 dirty:51266 writeback:109372 unstable:0 [463.493285] slab_reclaimable:31191 slab_unreclaimable:3483521 [463.493285] mapped:526 shmem:4903 pagetables:1759 bounce:0 [463.493285] free:33623 free_pcp:2392 free_cma:0 ... [463.493489] Unreclaimable slab info: [463.493513] Name Used Total [463.493522] bio-6 1028KB 1028KB [463.493525] bio-5 1028KB 1028KB [463.493528] dm_snap_pending_exception 236783KB 243789KB [463.493531] dm_exception 41KB 42KB [463.493534] bio-4 1216KB 1216KB [463.493537] bio-3 439396KB 439396KB [463.493539] kcopyd_job 6973427KB 6973427KB ... [463.494340] Out of memory: Kill process 1298 (ruby2.3) score 1 or sacrifice child [463.494673] Killed process 1298 (ruby2.3) total-vm:435740kB, anon-rss:20180kB, file-rss:4kB, shmem-rss:0kB [463.506437] oom_reaper: reaped process 1298 (ruby2.3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Moreover, issuing a large number of kcopyd jobs results in kcopyd hogging the CPU, while processing them. As a result, processing of work items, queued for execution on the same CPU as the currently running kcopyd thread, is stalled for long periods of time, hurting performance. Running the aforementioned test we get, in dmesg, messages like the following: [67501.194592] BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 27s! [67501.195586] Showing busy workqueues and worker pools: [67501.195591] workqueue events: flags=0x0 [67501.195597] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195611] pending: cache_reap [67501.195641] workqueue mm_percpu_wq: flags=0x8 [67501.195645] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195656] pending: vmstat_update [67501.195682] workqueue kblockd: flags=0x18 [67501.195687] pwq 5: cpus=2 node=0 flags=0x0 nice=-20 active=1/256 [67501.195698] pending: blk_timeout_work [67501.195753] workqueue kcopyd: flags=0x8 [67501.195757] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195768] pending: do_work [dm_mod] [67501.195802] workqueue kcopyd: flags=0x8 [67501.195806] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195817] pending: do_work [dm_mod] [67501.195834] workqueue kcopyd: flags=0x8 [67501.195838] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195848] pending: do_work [dm_mod] [67501.195881] workqueue kcopyd: flags=0x8 [67501.195885] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195896] pending: do_work [dm_mod] [67501.195920] workqueue kcopyd: flags=0x8 [67501.195924] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=2/256 [67501.195935] in-flight: 67:do_work [dm_mod] [67501.195945] pending: do_work [dm_mod] [67501.195961] pool 8: cpus=4 node=0 flags=0x0 nice=0 hung=27s workers=3 idle: 129 23765 The root cause for these issues is the way dm-snapshot uses kcopyd. In particular, the lack of an explicit or implicit limit to the maximum number of in-flight COW jobs. The merging path is not affected because it implicitly limits the in-flight kcopyd jobs to one. Fix these issues by using a semaphore to limit the maximum number of in-flight kcopyd jobs. We grab the semaphore before allocating a new kcopyd job in start_copy() and start_full_bio() and release it after the job finishes in copy_callback(). The initial semaphore value is configurable through a module parameter, to allow fine tuning the maximum number of in-flight COW jobs. Setting this parameter to zero initializes the semaphore to INT_MAX. A default value of 2048 maximum in-flight kcopyd jobs was chosen. This value was decided experimentally as a trade-off between memory consumption, stalling the kernel's workqueues and maintaining a high enough throughput. Re-running the aforementioned test: * Workqueue stalls are eliminated * kcopyd's job slab cache uses a maximum of 130MB * The time taken by the test to write to the snapshot-origin target is reduced from 05m20.48s to 03m26.38s [1] https://github.com/jthornber/device-mapper-test-suite Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm bufio: update comment in dm-bufio.cShenghui Wang
* Hashtable has been replaced by rbtree to manage buffers. Update the comment. * Fix typo in the comment for dm_bufio_issue_flush Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm writecache: fix typo in error msg for creating writecache_flush_threadShenghui Wang
The error msg should be "flush thread" instead of "endio thread" for writecache_flush_thread. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm: remove indirect calls from __send_changing_extent_only()Mike Snitzer
No need to be so fancy. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm mpath: only flush workqueue when neededwuzhouhui
The workqueues are shared by many multipath devices, only flush whole workqueue when necessary. Otherwise, we just flush works as needed. Signed-off-by: wuzhouhui <wuzhouhui14@mails.ucas.ac.cn> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm rq: remove unused arguments from rq_completed()Mike Snitzer
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18dm: avoid indirect call in __dm_make_requestMikulas Patocka
Indirect calls are inefficient because of retpolines that are used for spectre workaround. This patch replaces an indirect call with a condition (that can be predicted by the branch predictor). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-12-18drm/i915/icl: combo port vswing programming changes per BSPECClint Taylor
In August 2018 the BSPEC changed the ICL port programming sequence to closely resemble earlier gen programming sequence. Restrict combo phy to HBR max rate unless eDP panel is connected to port. v2: remove debug code that Imre found v3: simplify translation table if-else v4: edp translation table now based on link rate and low_swing v5: Misc review comments + r-b BSpec: 21257 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1545084827-5776-1-git-send-email-clinton.a.taylor@intel.com
2018-12-18Merge tag 'asoc-v4.21' of ↵Takashi Iwai
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next ASoC: Updates for v4.21 Not much work on the core this time around but we've seen quite a bit of driver work, including on the generic DT drivers. There's also a large part of the diff from a merge of the DaVinci and OMAP directories, along with some active development there: - Preparatory work from Morimoto-san for merging the audio-graph and audio-graph-scu cards. - A merge of the TI OMAP and DaVinci directories, the OMAP product line has been merged into the DaVinci product line so there is now a lot of IP sharing which meant that the split directories just got in the way. This has pulled in a few architecture changes as well. - A big cleanup of the Maxim MAX9867 driver from Ladislav Michl. - Support for Asahi Kaesi AKM4118, AMD ACP3x, Intel platforms with RT5660, Meson AXG S/PDIF inputs, several Qualcomm IPs and Xilinx I2S controllers.
2018-12-18PCI: mediatek: Remove un-used variant in struct mtk_pcie_portHonghui Zhang
The "lane" variant in struct mtk_pcie_port is not used, remove it. Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-12-18Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into ↵Daniel Vetter
drm-next Lucas writes: "nothing major this time, mostly some cleanups that were found on the way of reworking the code in preparation for new feature additions." Small conflict in drivers/gpu/drm/etnaviv/etnaviv_drv.c because drm-misc-next also has a patch to switch over to _put() functions. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1545130845.5874.23.camel@pengutronix.de
2018-12-18genirq: Fix various typos in commentsIngo Molnar
Go over the IRQ subsystem source code (including irqchip drivers) and fix common typos in comments. No change in functionality intended. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org
2018-12-18irqchip/irq-imx-gpcv2: Add IRQCHIP_DECLARE for i.MX8MQ compatibleLucas Stach
The GPC node on i.MX8MQ can not claim to be compatible with the i.MX7D GPC, as the power gating part has some significant differences. Thus we can not rely on the irqchip being probed with the old compatible. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18irqchip/irq-rda-intc: Fix return value check in rda8810_intc_init()Wei Yongjun
In case of error, the function of_io_request_and_map() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: d852e62ad689 ("irqchip: Add RDA8810PL interrupt driver") Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18mac80211_hwsim: fix overwriting of if_combinationJames Prestwood
Moved setting if_combination.num_different_channels/radar_detect_widths into an else after use_chanctx. In the case of use_chanctx, these two settings were getting overwritten. Signed-off-by: James Prestwood <james.prestwood@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-12-18PCI: dwc: Don't hard-code DBI/ATU offsetStephen Warren
The DWC PCIe core contains various separate register spaces: DBI, DBI2, ATU, DMA, etc. The relationship between the addresses of these register spaces is entirely determined by the implementation of the IP block, not by the IP block design itself. Hence, the DWC driver must not make assumptions that one register space can be accessed at a fixed offset from any other register space. To avoid such assumptions, introduce an explicit/separate register pointer for the ATU register space. In particular, the current assumption is not valid for NVIDIA's T194 SoC. The ATU register space is only used on systems that require unrolled ATU access. This property is detected at run-time for host controllers, and when this is detected, this patch provides a default value for atu_base that matches the previous assumption re: register layout. An alternative would be to update all drivers for HW that requires unrolled access to explicitly set atu_base. However, it's hard to tell which drivers would require atu_base to be set. The unrolled property is not detected for endpoint systems, and so any endpoint driver that requires unrolled access must explicitly set the iatu_unroll_enabled flag (none do at present), and so a check is added to require the driver to also set atu_base while at it. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Acked-by: Vidya Sagar <vidyas@nvidia.com>
2018-12-18PCI: imx: Add imx6sx suspend/resume supportLeonard Crestez
Enable PCI suspend/resume support on imx6sx SOCs. This is similar to imx7d with a few differences: * The PM_Turn_Off bit is exposed through an IOMUX GPR, like all other pcie control bits on 6sx. * The pcie_inbound_axi clk needs to be turned off in suspend. On resume it is restored via resume -> deassert_core_reset -> enable_ref_clk. Most of the resume logic is shared with the initial reset after probe. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com> Acked-by: Lucas Stach <l.stach@pengutronix.de>
2018-12-18PCI: armada8k: Add support for gpio controlled reset signalBaruch Siach
Add support for the gpio reset signal binding as described in the designware-pcie.txt DT binding document. Both the documented 'reset-gpio' property name and the more standard 'reset-gpios' name are supported. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
2018-12-18PCI: dwc: Adjust Kconfig to allow IMX6 PCIe host on IMX7Trent Piepho
The IMX6 PCI-e host driver also supports the IMX7d. However, the Kconfig dependencies of the driver prevented it from being enabled unless the kernel was built with both IMX6 and IMX7 support. It works fine to build with only IMX7 support enabled therefore adjust the Kconfig entry to allow this configuration. Signed-off-by: Trent Piepho <tpiepho@impinj.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
2018-12-18PCI: dwc: layerscape: Constify driver dataStefan Agner
Constify driver data since they do not get changed at runtime. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>