summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-04-10mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw workqueueIdo Schimmel
The workqueue is used to periodically update the networking stack about activity / statistics of various objects such as neighbours and TC actions. It should not be called as part of memory reclaim path, so remove the WQ_MEM_RECLAIM flag. Fixes: 3d5479e92087 ("mlxsw: core: Remove deprecated create_workqueue") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw ordered workqueueIdo Schimmel
The ordered workqueue is used to offload various objects such as routes and neighbours in the order they are notified. It should not be called as part of memory reclaim path, so remove the WQ_MEM_RECLAIM flag. This can also result in a warning [1], if a worker tries to flush a non-WQ_MEM_RECLAIM workqueue. [1] [97703.542861] workqueue: WQ_MEM_RECLAIM mlxsw_core_ordered:mlxsw_sp_router_fib6_event_work [mlxsw_spectrum] is flushing !WQ_MEM_RECLAIM events:rht_deferred_worker [97703.542884] WARNING: CPU: 1 PID: 32492 at kernel/workqueue.c:2605 check_flush_dependency+0xb5/0x130 ... [97703.542988] Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018 [97703.543049] Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib6_event_work [mlxsw_spectrum] [97703.543061] RIP: 0010:check_flush_dependency+0xb5/0x130 ... [97703.543071] RSP: 0018:ffffb3f08137bc00 EFLAGS: 00010086 [97703.543076] RAX: 0000000000000000 RBX: ffff96e07740ae00 RCX: 0000000000000000 [97703.543080] RDX: 0000000000000094 RSI: ffffffff82dc1934 RDI: 0000000000000046 [97703.543084] RBP: ffffb3f08137bc20 R08: ffffffff82dc18a0 R09: 00000000000225c0 [97703.543087] R10: 0000000000000000 R11: 0000000000007eec R12: ffffffff816e4ee0 [97703.543091] R13: ffff96e06f6a5c00 R14: ffff96e077ba7700 R15: ffffffff812ab0c0 [97703.543097] FS: 0000000000000000(0000) GS:ffff96e077a80000(0000) knlGS:0000000000000000 [97703.543101] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [97703.543104] CR2: 00007f8cd135b280 CR3: 00000001e860e003 CR4: 00000000003606e0 [97703.543109] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [97703.543112] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [97703.543115] Call Trace: [97703.543129] __flush_work+0xbd/0x1e0 [97703.543137] ? __cancel_work_timer+0x136/0x1b0 [97703.543145] ? pwq_dec_nr_in_flight+0x49/0xa0 [97703.543154] __cancel_work_timer+0x136/0x1b0 [97703.543175] ? mlxsw_reg_trans_bulk_wait+0x145/0x400 [mlxsw_core] [97703.543184] cancel_work_sync+0x10/0x20 [97703.543191] rhashtable_free_and_destroy+0x23/0x140 [97703.543198] rhashtable_destroy+0xd/0x10 [97703.543254] mlxsw_sp_fib_destroy+0xb1/0xf0 [mlxsw_spectrum] [97703.543310] mlxsw_sp_vr_put+0xa8/0xc0 [mlxsw_spectrum] [97703.543364] mlxsw_sp_fib_node_put+0xbf/0x140 [mlxsw_spectrum] [97703.543418] ? mlxsw_sp_fib6_entry_destroy+0xe8/0x110 [mlxsw_spectrum] [97703.543475] mlxsw_sp_router_fib6_event_work+0x6cd/0x7f0 [mlxsw_spectrum] [97703.543484] process_one_work+0x1fd/0x400 [97703.543493] worker_thread+0x34/0x410 [97703.543500] kthread+0x121/0x140 [97703.543507] ? process_one_work+0x400/0x400 [97703.543512] ? kthread_park+0x90/0x90 [97703.543523] ret_from_fork+0x35/0x40 Fixes: a3832b31898f ("mlxsw: core: Create an ordered workqueue for FIB offload") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Semion Lisyansky <semionl@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10mlxsw: core: Do not use WQ_MEM_RECLAIM for EMAD workqueueIdo Schimmel
The EMAD workqueue is used to handle retransmission of EMAD packets that contain configuration data for the device's firmware. Given the workers need to allocate these packets and that the code is not called as part of memory reclaim path, remove the WQ_MEM_RECLAIM flag. Fixes: d965465b60ba ("mlxsw: core: Fix possible deadlock") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10mlxsw: spectrum_switchdev: Add MDB entries in prepare phaseIdo Schimmel
The driver cannot guarantee in the prepare phase that it will be able to write an MDB entry to the device. In case the driver returned success during the prepare phase, but then failed to add the entry in the commit phase, a WARNING [1] will be generated by the switchdev core. Fix this by doing the work in the prepare phase instead. [1] [ 358.544486] swp12s0: Commit of object (id=2) failed. [ 358.550061] WARNING: CPU: 0 PID: 30 at net/switchdev/switchdev.c:281 switchdev_port_obj_add_now+0x9b/0xe0 [ 358.560754] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted 5.0.0-custom-13382-gf2449babf221 #1350 [ 358.570472] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ 358.580582] Workqueue: events switchdev_deferred_process_work [ 358.587001] RIP: 0010:switchdev_port_obj_add_now+0x9b/0xe0 ... [ 358.614109] RSP: 0018:ffffa6b900d6fe18 EFLAGS: 00010286 [ 358.619943] RAX: 0000000000000000 RBX: ffff8b00797ff000 RCX: 0000000000000000 [ 358.627912] RDX: ffff8b00b7a1d4c0 RSI: ffff8b00b7a152e8 RDI: ffff8b00b7a152e8 [ 358.635881] RBP: ffff8b005c3f5bc0 R08: 000000000000022b R09: 0000000000000000 [ 358.643850] R10: 0000000000000000 R11: ffffa6b900d6fcc8 R12: 0000000000000000 [ 358.651819] R13: dead000000000100 R14: ffff8b00b65a23c0 R15: 0ffff8b00b7a2200 [ 358.659790] FS: 0000000000000000(0000) GS:ffff8b00b7a00000(0000) knlGS:0000000000000000 [ 358.668820] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 358.675228] CR2: 00007f00aad90de0 CR3: 00000001ca80d000 CR4: 00000000001006f0 [ 358.683188] Call Trace: [ 358.685918] switchdev_port_obj_add_deferred+0x13/0x60 [ 358.691655] switchdev_deferred_process+0x6b/0xf0 [ 358.696907] switchdev_deferred_process_work+0xa/0x10 [ 358.702548] process_one_work+0x1f5/0x3f0 [ 358.707022] worker_thread+0x28/0x3c0 [ 358.711099] ? process_one_work+0x3f0/0x3f0 [ 358.715768] kthread+0x10d/0x130 [ 358.719369] ? __kthread_create_on_node+0x180/0x180 [ 358.724815] ret_from_fork+0x35/0x40 Fixes: 3a49b4fde2a1 ("mlxsw: Adding layer 2 multicast support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alex Kushnarov <alexanderk@mellanox.com> Tested-by: Alex Kushnarov <alexanderk@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-10Documentation: dt: edac: Fix Stratix10 IRQ bindingsThor Thayer
Fix Stratix10 ECC bindings to specify only the single bit error. On Stratix10 double bit errors are handled as SErrors instead of interrupts. Indicate the differences between the ARM64 and ARM32 EDAC architecture in the bindings. Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Rob Herring <robh@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: devicetree@vger.kernel.org Cc: dinguyen@kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Cc: mark.rutland@arm.com Cc: mchehab@kernel.org Link: https://lkml.kernel.org/r/1554388597-28019-2-git-send-email-thor.thayer@linux.intel.com
2019-04-10lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecsHans Holmberg
The introduction of multipage bio vectors broke pblk's partial read logic due to it not being prepared for multipage bio vectors. Use bio vector iterators instead of direct bio vector indexing. Fixes: 07173c3ec276 ("block: enable multipage bvecs") Reported-by: Klaus Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com> Updated description. Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10IB/hfi1: Do not flush send queue in the TID RDMA second legKaike Wan
When a QP is put into error state, the send queue will be flushed. This mechanism is implemented in both the first and the second leg of the send engine. Since the second leg is only responsible for data transactions in the KDETH space for the TID RDMA WRITE request, it should not perform the flushing of the send queue. This patch removes the flushing function of the second leg, but still keeps the bailing out of the QP if it is put into error state. Fixes: 70dcb2e3dc6a ("IB/hfi1: Add the TID second leg send packet builder") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10apparmorfs: fix use-after-free on symlink traversalAl Viro
symlink body shouldn't be freed without an RCU delay. Switch apparmorfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-10securityfs: fix use-after-free on symlink traversalAl Viro
symlink body shouldn't be freed without an RCU delay. Switch securityfs to ->destroy_inode() and use of call_rcu(); free both the inode and symlink body in the callback. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-10Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Several fixes, add more reviewers to the list" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: Honour 'may_reduce_num' in vring_create_virtqueue MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi virtio_pci: fix a NULL pointer reference in vp_del_vqs
2019-04-10RISC-V: Fix Maximum Physical Memory 2GiB option for 64bit systemsAnup Patel
The Maximum Physical Memory 2GiB option for 64bit systems is currently broken because kernel hangs at boot-time when this option is enabled and the underlying system has more than 2GiB memory. This issue can be easily reproduced on SiFive Unleashed board where we have 8GiB of memory. This patch fixes above issue by removing unusable memory region in setup_bootmem(). Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2019-04-10arm64: compat: Reduce address limitVincenzo Frascino
Currently, compat tasks running on arm64 can allocate memory up to TASK_SIZE_32 (UL(0x100000000)). This means that mmap() allocations, if we treat them as returning an array, are not compliant with the sections 6.5.8 of the C standard (C99) which states that: "If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P". Redefine TASK_SIZE_32 to address the issue. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Jann Horn <jannh@google.com> Cc: <stable@vger.kernel.org> Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> [will: fixed typo in comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-10ASoC: wcd9335: Fix missing regmap requirementMarc Gonzalez
wcd9335.c: undefined reference to 'devm_regmap_add_irq_chip' Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-10drm/i915/dp: revert back to max link rate and lane count on eDPJani Nikula
Commit 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast and narrow") started to optize the eDP 1.4+ link config, both per spec and as preparation for display stream compression support. Sadly, we again face panels that flat out fail with parameters they claim to support. Revert, and go back to the drawing board. v2: Actually revert to max params instead of just wide-and-slow. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109959 Fixes: 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast and narrow") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Atwood <matthew.s.atwood@intel.com> Cc: "Lee, Shawn C" <shawn.c.lee@intel.com> Cc: Dave Airlie <airlied@gmail.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.0+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Tested-by: Albert Astals Cid <aacid@kde.org> # v5.0 backport Tested-by: Emanuele Panigati <ilpanich@gmail.com> # v5.0 backport Tested-by: Matteo Iervasi <matteoiervasi@gmail.com> # v5.0 backport Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190405075220.9815-1-jani.nikula@intel.com (cherry picked from commit f11cb1c19ad0563b3c1ea5eb16a6bac0e401f428) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-04-10drm/i915/icl: Fix port disable sequence for mipi-dsiVandita Kulkarni
Re-enable clock gating of DDI clocks. v2: Fix the default ddi clk state for mipi-dsi (Imre) Fixes: 1026bea00381 ("drm/i915/icl: Ungate DSI clocks") Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1553513202-13863-2-git-send-email-vandita.kulkarni@intel.com (cherry picked from commit 942d1cf48eae3fcd7e973cfb708d5c4860f0c713) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-04-10drm/i915/icl: Ungate ddi clocks before IO enableVandita Kulkarni
IO enable sequencing needs ddi clocks enabled. These clocks will be gated at a later point in the enable sequence. v2: Fix the commit header (Uma) v3: Remove the redundant read (Ville) Fixes: 949fc52af19e ("drm/i915/icl: add pll mapping for DSI") Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1553513202-13863-1-git-send-email-vandita.kulkarni@intel.com (cherry picked from commit c5b81a325263a891d5811aabe938c87e03db4c37) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-04-10nvme: cancel request synchronouslyMing Lei
nvme_cancel_request() is used in error handler, and it is always reliable to cancel request synchronously, and avoids possible race in which request may be completed after real hw queue is destroyed. One issue is reported by our customer on NVMe RDMA, in which freed ib queue pair may be used in nvme_rdma_complete_rq(). Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Bart Van Assche <bvanassche@acm.org> Cc: James Smart <james.smart@broadcom.com> Cc: linux-nvme@lists.infradead.org Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10blk-mq: introduce blk_mq_complete_request_sync()Ming Lei
In NVMe's error handler, follows the typical steps of tearing down hardware for recovering controller: 1) stop blk_mq hw queues 2) stop the real hw queues 3) cancel in-flight requests via blk_mq_tagset_busy_iter(tags, cancel_request, ...) cancel_request(): mark the request as abort blk_mq_complete_request(req); 4) destroy real hw queues However, there may be race between #3 and #4, because blk_mq_complete_request() may run q->mq_ops->complete(rq) remotelly and asynchronously, and ->complete(rq) may be run after #4. This patch introduces blk_mq_complete_request_sync() for fixing the above race. Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Bart Van Assche <bvanassche@acm.org> Cc: James Smart <james.smart@broadcom.com> Cc: linux-nvme@lists.infradead.org Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10s390/rseq: use trap4 for RSEQ_SIGMartin Schwidefsky
Use trap4 as the guard instruction for the restartable sequence abort handler. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390: cio: fix cio_irb declarationArnd Bergmann
clang points out that the declaration of cio_irb does not match the definition exactly, it is missing the alignment attribute: ../drivers/s390/cio/cio.c:50:1: warning: section does not match previous declaration [-Wsection] DEFINE_PER_CPU_ALIGNED(struct irb, cio_irb); ^ ../include/linux/percpu-defs.h:150:2: note: expanded from macro 'DEFINE_PER_CPU_ALIGNED' DEFINE_PER_CPU_SECTION(type, name, PER_CPU_ALIGNED_SECTION) \ ^ ../include/linux/percpu-defs.h:93:9: note: expanded from macro 'DEFINE_PER_CPU_SECTION' extern __PCPU_ATTRS(sec) __typeof__(type) name; \ ^ ../include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS' __percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \ ^ ../drivers/s390/cio/cio.h:118:1: note: previous attribute is here DECLARE_PER_CPU(struct irb, cio_irb); ^ ../include/linux/percpu-defs.h:111:2: note: expanded from macro 'DECLARE_PER_CPU' DECLARE_PER_CPU_SECTION(type, name, "") ^ ../include/linux/percpu-defs.h:87:9: note: expanded from macro 'DECLARE_PER_CPU_SECTION' extern __PCPU_ATTRS(sec) __typeof__(type) name ^ ../include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS' __percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \ ^ Use DECLARE_PER_CPU_ALIGNED() here, to make the two match. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/mm: silence compiler warning when compiling without CONFIG_PGSTEThomas Huth
If CONFIG_PGSTE is not set (e.g. when compiling without KVM), GCC complains: CC arch/s390/mm/pgtable.o arch/s390/mm/pgtable.c:413:15: warning: ‘pmd_alloc_map’ defined but not used [-Wunused-function] static pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr) ^~~~~~~~~~~~~ Wrap the function with "#ifdef CONFIG_PGSTE" to silence the warning. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/qdio: eliminate queue's last_move cursorJulian Wiedmann
This cursor is used for debugging only. But since commit "s390/qdio: pass up count of ready-to-process SBALs" it effectively duplicates the first_to_check cursor, diverging for just a short moment when get_*_buffer_frontier() updates q->first_to_check. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/qdio: simplify SBAL range calculationJulian Wiedmann
When passing a range of ready-to-process SBALs to the upper-layer driver, use the available 'count' instead of calculating the distance between the first_to_check and first_to_kick cursors. This simplifies the logic of the queue-scan path, and opens up the possibility of scanning all 128 SBALs in one go (as determining the reported count no longer requires wrap-around safe arithmetic on the queue's cursors). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/qdio: pass up count of ready-to-process SBALsJulian Wiedmann
When qdio_{in,out}bound_q_moved() scans a queue for pending work, it currently only returns a boolean to its caller. The interface to the upper-layer-drivers (qdio_kick_handler() and qdio_get_next_buffers()) then re-calculates the number of pending SBALs from the q->first_to_check and q->first_to_kick cursors. Refactor this so that whenever get_{in,out}bound_buffer_frontier() adjusted the queue's first_to_check cursor, it also returns the corresponding count of ready-to-process SBALs (and 0 else). A subsequent patch will then make use of this additional information. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/qdio: fix output of DSCI value in debug fileJulian Wiedmann
The DSCI is a 1-byte field, placed at the start of an u32. So when printing it to a queue's debug state, limit the output to the part that's actually occupied by the DSCI. When the DSCI is set this gives us the expected output of '1', rather than the current (obscure) value of '16777216'. Suggested-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/protvirt: block kernel command line alterationVasily Gorbik
Disallow kernel command line alteration via ipl parameter block if running in protected virtualization environment. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/protvirt: add memory sharing for diag 308 set/storeVasily Gorbik
Add sharing of ipl parameter block for diag 308 set/store calls to allow kvm access in protected virtualization environment. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/uv: introduce guest side ultravisor codeVasily Gorbik
The Ultravisor Call Facility (stfle bit 158) defines an API to the Ultravisor (UV calls), a mini hypervisor located at machine level. With help of the Ultravisor, KVM will be able to run "protected" VMs, special VMs whose memory and management data are unavailable to KVM. The protected VMs can also request services from the Ultravisor. The guest api consists of UV calls to share and unshare memory with the kvm hypervisor. To enable this feature support PROTECTED_VIRTUALIZATION_GUEST kconfig option has been introduced. Co-developed-by: Janosch Frank <frankja@de.ibm.com> Signed-off-by: Janosch Frank <frankja@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390: introduce .boot.preserved.data section compile time validationVasily Gorbik
Same as for .boot.data section make sure that .boot.preserved.data sections of vmlinux and arch/s390/compressed/vmlinux match before producing the compressed kernel image. Symbols presence, order and sizes are cross-checked. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390: move ipl block to .boot.preserved.data sectionVasily Gorbik
.boot.preserved.data is a better fit for ipl block than .boot.data which is discarded after init. Reusing .boot.preserved.data allows to simplify code a little bit and avoid copying data from .boot.data to persistent variables. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390: introduce .boot.preserved.data sectionGerald Schaefer
Introduce .boot.preserve.data section which is similar to .boot.data and "shared" between the decompressor code and the decompressed kernel. The decompressor will store values in it, and copy over to the decompressed image before starting it. This method allows to avoid using pre-defined addresses and other hacks to pass values between those boot phases. Unlike .boot.data section .boot.preserved.data is NOT a part of init data, and hence will be preserved for the kernel life time. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/zcrypt: fix possible deadlock situation on ap queue removeHarald Freudenberger
With commit 01396a374c3d ("s390/zcrypt: revisit ap device remove procedure") the ap queue remove is now a two stage process. However, a del_timer_sync() call may trigger the timer function which may try to lock the very same spinlock as is held by the function just initiating the del_timer_sync() call. This could end up in a deadlock situation. Very unlikely but possible as you need to remove an ap queue at the exact sime time when a timeout of a request occurs. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reported-by: Pierre Morel <pmorel@linux.ibm.com> Fixes: commit 01396a374c3d ("s390/zcrypt: revisit ap device remove procedure") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10s390/3270: fix lockdep false positive on view->lockMartin Schwidefsky
The spinlock in the raw3270_view structure is used by con3270, tty3270 and fs3270 in different ways. For con3270 the lock can be acquired in irq context, for tty3270 and fs3270 the highest context is bh. Lockdep sees the view->lock as a single class and if the 3270 driver is used for the console the following message is generated: WARNING: inconsistent lock state 5.1.0-rc3-05157-g5c168033979d #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. swapper/0/1 [HC0[0]:SC1[1]:HE1:SE0] takes: (____ptrval____) (&(&view->lock)->rlock){?.-.}, at: tty3270_update+0x7c/0x330 Introduce a lockdep subclass for the view lock to distinguish bh from irq locks. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10scsi: virtio_scsi: limit number of hw queues by nr_cpu_idsDongli Zhang
When tag_set->nr_maps is 1, the block layer limits the number of hw queues by nr_cpu_ids. No matter how many hw queues are used by virtio-scsi, as it has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues. In addition, specifically for pci scenario, when the 'num_queues' specified by qemu is more than maxcpus, virtio-scsi would not be able to allocate more than maxcpus vectors in order to have a vector for each queue. As a result, it falls back into MSI-X with one vector for config and one shared for queues. Considering above reasons, this patch limits the number of hw queues used by virtio-scsi by nr_cpu_ids. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10virtio-blk: limit number of hw queues by nr_cpu_idsDongli Zhang
When tag_set->nr_maps is 1, the block layer limits the number of hw queues by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues. In addition, specifically for pci scenario, when the 'num-queues' specified by qemu is more than maxcpus, virtio-blk would not be able to allocate more than maxcpus vectors in order to have a vector for each queue. As a result, it falls back into MSI-X with one vector for config and one shared for queues. Considering above reasons, this patch limits the number of hw queues used by virtio-blk by nr_cpu_ids. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10block, bfq: fix use after free in bfq_bfqq_expirePaolo Valente
The function bfq_bfqq_expire() invokes the function __bfq_bfqq_expire(), and the latter may free the in-service bfq-queue. If this happens, then no other instruction of bfq_bfqq_expire() must be executed, or a use-after-free will occur. Basing on the assumption that __bfq_bfqq_expire() invokes bfq_put_queue() on the in-service bfq-queue exactly once, the queue is assumed to be freed if its refcounter is equal to one right before invoking __bfq_bfqq_expire(). But, since commit 9dee8b3b057e ("block, bfq: fix queue removal from weights tree") this assumption is false. __bfq_bfqq_expire() may also invoke bfq_weights_tree_remove() and, since commit 9dee8b3b057e ("block, bfq: fix queue removal from weights tree"), also the latter function may invoke bfq_put_queue(). So __bfq_bfqq_expire() may invoke bfq_put_queue() twice, and this is the actual case where the in-service queue may happen to be freed. To address this issue, this commit moves the check on the refcounter of the queue right around the last bfq_put_queue() that may be invoked on the queue. Fixes: 9dee8b3b057e ("block, bfq: fix queue removal from weights tree") Reported-by: Dmitrii Tcvetkov <demfloro@demfloro.ru> Reported-by: Douglas Anderson <dianders@chromium.org> Tested-by: Dmitrii Tcvetkov <demfloro@demfloro.ru> Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-10ALSA: hda: Fix racy display power accessTakashi Iwai
snd_hdac_display_power() doesn't handle the concurrent calls carefully enough, and it may lead to the doubly get_power or put_power calls, when a runtime PM and an async work get called in racy way. This patch addresses it by reusing the bus->lock mutex that has been used for protecting the link state change in ext bus code, so that it can protect against racy display state changes. The initialization of bus->lock was moved from snd_hdac_ext_bus_init() to snd_hdac_bus_init() as well accordingly. Testcase: igt/i915_pm_rpm/module-reload #glk-dsi Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-10alarmtimer: Return correct remaining timeAndrei Vagin
To calculate a remaining time, it's required to subtract the current time from the expiration time. In alarm_timer_remaining() the arguments of ktime_sub are swapped. Fixes: d653d8457c76 ("alarmtimer: Implement remaining callback") Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: John Stultz <john.stultz@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190408041542.26338-1-avagin@gmail.com
2019-04-10mac80211: fix RX STBC override byte orderJohannes Berg
The original patch neglected to take byte order conversions into account, fix that. Fixes: d9bb410888ce ("mac80211: allow overriding HT STBC capabilities") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-04-10locking/lockdep: Zap lock classes even with lock debugging disabledBart Van Assche
The following commit: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use") changed the behavior of lockdep_free_key_range() from unconditionally zapping lock classes into only zapping lock classes if debug_lock == true. Not zapping lock classes if debug_lock == false leaves dangling pointers in several lockdep datastructures, e.g. lock_class::name in the all_lock_classes list. The shell command "cat /proc/lockdep" causes the kernel to iterate the all_lock_classes list. Hence the "unable to handle kernel paging request" cash that Shenghui encountered by running cat /proc/lockdep. Since the new behavior can cause cat /proc/lockdep to crash, restore the pre-v5.1 behavior. This patch avoids that cat /proc/lockdep triggers the following crash with debug_lock == false: BUG: unable to handle kernel paging request at fffffbfff40ca448 RIP: 0010:__asan_load1+0x28/0x50 Call Trace: string+0xac/0x180 vsnprintf+0x23e/0x820 seq_vprintf+0x82/0xc0 seq_printf+0x92/0xb0 print_name+0x34/0xb0 l_show+0x184/0x200 seq_read+0x59e/0x6c0 proc_reg_read+0x11f/0x170 __vfs_read+0x4d/0x90 vfs_read+0xc5/0x1f0 ksys_read+0xab/0x130 __x64_sys_read+0x43/0x50 do_syscall_64+0x71/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: shenghui <shhuiw@foxmail.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use") # v5.1-rc1. Link: https://lkml.kernel.org/r/20190403233552.124673-1-bvanassche@acm.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10ASoC: pcm: fix error handling when try_module_get() fails.Ranjani Sridharan
Handle error before returning when try_module_get() fails to prevent inconsistent mutex lock/unlock. Fixes: 52034add7 (ASoC: pcm: update module refcount if module_get_upon_open is set) Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-10apparmor: Restore Y/N in /sys for apparmor's "enabled"Kees Cook
Before commit c5459b829b71 ("LSM: Plumb visibility into optional "enabled" state"), /sys/module/apparmor/parameters/enabled would show "Y" or "N" since it was using the "bool" handler. After being changed to "int", this switched to "1" or "0", breaking the userspace AppArmor detection of dbus-broker. This restores the Y/N output while keeping the LSM infrastructure happy. Before: $ cat /sys/module/apparmor/parameters/enabled 1 After: $ cat /sys/module/apparmor/parameters/enabled Y Reported-by: David Rheinsberg <david.rheinsberg@gmail.com> Reviewed-by: David Rheinsberg <david.rheinsberg@gmail.com> Link: https://lkml.kernel.org/r/CADyDSO6k8vYb1eryT4g6+EHrLCvb68GAbHVWuULkYjcZcYNhhw@mail.gmail.com Fixes: c5459b829b71 ("LSM: Plumb visibility into optional "enabled" state") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-04-10ASoC: stm32: sai: fix master clock managementOlivier Moysan
When master clock is used, master clock rate is set exclusively. Parent clocks of master clock cannot be changed after a call to clk_set_rate_exclusive(). So the parent clock of SAI kernel clock must be set before. Ensure also that exclusive rate operations are balanced in STM32 SAI driver. Signed-off-by: Olivier Moysan <olivier.moysan@st.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-10ASoC: Intel: kbl: fix wrong number of channelsTzung-Bi Shih
Fix wrong setting on number of channels. The context wants to set constraint to 2 channels instead of 4. Signed-off-by: Tzung-Bi Shih <tzungbi@google.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-10x86/perf/amd: Remove need to check "running" bit in NMI handlerLendacky, Thomas
Spurious interrupt support was added to perf in the following commit, almost a decade ago: 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters") The two previous patches (resolving the race condition when disabling a PMC and NMI latency mitigation) allow for the removal of this older spurious interrupt support. Currently in x86_pmu_stop(), the bit for the PMC in the active_mask bitmap is cleared before disabling the PMC, which sets up a race condition. This race condition was mitigated by introducing the running bitmap. That race condition can be eliminated by first disabling the PMC, waiting for PMC reset on overflow and then clearing the bit for the PMC in the active_mask bitmap. The NMI handler will not re-enable a disabled counter. If x86_pmu_stop() is called from the perf NMI handler, the NMI latency mitigation support will guard against any unhandled NMI messages. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10MAINTAINERS: Fix the I3C entryBoris Brezillon
There's no include/dt-bindings/i3c/ directory, remove this F: entry from the I3C file patterns. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Reported-by: Joe Perches <joe@perches.com> Fixes: 4f26d0666961 ("MAINTAINERS: Add myself as the I3C subsystem maintainer") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-04-10i3c: dw: Fix dw_i3c_master_disable controller by using correct maskVitor Soares
The controller was being disabled incorrectly. The correct way is to clear the DEV_CTRL_ENABLE bit. Fix this by clearing this bit. Cc: Boris Brezillon <bbrezillon@kernel.org> Cc: <stable@vger.kernel.org> Fixes: 1dd728f5d4d4 ("i3c: master: Add driver for Synopsys DesignWare IP") Signed-off-by: Vitor Soares <vitor.soares@synopsys.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-04-10i3c: Fix the verification of random PIDVitor Soares
The validation of random PID should be done by checking the boardinfo->pid instead of info.pid which is empty. Doing the change the info struture declaration is no longer necessary. Cc: Boris Brezillon <bbrezillon@kernel.org> Cc: <stable@vger.kernel.org> Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure") Signed-off-by: Vitor Soares <vitor.soares@synopsys.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-04-10locking/rwsem: Optimize rwsem structure for uncontended lock acquisitionWaiman Long
For an uncontended rwsem, count and owner are the only fields a task needs to touch when acquiring the rwsem. So they are put next to each other to increase the chance that they will share the same cacheline. On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2 cache, a rwsem locking microbenchmark with one locking thread was run to write-lock and write-unlock an array of rwsems separated 2 cachelines apart in a 1M byte memory block. The locking rates (kops/s) of the microbenchmark when the rwsems are at various "long" (8-byte) offsets from beginning of the cacheline before and after the patch were as follows: Cacheline Offset Pre-patch Post-patch ---------------- --------- ---------- 0 17,449 16,588 1 17,450 16,465 2 17,450 16,460 3 17,453 16,462 4 14,867 16,471 5 14,867 16,470 6 14,853 16,464 7 14,867 13,172 Before the patch, the count and owner are 4 "long"s apart. After the patch, they are only 1 "long" apart. The rwsem data have to be loaded from the L3 cache for each access. It can be seen that the locking rates are more consistent after the patch than before. Note that for this particular system, the performance drop happens whenever the count and owner are at an odd multiples of "long"s apart. No performance drop was observed when only a single rwsem was used (hot cache). So the drop is likely just an idiosyncrasy of the cache architecture of this chip than an inherent problem with the patch. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20190404174320.22416-12-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-10locking/rwsem: Enable lock event countingWaiman Long
Add lock event counting calls so that we can track the number of lock events happening in the rwsem code. With CONFIG_LOCK_EVENT_COUNTS on and booting a 4-socket 112-thread x86-64 system, the rwsem counts after system bootup were as follows: rwsem_opt_fail=261 rwsem_opt_wlock=50636 rwsem_rlock=445 rwsem_rlock_fail=0 rwsem_rlock_fast=22 rwsem_rtrylock=810144 rwsem_sleep_reader=441 rwsem_sleep_writer=310 rwsem_wake_reader=355 rwsem_wake_writer=2335 rwsem_wlock=261 rwsem_wlock_fail=0 rwsem_wtrylock=20583 It can be seen that most of the lock acquisitions in the slowpath were write-locks in the optimistic spinning code path with no sleeping at all. For this system, over 97% of the locks are acquired via optimistic spinning. It illustrates the importance of optimistic spinning in improving the performance of rwsem. Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20190404174320.22416-11-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>