summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-05-01staging: fieldbus: anybus-s: fix wait_for_completion_timeout return handlingNicholas Mc Guire
wait_for_completion_timeout() returns unsigned long (0 on timeout or remaining jiffies) not int. Assigning this return value to int may theoretically overflow (though not in this case where TIMEOUT is only HZ*2). Fix this inconsistency by wrapping the wait_for_completion_timeout into the if(). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Reviewed-by: Sven Van Asbroeck <TheSven73@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-01iov_iter: fix iov_iter_typeMing Lei
Commit 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag") introduces one extra flag of ITER_BVEC_FLAG_NO_REF, and this flag is stored into iter->type. However, iov_iter_type() doesn't consider the new added flag, fix it by masking this flag in iov_iter_type(). Fixes: 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01block: fix handling for BIO_NO_PAGE_REFMing Lei
Commit 399254aaf489211 ("block: add BIO_NO_PAGE_REF flag") introduces BIO_NO_PAGE_REF, and once this flag is set for one bio, all pages in the bio won't be get/put during IO. However, if one bio is submitted via __blkdev_direct_IO_simple(), even though BIO_NO_PAGE_REF is set, pages still may be put. Fixes this issue by avoiding to put pages if BIO_NO_PAGE_REF is set. Fixes: 399254aaf489211 ("block: add BIO_NO_PAGE_REF flag") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01io_uring: drop req submit reference always in async puntJens Axboe
If we don't end up actually calling submit in io_sq_wq_submit_work(), we still need to drop the submit reference to the request. If we don't, then we can leak the request. This can happen if we race with ring shutdown while flushing the workqueue for requests that require use of the mm_struct. Fixes: e65ef56db494 ("io_uring: use regular request ref counts") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01io_uring: free allocated io_memory onceMark Rutland
If io_allocate_scq_urings() fails to allocate an sq_* region, it will call io_mem_free() for any previously allocated regions, but leave dangling pointers to these regions in the ctx. Any regions which have not yet been allocated are left NULL. Note that when returning -EOVERFLOW, the previously allocated sq_ring is not freed, which appears to be an unintentional leak. When io_allocate_scq_urings() fails, io_uring_create() will call io_ring_ctx_wait_and_kill(), which calls io_mem_free() on all the sq_* regions, assuming the pointers are valid and not NULL. This can result in pages being freed multiple times, which has been observed to corrupt the page state, leading to subsequent fun. This can also result in virt_to_page() on NULL, resulting in the use of bogus page addresses, and yet more subsequent fun. The latter can be detected with CONFIG_DEBUG_VIRTUAL on arm64. Adding a cleanup path to io_allocate_scq_urings() complicates the logic, so let's leave it to io_ring_ctx_free() to consistently free these pointers, and simplify the io_allocate_scq_urings() error paths. Full splats from before this patch below. Note that the pointer logged by the DEBUG_VIRTUAL "non-linear address" warning has been hashed, and is actually NULL. [ 26.098129] page:ffff80000e949a00 count:0 mapcount:-128 mapping:0000000000000000 index:0x0 [ 26.102976] flags: 0x63fffc000000() [ 26.104373] raw: 000063fffc000000 ffff80000e86c188 ffff80000ea3df08 0000000000000000 [ 26.108917] raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000 [ 26.137235] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) [ 26.143960] ------------[ cut here ]------------ [ 26.146020] kernel BUG at include/linux/mm.h:547! [ 26.147586] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 26.149163] Modules linked in: [ 26.150287] Process syz-executor.21 (pid: 20204, stack limit = 0x000000000e9cefeb) [ 26.153307] CPU: 2 PID: 20204 Comm: syz-executor.21 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #18 [ 26.156566] Hardware name: linux,dummy-virt (DT) [ 26.158089] pstate: 40400005 (nZcv daif +PAN -UAO) [ 26.159869] pc : io_mem_free+0x9c/0xa8 [ 26.161436] lr : io_mem_free+0x9c/0xa8 [ 26.162720] sp : ffff000013003d60 [ 26.164048] x29: ffff000013003d60 x28: ffff800025048040 [ 26.165804] x27: 0000000000000000 x26: ffff800025048040 [ 26.167352] x25: 00000000000000c0 x24: ffff0000112c2820 [ 26.169682] x23: 0000000000000000 x22: 0000000020000080 [ 26.171899] x21: ffff80002143b418 x20: ffff80002143b400 [ 26.174236] x19: ffff80002143b280 x18: 0000000000000000 [ 26.176607] x17: 0000000000000000 x16: 0000000000000000 [ 26.178997] x15: 0000000000000000 x14: 0000000000000000 [ 26.181508] x13: 00009178a5e077b2 x12: 0000000000000001 [ 26.183863] x11: 0000000000000000 x10: 0000000000000980 [ 26.186437] x9 : ffff000013003a80 x8 : ffff800025048a20 [ 26.189006] x7 : ffff8000250481c0 x6 : ffff80002ffe9118 [ 26.191359] x5 : ffff80002ffe9118 x4 : 0000000000000000 [ 26.193863] x3 : ffff80002ffefe98 x2 : 44c06ddd107d1f00 [ 26.196642] x1 : 0000000000000000 x0 : 000000000000003e [ 26.198892] Call trace: [ 26.199893] io_mem_free+0x9c/0xa8 [ 26.201155] io_ring_ctx_wait_and_kill+0xec/0x180 [ 26.202688] io_uring_setup+0x6c4/0x6f0 [ 26.204091] __arm64_sys_io_uring_setup+0x18/0x20 [ 26.205576] el0_svc_common.constprop.0+0x7c/0xe8 [ 26.207186] el0_svc_handler+0x28/0x78 [ 26.208389] el0_svc+0x8/0xc [ 26.209408] Code: aa0203e0 d0006861 9133a021 97fcdc3c (d4210000) [ 26.211995] ---[ end trace bdb81cd43a21e50d ]--- [ 81.770626] ------------[ cut here ]------------ [ 81.825015] virt_to_phys used for non-linear address: 000000000d42f2c7 ( (null)) [ 81.827860] WARNING: CPU: 1 PID: 30171 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x48/0x68 [ 81.831202] Modules linked in: [ 81.832212] CPU: 1 PID: 30171 Comm: syz-executor.20 Not tainted 5.1.0-rc7-00004-g7d30b2ea43d6 #19 [ 81.835616] Hardware name: linux,dummy-virt (DT) [ 81.836863] pstate: 60400005 (nZCv daif +PAN -UAO) [ 81.838727] pc : __virt_to_phys+0x48/0x68 [ 81.840572] lr : __virt_to_phys+0x48/0x68 [ 81.842264] sp : ffff80002cf67c70 [ 81.843858] x29: ffff80002cf67c70 x28: ffff800014358e18 [ 81.846463] x27: 0000000000000000 x26: 0000000020000080 [ 81.849148] x25: 0000000000000000 x24: ffff80001bb01f40 [ 81.851986] x23: ffff200011db06c8 x22: ffff2000127e3c60 [ 81.854351] x21: ffff800014358cc0 x20: ffff800014358d98 [ 81.856711] x19: 0000000000000000 x18: 0000000000000000 [ 81.859132] x17: 0000000000000000 x16: 0000000000000000 [ 81.861586] x15: 0000000000000000 x14: 0000000000000000 [ 81.863905] x13: 0000000000000000 x12: ffff1000037603e9 [ 81.866226] x11: 1ffff000037603e8 x10: 0000000000000980 [ 81.868776] x9 : ffff80002cf67840 x8 : ffff80001bb02920 [ 81.873272] x7 : ffff1000037603e9 x6 : ffff80001bb01f47 [ 81.875266] x5 : ffff1000037603e9 x4 : dfff200000000000 [ 81.876875] x3 : ffff200010087528 x2 : ffff1000059ecf58 [ 81.878751] x1 : 44c06ddd107d1f00 x0 : 0000000000000000 [ 81.880453] Call trace: [ 81.881164] __virt_to_phys+0x48/0x68 [ 81.882919] io_mem_free+0x18/0x110 [ 81.886585] io_ring_ctx_wait_and_kill+0x13c/0x1f0 [ 81.891212] io_uring_setup+0xa60/0xad0 [ 81.892881] __arm64_sys_io_uring_setup+0x2c/0x38 [ 81.894398] el0_svc_common.constprop.0+0xac/0x150 [ 81.896306] el0_svc_handler+0x34/0x88 [ 81.897744] el0_svc+0x8/0xc [ 81.898715] ---[ end trace b4a703802243cbba ]--- Fixes: 2b188cc1bb857a9d ("Add io_uring IO interface") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-block@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01io_uring: fix SQPOLL cpu validationMark Rutland
In io_sq_offload_start(), we call cpu_possible() on an unbounded cpu value from userspace. On v5.1-rc7 on arm64 with CONFIG_DEBUG_PER_CPU_MAPS, this results in a splat: WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline] There was an attempt to fix this in commit: 917257daa0fea7a0 ("io_uring: only test SQPOLL cpu after we've verified it") ... by adding a check after the cpu value had been limited to NR_CPU_IDS using array_index_nospec(). However, this left an unbound check at the start of the function, for which the warning still fires. Let's fix this correctly by checking that the cpu value is bound by nr_cpu_ids before passing it to cpu_possible(). Note that only nr_cpu_ids of a cpumask are guaranteed to exist at runtime, and nr_cpu_ids can be significantly smaller than NR_CPUs. For example, an arm64 defconfig has NR_CPUS=256, while my test VM has 4 vCPUs. Following the intent from the commit message for 917257daa0fea7a0, the check is moved under the SQ_AFF branch, which is the only branch where the cpu values is consumed. The check is performed before bounding the value with array_index_nospec() so that we don't silently accept bogus cpu values from userspace, where array_index_nospec() would force these values to 0. I suspect we can remove the array_index_nospec() call entirely, but I've conservatively left that in place, updated to use nr_cpu_ids to match the prior check. Tested on arm64 with the Syzkaller reproducer: https://syzkaller.appspot.com/bug?extid=cd714a07c6de2bc34293 https://syzkaller.appspot.com/x/repro.syz?x=15d8b397200000 Full splat from before this patch: WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline] WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_check include/linux/cpumask.h:128 [inline] WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpumask_test_cpu include/linux/cpumask.h:344 [inline] WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_sq_offload_start fs/io_uring.c:2244 [inline] WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_create fs/io_uring.c:2864 [inline] WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 27601 Comm: syz-executor.0 Not tainted 5.1.0-rc7 #3 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2f0 include/linux/compiler.h:193 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:158 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x110/0x190 lib/dump_stack.c:113 panic+0x384/0x68c kernel/panic.c:214 __warn+0x2bc/0x2c0 kernel/panic.c:571 report_bug+0x228/0x2d8 lib/bug.c:186 bug_handler+0xa0/0x1a0 arch/arm64/kernel/traps.c:956 call_break_hook arch/arm64/kernel/debug-monitors.c:301 [inline] brk_handler+0x1d4/0x388 arch/arm64/kernel/debug-monitors.c:316 do_debug_exception+0x1a0/0x468 arch/arm64/mm/fault.c:831 el1_dbg+0x18/0x8c cpu_max_bits_warn include/linux/cpumask.h:121 [inline] cpumask_check include/linux/cpumask.h:128 [inline] cpumask_test_cpu include/linux/cpumask.h:344 [inline] io_sq_offload_start fs/io_uring.c:2244 [inline] io_uring_create fs/io_uring.c:2864 [inline] io_uring_setup+0x1108/0x15a0 fs/io_uring.c:2916 __do_sys_io_uring_setup fs/io_uring.c:2929 [inline] __se_sys_io_uring_setup fs/io_uring.c:2926 [inline] __arm64_sys_io_uring_setup+0x50/0x70 fs/io_uring.c:2926 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall arch/arm64/kernel/syscall.c:47 [inline] el0_svc_common.constprop.0+0x148/0x2e0 arch/arm64/kernel/syscall.c:83 el0_svc_handler+0xdc/0x100 arch/arm64/kernel/syscall.c:129 el0_svc+0x8/0xc arch/arm64/kernel/entry.S:948 SMP: stopping secondary CPUs Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled CPU features: 0x002,23000438 Memory Limit: none Rebooting in 1 seconds.. Fixes: 917257daa0fea7a0 ("io_uring: only test SQPOLL cpu after we've verified it") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-block@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Simplied the logic Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01Merge branch 'for-next/mitigations' of ↵Will Deacon
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core
2019-05-01Merge branch 'for-next/futex' of ↵Will Deacon
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-next/core
2019-05-01Documentation: Add ARM64 to kernel-parameters.rstJosh Poimboeuf
Add ARM64 to the legend of architectures. It's already used in several places in kernel-parameters.txt. Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01arm64/speculation: Support 'mitigations=' cmdline optionJosh Poimboeuf
Configure arm64 runtime CPU speculation bug mitigations in accordance with the 'mitigations=' cmdline option. This affects Meltdown, Spectre v2, and Speculative Store Bypass. The default behavior is unchanged. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> [will: reorder checks so KASLR implies KPTI and SSBS is affected by cmdline] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSBWill Deacon
SSBS provides a relatively cheap mitigation for SSB, but it is still a mitigation and its presence does not indicate that the CPU is unaffected by the vulnerability. Tweak the mitigation logic so that we report the correct string in sysfs. Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01arm64: enable generic CPU vulnerabilites supportMian Yousaf Kaukab
Enable CPU vulnerabilty show functions for spectre_v1, spectre_v2, meltdown and store-bypass. Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01arm64: add sysfs vulnerability show for speculative store bypassJeremy Linton
Return status based on ssbd_state and __ssb_safe. If the mitigation is disabled, or the firmware isn't responding then return the expected machine state based on a whitelist of known good cores. Given a heterogeneous machine, the overall machine vulnerability defaults to safe but is reset to unsafe when we miss the whitelist and the firmware doesn't explicitly tell us the core is safe. In order to make that work we delay transitioning to vulnerable until we know the firmware isn't responding to avoid a case where we miss the whitelist, but the firmware goes ahead and reports the core is not vulnerable. If all the cores in the machine have SSBS, then __ssb_safe will remain true. Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01arm64: Fix size of __early_cpu_boot_statusArun KS
__early_cpu_boot_status is of type long. Use quad assembler directive to allocate proper size. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Arun KS <arunks@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-01Merge branch 'aquantia-next'David S. Miller
Igor Russkikh says: ==================== net: atlantic: Aquantia driver updates 2019-04 This patchset contains various improvements: - Work targeting link up speedups: link interrupt introduced, some other logic changes to imrove this. - FW operations securing with mutex - Counters and statistics logic improved by Dmitry - read out of chip temperature via hwmon interface implemented by Yana and Nikita. v4 changes: - remove drvinfo_exit noop - 64bit stats should be readed out sequentially (lsw, then msw) declare 64bit read ops for that v3 changes: - temp ops renamed to phy_temp ops - mutex commits squashed for better structure v2 changes: - use threaded irq for link state handling - rework hwmon via devm_hwmon_device_register_with_info Extra comments on review from Andrew: - direct device name pointer is used in hwmon registration. This causes hwmon device to derive possible interface name changes - Will consider sanity checks for firmware mutex lock separately. Right now there is no single point exsists where such check could be easily added. - There is no way now to fetch and configure min/max/crit temperatures via FW. Will investigate this separately. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: remove outdated device idsNikita Danilov
Some device ids were never released and does not exist. Cleanup these. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: fixups on 64bit dma countersDmitry Bogdanov
DMA counters are 64 bit and we can fetch that to reduce counter overflow, espesially on byte counters. Tested-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: get total counters from DMA blockDmitry Bogdanov
aq_nic_update_ndev_stats pushes statistics to ndev->stats from system interface. This is not always good because it counts packets/bytes before any of rx filters (including mac filter). Its better to report the packet/bytes statistics from DMA counters which gives actual values of data transferred over pci. System level stats is still available via ethtool. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: fetch up to date statistics on ethtool requestDmitry Bogdanov
This improves ethtool -S usage, where stats are now actual on each request. Before that stats only were updated at service timer period. Tested-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: extract timer cb into work jobIgor Russkikh
Service timer callback fetches statistics from FW and that may cause a long delay in error cases. We also now need to use fw mutex to prevent concurrent access to FW, thus - extract that logic from timer callback into the job in the separate work queue. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: introduce fwreq mutexNikita Danilov
Some of FW operations could be invoked simultaneously, from f.e. ethtool context and from service service activity work. Here we introduce a fw mutex to secure and serialize access to FW logic. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: user correct MSI irq typeIgor Russkikh
Typo in msi code. No much impact though. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: use macros for better visibilityIgor Russkikh
Improve for better readability Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: improve ifup link detectionIgor Russkikh
Original code detected link only after 1 sec is passed after up. Here we replace this with direct service callback which updates link status immediately Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: link status irq handlingIgor Russkikh
Here we define and request an extra interrupt line, assign it on link isr handler and restructure abit aq_pci code to better support that. We also remove logic for using different timer intervals depending on link state, since thats now useless. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: create global service workqueueNikita Danilov
We need this to schedule link interrupt handling and various service tasks. Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: link interrupt handling functionIgor Russkikh
Define link interrupt handler Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: add link interrupt fieldsIgor Russkikh
Declare macroes and nic fields to support link interrupt handling Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: implement hwmon api for chip temperatureYana Esina
Added support for hwmon api to fetch out chip temperature Signed-off-by: Yana Esina <yana.esina@aquantia.com> Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01net: aquantia: add infrastructure to readout chip temperatureYana Esina
Ability to read the chip temperature from memory via hwmon interface Signed-off-by: Yana Esina <yana.esina@aquantia.com> Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01sctp: avoid running the sctp state machine recursivelyXin Long
Ying triggered a call trace when doing an asconf testing: BUG: scheduling while atomic: swapper/12/0/0x10000100 Call Trace: <IRQ> [<ffffffffa4375904>] dump_stack+0x19/0x1b [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72 [<ffffffffa437b93a>] __schedule+0x9ba/0xa00 [<ffffffffa3cd5326>] __cond_resched+0x26/0x30 [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50 [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200 [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0 [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp] [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp] [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp] [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp] [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp] [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp] [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp] As it shows, the first sctp_do_sm() running under atomic context (NET_RX softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later, and this flag is supposed to be used in non-atomic context only. Besides, sctp_do_sm() was called recursively, which is not expected. Vlad tried to fix this recursive call in Commit c0786693404c ("sctp: Fix oops when sending queued ASCONF chunks") by introducing a new command SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will be called in this command again. To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st sctp_do_sm() directly. Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01nvmet: protect discovery change log event list iterationSagi Grimberg
When we iterate on the discovery subsystem controllers we need to protect against concurrent mutations to it. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme: mark nvme_core_init and nvme_core_exit staticChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2019-05-01nvme: move command size checks to the coreChristoph Hellwig
Most command aren't PCIe specific, so move the size checking for them to core.c Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2019-05-01nvme-fabrics: check more command sizesMinwoo Im
struct common_command provides a common structure for NVMe-oF command format. It also needs to be checked for unintended size growth. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-pci: check more command sizesMinwoo Im
All the NVMe command has 64bytes fixed size so that it has been assured with BUILD_BUG_ON(). The remaining command structures in linux/nvme.h also need to be checked here. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-pci: remove an unneeded variable initializationMinwoo Im
Variable "n" will be assigned once kstrtoint() succeeds, otherwise it will not be referred because kstrtoint() will return an error which means go out from this function. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-pci: unquiesce admin queue on shutdownKeith Busch
Just like IO queues, the admin queue also will not be restarted after a controller shutdown. Unquiesce this queue so that we do not block request dispatch on a permanently disabled controller. Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-pci: shutdown on timeout during deletionKeith Busch
We do not restart a controller in a deleting state for timeout errors. When in this state, unblock potential request dispatchers with failed completions by shutting down the controller on timeout detection. Reported-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-pci: fix psdt field for single segment sglsKlaus Birkelund Jensen
The shortcut for single segment SGL requests did not set the PSDT field to mark the request as using SGLs. Fixes: 297910571f08 ("nvme-pci: optimize mapping single segment requests using SGLs") Signed-off-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-multipath: don't print ANA group state by defaultHannes Reinecke
Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-multipath: split bios with the ns_head bio_set before submittingHannes Reinecke
If the bio is moved to a different queue via blk_steal_bios() and the original queue is destroyed in nvme_remove_ns() we'll be ending with a crash in bio_endio() as the mempool for the split bio bvecs had already been destroyed. So split the bio using the original queue (which will remain during the lifetime of the bio) before sending it down to the underlying device. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01nvme-tcp: fix possible null deref on a timed out io queue connectSagi Grimberg
If I/O queue connect times out, we might have freed the queue socket already, so check for that on the error path in nvme_tcp_start_queue. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-05-01stmmac: pci: Fix typo in IOT2000 commentJan Kiszka
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01Documentation: fix netdev-FAQ.rst markup warningRandy Dunlap
Fix ReST underline warning: ./Documentation/networking/netdev-FAQ.rst:135: WARNING: Title underline too short. Q: I made changes to only a few patches in a patch series should I resend only those changed? -------------------------------------------------------------------------------------------- Fixes: ffa91253739c ("Documentation: networking: Update netdev-FAQ regarding patches") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-01kobject: fix dereference before null check on kobjColin Ian King
The kobj pointer is being null-checked so potentially it could be null, however, the ktype declaration before the null check is dereferencing kobj hence we have a potential null pointer deference. Fix this by moving the assignment of ktype after kobj has been null checked. Addresses-Coverity: ("Dereference before null check") Fixes: aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-01powerpc/tm: Avoid machine crash on rt_sigreturn()Breno Leitao
There is a kernel crash that happens if rt_sigreturn() is called inside a transactional block. This crash happens if the kernel hits an in-kernel page fault when accessing userspace memory, usually through copy_ckvsx_to_user(). A major page fault calls might_sleep() function, which can cause a task reschedule. A task reschedule (switch_to()) reclaim and recheckpoint the TM states, but, in the signal return path, the checkpointed memory was already reclaimed, thus the exception stack has MSR that points to MSR[TS]=0. When the code returns from might_sleep() and a task reschedule happened, then this task is returned with the memory recheckpointed, and CPU MSR[TS] = suspended. This means that there is a side effect at might_sleep() if it is called with CPU MSR[TS] = 0 and the task has regs->msr[TS] != 0. This side effect can cause a TM bad thing, since at the exception entrance, the stack saves MSR[TS]=0, and this is what will be used at RFID, but, the processor has MSR[TS] = Suspended, and this transition will be invalid and a TM Bad thing will be raised, causing the following crash: Unexpected TM Bad Thing exception at c00000000000e9ec (msr 0x8000000302a03031) tm_scratch=800000010280b033 cpu 0xc: Vector: 700 (Program Check) at [c00000003ff1fd70] pc: c00000000000e9ec: fast_exception_return+0x100/0x1bc lr: c000000000032948: handle_rt_signal64+0xb8/0xaf0 sp: c0000004263ebc40 msr: 8000000302a03031 current = 0xc000000415050300 paca = 0xc00000003ffc4080 irqmask: 0x03 irq_happened: 0x01 pid = 25006, comm = sigfuz Linux version 5.0.0-rc1-00001-g3bd6e94bec12 (breno@debian) (gcc version 8.2.0 (Debian 8.2.0-3)) #899 SMP Mon Jan 7 11:30:07 EST 2019 WARNING: exception is not recoverable, can't continue enter ? for help [c0000004263ebc40] c000000000032948 handle_rt_signal64+0xb8/0xaf0 (unreliable) [c0000004263ebd30] c000000000022780 do_notify_resume+0x2f0/0x430 [c0000004263ebe20] c00000000000e844 ret_from_except_lite+0x70/0x74 --- Exception: c00 (System Call) at 00007fffbaac400c SP (7fffeca90f40) is in userspace The solution for this problem is running the sigreturn code with regs->msr[TS] disabled, thus, avoiding hitting the side effect above. This does not seem to be a problem since regs->msr will be replaced by the ucontext value, so, it is being flushed already. In this case, it is flushed earlier. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-01powerpc/mm/radix: Fix kernel crash when running subpage protect testAneesh Kumar K.V
This patch fixes the below crash by making sure we touch the subpage protection related structures only if we know they are allocated on the platform. With radix translation we don't allocate hash context at all and trying to access subpage_prot_table results in: Faulting instruction address: 0xc00000000008bdb4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV .... NIP [c00000000008bdb4] sys_subpage_prot+0x74/0x590 LR [c00000000000b688] system_call+0x5c/0x70 Call Trace: [c00020002c6b7d30] [c00020002c6b7d90] 0xc00020002c6b7d90 (unreliable) [c00020002c6b7e20] [c00000000000b688] system_call+0x5c/0x70 Instruction dump: fb61ffd8 fb81ffe0 fba1ffe8 fbc1fff0 fbe1fff8 f821ff11 e92d1178 f9210068 39200000 e92d0968 ebe90630 e93f03e8 <eb891038> 60000000 3860fffe e9410068 We also move the subpage_prot_table with mmp_sem held to avoid race between two parallel subpage_prot syscall. Fixes: 701101865f5d ("powerpc/mm: Reduce memory usage for mm_context_t for radix") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-01io_uring: have submission side sqe errors post a cqeJens Axboe
Currently we only post a cqe if we get an error OUTSIDE of submission. For submission, we return the error directly through io_uring_enter(). This is a bit awkward for applications, and it makes more sense to always post a cqe with an error, if the error happens on behalf of an sqe. This changes submission behavior a bit. io_uring_enter() returns -ERROR for an error, and > 0 for number of sqes submitted. Before this change, if you wanted to submit 8 entries and had an error on the 5th entry, io_uring_enter() would return 4 (for number of entries successfully submitted) and rewind the sqring. The application would then have to peek at the sqring and figure out what was wrong with the head sqe, and then skip it itself. With this change, we'll return 5 since we did consume 5 sqes, and the last sqe (with the error) will result in a cqe being posted with the error. This makes the logic easier to handle in the application, and it cleans up the submission part. Suggested-by: Stefan Bühler <source@stbuehler.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-01bcache: make is_discard_enabled() staticJens Axboe
It's not used outside this file. Fixes: 631207314d88 ("bcache: fix failure in journal relplay") Signed-off-by: Jens Axboe <axboe@kernel.dk>