summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-17net: fs_enet: do not call phy_stop() in interruptsChristophe Leroy
In case of TX timeout, fs_timeout() calls phy_stop(), which triggers the following BUG_ON() as we are in interrupt. [92708.199889] kernel BUG at drivers/net/phy/mdio_bus.c:482! [92708.204985] Oops: Exception in kernel mode, sig: 5 [#1] [92708.210119] PREEMPT [92708.212107] CMPC885 [92708.214216] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G W 4.9.61 #39 [92708.223227] task: c60f0a40 task.stack: c6104000 [92708.227697] NIP: c02a84bc LR: c02a947c CTR: c02a93d8 [92708.232614] REGS: c6105c70 TRAP: 0700 Tainted: G W (4.9.61) [92708.241193] MSR: 00021032 <ME,IR,DR,RI>[92708.244818] CR: 24000822 XER: 20000000 [92708.248767] GPR00: c02a947c c6105d20 c60f0a40 c62b4c00 00000005 0000001f c069aad8 0001a688 GPR08: 00000007 00000100 c02a93d8 00000000 000005fc 00000000 c6213240 c06338e4 GPR16: 00000001 c06330d4 c0633094 00000000 c0680000 c6104000 c6104000 00000000 GPR24: 00000200 00000000 ffffffff 00000004 00000078 00009032 00000000 c62b4c00 NIP [c02a84bc] mdiobus_read+0x20/0x74 [92708.281517] LR [c02a947c] kszphy_config_intr+0xa4/0xc4 [92708.286547] Call Trace: [92708.288980] [c6105d20] [c6104000] 0xc6104000 (unreliable) [92708.294339] [c6105d40] [c02a947c] kszphy_config_intr+0xa4/0xc4 [92708.300098] [c6105d50] [c02a5330] phy_stop+0x60/0x9c [92708.305007] [c6105d60] [c02c84d0] fs_timeout+0xdc/0x110 [92708.310197] [c6105d80] [c035cd48] dev_watchdog+0x268/0x2a0 [92708.315593] [c6105db0] [c0060288] call_timer_fn+0x34/0x17c [92708.321014] [c6105dd0] [c00605f0] run_timer_softirq+0x21c/0x2e4 [92708.326887] [c6105e50] [c001e19c] __do_softirq+0xf4/0x2f4 [92708.332207] [c6105eb0] [c001e3c8] run_ksoftirqd+0x2c/0x40 [92708.337560] [c6105ec0] [c003b420] smpboot_thread_fn+0x1f0/0x258 [92708.343405] [c6105ef0] [c003745c] kthread+0xbc/0xd0 [92708.348217] [c6105f40] [c000c400] ret_from_kernel_thread+0x5c/0x64 [92708.354275] Instruction dump: [92708.357207] 7c0803a6 bbc10018 38210020 4e800020 7c0802a6 9421ffe0 54290024 bfc10018 [92708.364865] 90010024 7c7f1b78 81290008 552902ee <0f090000> 3bc3002c 7fc3f378 90810008 [92708.372711] ---[ end trace 42b05441616fafd7 ]--- This patch moves fs_timeout() actions into an async worker. Fixes: commit 48257c4f168e5 ("Add fs_enet ethernet network driver, for several embedded platforms") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17r8152: disable RX aggregation on Dell TB16 dockKai-Heng Feng
r8153 on Dell TB15/16 dock corrupts rx packets. This change is suggested by Realtek. They guess that the XHCI controller doesn't have enough buffer, and their guesswork is correct, once the RX aggregation gets disabled, the issue is gone. ASMedia is currently working on a real sulotion for this issue. Dell and ODM confirm the bcdDevice and iSerialNumber is unique for TB16. Note that TB15 has different bcdDevice and iSerialNumber, which are not unique values. If you still have TB15, please contact Dell to replace it with TB16. BugLink: https://bugs.launchpad.net/bugs/1729674 Cc: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - A rather involved set of memory hardware encryption fixes to support the early loading of microcode files via the initrd. These are larger than what we normally take at such a late -rc stage, but there are two mitigating factors: 1) much of the changes are limited to the SME code itself 2) being able to early load microcode has increased importance in the post-Meltdown/Spectre era. - An IRQ vector allocator fix - An Intel RDT driver use-after-free fix - An APIC driver bug fix/revert to make certain older systems boot again - A pkeys ABI fix - TSC calibration fixes - A kdump fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/vector: Fix off by one in error path x86/intel_rdt/cqm: Prevent use after free x86/mm: Encrypt the initrd earlier for BSP microcode update x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption x86/mm: Centralize PMD flags in sme_encrypt_kernel() x86/mm: Use a struct to reduce parameters for SME PGD mapping x86/mm: Clean up register saving in the __enc_copy() assembly code x86/idt: Mark IDT tables __initconst Revert "x86/apic: Remove init_bsp_APIC()" x86/mm/pkeys: Fix fill_sig_info_pkey x86/tsc: Print tsc_khz, when it differs from cpu_khz x86/tsc: Fix erroneous TSC rate on Skylake Xeon x86/tsc: Future-proof native_calibrate_tsc() kdump: Write the correct address of mem_section into vmcoreinfo
2018-01-17Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "A delayacct statistics correctness fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: delayacct: Account blkio completion on the correct task
2018-01-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Ingo Molnar: "An Intel RAPL events fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/rapl: Fix Haswell and Broadwell server RAPL event
2018-01-17Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two futex fixes: a input parameters robustness fix, and futex race fixes" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Prevent overflow by strengthen input validation futex: Avoid violating the 10th rule of futex
2018-01-17tun: fix a memory leak for tfile->tx_arrayCong Wang
tfile->tun could be detached before we close the tun fd, via tun_detach_all(), so it should not be used to check for tfile->tx_array. As Jason suggested, we probably have to clean it up unconditionally both in __tun_deatch() and tun_detach_all(), but this requires to check if it is initialized or not. Currently skb_array_cleanup() doesn't have such a check, so I check it in the caller and introduce a helper function, it is a bit ugly but we can always improve it in net-next. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: 1576d9860599 ("tun: switch to use skb array for tx") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti bits and fixes from Thomas Gleixner: "This last update contains: - An objtool fix to prevent a segfault with the gold linker by changing the invocation order. That's not just for gold, it's a general robustness improvement. - An improved error message for objtool which spares tearing hairs. - Make KASAN fail loudly if there is not enough memory instead of oopsing at some random place later - RSB fill on context switch to prevent RSB underflow and speculation through other units. - Make the retpoline/RSB functionality work reliably for both Intel and AMD - Add retpoline to the module version magic so mismatch can be detected - A small (non-fix) update for cpufeatures which prevents cpu feature clashing for the upcoming extra mitigation bits to ease backporting" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: module: Add retpoline tag to VERMAGIC x86/cpufeature: Move processor tracing out of scattered features objtool: Improve error message for bad file argument objtool: Fix seg fault with gold linker x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros x86/retpoline: Fill RSB on context switch for affected CPUs x86/kasan: Panic if there is not enough memory to boot
2018-01-17Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A one-liner fix which prevents deferrable timers becoming stale when the system does not switch into NOHZ mode" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Unconditionally check deferrable base
2018-01-17ARM: net: bpf: clarify tail_call indexRussell King
As per 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT"), the index used for array lookup is defined to be 32-bit wide. Update a misleading comment that suggests it is 64-bit wide. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: fix LDX instructionsRussell King
When the source and destination register are identical, our JIT does not generate correct code, which leads to kernel oopses. Fix this by (a) generating more efficient code, and (b) making use of the temporary earlier if we will overwrite the address register. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: fix register savingRussell King
When an eBPF program tail-calls another eBPF program, it enters it after the prologue to avoid having complex stack manipulations. This can lead to kernel oopses, and similar. Resolve this by always using a fixed stack layout, a CPU register frame pointer, and using this when reloading registers before returning. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: correct stack layout documentationRussell King
The stack layout documentation incorrectly suggests that the BPF JIT scratch space starts immediately below BPF_FP. This is not correct, so let's fix the documentation to reflect reality. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: move stack documentationRussell King
Move the stack documentation towards the top of the file, where it's relevant for things like the register layout. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: fix stack alignmentRussell King
As per 2dede2d8e925 ("ARM EABI: stack pointer must be 64-bit aligned after a CPU exception") the stack should be aligned to a 64-bit boundary on EABI systems. Ensure that the eBPF JIT appropraitely aligns the stack. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: fix tail call jumpsRussell King
When a tail call fails, it is documented that the tail call should continue execution at the following instruction. An example tail call sequence is: 12: (85) call bpf_tail_call#12 13: (b7) r0 = 0 14: (95) exit The ARM assembler for the tail call in this case ends up branching to instruction 14 instead of instruction 13, resulting in the BPF filter returning a non-zero value: 178: ldr r8, [sp, #588] ; insn 12 17c: ldr r6, [r8, r6] 180: ldr r8, [sp, #580] 184: cmp r8, r6 188: bcs 0x1e8 18c: ldr r6, [sp, #524] 190: ldr r7, [sp, #528] 194: cmp r7, #0 198: cmpeq r6, #32 19c: bhi 0x1e8 1a0: adds r6, r6, #1 1a4: adc r7, r7, #0 1a8: str r6, [sp, #524] 1ac: str r7, [sp, #528] 1b0: mov r6, #104 1b4: ldr r8, [sp, #588] 1b8: add r6, r8, r6 1bc: ldr r8, [sp, #580] 1c0: lsl r7, r8, #2 1c4: ldr r6, [r6, r7] 1c8: cmp r6, #0 1cc: beq 0x1e8 1d0: mov r8, #32 1d4: ldr r6, [r6, r8] 1d8: add r6, r6, #44 1dc: bx r6 1e0: mov r0, #0 ; insn 13 1e4: mov r1, #0 1e8: add sp, sp, #596 ; insn 14 1ec: pop {r4, r5, r6, r7, r8, sl, pc} For other sequences, the tail call could end up branching midway through the following BPF instructions, or maybe off the end of the function, leading to unknown behaviours. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUsRussell King
Avoid the 'bx' instruction on CPUs that have no support for Thumb and thus do not implement this instruction by moving the generation of this opcode to a separate function that selects between: bx reg and mov pc, reg according to the capabilities of the CPU. Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-01-17mtd: ubi: Use 'max_bad_blocks' to compute bad_peb_limit if availableJeff Westfahl
If the user has not set max_beb_per1024 using either the cmdline or Kconfig options for doing so, use the MTD function 'max_bad_blocks' to compute the UBI bad_peb_limit. Signed-off-by: Jeff Westfahl <jeff.westfahl@ni.com> Signed-off-by: Zach Brown <zach.brown@ni.com> Acked-by: Boris Brezillon <boris.brezillon@free-electron.com> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-01-17ubifs: Fix uninitialized variable in search_dh_cookie()Geert Uytterhoeven
fs/ubifs/tnc.c: In function ‘search_dh_cookie’: fs/ubifs/tnc.c:1893: warning: ‘err’ is used uninitialized in this function Indeed, err is always used uninitialized. According to an original review comment from Hyunchul, acknowledged by Richard, err should be initialized to -ENOENT to avoid the first call to tnc_next(). But we can achieve the same by reordering the code. Fixes: 781f675e2d7e ("ubifs: Fix unlink code wrt. double hash lookups") Reported-by: Hyunchul Lee <hyc.lee@gmail.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-01-17block: Fix __bio_integrity_endio() documentationBart Van Assche
Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio") Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17nvme-pci: clean up SMBSZ bit definitionsChristoph Hellwig
Define the bit positions instead of macros using the magic values, and move the expanded helpers to calculate the size and size unit into the implementation C file. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2018-01-17nvme-pci: clean up CMB initializationChristoph Hellwig
Refactor the call to nvme_map_cmb, and change the conditions for probing for the CMB. First remove the version check as NVMe TPs always apply to earlier versions of the spec as well. Second check for the whole CMBSZ register for support of the CMB feature instead of just the size field inside of it to simplify the code a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
2018-01-17nvme-fc: correct hang in nvme_ns_remove()James Smart
When connectivity is lost to a device, the association is terminated and the blk-mq queues are quiesced/stopped. When connectivity is re-established, they are resumed. If connectivity is lost for a sufficient amount of time that the controller is then deleted, the delete path starts tearing down queues, and eventually calling nvme_ns_remove(). It appears that pending commands may cause blk_cleanup_queue() to never complete and the teardown stalls. Correct by starting the ns queues after transitioning to a DELETING state, allowing pending commands to be flushed with io failures. Thus the delete path is clear when reached. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-17nvme-fc: fix rogue admin cmds stalling teardownJames Smart
When connectivity is lost to a device, the association is terminated and the blk-mq queues are quiesced/stopped. When connectivity is re-established, they are resumed. If an admin command is received while connectivity is list, the ioctl queues the command on the admin_q and the command stalls (the thread issuing the ioctl hangs/waits). if the connectivity is lost long enough such that the controller is then deleted, the delete code makes its calls to initiate the delete, which then expects the core layer to call the transport when all references are removed and the controller can be freed. Unfortunately, nothing in this path dequeued the admin command, so a reference sits outstanding and things stop, hanging the delete indefinitely. Correct by unquiescing the admin queue in the delete association. This means any admin command (which should only be from an ioctl) issued after connectivity is lost will detect the controller is in a reconnecting state and will (fast) fail the command. Thus, a pending reference can no longer be created. Once connectivity is re-established, a new ioctl/admin command would see proper device state and function again. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-17blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_requestMike Snitzer
After commit: 923218f6166a ("blk-mq: don't allocate driver tag upfront for flush rq") we no longer use the 'can_block' argument in blk_mq_sched_insert_request(). Kill it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Added actual commit message as to why it's being removed. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedbackMing Lei
blk_insert_cloned_request() is called in the fast path of a dm-rq driver (e.g. blk-mq request-based DM mpath). blk_insert_cloned_request() uses blk_mq_request_bypass_insert() to directly append the request to the blk-mq hctx->dispatch_list of the underlying queue. 1) This way isn't efficient enough because the hctx spinlock is always used. 2) With blk_insert_cloned_request(), we completely bypass underlying queue's elevator and depend on the upper-level dm-rq driver's elevator to schedule IO. But dm-rq currently can't get the underlying queue's dispatch feedback at all. Without knowing whether a request was issued or not (e.g. due to underlying queue being busy) the dm-rq elevator will not be able to provide effective IO merging (as a side-effect of dm-rq currently blindly destaging a request from its elevator only to requeue it after a delay, which kills any opportunity for merging). This obviously causes very bad sequential IO performance. Fix this by updating blk_insert_cloned_request() to use blk_mq_request_direct_issue(). blk_mq_request_direct_issue() allows a request to be issued directly to the underlying queue and returns the dispatch feedback (blk_status_t). If blk_mq_request_direct_issue() returns BLK_SYS_RESOURCE the dm-rq driver will now use DM_MAPIO_REQUEUE to _not_ destage the request. Whereby preserving the opportunity to merge IO. With this, request-based DM's blk-mq sequential IO performance is vastly improved (as much as 3X in mpath/virtio-scsi testing). Signed-off-by: Ming Lei <ming.lei@redhat.com> [blk-mq.c changes heavily influenced by Ming Lei's initial solution, but they were refactored to make them less fragile and easier to read/review] Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17blk-mq: factor out a few helpers from __blk_mq_try_issue_directlyMike Snitzer
No functional change. Just makes code flow more logically. In following commit, __blk_mq_try_issue_directly() will be used to return the dispatch result (blk_status_t) to DM. DM needs this information to improve IO merging. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17blk-mq: turn WARN_ON in __blk_mq_run_hw_queue into printkMing Lei
We know this WARN_ON is harmless and in reality it may be trigged, so convert it to printk() and dump_stack() to avoid to confusing people. Also add comment about two releated races here. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Stefan Haberland <sth@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "jianchao.wang" <jianchao.w.wang@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17blk-mq: make sure hctx->next_cpu is set correctlyMing Lei
When hctx->next_cpu is set from possible online CPUs, there is one race in which hctx->next_cpu may be set as >= nr_cpu_ids, and finally break workqueue. The race can be triggered in the following two sitations: 1) when one CPU is becoming DEAD, blk_mq_hctx_notify_dead() is called to dispatch requests from the DEAD cpu context, but at that time, this DEAD CPU has been cleared from 'cpu_online_mask', so all CPUs in hctx->cpumask may become offline, and cause hctx->next_cpu set a bad value. 2) blk_mq_delay_run_hw_queue() is called from CPU B, and found the queue should be run on the other CPU A, then CPU A may become offline at the same time and all CPUs in hctx->cpumask become offline. This patch deals with this issue by re-selecting next CPU, and making sure it is set correctly. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Stefan Haberland <sth@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: "jianchao.wang" <jianchao.w.wang@oracle.com> Tested-by: "jianchao.wang" <jianchao.w.wang@oracle.com> Fixes: 20e4d81393 ("blk-mq: simplify queue mapping & schedule with each possisble CPU") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17aoe: use ktime_t instead of timevalTina Ruchandani
'struct frame' uses two variables to store the sent timestamp - 'struct timeval' and jiffies. jiffies is used to avoid discrepancies caused by updates to system time. 'struct timeval' is deprecated because it uses 32-bit representation for seconds which will overflow in year 2038. This patch does the following: - Replace the use of 'struct timeval' and jiffies with ktime_t, which is the recommended type for timestamping - ktime_t provides both long range (like jiffies) and high resolution (like timeval). Using ktime_get (monotonic time) instead of wall-clock time prevents any discprepancies caused by updates to system time. [updates by Arnd below] The original patch from Tina never went anywhere as we discussed how to keep the impact on performance minimal. I've started over now but arrived at basically the same patch that she had originally, except for an slightly improved tsince_hr() function. I'm making it more robust against overflows, and also optimize explicitly for the common case in which a frame is less than 4.2 seconds old, using only a 32-bit division in that case. This should make the new version more efficient than the old code, since we replace the existing two 32-bit division in do_gettimeofday() plus one multiplication with a single single 32-bit division in tsince_hr() and drop the double bookkeeping. It's also more efficient than the ktime_get_us() API we discussed before, since that would also rely on multiple divisions. Link: https://lists.linaro.org/pipermail/y2038/2015-May/000276.html Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Cc: Ed Cashin <ed.cashin@acm.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-17drm/vmwgfx: fix memory corruption with legacy/sou connectorsRob Clark
It looks like in all cases 'struct vmw_connector_state' is used. But only in stdu connectors, was atomic_{duplicate,destroy}_state() properly subclassed. Leading to writes beyond the end of the allocated connector state block and all sorts of fun memory corruption related crashes. Fixes: d7721ca71126 "drm/vmwgfx: Connector atomic state" Cc: <stable@vger.kernel.org> Signed-off-by: Rob Clark <rclark@redhat.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2018-01-17i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATAJeremy Compostella
On a I2C_SMBUS_I2C_BLOCK_DATA read request, if data->block[0] is greater than I2C_SMBUS_BLOCK_MAX + 1, the underlying I2C driver writes data out of the msgbuf1 array boundary. It is possible from a user application to run into that issue by calling the I2C_SMBUS ioctl with data.block[0] greater than I2C_SMBUS_BLOCK_MAX + 1. This patch makes the code compliant with Documentation/i2c/dev-interface by raising an error when the requested size is larger than 32 bytes. Call Trace: [<ffffffff8139f695>] dump_stack+0x67/0x92 [<ffffffff811802a4>] panic+0xc5/0x1eb [<ffffffff810ecb5f>] ? vprintk_default+0x1f/0x30 [<ffffffff817456d3>] ? i2cdev_ioctl_smbus+0x303/0x320 [<ffffffff8109a68b>] __stack_chk_fail+0x1b/0x20 [<ffffffff817456d3>] i2cdev_ioctl_smbus+0x303/0x320 [<ffffffff81745aed>] i2cdev_ioctl+0x4d/0x1e0 [<ffffffff811f761a>] do_vfs_ioctl+0x2ba/0x490 [<ffffffff81336e43>] ? security_file_ioctl+0x43/0x60 [<ffffffff811f7869>] SyS_ioctl+0x79/0x90 [<ffffffff81a22e97>] entry_SYSCALL_64_fastpath+0x12/0x6a Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-01-17i2c: core: decrease reference count of device node in i2c_unregister_deviceLixin Wang
Reference count of device node was increased in of_i2c_register_device, but without decreasing it in i2c_unregister_device. Then the added device node will never be released. Fix this by adding the of_node_put. Signed-off-by: Lixin Wang <alan.1.wang@nokia-sbell.com> Tested-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2018-01-17dm crypt: fix error return code in crypt_ctr()Wei Yongjun
Fix to return error code -ENOMEM from the mempool_create_kmalloc_pool() error handling case instead of 0, as done elsewhere in this function. Fixes: ef43aa38063a6 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm crypt: wipe kernel key copy after IV initializationOndrej Kozina
Loading key via kernel keyring service erases the internal key copy immediately after we pass it in crypto layer. This is wrong because IV is initialized later and we use wrong key for the initialization (instead of real key there's just zeroed block). The bug may cause data corruption if key is loaded via kernel keyring service first and later same crypt device is reactivated using exactly same key in hexbyte representation, or vice versa. The bug (and fix) affects only ciphers using following IVs: essiv, lmk and tcw. Fixes: c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service") Cc: stable@vger.kernel.org # 4.10+ Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm integrity: don't store cipher request on the stackMikulas Patocka
Some asynchronous cipher implementations may use DMA. The stack may be mapped in the vmalloc area that doesn't support DMA. Therefore, the cipher request and initialization vector shouldn't be on the stack. Fix this by allocating the request and iv with kmalloc. Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm crypt: fix crash by adding missing check for auth key sizeMilan Broz
If dm-crypt uses authenticated mode with separate MAC, there are two concatenated part of the key structure - key(s) for encryption and authentication key. Add a missing check for authenticated key length. If this key length is smaller than actually provided key, dm-crypt now properly fails instead of crashing. Fixes: ef43aa3806 ("dm crypt: add cryptographic data integrity protection (authenticated encryption)") Cc: stable@vger.kernel.org # 4.12+ Reported-by: Salah Coronya <salahx@yahoo.com> Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm btree: fix serious bug in btree_split_beneath()Joe Thornber
When inserting a new key/value pair into a btree we walk down the spine of btree nodes performing the following 2 operations: i) space for a new entry ii) adjusting the first key entry if the new key is lower than any in the node. If the _root_ node is full, the function btree_split_beneath() allocates 2 new nodes, and redistibutes the root nodes entries between them. The root node is left with 2 entries corresponding to the 2 new nodes. btree_split_beneath() then adjusts the spine to point to one of the two new children. This means the first key is never adjusted if the new key was lower, ie. operation (ii) gets missed out. This can result in the new key being 'lost' for a period; until another low valued key is inserted that will uncover it. This is a serious bug, and quite hard to make trigger in normal use. A reproducing test case ("thin create devices-in-reverse-order") is available as part of the thin-provision-tools project: https://github.com/jthornber/thin-provisioning-tools/blob/master/functional-tests/device-mapper/dm-tests.scm#L593 Fix the issue by changing btree_split_beneath() so it no longer adjusts the spine. Instead it unlocks both the new nodes, and lets the main loop in btree_insert_raw() relock the appropriate one and make any neccessary adjustments. Cc: stable@vger.kernel.org Reported-by: Monty Pavel <monty_pavel@sina.com> Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6Dennis Yang
For btree removal, there is a corner case that a single thread could takes 6 locks which is more than THIN_MAX_CONCURRENT_LOCKS(5) and leads to deadlock. A btree removal might eventually call rebalance_children()->rebalance3() to rebalance entries of three neighbor child nodes when shadow_spine has already acquired two write locks. In rebalance3(), it tries to shadow and acquire the write locks of all three child nodes. However, shadowing a child node requires acquiring a read lock of the original child node and a write lock of the new block. Although the read lock will be released after block shadowing, shadowing the third child node in rebalance3() could still take the sixth lock. (2 write locks for shadow_spine + 2 write locks for the first two child nodes's shadow + 1 write lock for the last child node's shadow + 1 read lock for the last child node) Cc: stable@vger.kernel.org Signed-off-by: Dennis Yang <dennisyang@qnap.com> Acked-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-01-17KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in ↵Tianyu Lan
kvm_valid_sregs() kvm_valid_sregs() should use X86_CR0_PG and X86_CR4_PAE to check bit status rather than X86_CR0_PG_BIT and X86_CR4_PAE_BIT. This patch is to fix it. Fixes: f29810335965a(KVM/x86: Check input paging mode when cs.l is set) Reported-by: Jeremi Piotrowski <jeremi.piotrowski@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-17Merge tag 'kvm-arm-fixes-for-v4.15-3-v2' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm KVM/ARM Fixes for v4.15, Round 3 (v2) Three more fixes for v4.15 fixing incorrect huge page mappings on systems using the contigious hint for hugetlbfs; supporting an alternative GICv4 init sequence; and correctly implementing the ARM SMCC for HVC and SMC handling.
2018-01-17powerpc/pseries: include linux/types.h in asm/hvcall.hMichal Suchanek
Commit 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") uses u64 in asm/hvcall.h without including linux/types.h This breaks hvcall.h users that do not include the header themselves. Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-17powerpc/64s: Allow control of RFI flush via debugfsMichael Ellerman
Expose the state of the RFI flush (enabled/disabled) via debugfs, and allow it to be enabled/disabled at runtime. eg: $ cat /sys/kernel/debug/powerpc/rfi_flush 1 $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush $ cat /sys/kernel/debug/powerpc/rfi_flush 0 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-01-17powerpc/64s: Wire up cpu_show_meltdown()Michael Ellerman
The recent commit 87590ce6e373 ("sysfs/cpu: Add vulnerability folder") added a generic folder and set of files for reporting information on CPU vulnerabilities. One of those was for meltdown: /sys/devices/system/cpu/vulnerabilities/meltdown This commit wires up that file for 64-bit Book3S powerpc. For now we default to "Vulnerable" unless the RFI flush is enabled. That may not actually be true on all hardware, further patches will refine the reporting based on the CPU/platform etc. But for now we default to being pessimists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-17cpufreq: scpi: remove arm_big_little dependencySudeep Holla
The dependency on physical_package_id from the topology to get the cluster identifier is wrong. The concept of cluster used in ARM topology is unfortunately not well defined in the architecture, we should avoid using it. Further the frequency domain need not be mapped to so called "clusters" one to one. SCPI already provides means to obtain the frequency domain id from the device tree. In order to support some new topologies(e.g. DSU which contains 2 frequency domains within the physical cluster), pseudo clusters are created to make this driver work which is wrong again. In order to solve those issues and also remove dependency of topological physical id for frequency domain, this patch removes the arm_big_little dependency from scpi driver. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17drivers: psci: remove cluster terminology and dependency on physical_package_idSudeep Holla
Since the definition of the term "cluster" is not well defined in the architecture, we should avoid using it. Also the physical package id is currently mapped to so called "clusters" in ARM/ARM64 platforms which is already argumentative. Currently PSCI checker uses the physical package id assuming that CPU power domains map to "clusters" and the physical package id in the code as it stands also maps to cluster boundaries. It does that trying to test "cluster" idle states to its best. However the CPU power domain often but not always maps directly to the processor topology. This patch removes the dependency on physical_package_id from the topology in this PSCI checker. Also it replaces all the occurences of clusters to cpu_groups which is derived from core_sibling_mask and may not directly map to physical "cluster". Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17powercap: intel_rapl: Fix trailing semicolonLuis de Bethencourt
The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17ACPI/PCI: pci_link: reduce verbosity when IRQ is enabledSinan Kaya
When ACPI Link object is enabled, the message is printed with a warning prefix. Some test tools are capturing warning and test error types as errors. Let's reduce the verbosity of success case. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17dmaengine: rcar-dmac: Make DMAC reinit during system resume explicitGeert Uytterhoeven
The current (empty) system sleep callbacks rely on the PM core to force a runtime resume to reinitialize the DMAC registers during system resume. Without a reinitialization, e.g. SCIF DMA will hang silently after a system resume on R-Car Gen3. Make this explicit by using pm_runtime_force_{suspend,resume}() as the system sleep callbacks instead. Use SET_LATE_SYSTEM_SLEEP_PM_OPS() as DMA engines must be initialized before all DMA slave devices. Fixes: 17218e0092f8 "PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume()" Suggested-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-17PM / runtime: Allow no callbacks in pm_runtime_force_suspend|resume()Ulf Hansson
The pm_runtime_force_suspend|resume() helpers currently requires the device to at some level (PM domain, bus, etc), have the ->runtime_suspend|resume() callbacks assigned for it, else -ENOSYS is returned as an error. However, there are no reason for this requirement, so let's simply remove it by allowing these callbacks to be NULL. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>