summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-01-12arm64: assembler: make adr_l work in modules under KASLRArd Biesheuvel
When CONFIG_RANDOMIZE_MODULE_REGION_FULL=y, the offset between loaded modules and the core kernel may exceed 4 GB, putting symbols exported by the core kernel out of the reach of the ordinary adrp/add instruction pairs used to generate relative symbol references. So make the adr_l macro emit a movz/movk sequence instead when executing in module context. While at it, remove the pointless special case for the stack pointer. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-12nfs: Don't take a reference on fl->fl_file for LOCK operationBenjamin Coddington
I have reports of a crash that look like __fput() was called twice for a NFSv4.0 file. It seems possible that the state manager could try to reclaim a lock and take a reference on the fl->fl_file at the same time the file is being released if, during the close(), a signal interrupts the wait for outstanding IO while removing locks which then skips the removal of that lock. Since 83bfff23e9ed ("nfs4: have do_vfs_lock take an inode pointer") has removed the need to traverse fl->fl_file->f_inode in nfs4_lock_done(), taking that reference is no longer necessary. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-12spi: davinci: use dma_mapping_error()Kevin Hilman
The correct error checking for dma_map_single() is to use dma_mapping_error(). Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-01-12Merge tag 'usb-serial-4.10-rc4' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for v4.10-rc4 These fixes address a number of issues in the ch341 driver and includes a partial revert of a change in how we set the line settings that went into 4.10-rc1 but which turned out to have undesired side effects. This included deasserting the modem-control lines when configuring the device, but also prevented a certain class of CH340 devices from working with the driver. Included are also two fixes for two minor information leaks in kl5kusb105 and ch341 due to failures to detect short control transfers. Signed-off-by: Johan Hovold <johan@kernel.org>
2017-01-12ravb: Remove Rx overflow log messagesKazuya Mizuguchi
Remove Rx overflow log messages as in an environment where logging results in network traffic logging may cause further overflows. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> [simon: reworked changelog] Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12block: Rename blk_queue_zone_size and bdev_zone_sizeDamien Le Moal
All block device data fields and functions returning a number of 512B sectors are by convention named xxx_sectors while names in the form xxx_size are generally used for a number of bytes. The blk_queue_zone_size and bdev_zone_size functions were not following this convention so rename them. No functional change is introduced by this patch. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Collapsed the two patches, they were nonsensically split and broke bisection. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-12Merge branch 'mlxsw-fixes'David S. Miller
Jiri Pirko says: ==================== mlxsw: Couple of fixes Couple of simple fixes from Arkadi and Elad. Please queue these up for stable. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12mlxsw: pci: Fix EQE structure definitionElad Raz
The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This leads to duplication with other fields in the Event Queue Element such as sub-type, cqn and owner. Fixes: eda6500a987a0 ("mlxsw: Add PCI bus implementation") Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12mlxsw: switchx2: Fix memory leak at skb reallocationArkadi Sharshevsky
During transmission the skb is checked for headroom in order to add vendor specific header. In case the skb needs to be re-allocated, skb_realloc_headroom() is called to make a private copy of the original, but doesn't release it. Current code assumes that the original skb is released during reallocation and only releases it at the error path which causes a memory leak. Fix this by adding the original skb release to the main path. Fixes: d003462a50de ("mlxsw: Simplify mlxsw_sx_port_xmit function") Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12mlxsw: spectrum: Fix memory leak at skb reallocationArkadi Sharshevsky
During transmission the skb is checked for headroom in order to add vendor specific header. In case the skb needs to be re-allocated, skb_realloc_headroom() is called to make a private copy of the original, but doesn't release it. Current code assumes that the original skb is released during reallocation and only releases it at the error path which causes a memory leak. Fix this by adding the original skb release to the main path. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-12KVM: x86: fix emulation of "MOV SS, null selector"Paolo Bonzini
This is CVE-2017-2583. On Intel this causes a failed vmentry because SS's type is neither 3 nor 7 (even though the manual says this check is only done for usable SS, and the dmesg splat says that SS is unusable!). On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb. The fix fabricates a data segment descriptor when SS is set to a null selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb. Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3; this in turn ensures CPL < 3 because RPL must be equal to CPL. Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing the bug and deciphering the manuals. Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com> Fixes: 79d5b4c3cd809c770d4bf9812635647016c56011 Cc: stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12capability: export has_capabilityJike Song
has_capability() is sometimes needed by modules to test capability for specified task other than current, so export it. Cc: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Jike Song <jike.song@intel.com> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12KVM: x86: fix NULL deref in vcpu_scan_ioapicWanpeng Li
Reported by syzkaller: BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0 IP: _raw_spin_lock+0xc/0x30 PGD 3e28eb067 PUD 3f0ac6067 PMD 0 Oops: 0002 [#1] SMP CPU: 0 PID: 2431 Comm: test Tainted: G OE 4.10.0-rc1+ #3 Call Trace: ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm] kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm] ? pick_next_task_fair+0xe1/0x4e0 ? kvm_arch_vcpu_load+0xea/0x260 [kvm] kvm_vcpu_ioctl+0x33a/0x600 [kvm] ? hrtimer_try_to_cancel+0x29/0x130 ? do_nanosleep+0x97/0xf0 do_vfs_ioctl+0xa1/0x5d0 ? __hrtimer_init+0x90/0x90 ? do_nanosleep+0x5b/0xf0 SyS_ioctl+0x79/0x90 do_syscall_64+0x6e/0x180 entry_SYSCALL64_slow_path+0x25/0x25 RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0 The syzkaller folks reported a NULL pointer dereference due to ENABLE_CAP succeeding even without an irqchip. The Hyper-V synthetic interrupt controller is activated, resulting in a wrong request to rescan the ioapic and a NULL pointer dereference. #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/types.h> #include <linux/kvm.h> #include <pthread.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #ifndef KVM_CAP_HYPERV_SYNIC #define KVM_CAP_HYPERV_SYNIC 123 #endif void* thr(void* arg) { struct kvm_enable_cap cap; cap.flags = 0; cap.cap = KVM_CAP_HYPERV_SYNIC; ioctl((long)arg, KVM_ENABLE_CAP, &cap); return 0; } int main() { void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); int kvmfd = open("/dev/kvm", 0); int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); struct kvm_userspace_memory_region memreg; memreg.slot = 0; memreg.flags = 0; memreg.guest_phys_addr = 0; memreg.memory_size = 0x1000; memreg.userspace_addr = (unsigned long)host_mem; host_mem[0] = 0xf4; ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg); int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0); struct kvm_sregs sregs; ioctl(cpufd, KVM_GET_SREGS, &sregs); sregs.cr0 = 0; sregs.cr4 = 0; sregs.efer = 0; sregs.cs.selector = 0; sregs.cs.base = 0; ioctl(cpufd, KVM_SET_SREGS, &sregs); struct kvm_regs regs = { .rflags = 2 }; ioctl(cpufd, KVM_SET_REGS, &regs); ioctl(vmfd, KVM_CREATE_IRQCHIP, 0); pthread_t th; pthread_create(&th, 0, thr, (void*)(long)cpufd); usleep(rand() % 10000); ioctl(cpufd, KVM_RUN, 0); pthread_join(th, 0); return 0; } This patch fixes it by failing ENABLE_CAP if without an irqchip. Reported-by: Dmitry Vyukov <dvyukov@google.com> Fixes: 5c919412fe61 (kvm/x86: Hyper-V synthetic interrupt controller) Cc: stable@vger.kernel.org # 4.5+ Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: eventfd: fix NULL deref irqbypass consumerWanpeng Li
Reported syzkaller: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] PGD 0 Oops: 0002 [#1] SMP CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1 Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm] task: ffff9bbe0dfbb900 task.stack: ffffb61802014000 RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] Call Trace: irqfd_shutdown+0x66/0xa0 [kvm] process_one_work+0x16b/0x480 worker_thread+0x4b/0x500 kthread+0x101/0x140 ? process_one_work+0x480/0x480 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x25/0x30 RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20 CR2: 0000000000000008 The syzkaller folks reported a NULL pointer dereference that due to unregister an consumer which fails registration before. The syzkaller creates two VMs w/ an equal eventfd occasionally. So the second VM fails to register an irqbypass consumer. It will make irqfd as inactive and queue an workqueue work to shutdown irqfd and unregister the irqbypass consumer when eventfd is closed. However, the second consumer has been initialized though it fails registration. So the token(same as the first VM's) is taken to unregister the consumer through the workqueue, the consumer of the first VM is found and unregistered, then NULL deref incurred in the path of deleting consumer from the consumers list. This patch fixes it by making irq_bypass_register/unregister_consumer() looks for the consumer entry based on consumer pointer itself instead of token matching. Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: Introduce segmented_write_stdSteve Rutherford
Introduces segemented_write_std. Switches from emulated reads/writes to standard read/writes in fxsave, fxrstor, sgdt, and sidt. This fixes CVE-2017-2584, a longstanding kernel memory leak. Since commit 283c95d0e389 ("KVM: x86: emulate FXSAVE and FXRSTOR", 2016-11-09), which is luckily not yet in any final release, this would also be an exploitable kernel memory *write*! Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: 96051572c819194c37a8367624b285be10297eca Fixes: 283c95d0e3891b64087706b344a4b545d04a6e62 Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12KVM: x86: flush pending lapic jump label updates on module unloadDavid Matlack
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled). These are implemented with delayed_work structs which can still be pending when the KVM module is unloaded. We've seen this cause kernel panics when the kvm_intel module is quickly reloaded. Use the new static_key_deferred_flush() API to flush pending updates on module unload. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12jump_labels: API for flushing deferred jump label updatesDavid Matlack
Modules that use static_key_deferred need a way to synchronize with any delayed work that is still pending when the module is unloaded. Introduce static_key_deferred_flush() which flushes any pending jump label updates. Signed-off-by: David Matlack <dmatlack@google.com> Cc: stable@vger.kernel.org Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12ARM: ux500: fix prcmu_is_cpu_in_wfi() calculationArnd Bergmann
This function clearly never worked and always returns true, as pointed out by gcc-7: arch/arm/mach-ux500/pm.c: In function 'prcmu_is_cpu_in_wfi': arch/arm/mach-ux500/pm.c:137:212: error: ?: using integer constants in boolean context, the expression will always evaluate to 'true' [-Werror=int-in-bool-context] With the added braces, the condition actually makes sense. Fixes: 34fe6f107eab ("mfd : Check if the other db8500 core is in WFI") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-01-12mmc: mxs-mmc: Fix additional cycles after transmission stopStefan Wahren
According to the code the intention is to append 8 SCK cycles instead of 4 at end of a MMC_STOP_TRANSMISSION command. But this will never happened because it's an AC command not an ADTC command. So fix this by moving the statement into the right function. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28) Cc: <stable@vger.kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-01-12mmc: sdhci-acpi: Only powered up enabled acpi child devicesHans de Goede
Commit e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing") introduced code to powerup any acpi child nodes listed in the dstd. But some dstd-s list all possible devices used on some board variants, while reporting if the device is actually present and enabled in the status field of the device. So we end up calling the acpi _PS0 (power-on) method for devices which are not actually present. This does not always end well, e.g. on my cube iwork8 air tablet, this results in freezing the entire tablet as soon as the r8723bs module is loaded. This commit fixes this by checking the child device's status.present and status.enabled bits and only call acpi_device_fix_up_power() if both are set. Fixes: e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing") BugLink: https://github.com/hadess/rtl8723bs/issues/80 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-01-12x86/entry: Fix the end of the stack for newly forked tasksJosh Poimboeuf
When unwinding a task, the end of the stack is always at the same offset right below the saved pt_regs, regardless of which syscall was used to enter the kernel. That convention allows the unwinder to verify that a stack is sane. However, newly forked tasks don't always follow that convention, as reported by the following unwinder warning seen by Dave Jones: WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value (null) The warning was due to the following call chain: (ftrace handler) call_usermodehelper_exec_async+0x5/0x140 ret_from_fork+0x22/0x30 The problem is that ret_from_fork() doesn't create a stack frame before calling other functions. Fix that by carefully using the frame pointer macros. In addition to conforming to the end of stack convention, this also makes related stack traces more sensible by making it clear to the user that ret_from_fork() was involved. Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/unwind: Include __schedule() in stack tracesJosh Poimboeuf
In the following commit: 0100301bfdf5 ("sched/x86: Rewrite the switch_to() code") ... the layout of the 'inactive_task_frame' struct was designed to have a frame pointer header embedded in it, so that the unwinder could use the 'bp' and 'ret_addr' fields to report __schedule() on the stack (or ret_from_fork() for newly forked tasks which haven't actually run yet). Finish the job by changing get_frame_pointer() to return a pointer to inactive_task_frame's 'bp' field rather than 'bp' itself. This allows the unwinder to start one frame higher on the stack, so that it properly reports __schedule(). Reported-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/598e9f7505ed0aba86e8b9590aa528c6c7ae8dcd.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/unwind: Disable KASAN checks for non-current tasksJosh Poimboeuf
There are a handful of callers to save_stack_trace_tsk() and show_stack() which try to unwind the stack of a task other than current. In such cases, it's remotely possible that the task is running on one CPU while the unwinder is reading its stack from another CPU, causing the unwinder to see stack corruption. These cases seem to be mostly harmless. The unwinder has checks which prevent it from following bad pointers beyond the bounds of the stack. So it's not really a bug as long as the caller understands that unwinding another task will not always succeed. In such cases, it's possible that the unwinder may read a KASAN-poisoned region of the stack. Account for that by using READ_ONCE_NOCHECK() when reading the stack of another task. Use READ_ONCE() when reading the stack of the current task, since KASAN warnings can still be useful for finding bugs in that case. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12x86/unwind: Silence warnings for non-current tasksJosh Poimboeuf
There are a handful of callers to save_stack_trace_tsk() and show_stack() which try to unwind the stack of a task other than current. In such cases, it's remotely possible that the task is running on one CPU while the unwinder is reading its stack from another CPU, causing the unwinder to see stack corruption. These cases seem to be mostly harmless. The unwinder has checks which prevent it from following bad pointers beyond the bounds of the stack. So it's not really a bug as long as the caller understands that unwinding another task will not always succeed. Since stack "corruption" on another task's stack isn't necessarily a bug, silence the warnings when unwinding tasks other than current. Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12Merge tag 'perf-core-for-mingo-4.11-20170111' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Add more triggers to switch the output file (perf.data.TIMESTAMP). Now, in addition to switching to a different output file when receiving a SIGUSR2, one can also specify file size and time based triggers: perf record -a --switch-output=signal is equivalent to what we had before: perf record -a --switch-output While we can also ask for the file to be "sliced" by size, taking into account that that will happen only when we get woken up by the kernel, i.e. one has to take into account the --mmap-pages (the size of the perf mmap ring buffer): perf record -a --switch-output=2G will break the perf.data output into multiple files limited to 2GB of samples, right when generating the output. For time based samples, alert() will be used, so to have 1 minute limited perf.data output files: perf record -a --switch-output=1m (Jiri Olsa) - Remove the need to use -e only for syscalls and --event only for tracepoints/HW/SW/etc events, i.e. now one can use: perf trace -e nanosleep,futex,sched:sched_switch ./workload or: perf trace --event nanosleep,futex,sched:sched_switch ./workload And have it tracing raw_syscalls:sys_{enter,exit} for the nanosleep and futex syscalls, formatting those as strace does while also tracing sched:sched_switch, ordering it all into one strace like output. Using '!' as the first character in the -e/--event argument remains a way to negate the list of syscalls, i.e. all syscalls except for the ones specified, doesn't affect the other kinds of events. E.g: [root@jouet ~] # perf trace -e sched:sched_switch,nanosleep usleep 1 0.000 ( 0.028 ms): usleep/28150 nanosleep(rqtp: 0x7ffe4201b9f0) ... 0.028 ( ): sched:sched_switch:usleep:28150 [120] S ==> swapper/0:0 [120]) 0.000 ( 0.065 ms): usleep/28150 ... [continued]: nanosleep()) = 0 [root@jouet ~]# (Arnaldo Carvalho de Melo) - 'perf kallsyms' toy tool to look for extended symbol information on the running kernel and demonstrate the machine/thread/symbol APIs for use in other tools, such as 'perf probe' (Arnaldo Carvalho de Melo) Infrastructure improvements: - Add missing linux/kernel.h include to subcmd.h (Arnaldo Carvalho de Melo) tools: Sync x86's vmx.h with the kernel - Create libdir directory before installing libperf-jvmti.so (Laura Abbott) - Fix typo in perf_evlist__start_workload() (Soramichi Akiyama) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12drm/i915: Fix phys pwrite for struct_mutex-less operationChris Wilson
Since commit fe115628d567 ("drm/i915: Implement pwrite without struct-mutex") the lowlevel pwrite calls are now called without the protection of struct_mutex, but pwrite_phys was still asserting that it held the struct_mutex and later tried to drop and relock it. Fixes: fe115628d567 ("drm/i915: Implement pwrite without struct-mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <drm-intel-fixes@lists.freedesktop.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (cherry picked from commit 10466d2a59b23aa6d5ecd5310296c8cdb6458dac) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-01-12drm/i915: Clear ret before unbinding in i915_gem_evict_something()Chris Wilson
Missed when rebasing patches, I failed to set ret to zero before starting the unbind loop (which depends upon ret being zero). Reported-by: Matthew Auld <matthew.william.auld@gmail.com> Fixes: 9332f3b1b99a ("drm/i915: Combine loops within i915_gem_evict_something") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170105155940.10033-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Cc: <stable@vger.kernel.org> # v4.9+ (cherry picked from commit 121dfbb2a2ef1c5f49e15c38ccc47ff0beb59446) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-01-12usb: gadget: udc: atmel: remove memory leakAlexandre Belloni
Commit bbe097f092b0 ("usb: gadget: udc: atmel: fix endpoint name") introduced a memory leak when unbinding the driver. The endpoint names would not be freed. Solve that by including the name as a string in struct usba_ep so it is freed when the endpoint is. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: dwc3: exynos fix axius clock error path to do cleanupShuah Khan
Axius clock error path returns without disabling clock and suspend clock. Fix it to disable them before returning error. Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: dwc2: Avoid suspending if we're in gadget modeJohn Stultz
I've found when booting HiKey with the usb gadget cable attached if I then try to connect via adb, I get an infinite spew of: dwc2 f72c0000.usb: dwc2_hsotg_ep_sethalt(ep ffffffc0790ecb18 ep1out, 0) dwc2 f72c0000.usb: dwc2_hsotg_ep_sethalt(ep ffffffc0790eca18 ep1in, 0) It seems that the usb autosuspend is suspending the bus shortly after bootup when the gadget cable is attached. So when adbd then tries to use the device, it doesn't work and it then tries to restart it over and over via the ep_sethalt calls (via FUNCTIONFS_CLEAR_HALT ioctl). Chen Yu suggested this patch to avoid suspending if we're in device mode, and it avoids the problem. Cc: Wei Xu <xuwei5@hisilicon.com> Cc: Guodong Xu <guodong.xu@linaro.org> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: John Youn <johnyoun@synopsys.com> Cc: Douglas Anderson <dianders@chromium.org> Cc: Chen Yu <chenyu56@huawei.com> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Suggested-by: Chen Yu <chenyu56@huawei.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: dwc2: use u32 for DT binding parametersLeo Yan
Commit 05ee799f2021 ("usb: dwc2: Move gadget settings into core_params") changes to type u16 for DT binding "g-rx-fifo-size" and "g-np-tx-fifo-size" but use type u32 for "g-tx-fifo-size". Finally the the first two parameters cannot be passed successfully with wrong data format. This is found the data transferring broken on 96boards Hikey. This patch is to change all parameters to u32 type, and verified on Hikey board the DT parameters can pass successfully. [johnyoun: minor rebase] Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: John Youn <johnyoun@synopsys.com> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: gadget: f_fs: Fix iterations on endpoints.Vincent Pelletier
When zero endpoints are declared for a function, there is no endpoint to disable, enable or free, so replace do...while loops with while loops. Change pre-decrement to post-decrement to iterate the same number of times when there are endpoints to process. Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: dwc2: gadget: Fix DMA memory freeingVardan Mikayelyan
Remove DMA memory free from EP disable flow by replacing dma_alloc_coherent with dmam_alloc_coherent. Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Vardan Mikayelyan <mvardan@synopsys.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-12usb: gadget: composite: Fix function used to free memoryChristophe JAILLET
'cdev->os_desc_req' has been allocated with 'usb_ep_alloc_request()' so 'usb_ep_free_request()' should be used to free it. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-11scsi: lpfc: avoid double free of resource identifiersRoberto Sassu
Set variables initialized in lpfc_sli4_alloc_resource_identifiers() to NULL if an error occurred. Otherwise, lpfc_sli4_driver_resource_unset() attempts to free the memory again. Signed-off-by: Roberto Sassu <rsassu@suse.de> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-11scsi: qla2xxx: remove irq_affinity_notifierChristoph Hellwig
Now that qla2xxx uses the IRQ layer affinity assignment, affinity won't change over the life time of a device and the notifiers are useless. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-11scsi: qla2xxx: fix MSI-X vector affinityChristoph Hellwig
The first two or three vectors in qla2xxx adapter are global and not associated with a specific queue. They should not have IRQ affinity assigned. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-01-11netvsc: add rcu_read locking to netvsc callbackstephen hemminger
The receive callback (in tasklet context) is using RCU to get reference to associated VF network device but this is not safe. RCU read lock needs to be held. Found by running with full lockdep debugging enabled. Fixes: f207c10d9823 ("hv_netvsc: use RCU to protect vf_netdev") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11vxlan: Set ports in flow key when doing route lookupsMartynas Pumputis
Otherwise, a xfrm policy with sport/dport being set cannot be matched. Signed-off-by: Martynas Pumputis <martynas@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11r8152: fix the sw rx checksum is unavailablehayeswang
Fix the hw rx checksum is always enabled, and the user couldn't switch it to sw rx checksum. Note that the RTL_VER_01 only support sw rx checksum only. Besides, the hw rx checksum for RTL_VER_02 is disabled after commit b9a321b48af4 ("r8152: Fix broken RX checksums."). Re-enable it. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11HID: i2c-hid: Add sleep between POWER ON and RESETBrendan McGrath
Support for the Asus Touchpad was recently added. It turns out this device can fail initialisation (and become unusable) when the RESET command is sent too soon after the POWER ON command. Unfortunately the i2c-hid specification does not specify the need for a delay between these two commands. But it was discovered the Windows driver has a 1ms delay. As a result, this patch modifies the i2c-hid module to add a sleep inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms. See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further details. Signed-off-by: Brendan McGrath <redmcg@redmandi.dyndns.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-11tools: Sync x86's vmx.h with the kernelArnaldo Carvalho de Melo
To pick the changes from: 1b07304c587d ("KVM: nVMX: support descriptor table exits") That adds entries to VMX_EXIT_REASONS, that is used by tools/perf/arch/x86/util/kvm-stat.c. This also picks the changes in: 1dc35dacc16b ("KVM: nVMX: check host CR3 on vmentry and vmexit") But these are not used in 'perf kvm stat', do it just to silence the kernel/tools file cache coherency detector: $ make -C tools/perf make: Entering directory '/home/acme/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ladi Prosek <lprosek@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-56uowkk8t5zje49a42asffcy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Add switch-output time option argumentJiri Olsa
It's now possible to specify the threshold time for perf.data like: $ perf record --switch-output=30s ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=30s ... [ perf record: dump data: Woken up 44 times ] [ perf record: Dump perf.data.2017010213043746 ] ... The time is expected to be a number with appended unit character - s/m/h/d. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Add switch-output size warningJiri Olsa
Adding switch-output size warning if the requested size of lower than the wakeup ring buffer size. $ perf record --switch-output=1K ls WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Add switch-output size option argumentJiri Olsa
It's now possible to specify the threshold size for perf.data like: $ perf record --switch-output=2G ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=2G ... [ perf record: dump data: Woken up 7244 times ] [ perf record: Dump perf.data.2017010214093746 ] ... The size is expected to be a number with appended unit character - B/K/M/G. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Change switch-output option to take optional argumentJiri Olsa
Next patches will add --switch-output option arguments, changing the option to allow that and adding its default value to 'signal'. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Add struct switch_outputJiri Olsa
Next patches will add more --switch-output option arguments, so preparing the data holder. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Wang Nan <wangnan0@huawei.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1483955520-29063-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf tools: Add unit_number__scnprintf functionJiri Olsa
Add unit_number__scnprintf function to display size units and use it in -m option info message. Before: $ perf record -m 10M ls rounding mmap pages size to 16777216 bytes (4096 pages) ... After: $ perf record -m 10M ls rounding mmap pages size to 16M (4096 pages) ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1483955520-29063-2-git-send-email-jolsa@kernel.org [ Rename it to unit_number__scnprintf for consistency ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf evlist: Fix typo in perf_evlist__start_workload()Soramichi Akiyama
This patch fixes a typo: s/enable to/unable to/ Signed-off-by: Soramichi AKIYAMA <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: bcf3145fbeb1 ("perf evlist: Enhance perf_evlist__start_workload()") Link: http://lkml.kernel.org/r/20170110200006.e1f7a766b4faf1f107ae2e1b@m.soramichi.jp [ Wasn't applying, fixed it up by hand, added Fixes: tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf trace: Allow specifying list of syscalls and events in -e/--expr/--eventArnaldo Carvalho de Melo
Makes it easier to specify both events and syscalls (to be formatter strace-like), i.e. previously one would have to do: # perf trace -e nanosleep --event sched:sched_switch usleep 1 Now it is possible to do: # perf trace -e nanosleep,sched:sched_switch usleep 1 0.000 ( 0.021 ms): usleep/17962 nanosleep(rqtp: 0x7ffdedd61ec0) ... 0.021 ( ): sched:sched_switch:usleep:17962 [120] S ==> swapper/1:0 [120]) 0.000 ( 0.066 ms): usleep/17962 ... [continued]: nanosleep()) = 0 # The old style --expr and using both -e and --event continues to work. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ieg6bakub4657l9e6afn85r4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>