summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-04-27libnvdimm: fix phys_addr for nvdimm_clear_poisonToshi Kani
nvdimm_clear_poison() expects a physical address, not an offset. Fix nsio_rw_bytes() to call nvdimm_clear_poison() with a physical address. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-04-27net: vrf: Do not allow looback to be moved to a VRFDavid Ahern
Moving the loopback into a VRF breaks networking for the default VRF. Since the VRF device is the loopback for VRF domains, there is no reason to move the loopback. Given the repercussions, block attempts to set lo into a VRF. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27Merge tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Thanks to Ari Kauppi and Tuomas Haanpää at Synopsis for spotting bugs in our NFSv2/v3 xdr code that could crash the server or leak memory" * tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux: nfsd: stricter decoding of write-like NFSv2/v3 ops nfsd4: minor NFSv2/v3 write decoding cleanup nfsd: check for oversized NFSv2/v3 arguments
2017-04-27fib_rules: fix error return codeWei Yongjun
Fix to return error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27bridge: add per-port broadcast flood flagMike Manning
Support for l2 multicast flood control was added in commit b6cb5ac8331b ("net: bridge: add per-port multicast flood flag"). It allows broadcast as it was introduced specifically for unknown multicast flood control. But as broadcast is a special case of multicast, this may also need to be disabled. For this purpose, introduce a flag to disable the flooding of received l2 broadcasts. This approach is backwards compatible and provides flexibility in filtering for the desired packet types. Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Mike Manning <mmanning@brocade.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27net: fib: Decrease one unnecessary rt cache flush in fib_disable_ipGao Feng
The func fib_flush already flushes the rt cache if necessary, so it is not necessary to invoke rt_cache_flush again in fib_disable_ip. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27l2tp: remove useless device duplication test in l2tp_eth_create()Guillaume Nault
There's no need to verify that cfg->ifname is unique at this point. register_netdev() will return -EEXIST if asked to create a device with a name that's alrealy in use. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27net: remove unnecessary carrier status checkZhang Shengju
Since netif_carrier_on() will do nothing if device's carrier is already on, so it's unnecessary to do carrier status check. It's the same for netif_carrier_off(). Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27net: update comment for netif_dormant() functionZhang Shengju
This patch updates the comment for netif_dormant() function to reflect the intended usage. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27arm64: sunxi: always enable reset controllerArnd Bergmann
The sunxi clk driver causes a link error when the reset controller subsystem is disabled: drivers/clk/built-in.o: In function `sun4i_ve_clk_setup': :(.init.text+0xd040): undefined reference to `reset_controller_register' drivers/clk/built-in.o: In function `sun4i_a10_display_init': :(.init.text+0xe5e0): undefined reference to `reset_controller_register' drivers/clk/built-in.o: In function `sunxi_usb_clk_setup': :(.init.text+0x10074): undefined reference to `reset_controller_register' We already force it to be enabled on arm32 and some other arm64 platforms, but not on arm64/sunxi. This adds the respective Kconfig statements to also select it here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
2017-04-27Merge branch 'for_4.12/soc-pmdomain-p2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into next/drivers Pull "Keystone mangled URLs fixed" from Santosh Shilimkar * 'for_4.12/soc-pmdomain-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: soc: pm-domain: Fix the mangled urls
2017-04-27Merge tag 'samsung-dt-4.12-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into next/dt Pull "Fix DTC warnings in Exynos ARMv7 Device Tree sources." from Krzysztof Kozłowski: * tag 'samsung-dt-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: dts: exynos: Use - instead of @ for DT OPP entries
2017-04-27Merge tag 'vexpress-dt-4.12' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into next/dt Pull "ARMv7 VExpress DT fix for v4.12" from Sudeep Holla: A single fix to remove device tree warnings introduced with recently added checks in DTC * tag 'vexpress-dt-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: ARM: dts: vexpress: fix few unit address format warnings
2017-04-27Merge tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A fix for a kernel stack overflow bug in ceph setattr code, marked for stable" * tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client: ceph: fix recursion between ceph_set_acl() and __ceph_setattr()
2017-04-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - fix orangefs handling of faults on write() - I'd missed that one back when orangefs was going through review. - readdir counterpart of "9p: cope with bogus responses from server in p9_client_{read,write}" - server might be lying or broken, and we'd better not overrun the kmalloc'ed buffer we are copying the results into. - NFS O_DIRECT read/write can leave iov_iter advanced by too much; that's what had been causing iov_iter_pipe() warnings davej had been seeing. - statx_timestamp.tv_nsec type fix (s32 -> u32). That one really should go in before 4.11. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: uapi: change the type of struct statx_timestamp.tv_nsec to unsigned fix nfs O_DIRECT advancing iov_iter too much p9_client_readdir() fix orangefs_bufmap_copy_from_iovec(): fix EFAULT handling
2017-04-27dm ioctl: prevent stack leak in dm ioctl callAdrian Salido
When calling a dm ioctl that doesn't process any data (IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct dm_ioctl are left initialized. Current code is incorrectly extending the size of data copied back to user, causing the contents of kernel stack to be leaked to user. Fix by only copying contents before data and allow the functions processing the ioctl to override. Cc: stable@vger.kernel.org Signed-off-by: Adrian Salido <salidoa@google.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27xfs: Allow user to kill fstrim processLukas Czerner
fstrim can take really long time on big, slow device or on file system with a lots of allocation groups. Currently there is no way for the user to cancell the operation. This patch makes it possible for the user to kill fstrim pocess by adding the check for fatal_signal_pending() in xfs_trim_extents(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-04-27statx: correct error handling of NULL pathnameMichael Kerrisk (man-pages)
The change in commit 1e2f82d1e9d1 ("statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH") to error on a NULL pathname to statx() is inconsistent. It results in the error EINVAL for a NULL pathname. Other system calls with similar APIs (fchownat(), fstatat(), linkat()), return EFAULT. The solution is simply to remove the EINVAL check. As I already pointed out in [1], user_path_at*() and filename_lookup() will handle the NULL pathname as per the other APIs, to correctly produce the error EFAULT. [1] https://lkml.org/lkml/2017/4/26/561 Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-27Merge branch 'nvme-4.12' of git://git.infradead.org/nvme into ↵Jens Axboe
for-4.12/post-merge Christoph writes: "A couple more updates for 4.12. The biggest pile is fc and lpfc updates from James, but there are various small fixes and cleanups as well." Fixes up a few merge issues, and also a warning in lpfc_nvmet_rcv_unsol_abort() if CONFIG_NVME_TARGET_FC isn't enabled. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-27samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnelDavid Ahern
Add option to xdp1 and xdp_tx_iptunnel to insert xdp program in SKB_MODE: - update set_link_xdp_fd to take a flags argument that is added to the RTM_SETLINK message - Add -S option to xdp1 and xdp_tx_iptunnel user code. When passed in XDP_FLAGS_SKB_MODE is set in the flags arg passed to set_link_xdp_fd Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27ixgbe: Use pcie_flr() instead of duplicating itChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-27dm integrity: use previously calculated log2 of sectors_per_blockMikulas Patocka
The log2 of sectors_per_block was already calculated, so we don't have to use the ilog2 function. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27dm integrity: use hex2bin instead of open-coded variantMikulas Patocka
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27dm crypt: replace custom implementation of hex2bin()Andy Shevchenko
There is no need to have a duplication of the generic library, i.e. hex2bin(). Replace the open coded variant. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-04-27rhashtable: Cap total number of entries to 2^31Herbert Xu
When max_size is not set or if it set to a sufficiently large value, the nelems counter can overflow. This would cause havoc with the automatic shrinking as it would then attempt to fit a huge number of entries into a tiny hash table. This patch fixes this by adding max_elems to struct rhashtable to cap the number of elements. This is set to 2^31 as nelems is not a precise count. This is sufficiently smaller than UINT_MAX that it should be safe. When max_size is set max_elems will be lowered to at most twice max_size as is the status quo. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27tcp: tcp_rack_reo_timeout() must update tp->tcp_mstampEric Dumazet
I wrongly assumed tp->tcp_mstamp was up to date at the time tcp_rack_reo_timeout() was called. It is not true, since we only update tcp->tcp_mstamp when receiving a packet (as initially done in commit 69e996c58a35 ("tcp: add tp->tcp_mstamp field") tcp_rack_reo_timeout() being called by a timer and not an incoming packet, we need to refresh tp->tcp_mstamp Fixes: 7c1c7308592f ("tcp: do not pass timestamp to tcp_rack_detect_loss()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-27Merge tag 'kvm-arm-for-v4.12' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.12. Changes include: - Using the common sysreg definitions between KVM and arm64 - Improved hyp-stub implementation with support for kexec and kdump on the 32-bit side - Proper PMU exception handling - Performance improvements of our GIC handling - Support for irqchip in userspace with in-kernel arch-timers and PMU support - A fix for a race condition in our PSCI code Conflicts: Documentation/virtual/kvm/api.txt include/uapi/linux/kvm.h
2017-04-27kvm: nVMX: Remove superfluous VMX instruction fault checksJim Mattson
According to the Intel SDM, "Certain exceptions have priority over VM exits. These include invalid-opcode exceptions, faults based on privilege level*, and general-protection exceptions that are based on checking I/O permission bits in the task-state segment (TSS)." There is no need to check for faulting conditions that the hardware has already checked. * These include faults generated by attempts to execute, in virtual-8086 mode, privileged instructions that are not recognized in that mode. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27KVM: x86: fix emulation of RSM and IRET instructionsLadi Prosek
On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm on hflags is reverted later on in x86_emulate_instruction where hflags are overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu. Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after an instruction is emulated, this commit deletes emul_flags altogether and makes the emulator access vcpu->arch.hflags using two new accessors. This way all changes, on the emulator side as well as in functions called from the emulator and accessing vcpu state with emul_to_vcpu, are preserved. More details on the bug and its manifestation with Windows and OVMF: It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD. I believe that the SMM part explains why we started seeing this only with OVMF. KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because later on in x86_emulate_instruction we overwrite arch.hflags with ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call. The AMD-specific hflag of interest here is HF_NMI_MASK. When rebooting the system, Windows sends an NMI IPI to all but the current cpu to shut them down. Only after all of them are parked in HLT will the initiating cpu finish the restart. If NMI is masked, other cpus never get the memo and the initiating cpu spins forever, waiting for hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe. Fixes: a584539b24b8 ("KVM: x86: pass the whole hflags field to emulator and back") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27selftests: create cpufreq kconfig fragmentsNaresh Kamboju
For the better test coverage of cpufreq driver code these extra configurations are needed. Enable cpufreq governors and stats. Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-04-27selftests: x86: override clean in lib.mk to fix warningsShuah Khan
Add override with EXTRA_CLEAN for lib.mk clean to fix the following warnings from clean target run. Makefile:44: warning: overriding recipe for target 'clean' ../lib.mk:55: warning: ignoring old recipe for target 'clean' Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-04-27selftests: sync: override clean in lib.mk to fix warningsShuah Khan
Add override with EXTRA_CLEAN for lib.mk clean to fix the following warnings from clean target run. Makefile:24: warning: overriding recipe for target 'clean' ../lib.mk:55: warning: ignoring old recipe for target 'clean' Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-04-27selftests: splice: override clean in lib.mk to fix warningsShuah Khan
Add override with EXTRA_CLEAN for lib.mk clean to fix the following warnings from clean target run. Makefile:8: warning: overriding recipe for target 'clean' ../lib.mk:55: warning: ignoring old recipe for target 'clean' Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-04-27blk-mq-sched: alloate reserved tags out of normal poolJens Axboe
At least one driver, mtip32xx, has a hard coded dependency on the value of the reserved tag used for internal commands. While that should really be fixed up, for now let's ensure that we just bypass the scheduler tags an allocation marked as reserved. They are used for house keeping or error handling, so we can safely ignore them in the scheduler. Tested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-27mtip32xx: use runtime tag to initialize command headerMing Lei
mtip32xx supposes that 'request_idx' passed to .init_request() is tag of the request, and use that as request's tag to initialize command header. After MQ IO scheduler is in, request tag assigned isn't same with the request index anymore, so cause strange hardware failure on mtip32xx, even whole system panic is triggered. This patch fixes the issue by initializing command header via request's real tag. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-27KVM: mark requests that need synchronizationPaolo Bonzini
kvm_make_all_requests() provides a synchronization that waits until all kicked VCPUs have acknowledged the kick. This is important for KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is underway. This patch adds the synchronization property into all requests that are currently being used with kvm_make_all_requests() in order to preserve the current behavior and only introduce a new framework. Removing it from requests where it is not necessary is left for future patches. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27powerpc/ftrace/64: Split further based on -mprofile-kernelNaveen N. Rao
Split ftrace_64.S further retaining the core ftrace 64-bit aspects in ftrace_64.S and moving ftrace_caller() and ftrace_graph_caller() into separate files based on -mprofile-kernel. The livepatch routines are all now contained within the mprofile file. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc: Split ftrace bits into a separate fileNaveen N. Rao
entry_*.S now includes a lot more than just kernel entry/exit code. As a first step at cleaning this up, let's split out the ftrace bits into separate files. Also move all related tracing code into a new trace/ subdirectory. No functional changes. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm: Rename table dump file nameChristophe Leroy
Page table dump debugfs file is named 'kernel_page_tables' on all other architectures implementing it, while is is named 'kernel_pagetables' on powerpc. This patch renames it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm: On PPC32, display 32 bits addresses in page table dumpChristophe Leroy
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm: Fix missing page attributes in page table dumpChristophe Leroy
On some targets, _PAGE_RW is 0 and this is _PAGE_RO which is used. There is also _PAGE_SHARED that is missing. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm: Fix page table dump build on PPC32Christophe Leroy
On PPC32 (eg. mpc885_ads_defconfig), page table dump compilation fails as follows. This is because the memory layout is slightly different on PPC32. This patch adapts it. arch/powerpc/mm/dump_linuxpagetables.c: In function 'walk_pagetables': arch/powerpc/mm/dump_linuxpagetables.c:369:10: error: 'KERN_VIRT_START' undeclared (first use in this function) ... Fixes: 8eb07b187000d ("powerpc/mm: Dump linux pagetables") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm/radix: Optimise tlbiel flush all caseAneesh Kumar K.V
_tlbiel_pid() is called with a ric (Radix Invalidation Control) argument of either RIC_FLUSH_TLB or RIC_FLUSH_ALL. RIC_FLUSH_ALL says to invalidate the entire TLB and the Page Walk Cache (PWC). To flush the whole TLB, we have to iterate over each set (congruence class) of the TLB. Currently we do that and pass RIC_FLUSH_ALL each time. That is not incorrect but it means we flush the PWC 128 times, when once would suffice. Fix it by doing the first flush with the ric value we're passed, and then if it was RIC_FLUSH_ALL, we downgrade it to RIC_FLUSH_TLB, because we know we have just flushed the PWC and don't need to do it again. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Split out of combined patch, tweak logic, rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27powerpc/mm/radix: Optimise Page Walk Cache flushAneesh Kumar K.V
Currently we implement flushing of the page walk cache (PWC) by calling _tlbiel_pid() with a RIC (Radix Invalidation Control) value of 1 which says to only flush the PWC. But _tlbiel_pid() loops over each set (congruence class) of the TLB, which is not necessary when we're just flushing the PWC. In fact the set argument is ignored for a PWC flush, so essentially we're just flushing the PWC 127 extra times for no benefit. Fix it by adding tlbiel_pwc() which just does a single flush of the PWC. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Split out of combined patch, drop _ in name, rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-27KVM: return if kvm_vcpu_wake_up() did wake up the VCPURadim Krčmář
No need to kick a VCPU that we have just woken up. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27KVM: add explicit barrier to kvm_vcpu_kickAndrew Jones
kvm_vcpu_kick() must issue a general memory barrier prior to reading vcpu->mode in order to ensure correctness of the mutual-exclusion memory barrier pattern used with vcpu->requests. While the cmpxchg called from kvm_vcpu_kick(): kvm_vcpu_kick kvm_arch_vcpu_should_kick kvm_vcpu_exiting_guest_mode cmpxchg implies general memory barriers before and after the operation, that implication is only valid when cmpxchg succeeds. We need an explicit barrier for when it fails, otherwise a VCPU thread on its entry path that reads zero for vcpu->requests does not exclude the possibility the requesting thread sees !IN_GUEST_MODE when it reads vcpu->mode. kvm_make_all_cpus_request already had a barrier, so we remove it, as now it would be redundant. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27EDAC, ghes: Do not enable it by defaultBorislav Petkov
Leave it to the user to decide whether to enable this or not. Otherwise, platform-specific drivers won't initialize (currently, EDAC supports only a single platform driver loaded). Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-27KVM: perform a wake_up in kvm_make_all_cpus_requestRadim Krčmář
We want to have kvm_make_all_cpus_request() to be an optmized version of kvm_for_each_vcpu(i, vcpu, kvm) { kvm_make_request(vcpu, request); kvm_vcpu_kick(vcpu); } and kvm_vcpu_kick() wakes up the target vcpu. We know which requests do not need the wake up and use it to optimize the loop. Thanks to that, this patch doesn't change the behavior of current users (the all don't need the wake up) and only prepares for future where the wake up is going to be needed. I think that most requests do not need the wake up, so we would flip the bit then. Later on, kvm_make_request() will take care of kicking too, using this bit to make the decision whether to kick or not. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27KVM: mark requests that do not need a wakeupRadim Krčmář
Some operations must ensure that the guest is not running with stale data, but if the guest is halted, then the update can wait until another event happens. kvm_make_all_requests() currently doesn't wake up, so we can mark all requests used with it. First 8 bits were arbitrarily reserved for request numbers. Most uses of requests have the request type as a constant, so a compiler will optimize the '&'. An alternative would be to have an inline function that would return whether the request needs a wake-up or not, but I like this one better even though it might produce worse assembly. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-27KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_upRadim Krčmář
The #ifndef was protecting a missing halt_wakeup stat, but that is no longer necessary. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>