git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-12-08	KVM: nVMX: check host CR3 on vmentry and vmexit	Ladi Prosek
	This commit adds missing host CR3 checks. Before entering guest mode, the value of CR3 is checked for reserved bits. After returning, nested_vmx_load_cr3 is called to set the new CR3 value and check and load PDPTRs. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry	Ladi Prosek
	Loading CR3 as part of emulating vmentry is different from regular CR3 loads, as implemented in kvm_set_cr3, in several ways. * different rules are followed to check CR3 and it is desirable for the caller to distinguish between the possible failures * PDPTRs are not loaded if PAE paging and nested EPT are both enabled * many MMU operations are not necessary This patch introduces nested_vmx_load_cr3 suitable for CR3 loads as part of nested vmentry and vmexit, and makes use of it on the nested vmentry path. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: propagate errors from prepare_vmcs02	Ladi Prosek
	It is possible that prepare_vmcs02 fails to load the guest state. This patch adds the proper error handling for such a case. L1 will receive an INVALID_STATE vmexit with the appropriate exit qualification if it happens. A failure to set guest CR3 is the only error propagated from prepare_vmcs02 at the moment. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT	Ladi Prosek
	KVM does not correctly handle L1 hypervisors that emulate L2 real mode with PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used (see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two related issues: 1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are overwritten in ept_load_pdptrs because the registers are believed to have been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is running with PAE enabled but PDPTRs have been set up by L1. 2) When L2 is about to enable paging and loads its CR3, we, again, attempt to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees that this will succeed (it's just a CR3 load, paging is not enabled yet) and if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is then lost and L2 crashes right after it enables paging. This patch replaces the kvm_set_cr3 call with a simple register write if PAE and EPT are both on. CR3 is not to be interpreted in this case. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry	David Matlack
	vmx_set_cr0() modifies GUEST_EFER and "IA-32e mode guest" in the current VMCS. Call vmx_set_efer() after vmx_set_cr0() so that emulated VM-entry is more faithful to VMCS12. This patch correctly causes VM-entry to fail when "IA-32e mode guest" is 1 and GUEST_CR0.PG is 0. Previously this configuration would succeed and "IA-32e mode guest" would silently be disabled by KVM. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID	David Matlack
	MSR_IA32_CR{0,4}_FIXED1 define which bits in CR0 and CR4 are allowed to be 1 during VMX operation. Since the set of allowed-1 bits is the same in and out of VMX operation, we can generate these MSRs entirely from the guest's CPUID. This lets userspace avoiding having to save/restore these MSRs. This patch also initializes MSR_IA32_CR{0,4}_FIXED1 from the CPU's MSRs by default. This is a saner than the current default of -1ull, which includes bits that the host CPU does not support. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation	David Matlack
	KVM emulates MSR_IA32_VMX_CR{0,4}_FIXED1 with the value -1ULL, meaning all CR0 and CR4 bits are allowed to be 1 during VMX operation. This does not match real hardware, which disallows the high 32 bits of CR0 to be 1, and disallows reserved bits of CR4 to be 1 (including bits which are defined in the SDM but missing according to CPUID). A guest can induce a VM-entry failure by setting these bits in GUEST_CR0 and GUEST_CR4, despite MSR_IA32_VMX_CR{0,4}_FIXED1 indicating they are valid. Since KVM has allowed all bits to be 1 in CR0 and CR4, the existing checks on these registers do not verify must-be-0 bits. Fix these checks to identify must-be-0 bits according to MSR_IA32_VMX_CR{0,4}_FIXED1. This patch should introduce no change in behavior in KVM, since these MSRs are still -1ULL. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: support restore of VMX capability MSRs	David Matlack
	The VMX capability MSRs advertise the set of features the KVM virtual CPU can support. This set of features varies across different host CPUs and KVM versions. This patch aims to addresses both sources of differences, allowing VMs to be migrated across CPUs and KVM versions without guest-visible changes to these MSRs. Note that cross-KVM- version migration is only supported from this point forward. When the VMX capability MSRs are restored, they are audited to check that the set of features advertised are a subset of what KVM and the CPU support. Since the VMX capability MSRs are read-only, they do not need to be on the default MSR save/restore lists. The userspace hypervisor can set the values of these MSRs or read them from KVM at VCPU creation time, and restore the same value after every save/restore. Signed-off-by: David Matlack <dmatlack@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: nVMX: generate non-true VMX MSRs based on true versions	David Matlack
	The "non-true" VMX capability MSRs can be generated from their "true" counterparts, by OR-ing the default1 bits. The default1 bits are fixed and defined in the SDM. Since we can generate the non-true VMX MSRs from the true versions, there's no need to store both in struct nested_vmx. This also lets userspace avoid having to restore the non-true MSRs. Note this does not preclude emulating MSR_IA32_VMX_BASIC[55]=0. To do so, we simply need to set all the default1 bits in the true MSRs (such that the true MSRs and the generated non-true MSRs are equal). Signed-off-by: David Matlack <dmatlack@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.	Kyle Huey
	The trap flag stays set until software clears it. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Add kvm_skip_emulated_instruction and use it.	Kyle Huey
	kvm_skip_emulated_instruction calls both kvm_x86_ops->skip_emulated_instruction and kvm_vcpu_check_singlestep, skipping the emulated instruction and generating a trap if necessary. Replacing skip_emulated_instruction calls with kvm_skip_emulated_instruction is straightforward, except for: - ICEBP, which is already inside a trap, so avoid triggering another trap. - Instructions that can trigger exits to userspace, such as the IO insns, MOVs to CR8, and HALT. If kvm_skip_emulated_instruction does trigger a KVM_GUESTDBG_SINGLESTEP exit, and the handling code for IN/OUT/MOV CR8/HALT also triggers an exit to userspace, the latter will take precedence. The singlestep will be triggered again on the next instruction, which is the current behavior. - Task switch instructions which would require additional handling (e.g. the task switch bit) and are instead left alone. - Cases where VMLAUNCH/VMRESUME do not proceed to the next instruction, which do not trigger singlestep traps as mentioned previously. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12	Kyle Huey
	We can't return both the pass/fail boolean for the vmcs and the upcoming continue/exit-to-userspace boolean for skip_emulated_instruction out of nested_vmx_check_vmcs, so move skip_emulated_instruction out of it instead. Additionally, VMENTER/VMRESUME only trigger singlestep exceptions when they advance the IP to the following instruction, not when they a) succeed, b) fail MSR validation or c) throw an exception. Add a separate call to skip_emulated_instruction that will later not be converted to the variant that checks the singlestep flag. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: VMX: Reorder some skip_emulated_instruction calls	Kyle Huey
	The functions being moved ahead of skip_emulated_instruction here don't need updated IPs, and skipping the emulated instruction at the end will make it easier to return its value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	KVM: x86: Add a return value to kvm_emulate_cpuid	Kyle Huey
	Once skipping the emulated instruction can potentially trigger an exit to userspace (via KVM_GUESTDBG_SINGLESTEP) kvm_emulate_cpuid will need to propagate a return value. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-12-08	mmc: sdhci-cadence: add Cadence SD4HC support	Masahiro Yamada
	Add a driver for the Cadence SD4HC SD/SDIO/eMMC Controller. For SD, it basically relies on the SDHCI standard code. For eMMC, this driver provides some callbacks to support the hardware part that is specific to this IP design. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-12-08	mmc: sdhci: export sdhci_execute_tuning()	Masahiro Yamada
	Some SDHCI-compat controllers support not only SD, but also eMMC, but they use different commands for tuning: CMD19 for SD, CMD21 for eMMC. Due to the difference of the underlying mechanism, some controllers (at least, the Cadence IP is the case) provide their own registers for the eMMC tuning. This commit will be useful when we want to override .execute_tuning callback (for eMMC HS200 tuning), but still let it fall back to sdhci_execute_tuning() for SD timing. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-12-08	kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker()	Oleg Nesterov
	kthread_create_on_cpu() sets KTHREAD_IS_PER_CPU and kthread->cpu, this only makes sense if this kthread can be parked/unparked by cpuhp code. kthread workers never call kthread_parkme() so this has no effect. Change __kthread_create_worker() to simply call kthread_bind(task, cpu). The very fact that kthread_create_on_cpu() doesn't accept a generic fmt shows that it should not be used outside of smpboot.c. Now, the only reason we can not unexport this helper and move it into smpboot.c is that it sets kthread->cpu and struct kthread is not exported. And the only reason we can not kill kthread->cpu is that kthread_unpark() is used by drivers/gpu/drm/amd/scheduler/gpu_scheduler.c and thus we can not turn _unpark into kthread_unpark(struct smp_hotplug_thread *, cpu). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Petr Mladek <pmladek@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Chunming Zhou <David1.Zhou@amd.com> Cc: Roman Pen <roman.penyaev@profitbricks.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161129175110.GA5342@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	kthread: Don't use to_live_kthread() in kthread_[un]park()	Oleg Nesterov
	Now that to_kthread() is always validm change kthread_park() and kthread_unpark() to use it and kill to_live_kthread(). The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set then the task has called complete(&self->parked) and there the function cannot race against a concurrent kthread_stop() and exit. kthread_park() is more tricky, because its semantics are not well defined. It returns -ENOSYS if the thread exited but this can never happen and as Roman pointed out kthread_park() can obviously block forever if it would race with the exiting kthread. The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c) is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads() clears *ht->store, smpboot_park_thread() checks it is not NULL under the same smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other callers are fine too. But it has two more users: - watchdog_park_threads(): The code is actually correct, get_online_cpus() ensures that kthread_park() can't race with itself (note that kthread_park() can't handle this race correctly), but it should not use kthread_park() directly. - drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use kthread_park() either. kthread_park() must not be called after amd_sched_fini() which does kthread_stop(), otherwise even to_live_kthread() is not safe because task_struct can be already freed and sched->thread can point to nowhere. The usage of kthread_park/unpark should either be restricted to core code which is properly protected against the exit race or made more robust so it is safe to use it in drivers. To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chunming Zhou <David1.Zhou@amd.com> Cc: Roman Pen <roman.penyaev@profitbricks.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	kthread: Don't use to_live_kthread() in kthread_stop()	Oleg Nesterov
	kthread_stop() had to use to_live_kthread() simply because it was not possible to access kthread->exited after the exiting task clears task_struct->vfork_done. Now that to_kthread() is always valid, wake_up_process() + wait_for_completion() can be done ununconditionally. It's not an issue anymore if the task has already issued complete_vfork_done() or died. The exiting task can get the spurious wakeup after mm_release() but this is possible without this change too and is fine; do_task_dead() ensures that this can't make any harm. As a further enhancement this could be converted to task_work_add() later, so ->vfork_done can be avoided completely. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chunming Zhou <David1.Zhou@amd.com> Cc: Roman Pen <roman.penyaev@profitbricks.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161129175103.GA5336@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in ↵	Oleg Nesterov
	to_live_kthread() function" This reverts commit 23196f2e5f5d810578a772785807dcdc2b9fdce9. Now that struct kthread is kmalloc'ed and not longer on the task stack there is no need anymore to pin the stack. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chunming Zhou <David1.Zhou@amd.com> Cc: Roman Pen <roman.penyaev@profitbricks.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161129175100.GA5333@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	kthread: Make struct kthread kmalloc'ed	Oleg Nesterov
	commit 23196f2e5f5d "kthread: Pin the stack via try_get_task_stack() / put_task_stack() in to_live_kthread() function" is a workaround for the fragile design of struct kthread being allocated on the task stack. struct kthread in its current form should be removed, but this needs cleanups outside of kthread.c. As a first step move struct kthread away from the task stack by making it kmalloc'ed. This allows to access kthread.exited without the magic of trying to pin task stack and the try logic in to_live_kthread(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chunming Zhou <David1.Zhou@amd.com> Cc: Roman Pen <roman.penyaev@profitbricks.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161129175057.GA5330@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	ceph: don't set req->r_locked_dir in ceph_d_revalidate	Jeff Layton
	This function sets req->r_locked_dir which is supposed to indicate to ceph_fill_trace that the parent's i_rwsem is locked for write. Unfortunately, there is no guarantee that the dir will be locked when d_revalidate is called, so we really don't want ceph_fill_trace to do any dcache manipulation from this context. Clear req->r_locked_dir since it's clearly not safe to do that. What we really want to know with d_revalidate is whether the dentry still points to the same inode. ceph_fill_trace installs a pointer to the inode in req->r_target_inode, so we can just compare that to d_inode(dentry) to see if it's the same one after the lookup. Also, since we aren't generally interested in the parent here, we can switch to using a GETATTR to hint that to the MDS, which also means that we only need to reserve one cap. Finally, just remove the d_unhashed check. That's really outside the purview of a filesystem's d_revalidate. If the thing became unhashed while we're checking it, then that's up to the VFS to handle anyway. Fixes: 200fd27c8fa2 ("ceph: use lookup request to revalidate dentry") Link: http://tracker.ceph.com/issues/18041 Reported-by: Donatas Abraitis <donatas.abraitis@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-08	crypto: algif_aead - fix uninitialized variable warning	Stephan Mueller
	In case the user provided insufficient data, the code may return prematurely without any operation. In this case, the processed data indicated with outlen is zero. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-12-08	drm/omap: tpd12s015: fix error handling	Tomi Valkeinen
	tpd12s015 driver is missing error value handling for gpio initialization, causing 0 to be returned as an error if gpiod_get_* fails. This may cause deferred probing to fail. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2016-12-08	drm/omap: fix primary-plane's possible_crtcs	Tomi Valkeinen
	We set the possible_crtc for all planes to "(1 << priv->num_crtcs) - 1", which is fine as the HW planes can be used fro all crtcs. However, when we're doing that, we are still incrementing 'num_crtcs', and we'll end up with bad possible_crtcs, preventing the use of the primary planes. This patch passes a possible_crtcs mask to plane init function so that we get correct possible_crtc. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2016-12-08	drm: fix possible_crtc's type	Tomi Valkeinen
	drm_universal_plane_init() and drm_plane_init() take "unsigned long possible_crtcs" parameter, but then stuff it into uint32_t. Change the parameter to uint32_t. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2016-12-08	cris: No need to append -O2 and $(LINUXINCLUDE)	Paul Bolle
	The make variables asflags-y and ccflags-y are appended with -O2 and $(LINUXINCLUDE). But the build already picks up -O2 from the top Makefile and $(LINUXINCLUDE) from scripts/Makefile.lib. The net effect is that -O2 and the (long) list of include directories are used twice. This is harmless but pointless. So stop appending to these flags. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jesper Nilsson <jespern@axis.com>
2016-12-08	drm: Take ownership of the dmabuf->obj when exporting	Chris Wilson
	Currently the reference for the dmabuf->obj is incremented for the dmabuf in drm_gem_prime_handle_to_fd() (at the high level userspace interface), but is released in drm_gem_dmabuf_release() (the lowlevel handler). Improve the symmetry of the dmabuf->obj ownership by acquiring the reference in drm_gem_dmabuf_export(). This makes it easier to use the prime functions directly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Update kerneldoc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161207214527.22533-1-chris@chris-wilson.co.uk
2016-12-08	hotplug: Make register and unregister notifier API symmetric	Michal Hocko
	Yu Zhao has noticed that __unregister_cpu_notifier only unregisters its notifiers when HOTPLUG_CPU=y while the registration might succeed even when HOTPLUG_CPU=n if MODULE is enabled. This means that e.g. zswap might keep a stale notifier on the list on the manual clean up during the pool tear down and thus corrupt the list. Resulting in the following [ 144.964346] BUG: unable to handle kernel paging request at ffff880658a2be78 [ 144.971337] IP: [<ffffffffa290b00b>] raw_notifier_chain_register+0x1b/0x40 <snipped> [ 145.122628] Call Trace: [ 145.125086] [<ffffffffa28e5cf8>] __register_cpu_notifier+0x18/0x20 [ 145.131350] [<ffffffffa2a5dd73>] zswap_pool_create+0x273/0x400 [ 145.137268] [<ffffffffa2a5e0fc>] __zswap_param_set+0x1fc/0x300 [ 145.143188] [<ffffffffa2944c1d>] ? trace_hardirqs_on+0xd/0x10 [ 145.149018] [<ffffffffa2908798>] ? kernel_param_lock+0x28/0x30 [ 145.154940] [<ffffffffa2a3e8cf>] ? __might_fault+0x4f/0xa0 [ 145.160511] [<ffffffffa2a5e237>] zswap_compressor_param_set+0x17/0x20 [ 145.167035] [<ffffffffa2908d3c>] param_attr_store+0x5c/0xb0 [ 145.172694] [<ffffffffa290848d>] module_attr_store+0x1d/0x30 [ 145.178443] [<ffffffffa2b2b41f>] sysfs_kf_write+0x4f/0x70 [ 145.183925] [<ffffffffa2b2a5b9>] kernfs_fop_write+0x149/0x180 [ 145.189761] [<ffffffffa2a99248>] __vfs_write+0x18/0x40 [ 145.194982] [<ffffffffa2a9a412>] vfs_write+0xb2/0x1a0 [ 145.200122] [<ffffffffa2a9a732>] SyS_write+0x52/0xa0 [ 145.205177] [<ffffffffa2ff4d97>] entry_SYSCALL_64_fastpath+0x12/0x17 This can be even triggered manually by changing /sys/module/zswap/parameters/compressor multiple times. Fix this issue by making unregister APIs symmetric to the register so there are no surprises. Fixes: 47e627bc8c9a ("[PATCH] hotplug: Allow modules to use the cpu hotplug notifiers even if !CONFIG_HOTPLUG_CPU") Reported-and-tested-by: Yu Zhao <yuzhao@google.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Streetman <ddstreet@ieee.org> Link: http://lkml.kernel.org/r/20161207135438.4310-1-mhocko@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-08	drm: Allow CAP_PRIME on !MODESET	Daniel Vetter
	vgem (and our igt tests using vgem) need this. I suspect etnaviv will fare similarly. v2. Make it build. Oops. Fixes: d5264ed3823a ("drm: Return -ENOTSUPP when called for KMS cap with a non-KMS driver") Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161207144939.22756-1-daniel.vetter@ffwll.ch
2016-12-08	fbdev: remove current maintainer	Tomi Valkeinen
	Remove Tomi Valkeinen from fbdev maintainer and mark fbdev as orphan. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2016-12-08	xen/pci: Bubble up error and fix description.	Konrad Rzeszutek Wilk
	The function is never called under PV guests, and only shows up when MSI (or MSI-X) cannot be allocated. Convert the message to include the error value. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-08	xen: xenbus: set error code on failure	Pan Bian
	Variable err is initialized with 0. As a result, the return value may be 0 even if get_zeroed_page() fails to allocate memory. This patch fixes the bug, initializing err with "-ENOMEM". Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-08	xen: set error code on failures	Pan Bian
	Variable rc is reset in the loop, and its value will be non-negative during the second and after repeat of the loop. If it fails to allocate memory then, it may return a non-negative integer, which indicates no error. This patch fixes the bug, assigning "-ENOMEM" to rc when kzalloc() or alloc_page() returns NULL, and removing the initialization of rc outside of the loop. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-08	Bluetooth: SMP: Add support for H7 crypto function and CT2 auth flag	Johan Hedberg
	Bluetooth 5.0 introduces a new H7 key generation function that's used when both sides of the pairing set the CT2 authentication flag to 1. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-08	Bluetooth: btmrvl: drop duplicate header slab.h	Geliang Tang
	Drop duplicate header slab.h from btmrvl_drv.h. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-08	ieee802154: atusb: implement .set_frame_retries ops callback	Stefan Schmidt
	From firmware version 0.3 onwards we use the TX_ARET mode allowing for automatic frame retransmissions. To actually make use of this feature we need to implement the callback for setting the frame retries. If the firmware version is to old print a warning and return with invalid value. Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-08	ieee802154: atusb: try to read permanent extended address from device	Stefan Schmidt
	With version 0.3 the atusb firmware offers an interface to read a permanent EUI64 address from the devices EEPROM. This patch checks if the firmware is new enough and tries to read out and use the address. If this does not work we fall back to the original randomly generated address. Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-08	ieee802154: atusb: store firmware version after retrieval for later use	Stefan Schmidt
	The firmware versions will be used to enable selective features based on the available firmware on the device. Make sure we do not need to fetch it for every check but store it after the initial retrieval. Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-08	ieee802154: atusb: sync header file from firmware for new features	Stefan Schmidt
	This file is shared between the atusb firmware and the kernel driver. In this update it brings the new interfaces from version 0.3 of the firmware. Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-12-07	f2fs: fix to access nullified flush_cmd_control pointer	Jaegeuk Kim
	f2fs_sync_file() remount_ro - f2fs_readonly - destroy_flush_cmd_control - f2fs_issue_flush - no fcc pointer! So, this patch doesn't free fcc in this case, but just stop its kernel thread which sends flush commands. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge more fixes from Andrew Morton: "3 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kcov: add missing #include <linux/sched.h> radix tree test suite: fix compilation zram: restrict add/remove attributes to root only
2016-12-07	kcov: add missing #include <linux/sched.h>	Kefeng Wang
	In __sanitizer_cov_trace_pc we use task_struct and fields within it, but as we haven't included <linux/sched.h>, it is not guaranteed to be defined. While we usually happen to acquire the definition through a transitive include, this is fragile (and hasn't been true in the past, causing issues with backports). Include <linux/sched.h> to avoid any fragility. [mark.rutland@arm.com: rewrote changelog] Link: http://lkml.kernel.org/r/1481007384-27529-1-git-send-email-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07	radix tree test suite: fix compilation	Matthew Wilcox
	Patch "lib/radix-tree: Convert to hotplug state machine" breaks the test suite as it adds a call to cpuhp_setup_state_nocalls() which is not currently emulated in the test suite. Add it, and delete the emulation of the old CPU hotplug mechanism. Link: http://lkml.kernel.org/r/1480369871-5271-36-git-send-email-mawilcox@linuxonhyperv.com Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-07	zram: restrict add/remove attributes to root only	Sergey Senozhatsky
	zram hot_add sysfs attribute is a very 'special' attribute - reading from it creates a new uninitialized zram device. This file, by a mistake, can be read by a 'normal' user at the moment, while only root must be able to create a new zram device, therefore hot_add attribute must have S_IRUSR mode, not S_IRUGO. [akpm@linux-foundation.org: s/sence/sense/, reflow comment to use 80 cols] Fixes: 6566d1a32bf72 ("zram: add dynamic device add/remove functionality") Link: http://lkml.kernel.org/r/20161205155845.20129-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Steven Allen <steven@stebalien.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> [4.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-08	devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks	Viresh Kumar
	The OPP structures are abused to the best here, without understanding how the OPP core and RCU locks work. In short, the OPP pointer saved in 'rk3399_dmcfreq' can become invalid under your nose, as the OPP core may free it. Fix various abuses around OPP structures and calls. Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-08	devfreq: rk3399_dmc: Remove dangling rcu_read_unlock()	Viresh Kumar
	This call never had the rcu_read_lock() counterpart. Remove the unlock part as well. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-08	devfreq: exynos: Don't use OPP structures outside of RCU locks	Viresh Kumar
	The OPP structures are abused to the best here, without understanding how the OPP core and RCU locks work. In short, the OPP pointer saved 'struct exynos_bus' can become invalid under your nose, as the OPP core may free it. Fix various abuses around OPP structures and calls. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Tested-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-08	Documentation: intel_pstate: Document HWP energy/performance hints	Srinivas Pandruvada
	Updated documentation for the support of energy performance hint in the HWP mode. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-08	cpufreq: intel_pstate: Support for energy performance hints with HWP	Srinivas Pandruvada
	It is possible to provide hints to the HWP algorithms in the processor to be more performance centric to more energy centric. These hints are provided by using HWP energy performance preference (EPP) or energy performance bias (EPB) settings. The scope of these settings is per logical processor, which means that each of the logical processors in the package can be programmed with a different value. This change provides cpufreq sysfs interface to provide hint. For each policy, two additional attributes will be available to check and provide hint. These attributes will only be present when the intel_pstate driver is using HWP mode. These attributes are: - energy_performance_available_preferences - energy_performance_preference To get list of supported hints: $ cat energy_performance_available_preferences default performance balance_performance balance_power power The current preference can be read or changed via cpufreq sysfs attribute "energy_performance_preference". Reading from this attribute will display current effective setting changed via any method. User can write any of the valid preference string to this attribute. User can always restore to power-on default by writing "default". Implementation Since these hints can be provided by direct MSR write or using some tools like x86_energy_perf_policy, the driver internally doesn't maintain any state. The user operation will result in direct read/write of MSR: 0x774 (HWP_REQUEST_MSR). Also driver use read modify write to update other fields in this MSR. Summary of changes: - struct cpudata field epp_saved is renamed to epp_powersave, as this stores the value to restore once policy is switched from performance to powersave to restore original powersave EPP value. - A new struct cpudata field epp_saved is used to store the raw MSR EPP/EPB value when a CPU goes offline or on suspend and restore on online/resume. This ensures that EPP value is restored to correct value irrespective of the means used to set. - EPP/EPB value ranges are fixed for each preference, which can be set for the cpufreq sysfs, so user request is mapped to/from this range. - New attributes are only added when HWP is present. - Since EPP value of 0 is valid the fields are initialized to -EINVAL when not valid. The field epp_default is read only once after powerup to avoid reading on subsequent CPU online operation - New suspend callback to store epp on suspend operation - Don't invalidate old epp_saved field on resume and online as now we can restore last epp value on suspend and this field can still have old EPP value sampled during switch to performance from powersave. - While here optimized setting of cpu_data->epp_powersave = epp in intel_pstate_hwp_set() as this was done in both true and false paths. - epp/epb set function returns error to caller on failure to pass on to user space for display. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>