linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-05-18	dax: use sb_issue_zerout instead of calling dax_clear_sectors	Matthew Wilcox
	dax_clear_sectors() cannot handle poisoned blocks. These must be zeroed using the BIO interface instead. Convert ext2 and XFS to use only sb_issue_zerout(). Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> [vishal: Also remove the dax_clear_sectors function entirely] Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2016-05-18	dax: enable dax in the presence of known media errors (badblocks)	Dan Williams
	1/ If a mapping overlaps a bad sector fail the request. 2/ Do not opportunistically report more dax-capable capacity than is requested when errors present. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> [vishal: fix a conflict with system RAM collision patches] [vishal: add a 'size' parameter to ->direct_access] [vishal: fix a conflict with DAX alignment check patches] Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2016-05-18	Merge branch 'work.lookups' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull parallel lookup fixups from Al Viro: "Fix for xfs parallel readdir (turns out the cxfs exposure was not enough to catch all problems), and a reversion of btrfs back to ->iterate() until the fs/btrfs/delayed-inode.c gets fixed" * 'work.lookups' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: xfs: concurrent readdir hangs on data buffer locks Revert "btrfs: switch to ->iterate_shared()"
2016-05-18	xfs: concurrent readdir hangs on data buffer locks	Dave Chinner
	There's a three-process deadlock involving shared/exclusive barriers and inverted lock orders in the directory readdir implementation. It's a pre-existing problem with lock ordering, exposed by the VFS parallelisation code. process 1 process 2 process 3 --------- --------- --------- readdir iolock(shared) get_leaf_dents iterate entries ilock(shared) map, lock and read buffer iunlock(shared) process entries in buffer ..... readdir iolock(shared) get_leaf_dents iterate entries ilock(shared) map, lock buffer <blocks> finish ->iterate_shared file_accessed() ->update_time start transaction ilock(excl) <blocks> ..... finishes processing buffer get next buffer ilock(shared) <blocks> And that's the deadlock. Fix this by dropping the current buffer lock in process 1 before trying to map the next buffer. This means we keep the lock order of ilock -> buffer lock intact and hence will allow process 3 to make progress and drop it's ilock(shared) once it is done. Reported-by: Xiong Zhou <xzhou@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-18	Revert "btrfs: switch to ->iterate_shared()"	Al Viro
	This reverts commit 972b241f8441dc37a3f89dcd7e71d7f013873d13. Quoth Chris: didn't take the delayed inode stuff into account it got an rbtree of items and it pulls things out so in shared mode, its hugely racey sorry, lets revert and fix it for real inside of btrfs Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-18	Merge branch 'sendmsg.cifs' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull cifs iovec cleanups from Al Viro. * 'sendmsg.cifs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: cifs: don't bother with kmap on read_pages side cifs_readv_receive: use cifs_read_from_socket() cifs: no need to wank with copying and advancing iovec on recvmsg side either cifs: quit playing games with draining iovecs cifs: merge the hash calculation helpers
2016-05-18	drm: remove unused dev variables	Arnd Bergmann
	After drm_gem_object_lookup() was changed along with all its callers, we have several drivers that have unused variables: drm/armada/armada_crtc.c: In function 'armada_drm_crtc_cursor_set': drm/armada/armada_crtc.c:900:21: error: unused variable 'dev' [-Werror=unused-variable] drm/nouveau/nouveau_gem.c: In function 'validate_init': drm/nouveau/nouveau_gem.c:371:21: error: unused variable 'dev' [-Werror=unused-variable] drm/nouveau/nv50_display.c: In function 'nv50_crtc_cursor_set': drm/nouveau/nv50_display.c:1308:21: error: unused variable 'dev' [-Werror=unused-variable] drm/radeon/radeon_cs.c: In function 'radeon_cs_parser_relocs': drm/radeon/radeon_cs.c:77:21: error: unused variable 'ddev' [-Werror=unused-variable] This fixes all the instances I found with ARM randconfig builds so far. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: a8ad0bd84f98 ("drm: Remove unused drm_device from drm_gem_object_lookup()") Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1463587653-3035181-6-git-send-email-arnd@arndb.de
2016-05-18	drm: mediatek: fixup drm_gem_object_lookup API change	Arnd Bergmann
	The drm_gem_object_lookup() function prototype changed while this driver was added, so it fails to build now: drivers/gpu/drm/mediatek/mtk_drm_gem.c: In function 'mtk_drm_gem_dumb_map_offset': drivers/gpu/drm/mediatek/mtk_drm_gem.c:142:30: error: passing argument 1 of 'drm_gem_object_lookup' from incompatible pointer type [-Werror=incompatible-pointer-types] obj = drm_gem_object_lookup(dev, file_priv, handle); This fixes the new caller as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: a8ad0bd84f98 ("drm: Remove unused drm_device from drm_gem_object_lookup()") Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1463587653-3035181-4-git-send-email-arnd@arndb.de
2016-05-18	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull remaining vfs xattr work from Al Viro: "The rest of work.xattr (non-cifs conversions)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: btrfs: Switch to generic xattr handlers ubifs: Switch to generic xattr handlers jfs: Switch to generic xattr handlers jfs: Clean up xattr name mapping gfs2: Switch to generic xattr handlers ceph: kill __ceph_removexattr() ceph: Switch to generic xattr handlers ceph: Get rid of d_find_alias in ceph_set_acl
2016-05-18	Merge branch 'for-4.7/acpi6.1' into libnvdimm-for-next	Dan Williams

2016-05-18	Merge branch 'for-4.7/dsm' into libnvdimm-for-next	Dan Williams

2016-05-18	Merge branch 'for-4.7/libnvdimm' into libnvdimm-for-next	Dan Williams

2016-05-18	Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds
	Pull cifs updates from Steve French: "Various small CIFS and SMB3 fixes (including some for stable)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: remove directory incorrectly tries to set delete on close on non-empty directories Update cifs.ko version to 2.09 fs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication fs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication fs/cifs: correctly to anonymous authentication for the LANMAN authentication fs/cifs: correctly to anonymous authentication via NTLMSSP cifs: remove any preceding delimiter from prefix_path cifs: Use file_dentry()
2016-05-18	Merge branch 'for-4.7/dax' into libnvdimm-for-next	Dan Williams

2016-05-18	libnvdimm: stop requiring a driver ->remove() method	Dan Williams
	The dax_pmem driver was implementing an empty ->remove() method to satisfy the nvdimm bus driver that unconditionally calls ->remove(). Teach the core bus driver to check if ->remove() is NULL to remove that requirement. Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-05-18	KVM: MTRR: remove MSR 0x2f8	Andy Honig
	MSR 0x2f8 accessed the 124th Variable Range MTRR ever since MTRR support was introduced by 9ba075a664df ("KVM: MTRR support"). 0x2f8 became harmful when 910a6aae4e2e ("KVM: MTRR: exactly define the size of variable MTRRs") shrinked the array of VR MTRRs from 256 to 8, which made access to index 124 out of bounds. The surrounding code only WARNs in this situation, thus the guest gained a limited read/write access to struct kvm_arch_vcpu. 0x2f8 is not a valid VR MTRR MSR, because KVM has/advertises only 16 VR MTRR MSRs, 0x200-0x20f. Every VR MTRR is set up using two MSRs, 0x2f8 was treated as a PHYSBASE and 0x2f9 would be its PHYSMASK, but 0x2f9 was not implemented in KVM, therefore 0x2f8 could never do anything useful and getting rid of it is safe. This fixes CVE-2016-3713. Fixes: 910a6aae4e2e ("KVM: MTRR: exactly define the size of variable MTRRs") Cc: stable@vger.kernel.org Reported-by: David Matlack <dmatlack@google.com> Signed-off-by: Andy Honig <ahonig@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: make hwapic_isr_update and hwapic_irr_update look the same	Paolo Bonzini
	Neither APICv nor AVIC actually need the first argument of hwapic_isr_update, but the vCPU makes more sense than passing the pointer to the whole virtual machine! In fact in the APICv case it's just happening that the vCPU is used implicitly, through the loaded VMCS. The second argument instead is named differently, make it consistent. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Manage vcpu load/unload when enable AVIC	Suravee Suthikulpanit
	When a vcpu is loaded/unloaded to a physical core, we need to update host physical APIC ID information in the Physical APIC-ID table accordingly. Also, when vCPU is blocking/un-blocking (due to halt instruction), we need to make sure that the is-running bit in set accordingly in the physical APIC-ID table. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> [Return void from new functions, add WARN_ON when they returned negative errno; split load and put into separate function as they have almost nothing in common. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Do not intercept CR8 when enable AVIC	Suravee Suthikulpanit
	When enable AVIC: * Do not intercept CR8 since this should be handled by AVIC HW. * Also, we don't need to sync cr8/V_TPR and APIC backing page. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [Rename svm_in_nested_interrupt_shadow to svm_nested_virtualize_tpr. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Do not expose x2APIC when enable AVIC	Suravee Suthikulpanit
	Since AVIC only virtualizes xAPIC hardware for the guest, this patch disable x2APIC support in guest CPUID. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Introducing kvm_x86_ops.apicv_post_state_restore	Suravee Suthikulpanit
	Adding kvm_x86_ops hooks to allow APICv to do post state restore. This is required to support VM save and restore feature. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Add VMEXIT handlers for AVIC	Suravee Suthikulpanit
	This patch introduces VMEXIT handlers, avic_incomplete_ipi_interception() and avic_unaccelerated_access_interception() along with two trace points (trace_kvm_avic_incomplete_ipi and trace_kvm_avic_unaccelerated_access). Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Add interrupt injection via AVIC	Suravee Suthikulpanit
	This patch introduces a new mechanism to inject interrupt using AVIC. Since VINTR is not supported when enable AVIC, we need to inject interrupt via APIC backing page instead. This patch also adds support for AVIC doorbell, which is used by KVM to signal a running vcpu to check IRR for injected interrupts. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Detect and Initialize AVIC support	Suravee Suthikulpanit
	This patch introduces AVIC-related data structure, and AVIC initialization code. There are three main data structures for AVIC: * Virtual APIC (vAPIC) backing page (per-VCPU) * Physical APIC ID table (per-VM) * Logical APIC ID table (per-VM) Currently, AVIC is disabled by default. Users can manually enable AVIC via kernel boot option kvm-amd.avic=1 or during kvm-amd module loading with parameter avic=1. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> [Avoid extra indentation (Boris). - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	svm: Introduce new AVIC VMCB registers	Suravee Suthikulpanit
	Introduce new AVIC VMCB registers. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: split kvm_vcpu_wake_up from kvm_vcpu_kick	Radim Krčmář
	AVIC has a use for kvm_vcpu_wake_up. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Introducing kvm_x86_ops VCPU blocking/unblocking hooks	Suravee Suthikulpanit
	Adding new function pointer in struct kvm_x86_ops, and calling them from the kvm_arch_vcpu[blocking/unblocking]. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Introducing kvm_x86_ops VM init/destroy hooks	Suravee Suthikulpanit
	Adding function pointers in struct kvm_x86_ops for processor-specific layer to provide hooks for when KVM initialize and destroy VM. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Rename kvm_apic_get_reg to kvm_lapic_get_reg	Suravee Suthikulpanit
	Rename kvm_apic_get_reg to kvm_lapic_get_reg to be consistent with the existing kvm_lapic_set_reg counterpart. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: x86: Misc LAPIC changes to expose helper functions	Suravee Suthikulpanit
	Exporting LAPIC utility functions and macros for re-use in SVM code. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	KVM: shrink halt polling even more for invalid wakeups	Christian Borntraeger
	commit 3491caf2755e ("KVM: halt_polling: provide a way to qualify wakeups during poll") added more aggressive shrinking of the polling interval if the wakeup did not match some criteria. This still allows to keep polling enabled if the polling time was smaller that the current max poll time (block_ns <= vcpu->halt_poll_ns). Performance measurement shows that even more aggressive shrinking (shrink polling on any invalid wakeup) reduces absolute and relative (to the workload) CPU usage even further. Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Radim Krčmář <rkrcmar@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-18	drm/tegra: Fix crash caused by reference count imbalance	Jon Hunter
	Commit d2307dea14a4 ("drm/atomic: use connector references (v3)") added reference counting for DRM connectors and this caused a crash when exercising system suspend on Tegra114 Dalmore. The Tegra DSI driver implements a Tegra specific function, tegra_dsi_connector_duplicate_state(), to duplicate the connector state and destroys the state using the generic helper function, drm_atomic_helper_connector_destroy_state(). Following commit d2307dea14a4 ("drm/atomic: use connector references (v3)") there is now an imbalance in the connector reference count because the Tegra function to duplicate state does not take a reference when duplicating the state information. However, the generic helper function to destroy the state information assumes a reference has been taken and during system suspend, when the connector state is destroyed, this leads to a crash because we attempt to put the reference for an object that has already been freed. Fix this by calling __drm_atomic_helper_connector_duplicate_state() from tegra_dsi_connector_duplicate_state() to ensure that we take a reference on a connector if crtc is set. Note that this will also copy the connector state a 2nd time, but this should be harmless. By fixing tegra_dsi_connector_duplicate_state() to take a reference, although a crash was no longer seen, it was then observed that after each system suspend-resume cycle, the reference would be one greater than before the suspend-resume cycle. Following commit d2307dea14a4 ("drm/atomic: use connector references (v3)"), it was found that we also need to put the reference when calling the function tegra_dsi_connector_reset() before freeing the state. Fix this by updating tegra_dsi_connector_reset() to call the function __drm_atomic_helper_connector_destroy_state() in order to put the reference for the connector. Fixes: d2307dea14a4 ("drm/atomic: use connector references (v3)") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1463585856-16606-1-git-send-email-jonathanh@nvidia.com
2016-05-18	IB/mlx5: Fire the CQ completion handler from tasklet	Matan Barak
	Previously, mlx5_ib_cq_comp was executed from interrupt context. Under heavy load, this could cause the CPU core to be in an interrupt context too long. Instead of executing the handler from the interrupt context we execute it from a much friendly tasklet context. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18	net/mlx5_core: Use tasklet for user-space CQ completion events	Matan Barak
	Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx5 Ethernet napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18	ALSA: firewire-lib: change a member of event structure to suppress sparse ↵	Takashi Sakamoto
	wanings to bool type Commit a9c4284bf5a9 ("ALSA: firewire-lib: add context information to tracepoints") adds new members to tracepoint events of this module, to represent context information. One of the members is bool type and this causes sparse warnings. 16:1: warning: expression using sizeof bool 60:1: warning: expression using sizeof bool 16:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) 60:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) This commit suppresses the warnings, by changing type of the member to 'unsigned int'. Additionally, this commit applies '!!' idiom to get 0/1 from 'in_interrupt()'. Fixes: a9c4284bf5a9 ("ALSA: firewire-lib: add context information to tracepoints") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2016-05-18	IB/core: Do not require CAP_NET_ADMIN for packet sniffing	Christoph Lameter
	In the Ethernet/TCP world, CAP_NET_RAW is sufficient to allow a program to listen to all incoming packets on a specific interface, and the higher CAP_NET_ADMIN is required to set the interface into promiscuous mode. We want to emulate that same basic division of privilege in the RDMA stack, so when dealing with Raw Ethernet QPs, allow apps with CAP_NET_RAW to listen to all incoming flows (and direct them as they see fit in their own listen stream). Do not require CAP_NET_ADMIN just to listen to traffic already incoming. Reserve CAP_NET_ADMIN if we attempt to set promiscuous mode. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18	IB/mlx4: Fix unaligned access in send_reply_to_slave	shamir rabinovitch
	The problem is that the function 'send_reply_to_slave' gets the 'req_sa_mad' as a pointer whose address is only aliged to 4 bytes but is 8 bytes in size. This can result in unaligned access faults on certain architectures. Sowmini Varadhan pointed to this reply from Dave Miller that say that memcpy should not be used to solve alignment issues: https://lkml.org/lkml/2015/10/21/352 Optimization of memcpy to 'ldx' instruction can only happen if the compiler knows that the size of the data we are copying is 8 bytes and it assumes it is aligned to 8 bytes. If the compiler know the type is not aligned to 8 it must not optimize the 8 byte copy. Defining the data type as aligned to 4 forces the compiler to treat all accesses as though they aren't aligned and avoids the 'ldx' optimization. Full credit for the idea goes to Jason Gunthorpe <jgunthorpe@obsidianresearch.com>. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18	perf/x86/intel/uncore: Remove WARN_ON_ONCE in uncore_pci_probe	Jiri Olsa
	When booting with nr_cpus=1, uncore_pci_probe tries to init the PCI/uncore also for the other packages and fails with warning when they are not found. The warning is bogus because it's correct to fail here for packages which are not initialized. Remove it and return silently. Fixes: cf6d445f6897 "perf/x86/uncore: Track packages, not per CPU data" Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-05-18	drm: Fix error handling in drm_connector_register	Daniel Vetter
	When debugfs or sysfs registration failed, we failed to clean up the idr registration. Reorder to fix this. Cc: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1462539302-27764-1-git-send-email-daniel.vetter@ffwll.ch
2016-05-18	drm: Avoid connector reference imbalance on error path	Chris Wilson
	Whilst looking at the fallout from using connector references for atomic, I noticed that there is an early return buried in drm_atomic_set_crtc_for_connector() that if hit could cause us to leak a reference on the connector. Fixes: d2307dea14 (drm/atomic: use connector references (v3)) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Stone <daniels@collabora.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1462535265-13058-1-git-send-email-chris@chris-wilson.co.uk
2016-05-18	Merge branches 'thermal-core', 'thermal-intel' and 'thermal-soc' into next	Zhang Rui

2016-05-18	mfd: hi655x: Add MFD driver for hi655x	Chen Feng
	Add PMIC MFD driver to support hisilicon hi665x. Signed-off-by: Chen Feng <puck.chen@hisilicon.com> Signed-off-by: Fei Wang <w.f@huawei.com> Signed-off-by: Xinwei Kong <kong.kongxinwei@hisilicon.com> Reviewed-by: Haojian Zhuang <haojian.zhuang@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-05-18	arc: axs103_smp: Fix CPU frequency to 100MHz for dual-core	Alexey Brodkin
	The most recent release of AXS103 [v1.1] is proven to work at 100 MHz in dual-core mode so this change uses mentioned feature. For that we: * Update axc003_idu.dtsi with mention of really-used CPU clock freq * Remove clock override in AXS platform code for dual-core HW Note we're still leaving a hack for clock "downgrade" on early boot for quad-core hardware. Also note this change will break functionality of AXS103 v1.0 hardware. That means all users of AXS103 __must__ upgrade their boards with the most recent firmware. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-05-18	xfs: move reclaim tagging functions	Dave Chinner
	Rearrange the inode tagging functions so that they are higher up in xfs_cache.c and so there is no need for forward prototypes to be defined. This is purely code movement, no other change. Signed-off-by: Dave Chinner <dchinner@redhat.com>
2016-05-18	xfs: simplify inode reclaim tagging interfaces	Dave Chinner
	Inode radix tree tagging for reclaim passes a lot of unnecessary variables around. Over time the xfs-perag has grown a xfs_mount backpointer, and an internal agno so we don't need to pass other variables into the tagging functions to supply this information. Rework the functions to pass the minimal variable set required and simplify the internal logic and flow. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18	xfs: rename variables in xfs_iflush_cluster for clarity	Dave Chinner
	The cluster inode variable uses unconventional naming - iq - which makes it hard to distinguish it between the inode passed into the function - ip - and that is a vector for mistakes to be made. Rename all the cluster inode variables to use a more conventional prefixes to reduce potential future confusion (cilist, cilist_size, cip). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18	xfs: xfs_iflush_cluster has range issues	Dave Chinner
	xfs_iflush_cluster() does a gang lookup on the radix tree, meaning it can find inodes beyond the current cluster if there is sparse cache population. gang lookups return results in ascending index order, so stop trying to cluster inodes once the first inode outside the cluster mask is detected. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18	xfs: mark reclaimed inodes invalid earlier	Dave Chinner
	The last thing we do before using call_rcu() on an xfs_inode to be freed is mark it as invalid. This means there is a window between when we know for certain that the inode is going to be freed and when we do actually mark it as "freed". This is important in the context of RCU lookups - we can look up the inode, find that it is valid, and then use it as such not realising that it is in the final stages of being freed. As such, mark the inode as being invalid the moment we know it is going to be reclaimed. This can be done while we still hold the XFS_ILOCK_EXCL and the flush lock in xfs_inode_reclaim, meaning that it occurs well before we remove it from the radix tree, and that the i_flags_lock, the XFS_ILOCK and the inode flush lock all act as synchronisation points for detecting that an inode is about to go away. For defensive purposes, this allows us to add a further check to xfs_iflush_cluster to ensure we skip inodes that are being freed after we grab the XFS_ILOCK_SHARED and the flush lock - we know that if the inode number if valid while we have these locks held we know that it has not progressed through reclaim to the point where it is clean and is about to be freed. [bfoster: fixed __xfs_inode_clear_reclaim() using ip->i_ino after it had already been zeroed.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18	xfs: xfs_inode_free() isn't RCU safe	Dave Chinner
	The xfs_inode freed in xfs_inode_free() has multiple allocated structures attached to it. We free these in xfs_inode_free() before we mark the inode as invalid, and before we run call_rcu() to queue the structure for freeing. Unfortunately, this freeing can race with other accesses that are in the RCU current grace period that have found the inode in the radix tree with a valid state. This includes xfs_iflush_cluster(), which calls xfs_inode_clean(), and that accesses the inode log item on the xfs_inode. The log item structure is freed in xfs_inode_free(), so there is the possibility we can be accessing freed memory in xfs_iflush_cluster() after validating the xfs_inode structure as being valid for this RCU context. Hence we can get spuriously incorrect clean state returned from such checks. This can lead to use thinking the inode is dirty when it is, in fact, clean, and so incorrectly attaching it to the buffer for IO and completion processing. This then leads to use-after-free situations on the xfs_inode itself if the IO completes after the current RCU grace period expires. The buffer callbacks will access the xfs_inode and try to do all sorts of things it shouldn't with freed memory. IOWs, xfs_iflush_cluster() only works correctly when racing with inode reclaim if the inode log item is present and correctly stating the inode is clean. If the inode is being freed, then reclaim has already made sure the inode is clean, and hence xfs_iflush_cluster can skip it. However, we are accessing the inode inode under RCU read lock protection and so also must ensure that all dynamically allocated memory we reference in this context is not freed until the RCU grace period expires. To fix this, move all the potential memory freeing into xfs_inode_free_callback() so that we are guarantee RCU protected lookup code will always have the memory structures it needs available during the RCU grace period that lookup races can occur in. Discovered-by: Brain Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-18	xfs: optimise xfs_iext_destroy	Alex Lyakas
	When unmounting XFS, we call: xfs_inode_free => xfs_idestroy_fork => xfs_iext_destroy This goes over the whole indirection array and calls xfs_iext_irec_remove for each one of the erps (from the last one to the first one). As a result, we keep shrinking (reallocating actually) the indirection array until we shrink out all of its elements. When we have files with huge numbers of extents, umount takes 30-80 sec, depending on the amount of files that XFS loaded and the amount of indirection entries of each file. The unmount stack looks like: [<ffffffffc0b6d200>] xfs_iext_realloc_indirect+0x40/0x60 [xfs] [<ffffffffc0b6cd8e>] xfs_iext_irec_remove+0xee/0xf0 [xfs] [<ffffffffc0b6cdcd>] xfs_iext_destroy+0x3d/0xb0 [xfs] [<ffffffffc0b6cef6>] xfs_idestroy_fork+0xb6/0xf0 [xfs] [<ffffffffc0b87002>] xfs_inode_free+0xb2/0xc0 [xfs] [<ffffffffc0b87260>] xfs_reclaim_inode+0x250/0x340 [xfs] [<ffffffffc0b87583>] xfs_reclaim_inodes_ag+0x233/0x370 [xfs] [<ffffffffc0b8823d>] xfs_reclaim_inodes+0x1d/0x20 [xfs] [<ffffffffc0b96feb>] xfs_unmountfs+0x7b/0x1a0 [xfs] [<ffffffffc0b98e4d>] xfs_fs_put_super+0x2d/0x70 [xfs] [<ffffffff811e9e36>] generic_shutdown_super+0x76/0x100 [<ffffffff811ea207>] kill_block_super+0x27/0x70 [<ffffffff811ea519>] deactivate_locked_super+0x49/0x60 [<ffffffff811eaaee>] deactivate_super+0x4e/0x70 [<ffffffff81207593>] cleanup_mnt+0x43/0x90 [<ffffffff81207632>] __cleanup_mnt+0x12/0x20 [<ffffffff8108f8e7>] task_work_run+0xa7/0xe0 [<ffffffff81014ff7>] do_notify_resume+0x97/0xb0 [<ffffffff81717c6f>] int_signal+0x12/0x17 Further, this reallocation prevents us from freeing the extent list from a RCU callback as allocation can block. Hence if the extent list is in indirect format, optimise the freeing of the extent list to only use kmem_free calls by freeing entire extent buffer pages at a time, rather than extent by extent. [dchinner: simplified freeing loop based on Christoph's suggestion] Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>