linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-12-09	powerpc/fsl/dts: add QMan and BMan nodes on t1024	Madalin Bucur
	Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	powerpc/fsl/dts: add QMan and BMan nodes on t1023	Madalin Bucur
	Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	soc/fsl/qman: test: use DEFINE_SPINLOCK()	Fabian Frederick
	Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	powerpc/fsl-lbc: use DEFINE_SPINLOCK()	Fabian Frederick
	Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	powerpc/8xx: Implement support of hugepages	Christophe Leroy
	8xx uses a two level page table with two different linux page size support (4k and 16k). 8xx also support two different hugepage sizes 512k and 8M. In order to support them on linux we define two different page table layout. The size of pages is in the PGD entry, using PS field (bits 28-29): 00 : Small pages (4k or 16k) 01 : 512k pages 10 : reserved 11 : 8M pages For 512K hugepage size a pgd entry have the below format [<hugepte address >0101] . The hugepte table allocated will contain 8 entries pointing to 512K huge pte in 4k pages mode and 64 entries in 16k pages mode. For 8M in 16k mode, a pgd entry have the below format [<hugepte address >1101] . The hugepte table allocated will contain 8 entries pointing to 8M huge pte. For 8M in 4k mode, multiple pgd entries point to the same hugepte address and pgd entry will have the below format [<hugepte address>1101]. The hugepte table allocated will only have one entry. For the time being, we do not support CPU15 ERRATA when HUGETLB is selected Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits) Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	powerpc: get hugetlbpage handling more generic	Christophe Leroy
	Today there are two implementations of hugetlbpages which are managed by exclusive #ifdefs: * FSL_BOOKE: several directory entries points to the same single hugepage * BOOK3S: one upper level directory entry points to a table of hugepages In preparation of implementation of hugepage support on the 8xx, we need a mix of the two above solutions, because the 8xx needs both cases depending on the size of pages: * In 4k page size mode, each PGD entry covers a 4M bytes area. It means that 2 PGD entries will be necessary to cover an 8M hugepage while a single PGD entry will cover 8x 512k hugepages. * In 16 page size mode, each PGD entry covers a 64M bytes area. It means that 8x 8M hugepages will be covered by one PGD entry and 64x 512k hugepages will be covers by one PGD entry. This patch: * removes #ifdefs in favor of if/else based on the range sizes * merges the two huge_pte_alloc() functions as they are pretty similar * merges the two hugetlbpage_init() functions as they are pretty similar Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3) Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	powerpc: port 64 bits pgtable_cache to 32 bits	Christophe Leroy
	Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Scott Wood <oss@buserror.net>
2016-12-09	nfs: add support for the umask attribute	Andreas Gruenbacher
	Clients can set the umask attribute when creating files to cause the server to apply it always except when inheriting permissions from the parent directory. That way, the new files will end up with the same permissions as files created locally. See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more details. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09	net: mlx5: Fix Kconfig help text	Christopher Covington
	Since the following commit, Infiniband and Ethernet have not been mutually exclusive. Fixes: 4aa17b28 mlx5: Enable mutual support for IB and Ethernet Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	net: skb_condense() can also deal with empty skbs	Eric Dumazet
	It seems attackers can also send UDP packets with no payload at all. skb_condense() can still be a win in this case. It will be possible to replace the custom code in tcp_add_backlog() to get full benefit from skb_condense() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	net: smsc911x: back out silently on probe deferrals	Linus Walleij
	When trying to get a regulator we may get deferred and we see this noise: smsc911x 1b800000.ethernet-ebi2 (unnamed net_device) (uninitialized): couldn't get regulators -517 Then the driver continues anyway. Which means that the regulator may not be properly retrieved and reference counted, and may be switched off in case noone else is using it. Fix this by returning silently on deferred probe and let the system work it out. Cc: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	Merge tag 'mac80211-next-for-davem-2016-12-09' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Three fixes: * fix a logic bug introduced by a previous cleanup * fix nl80211 attribute confusing (trying to use a single attribute for two purposes) * fix a long-standing BSS leak that happens when an association attempt is abandoned ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	ibmveth: set correct gso_size and gso_type	Thomas Falcon
	This patch is based on an earlier one submitted by Jon Maxwell with the following commit message: "We recently encountered a bug where a few customers using ibmveth on the same LPAR hit an issue where a TCP session hung when large receive was enabled. Closer analysis revealed that the session was stuck because the one side was advertising a zero window repeatedly. We narrowed this down to the fact the ibmveth driver did not set gso_size which is translated by TCP into the MSS later up the stack. The MSS is used to calculate the TCP window size and as that was abnormally large, it was calculating a zero window, even although the sockets receive buffer was completely empty." We rely on the Virtual I/O Server partition in a pseries environment to provide the MSS through the TCP header checksum field. The stipulation is that users should not disable checksum offloading if rx packet aggregation is enabled through VIOS. Some firmware offerings provide the MSS in the RX buffer. This is signalled by a bit in the RX queue descriptor. Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com> Reviewed-by: David Dai <zdai@us.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	nfs_write_end(): fix handling of short copies	Al Viro
	What matters when deciding if we should make a page uptodate is not how much we _wanted_ to copy, but how much we actually have copied. As it is, on architectures that do not zero tail on short copy we can leave uninitialized data in page marked uptodate. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-09	Merge branch 'udp-receive-path-optimizations'	David S. Miller
	Eric Dumazet says: ==================== udp: receive path optimizations This patch series provides about 100 % performance increase under flood. v2: added Paolo feedback on udp_rmem_release() for tiny sk_rcvbuf added the last patch touching sk_rmem_alloc later ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	udp: udp_rmem_release() should touch sk_rmem_alloc later	Eric Dumazet
	In flood situations, keeping sk_rmem_alloc at a high value prevents producers from touching the socket. It makes sense to lower sk_rmem_alloc only at the end of udp_rmem_release() after the thread draining receive queue in udp_recvmsg() finished the writes to sk_forward_alloc. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	udp: add batching to udp_rmem_release()	Eric Dumazet
	If udp_recvmsg() constantly releases sk_rmem_alloc for every read packet, it gives opportunity for producers to immediately grab spinlocks and desperatly try adding another packet, causing false sharing. We can add a simple heuristic to give the signal by batches of ~25 % of the queue capacity. This patch considerably increases performance under flood by about 50 %, since the thread draining the queue is no longer slowed by false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	udp: copy skb->truesize in the first cache line	Eric Dumazet
	In UDP RX handler, we currently clear skb->dev before skb is added to receive queue, because device pointer is no longer available once we exit from RCU section. Since this first cache line is always hot, lets reuse this space to store skb->truesize and thus avoid a cache line miss at udp_recvmsg()/udp_skb_destructor time while receive queue spinlock is held. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	udp: add busylocks in RX path	Eric Dumazet
	Idea of busylocks is to let producers grab an extra spinlock to relieve pressure on the receive_queue spinlock shared by consumer. This behavior is requested only once socket receive queue is above half occupancy. Under flood, this means that only one producer can be in line trying to acquire the receive_queue spinlock. These busylock can be allocated on a per cpu manner, instead of a per socket one (that would consume a cache line per socket) This patch considerably improves UDP behavior under stress, depending on number of NIC RX queues and/or RPS spread. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	Merge branch 'qcom-emac'	David S. Miller
	Timur Tabi says: ==================== net: qcom/emac: simplify support for different SOCs On SOCs that have the Qualcomm EMAC network controller, the internal PHY block is always different. Sometimes the differences are small, sometimes it might be a completely different IP. Either way, using version numbers to differentiate them and putting all of the init code in one file does not scale. This patchset does two things: The first breaks up the current code into different files, and the second patch adds support for a third SOC, the Qualcomm Technologies QDF2400 ARM Server SOC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	net: qcom/emac: add support for the Qualcomm Technologies QDF2400	Timur Tabi
	The QDF2432 and the QDF2400 have slightly different internal PHYs, so there are some programming differences. Some of the registers in the QDF2400 have moved, and some registers require different values during initialization. Because of the differences, and because HIDs are a scare resource, the ACPI tables specify the hardware version in an _HRV property. Version 1 is the QDF2432, and version 2 is the QDF2400. Any future SOC that has the same internal PHY but different programming requirements will be assigned the next available version number. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	net: qcom/emac: move phy init code to separate files	Timur Tabi
	The internal PHY of the EMAC differs on each SOC, and the list will only continue to grow. By separating the code into individual files, we can add support for more SOCs more cleanly. Note: The internal PHY is also sometimes called the SGMII device. We also stop referring to the various PHY variations by version number, so no more "v2", "v3", etc. Instead, the devices are named after the SOC they are, which is in sync with the device tree property names. Future patches will probably rearrange more code among the files. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-09	pNFS/flexfiles: Ensure we have enough buffer for layoutreturn	Trond Myklebust
	The flexfiles client can piggyback both layout errors and layoutstats as part of the layoutreturn. Both these payloads can get large, with 20 layout error entries taking up about 1.2K, and 4 layoutstats entries taking up another 1K. This patch allows a maximum payload of 4k by allocating a full page. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09	pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()	Trond Myklebust
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09	vfs: refactor clone/dedupe_file_range common functions	Darrick J. Wong
	Hoist both the XFS reflink inode state and preparation code and the XFS file blocks compare functions into the VFS so that ocfs2 can take advantage of it for reflink and dedupe. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-09	fs: try to clone files first in vfs_copy_file_range	Christoph Hellwig
	A clone is a perfectly fine implementation of a file copy, so most file systems just implement the copy that way. Instead of duplicating this logic move it to the VFS. Currently btrfs and XFS implement copies the same way as clones and there is no behavior change for them, cifs only implements clones and grow support for copy_file_range with this patch. NFS implements both, so this will allow copy_file_range to work on servers that only implement CLONE and be lot more efficient on servers that implements CLONE and COPY. Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-12-09	remoteproc: qcom_adsp_pil: select qcom_scm	Arnd Bergmann
	The adsp-pil driver relies on SCM and causes a build error without it: ERROR: "qcom_scm_pas_supported" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! ERROR: "qcom_scm_is_available" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! ERROR: "qcom_scm_pas_auth_and_reset" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! ERROR: "qcom_scm_pas_shutdown" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! ERROR: "qcom_scm_pas_mem_setup" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! ERROR: "qcom_scm_pas_init_image" [drivers/remoteproc/qcom_adsp_pil.ko] undefined! This adds a 'select', as SCM is a silent Kconfig symbol that gets enabled implicitly by all its users. Fixes: b9e718e950c3 ("remoteproc: Introduce Qualcomm ADSP PIL") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-12-09	remoteproc: Drop wait in __rproc_boot()	Bjorn Andersson
	In the event that rproc_boot() is called before the firmware loaded completion has been flagged it will wait with the mutex held, obstructing the request_firmware_nowait() callback from completing the wait. As rproc_fw_config_virtio() has been reduced to only triggering auto-boot there is no longer a reason for waiting in rproc_boot(), so drop this. Cc: Sarangdhar Joshi <spjoshi@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-12-09	remoteproc/ste: Delete unused driver	Jean Delvare
	Back in July 2014 I asked around what was the intended target platform for the STE Modem remoteproc driver, so that I could add the proper hardware dependency to its config option. The answer I got was that there was no known publicly available hardware needing it and it was unlikely that there ever would. So I think it's time to delete this driver to lower the maintenance burden. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Ohad Ben-Cohen <ohad@wizery.com> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Suman Anna <s-anna@ti.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Loic Pallardy <loic.pallardy@st.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-12-09	remoteproc: Remove "experimental" warning	Bjorn Andersson
	Warning users that remoteproc and it's binary format are under development doesn't serve much of a purpose. Different drivers support different image formats and the resource table has a version field that would need to be bumped when incompatible changes are introduced. So lets drop this warning to clean up the kernel log. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2016-12-09	drm/vc4: Don't use drm_put_dev	Daniel Vetter
	vc4 already has a proper load sequence, but the unload one needed some fixups: First unregister, and last drop the final ref. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-09	drm/vc4: Document VEC DT binding	Boris Brezillon
	Document the DT binding for the VEC (Video EnCoder) IP. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Rob Herring <robh@kernel.org>
2016-12-09	drm/vc4: Add support for the VEC (Video Encoder) IP	Boris Brezillon
	The VEC IP is a TV DAC, providing support for PAL and NTSC standards. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-09	drm: Add TV connector states to drm_connector_state	Boris Brezillon
	Some generic TV connector properties are exposed in drm_mode_config, but they are currently handled independently in each DRM encoder driver. Extend the drm_connector_state to store TV related states, and modify the drm_atomic_connector_{set,get}_property() helpers to fill the connector state accordingly. Each driver is then responsible for checking and applying the new config in its ->atomic_mode_{check,set}() operations. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-09	drm: Turn DRM_MODE_SUBCONNECTOR_xx definitions into an enum	Boris Brezillon
	List of values like the DRM_MODE_SUBCONNECTOR_xx ones are better represented with enums. Turn the DRM_MODE_SUBCONNECTOR_xx macros into an enum. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-09	drm/vc4: Fix ->clock_select setting for the VEC encoder	Boris Brezillon
	PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and rework the vc4_set_crtc_possible_masks() to cover the full range of the PV_CONTROL_CLK_SELECT field. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>
2016-12-10	x86/ldt: Make all size computations unsigned	Thomas Gleixner
	ldt->size can never be negative. The helper functions take 'unsigned int' arguments which are assigned from ldt->size. The related user space user_desc struct member entry_number is unsigned as well. But ldt->size itself and a few local variables which are related to ldt->size are type 'int' which makes no sense whatsoever and results in typecasts which make the eyes bleed. Clean it up and convert everything which is related to ldt->size to unsigned it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Carpenter <dan.carpenter@oracle.com>
2016-12-10	x86/ldt: Make a size argument unsigned	Dan Carpenter
	My static checker complains that we put an upper bound on the "size" argument but not a lower bound. The checker is not smart enough to know the possible ranges of "old_mm->context.ldt->size" from init_new_context_ldt() so it thinks maybe it could be negative. Let's make it unsigned to silence the warning and future proof the code a bit. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: kernel-janitors@vger.kernel.org Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20161208105602.GA11382@elgon.mountain Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	dt: pwm: bcm2835: fix typo in clocks property name	Vladimir Zapolskiy
	According to the examples of BCM2835 PWM device nodes there is a typo in 'clocks' property name, which is a common property name on clock consumer side to store a phandle to an input clock. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-12-09	devicetree: add vendor prefix for National Instruments	Nathan Sullivan
	Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Signed-off-by: Rob Herring <robh@kernel.org>
2016-12-09	x86: Remove empty idle.h header	Thomas Gleixner
	One include less is always a good thing(tm). Good riddance. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	x86/amd: Simplify AMD E400 aware idle routine	Borislav Petkov
	Reorganize the E400 detection now that we have everything in place: switch the CPUs to broadcast mode after the LAPIC has been initialized and remove the facilities that were used previously on the idle path. Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine because alternatives have been applied when the actual detection happens, so the static switching does not take effect and the test will stay false. Use boot_cpu_has_bug() instead which is definitely an improvement over the RDMSR and the cpumask handling. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-5-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	x86/amd: Check for the C1E bug post ACPI subsystem init	Thomas Gleixner
	AMD CPUs affected by the E400 erratum suffer from the issue that the local APIC timer stops when the CPU goes into C1E. Unfortunately there is no way to detect the affected CPUs on early boot. It's only possible to determine the range of possibly affected CPUs from the family/model range. The actual decision whether to enter C1E and thus cause the bug is done by the firmware and we need to detect that case late, after ACPI has been initialized. The current solution is to check in the idle routine whether the CPU is affected by reading the MSR_K8_INT_PENDING_MSG MSR and checking for the K8_INTP_C1E_ACTIVE_MASK bits. If one of the bits is set then the CPU is affected and the system is switched into forced broadcast mode. This is ineffective and on non-affected CPUs every entry to idle does the extra RDMSR. After doing some research it turns out that the bits are visible on the boot CPU right after the ACPI subsystem is initialized in the early boot process. So instead of polling for the bits in the idle loop, add a detection function after acpi_subsystem_init() and check for the MSR bits. If set, then the X86_BUG_AMD_APIC_C1E is set on the boot CPU and the TSC is marked unstable when X86_FEATURE_NONSTOP_TSC is not set as it will stop in C1E state as well. The switch to broadcast mode cannot be done at this point because the boot CPU still uses HPET as a clockevent device and the local APIC timer is not yet calibrated and installed. The switch to broadcast mode on the affected CPUs needs to be done when the local APIC timer is actually set up. This allows to cleanup the amd_e400_idle() function in the next step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-4-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	x86/bugs: Separate AMD E400 erratum and C1E bug	Thomas Gleixner
	The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E state) is a two step process: - Selection of the E400 aware idle routine - Detection whether the platform is affected The idle routine selection happens for possibly affected CPUs depending on family/model/stepping information. These range of CPUs is not necessarily affected as the decision whether to enable the C1E feature is made by the firmware. Unfortunately there is no way to query this at early boot. The current implementation polls a MSR in the E400 aware idle routine to detect whether the CPU is affected. This is inefficient on non affected CPUs because every idle entry has to do the MSR read. There is a better way to detect this before going idle for the first time which requires to seperate the bug flags: X86_BUG_AMD_E400 - Selects the E400 aware idle routine and enables the detection X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400 Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400 bug bit to select the idle routine which currently does an unconditional detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches to remove the MSR polling and simplify the handling of this misfeature. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	x86/cpufeature: Provide helper to set bugs bits	Borislav Petkov
	Will be used in a later patch to set bug bits for bugs which need late detection. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20161209182912.2726-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-09	blk-stat: fix a few cases of missing batch flushing	Jens Axboe
	Everytime we need to read ->nr_samples, we should have flushed the batch first. The non-mq read path also needs to flush the batch. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-09	Merge branch 'libnvdimm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "Several fixes to the DSM (ACPI device specific method) marshaling implementation. I consider these urgent enough to send for 4.9 consideration since they fix the kernel's handling of ARS (Address Range Scrub) commands. Especially for platforms without machine-check-recovery capabilities, successful execution of ARS commands enables the platform to potentially break out of an infinite reboot problem if a media error is present in the boot path. There is also a one line fix for a device-dax read-only mapping regression. Commits 9a901f5495e2 ("acpi, nfit: fix extended status translations for ACPI DSMs") and 325896ffdf90 ("device-dax: fix private mapping restriction, permit read-only") are true regression fixes for changes introduced this cycle. Commit efda1b5d87cb ("acpi, nfit, libnvdimm: fix / harden ars_status output length handling") fixes the kernel's handling of zero-length results, this never would have worked in the past, but we only just recently discovered a BIOS implementation that emits this arguably spec non-compliant result. The remaining two commits are additional fall out from thinking through the implications of a zero / truncated length result of the ARS Status command. In order to mitigate the risk that these changes introduce yet more regressions they are backstopped by a new unit test in commit a7de92dac9f0 ("tools/testing/nvdimm: unit test acpi_nfit_ctl()") that mocks up inputs to acpi_nfit_ctl()" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fix private mapping restriction, permit read-only tools/testing/nvdimm: unit test acpi_nfit_ctl() acpi, nfit: fix bus vs dimm confusion in xlat_status acpi, nfit: validate ars_status output buffer size acpi, nfit, libnvdimm: fix / harden ars_status output length handling acpi, nfit: fix extended status translations for ACPI DSMs
2016-12-09	Merge branch 'for-4.9-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "This is quite late but SCT Write Same support added during this cycle is broken subtly but seriously and it'd be best to disable it before v4.9 gets released. This contains two commits - one low impact sata_mv fix and the mentioned disabling of SCT Write Same" * 'for-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata-scsi: disable SCT Write Same for the moment ata: sata_mv: check for errors when parsing nr-ports from dt
2016-12-09	Merge tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client	Linus Torvalds
	Pull ceph fix from Ilya Dryomov: "A fix for an issue with ->d_revalidate() in ceph, causing frequent kernel crashes. Marked for stable - it goes back to 4.6, but started popping up only in 4.8" * tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client: ceph: don't set req->r_locked_dir in ceph_d_revalidate
2016-12-09	Merge tag 'armsoc-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Final batch of SoC fixes A few fixes that have trickled in over the last week, all fixing minor errors in devicetrees -- UART pin assignment on Allwinner H3, correcting number of SATA ports on a Marvell-based Linkstation platform and a display clock fix for Freescale/NXP i.MX7D that fixes a freeze when starting up X" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: dts: orion5x: fix number of sata port for linkstation ls-gl ARM: dts: imx7d: fix LCDIF clock assignment dts: sun8i-h3: correct UART3 pin definitions