summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-22ixgbevf: fix NACK check in ixgbevf_set_uc_addr_vf()Emil Tantilov
Fix the NACK check in ixgbevf_set_uc_addr_vf() for instances where index != 0. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i2c: bcm2835: Don't complain on -EPROBE_DEFER from getting our clockEric Anholt
Fixes dmesg spam when we just need to wait a moment for the clock driver to probe. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-07-22i40e: Explicitly write platform-specific mac address after PF resetTushar Dave
i40e PF reset clears mac filters. If platform-specific mac address is used, driver has to explicitly write default mac address to mac filters otherwise all incoming traffic destined to default mac address will be dropped after reset. This issue was found on SPARC while toggling i40e ntuple via ethtool. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: add missing link advertise settingStefan Assmann
Adding the missing link advertise for some X710 NICs. This can be observed by simply calling ethtool on the interface. root@rhel7:~ # ethtool eth0 Settings for eth0: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: Symmetric Supports auto-negotiation: No Advertised link modes: Not reported [...] With fix applied. root@rhel7:~ # ethtool eth0 Settings for eth0: Supported ports: [ FIBRE ] Supported link modes: 10000baseT/Full Supported pause frame use: Symmetric Supports auto-negotiation: No Advertised link modes: 10000baseT/Full [...] Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: Move the mutex lock in i40e_client_unregisterCatherine Sullivan
We need to lock the client list around the i40e_client_release call to prevent the release from interrupting the client instances while they are being added. Change-Id: I99993f20179aaf8730207833e7d0869d2ccffa1d Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: Remove redundant memsetAmitoj Kaur Chawla
Remove redundant call to memset before a call to memcpy. The Coccinelle semantic patch used to make this change is as follows: @@ expression e1,e2,e3,e4; @@ - memset(e1,e2,e3); memcpy(e1,e4,e3); Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e/i40evf-bump version to 1.6.11Bimmy Pujari
Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: refactor Rx filter handlingMitch Williams
Properly track filter adds and deletes so the driver doesn't lose filters during resets and up/down cycles. Add a tracking mechanism so that the driver knows when to enter and leave promiscuous mode. Implement a simple state machine so the driver can track the status of each filter throughout its lifecycle. Properly manage the overflow promiscuous state for the each VSI, and provide a way for the driver to detect when to exit overflow promiscuous mode. Remove all possible default MAC filters that the firmware may have set up so that the driver can manage these correctly, particularly when VLANs come into play. Remove the LAA flag for filters; instead just send whatever we get through set_mac to the firmware as the LAA for wakeup purposes. Finally, add the state of each filter to debugfs output so we can see what's going on inside the driver's pointy little head. Change-ID: I97c5e366fac2254fa01eaff4f65c0af61dcf2e1f Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40evf: add hyperv dev idsJoshua Hay
This patch adds the Hyper-V specific VF device ids. Change-ID: I9c4fe6d8dfd34f7f68ebc9fdae225c8768439c89 Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: Remove device ID 0x37D4Catherine Sullivan
This device ID is not needed, so take it out. Change-ID: I148d29f68a1f58b03980ecd83047a1b440f4f74d Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e/i40evf: remove useless initializerMitch Williams
This initializer isn't needed because the variable is assigned right away. Change-ID: I6ce3edb3f4e0364db248a7a0bcc62ca95c01d941 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i40e: Fix to show correct Advertised Link Modes when link is downAvinash Dayanand
When link is down, Advertised Link Modes was wrongly displaying full supported link modes instead of Advertised link mode. Added conditional checks in order to make sure correct Advertised link modes are displayed when the link is down. Change-ID: I8a61413f9ee174149c7a33157b5f0b0a8da9842d Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22i2c: i2c-smbus: drop useless stubsJean Delvare
Drivers which use the SMBus extensions select I2C_SMBUS, so the stubs are not needed. Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-07-22i40e: avoid null pointer dereferenceHeinrich Schuchardt
In function i40e_debug_aq parameter desc is assumed to be possibly NULL. Do not dereference it before checking the value. Fixes: f905dd62be88 ("i40e/i40evf: add max buf len to aq debug print helper") Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-07-22packet: propagate sock_cmsg_send() errorSoheil Hassas Yeganeh
sock_cmsg_send() can return different error codes and not only -EINVAL, and we should properly propagate them. Fixes: c14ac9451c34 ("sock: enable timestamping using control messages") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-22Merge branch 'macsec-gro'David S. Miller
Paolo Abeni says: ==================== macsec: enable s/w offloads This patches leverage gro_cells infrastructure to enable both GRO and RPS on macsec devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-22macsec: enable GRO and RPS on macsec devicesPaolo Abeni
Use gro_gells to trigger GRO and allow RPS on macsec traffic after decryption. Also, be sure to avoid clearing software offload features in macsec_fix_features(). Overall this increase TCP tput by 30% on recent h/w. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-22gro_cells: gro_cells_receive now return error codePaolo Abeni
so that the caller can update stats accordingly, if needed Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-22Merge branch 'xfs-4.8-misc-fixes-4' into for-nextDave Chinner
2016-07-22xfs: remove EXPERIMENTAL tag from sparse inode featureDave Chinner
Been around for long enough now, hasn't caused any regression test failures in the past 3 months, so it's time to make it a fully supported feature. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-22xfs: bufferhead chains are invalid after end_page_writebackDave Chinner
In xfs_finish_page_writeback(), we have a loop that looks like this: do { if (off < bvec->bv_offset) goto next_bh; if (off > end) break; bh->b_end_io(bh, !error); next_bh: off += bh->b_size; } while ((bh = bh->b_this_page) != head); The b_end_io function is end_buffer_async_write(), which will call end_page_writeback() once all the buffers have marked as no longer under IO. This issue here is that the only thing currently protecting both the bufferhead chain and the page from being reclaimed is the PageWriteback state held on the page. While we attempt to limit the loop to just the buffers covered by the IO, we still read from the buffer size and follow the next pointer in the bufferhead chain. There is no guarantee that either of these are valid after the PageWriteback flag has been cleared. Hence, loops like this are completely unsafe, and result in use-after-free issues. One such problem was caught by Calvin Owens with KASAN: ..... INFO: Freed in 0x103fc80ec age=18446651500051355200 cpu=2165122683 pid=-1 free_buffer_head+0x41/0x90 __slab_free+0x1ed/0x340 kmem_cache_free+0x270/0x300 free_buffer_head+0x41/0x90 try_to_free_buffers+0x171/0x240 xfs_vm_releasepage+0xcb/0x3b0 try_to_release_page+0x106/0x190 shrink_page_list+0x118e/0x1a10 shrink_inactive_list+0x42c/0xdf0 shrink_zone_memcg+0xa09/0xfa0 shrink_zone+0x2c3/0xbc0 ..... Call Trace: <IRQ> [<ffffffff81e8b8e4>] dump_stack+0x68/0x94 [<ffffffff8153a995>] print_trailer+0x115/0x1a0 [<ffffffff81541174>] object_err+0x34/0x40 [<ffffffff815436e7>] kasan_report_error+0x217/0x530 [<ffffffff81543b33>] __asan_report_load8_noabort+0x43/0x50 [<ffffffff819d651f>] xfs_destroy_ioend+0x3bf/0x4c0 [<ffffffff819d69d4>] xfs_end_bio+0x154/0x220 [<ffffffff81de0c58>] bio_endio+0x158/0x1b0 [<ffffffff81dff61b>] blk_update_request+0x18b/0xb80 [<ffffffff821baf57>] scsi_end_request+0x97/0x5a0 [<ffffffff821c5558>] scsi_io_completion+0x438/0x1690 [<ffffffff821a8d95>] scsi_finish_command+0x375/0x4e0 [<ffffffff821c3940>] scsi_softirq_done+0x280/0x340 Where the access is occuring during IO completion after the buffer had been freed from direct memory reclaim. Prevent use-after-free accidents in this end_io processing loop by pre-calculating the loop conditionals before calling bh->b_end_io(). The loop is already limited to just the bufferheads covered by the IO in progress, so the offset checks are sufficient to prevent accessing buffers in the chain after end_page_writeback() has been called by the the bh->b_end_io() callout. Yet another example of why Bufferheads Must Die. cc: <stable@vger.kernel.org> # 4.7 Signed-off-by: Dave Chinner <dchinner@redhat.com> Reported-and-Tested-by: Calvin Owens <calvinowens@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-22xfs: allocate log vector buffers outside CIL context lockDave Chinner
One of the problems we currently have with delayed logging is that under serious memory pressure we can deadlock memory reclaim. THis occurs when memory reclaim (such as run by kswapd) is reclaiming XFS inodes and issues a log force to unpin inodes that are dirty in the CIL. The CIL is pushed, but this will only occur once it gets the CIL context lock to ensure that all committing transactions are complete and no new transactions start being committed to the CIL while the push switches to a new context. The deadlock occurs when the CIL context lock is held by a committing process that is doing memory allocation for log vector buffers, and that allocation is then blocked on memory reclaim making progress. Memory reclaim, however, is blocked waiting for a log force to make progress, and so we effectively deadlock at this point. To solve this problem, we have to move the CIL log vector buffer allocation outside of the context lock so that memory reclaim can always make progress when it needs to force the log. The problem with doing this is that a CIL push can take place while we are determining if we need to allocate a new log vector buffer for an item and hence the current log vector may go away without warning. That means we canot rely on the existing log vector being present when we finally grab the context lock and so we must have a replacement buffer ready to go at all times. To ensure this, introduce a "shadow log vector" buffer that is always guaranteed to be present when we gain the CIL context lock and format the item. This shadow buffer may or may not be used during the formatting, but if the log item does not have an existing log vector buffer or that buffer is too small for the new modifications, we swap it for the new shadow buffer and format the modifications into that new log vector buffer. The result of this is that for any object we modify more than once in a given CIL checkpoint, we double the memory required to track dirty regions in the log. For single modifications then we consume the shadow log vectorwe allocate on commit, and that gets consumed by the checkpoint. However, if we make multiple modifications, then the second transaction commit will allocate a shadow log vector and hence we will end up with double the memory usage as only one of the log vectors is consumed by the CIL checkpoint. The remaining shadow vector will be freed when th elog item is freed. This can probably be optimised in future - access to the shadow log vector is serialised by the object lock (as opposited to the active log vector, which is controlled by the CIL context lock) and so we can probably free shadow log vector from some objects when the log item is marked clean on removal from the AIL. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-22libxfs: directory node splitting does not have an extra blockDave Chinner
xfsprogs source commit 4280e59dcbc4cd8e01585efe788a68eb378048e8 xfs_da3_split() has to handle all three versions of the directory/attribute btree structure. The attr tree is v1, the dir tre is v2 or v3. The main difference between the v1 and v2/3 trees is the way tree nodes are split - in the v1 tree we can require a double split to occur because the object to be inserted may be larger than the space made by splitting a leaf. In this case we need to do a double split - one to split the full leaf, then another to allocate an empty leaf block in the correct location for the new entry. This does not happen with dir (v2/v3) formats as the objects being inserted are always guaranteed to fit into the new space in the split blocks. Indeed, for directories they *may* be an extra block on this buffer pointer. However, it's guaranteed not to be a leaf block (i.e. a directory data block) - the directory code only ever places hash index or free space blocks in this pointer (as a cursor of sorts), and so to use it as a directory data block will immediately corrupt the directory. The problem is that the code assumes that there may be extra blocks that we need to link into the tree once we've split the root, but this is not true for either dir or attr trees, because the extra attr block is always consumed by the last node split before we split the root. Hence the linking in an extra block is always wrong at the root split level, and this manifests itself in repair as a directory corruption in a repaired directory, leaving the directory rebuild incomplete. This is a dir v2 zero-day bug - it was in the initial dir v2 commit that was made back in February 1998. Fix this by ensuring the linking of the blocks after the root split never tries to make use of the extra blocks that may be held in the cursor. They are held there for other purposes and should never be touched by the root splitting code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-22xfs: remove dax code from object file when disabledArnd Bergmann
We check IS_DAX(inode) before calling either xfs_file_dax_read or xfs_file_dax_write, and this will lead the call being optimized out at compile time when CONFIG_FS_DAX is disabled. However, the two functions are marked STATIC, so they become global symbols when CONFIG_XFS_DEBUG is set, leaving us with two unused global functions that call into an undefined function and a broken "allmodconfig" build: fs/built-in.o: In function `xfs_file_dax_read': fs/xfs/xfs_file.c:348: undefined reference to `dax_do_io' fs/built-in.o: In function `xfs_file_dax_write': fs/xfs/xfs_file.c:758: undefined reference to `dax_do_io' Marking the two functions 'static noinline' instead of 'STATIC' will let the compiler drop the symbols when there are no callers but avoid the implicit inlining. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 16d4d43595b4 ("xfs: split direct I/O and DAX path") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-22xfs: skip dirty pages in ->releasepage()Brian Foster
XFS has had scattered reports of delalloc blocks present at ->releasepage() time. This results in a warning with a stack trace similar to the following: ... Call Trace: [<ffffffffa23c5b8f>] dump_stack+0x63/0x84 [<ffffffffa20837a7>] warn_slowpath_common+0x97/0xe0 [<ffffffffa208380a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa2326caf>] xfs_vm_releasepage+0x10f/0x140 [<ffffffffa218c680>] ? page_mkclean_one+0xd0/0xd0 [<ffffffffa218d3a0>] ? anon_vma_prepare+0x150/0x150 [<ffffffffa21521c2>] try_to_release_page+0x32/0x50 [<ffffffffa2166b2e>] shrink_active_list+0x3ce/0x3e0 [<ffffffffa21671c7>] shrink_lruvec+0x687/0x7d0 [<ffffffffa21673ec>] shrink_zone+0xdc/0x2c0 [<ffffffffa2168539>] kswapd+0x4f9/0x970 [<ffffffffa2168040>] ? mem_cgroup_shrink_node_zone+0x1a0/0x1a0 [<ffffffffa20a0d99>] kthread+0xc9/0xe0 [<ffffffffa20a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffffa26b404f>] ret_from_fork+0x3f/0x70 [<ffffffffa20a0cd0>] ? kthread_stop+0x100/0x100 This occurs because it is possible for shrink_active_list() to send pages marked dirty to ->releasepage() when certain buffer_head threshold conditions are met. shrink_active_list() doesn't check the page dirty state apparently to handle an old ext3 corner case where in some cases clean pages would not have the dirty bit cleared, thus it is up to the filesystem to determine how to handle the page. XFS currently handles the delalloc case properly, but this behavior makes the warning spurious. Update the XFS ->releasepage() handler to explicitly skip dirty pages. Retain the existing delalloc/unwritten checks so we continue to warn if such buffers exist on clean pages when they shouldn't. Diagnosed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-21Documentation: dtb: xgene: Add hwmon dts binding documentationhotran
This patch adds the APM X-Gene hwmon device tree node documentation. Signed-off-by: Hoan Tran <hotran@apm.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-07-21cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()Viresh Kumar
The handlers provided by cpufreq core are sufficient for resolving the frequency for drivers providing ->target_index(), as the core already has the frequency table and so ->resolve_freq() isn't required for such platforms. This patch disallows drivers with ->target_index() callback to use the ->resolve_freq() callback. Also, it fixes a potential kernel crash for drivers providing ->target() but no ->resolve_freq(). Fixes: e3c062360870 "cpufreq: add cpufreq_driver_resolve_freq()" Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21ACPI: enable ACPI_PROCESSOR_IDLE on ARM64Sudeep Holla
Now that ACPI processor idle driver supports LPI(Low Power Idle), lets enable ACPI_PROCESSOR_IDLE for ARM64 too. This patch just removes the IA64 and X86 dependency on ACPI_PROCESSOR_IDLE Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21arm64: add support for ACPI Low Power Idle(LPI)Sudeep Holla
This patch adds appropriate callbacks to support ACPI Low Power Idle (LPI) on ARM64. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21drivers: firmware: psci: initialise idle states using ACPI LPISudeep Holla
This patch adds support for initialisation of PSCI CPUIdle states from Low Power Idle(_LPI) entries in the ACPI tables when acpi is enabled. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21cpuidle: introduce CPU_PM_CPU_IDLE_ENTER macro for ARM{32, 64}Sudeep Holla
The function arm_enter_idle_state is exactly the same in both generic ARM{32,64} CPUIdle driver and will be the same even on ARM64 backend for ACPI processor idle driver. So we can unify it and move it to a common place by introducing CPU_PM_CPU_IDLE_ENTER macro that can be used in all places avoiding duplication. This is in preparation of reuse of the generic cpuidle entry function for ACPI LPI support on ARM64. Suggested-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21arm64: cpuidle: drop __init section marker to arm_cpuidle_initSudeep Holla
Commit ea389daa7fd9 (arm64: cpuidle: add __init section marker to arm_cpuidle_init) added the __init annotation to arm_cpuidle_init as it was not needed after booting which was correct at that time. However with the introduction of ACPI LPI support, this will be used from cpuhotplug path in ACPI processor driver. This patch drops the __init annotation from arm_cpuidle_init to avoid the following warning: WARNING: vmlinux.o(.text+0x113c8): Section mismatch in reference from the function acpi_processor_ffh_lpi_probe() to the function .init.text:arm_cpuidle_init() The function acpi_processor_ffh_lpi_probe() references the function __init arm_cpuidle_init(). This is often because acpi_processor_ffh_lpi_probe() lacks a __init annotation or the annotation of arm_cpuidle_init is wrong. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21ACPI / processor_idle: Add support for Low Power Idle(LPI) statesSudeep Holla
ACPI 6.0 introduced an optional object _LPI that provides an alternate method to describe Low Power Idle states. It defines the local power states for each node in a hierarchical processor topology. The OSPM can use _LPI object to select a local power state for each level of processor hierarchy in the system. They used to produce a composite power state request that is presented to the platform by the OSPM. Since multiple processors affect the idle state for any non-leaf hierarchy node, coordination of idle state requests between the processors is required. ACPI supports two different coordination schemes: Platform coordinated and OS initiated. This patch adds initial support for Platform coordination scheme of LPI. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21ACPI / processor_idle: introduce ACPI_PROCESSOR_CSTATESudeep Holla
ACPI 6.0 adds a new method to specify the CPU idle states(C-states) called Low Power Idle(LPI) states. Since new architectures like ARM64 use only LPIs, introduce ACPI_PROCESSOR_CSTATE to encapsulate all the code supporting the old style C-states(_CST). This patch will help to extend the processor_idle module to support LPI. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21PCI / PM: check all fields in pci_set_platform_pm()Andy Shevchenko
When assign new PCI platform PM operations check for all mandatory fields to prevent NULL pointer dereference. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21cpufreq: acpi-cpufreq: use cached frequency mapping when possibleSteve Muckle
A call to cpufreq_driver_resolve_freq will cache the mapping from the desired target frequency to the frequency table index. If there is a mapping for the desired target frequency then use it instead of looking up the mapping again. Signed-off-by: Steve Muckle <smuckle@linaro.org> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21cpufreq: schedutil: map raw required frequency to driver frequencySteve Muckle
The slow-path frequency transition path is relatively expensive as it requires waking up a thread to do work. Should support be added for remote CPU cpufreq updates that is also expensive since it requires an IPI. These activities should be avoided if they are not necessary. To that end, calculate the actual driver-supported frequency required by the new utilization value in schedutil by using the recently added cpufreq_driver_resolve_freq API. If it is the same as the previously requested driver frequency then there is no need to continue with the update assuming the cpu frequency limits have not changed. This will have additional benefits should the semantics of the rate limit be changed to apply solely to frequency transitions rather than to frequency calculations in schedutil. The last raw required frequency is cached. This allows the driver frequency lookup to be skipped in the event that the new raw required frequency matches the last one, assuming a frequency update has not been forced due to limits changing (indicated by a next_freq value of UINT_MAX, see sugov_should_update_freq). Signed-off-by: Steve Muckle <smuckle@linaro.org> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21GFS2: Fix gfs2_replay_incr_blk for multiple journal sizesBob Peterson
Before this patch, if you used gfs2_jadd to add new journals of a size smaller than the existing journals, replaying those new journals would withdraw. That's because function gfs2_replay_incr_blk was using the number of journal blocks (jd_block) from the superblock's journal pointer. In other words, "My journal's max size" rather than "the journal we're replaying's size." This patch changes the function to use the size of the pertinent journal rather than always using the journal we happen to be using. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2016-07-21Merge branch 'for-next/kprobes' into for-next/coreCatalin Marinas
* kprobes: arm64: kprobes: Add KASAN instrumentation around stack accesses arm64: kprobes: Cleanup jprobe_return arm64: kprobes: Fix overflow when saving stack arm64: kprobes: WARN if attempting to step with PSTATE.D=1 kprobes: Add arm64 case in kprobe example module arm64: Add kernel return probes support (kretprobes) arm64: Add trampoline code for kretprobes arm64: kprobes instruction simulation support arm64: Treat all entry code as non-kprobe-able arm64: Blacklist non-kprobe-able symbol arm64: Kprobes with single stepping support arm64: add conditional instruction simulation support arm64: Add more test functions to insn.c arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
2016-07-21x86/fpu: Do not BUG_ON() in early FPU codeDave Hansen
I don't think it is really possible to have a system where CPUID enumerates support for XSAVE but that it does not have FP/SSE (they are "legacy" features and always present). But, I did manage to hit this case in qemu when I enabled its somewhat shaky XSAVE support. The bummer is that the FPU is set up before we parse the command-line or have *any* console support including earlyprintk. That turned what should have been an easy thing to debug in to a bit more of an odyssey. So a BUG() here is worthless. All it does it guarantee that if/when we hit this case we have an empty console. So, remove the BUG() and try to limp along by disabling XSAVE and trying to continue. Add a comment on why we are doing this, and also add a common "out_disable" path for leaving fpu__init_system_xstate(). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21arm64: Honor nosmp kernel command line optionSuzuki K Poulose
Passing "nosmp" should boot the kernel with a single processor, without provision to enable secondary CPUs even if they are present. "nosmp" is implemented by setting maxcpus=0. At the moment we still mark the secondary CPUs present even with nosmp, which allows the userspace to bring them up. This patch corrects the smp_prepare_cpus() to honor the maxcpus == 0. Commit 44dbcc93ab67145 ("arm64: Fix behavior of maxcpus=N") fixed the behavior for maxcpus >= 1, but broke maxcpus = 0. Fixes: 44dbcc93ab67 ("arm64: Fix behavior of maxcpus=N") Cc: <stable@vger.kernel.org> # 4.7+ Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> [catalin.marinas@arm.com: updated code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-21arm64: Fix incorrect per-cpu usage for boot CPUSuzuki K Poulose
In smp_prepare_boot_cpu(), we invoke cpuinfo_store_boot_cpu to store the cpuinfo in a per-cpu ptr, before initialising the per-cpu offset for the boot CPU. This patch reorders the sequence to make sure we initialise the per-cpu offset before accessing the per-cpu area. Commit 4b998ff1885eec ("arm64: Delay cpuinfo_store_boot_cpu") fixed the issue where we modified the per-cpu area even before the kernel initialises the per-cpu areas, but failed to wait until the boot cpu updated it's offset. Fixes: 4b998ff1885e ("arm64: Delay cpuinfo_store_boot_cpu") Cc: <stable@vger.kernel.org> # 4.4+ Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-21cpufreq: add cpufreq_driver_resolve_freq()Steve Muckle
Cpufreq governors may need to know what a particular target frequency maps to in the driver without necessarily wanting to set the frequency. Support this operation via a new cpufreq API, cpufreq_driver_resolve_freq(). This API returns the lowest driver frequency equal or greater than the target frequency (CPUFREQ_RELATION_L), subject to any policy (min/max) or driver limitations. The mapping is also cached in the policy so that a subsequent fast_switch operation can avoid repeating the same lookup. The API will call a new cpufreq driver callback, resolve_freq(), if it has been registered by the driver. Otherwise the frequency is resolved via cpufreq_frequency_table_target(). Rather than require ->target() style drivers to provide a resolve_freq() callback it is left to the caller to ensure that the driver implements this callback if necessary to use cpufreq_driver_resolve_freq(). Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Steve Muckle <smuckle@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21perf tools: Add AVX-512 instructions to the new instructions testAdrian Hunter
Previous patches added support for Intel's AVX-512 instructions to the kernel and perf tools instruction decoders. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). Add a representative set of instructions to perf's "new instructions" test. e.g. perf test "new instructions" Or to view a particular instruction: perf test -v "new instructions" 2>&1 | grep vbroadcasti64x4 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21perf tools: Add AVX-512 support to the instruction decoder used by Intel PTAdrian Hunter
Add support for Intel's AVX-512 instructions to perf tools instruction decoder used by Intel PT. The kernel's instruction decoder was updated in a previous patch. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). AVX-512 instructions are identified by a EVEX prefix which, for the purpose of instruction decoding, can be treated as though it were a 4-byte VEX prefix. Existing instructions which can now accept an EVEX prefix need not be further annotated in the op code map (x86-opcode-map.txt). In the case of new instructions, the op code map is updated accordingly. Also add associated Mask Instructions that are used to manipulate mask registers used in AVX-512 instructions. A representative set of instructions is added to the perf tools new instructions test in a subsequent patch. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21x86/insn: Add AVX-512 support to the instruction decoderAdrian Hunter
Add support for Intel's AVX-512 instructions to the instruction decoder. AVX-512 instructions are documented in Intel Architecture Instruction Set Extensions Programming Reference (February 2016). AVX-512 instructions are identified by a EVEX prefix which, for the purpose of instruction decoding, can be treated as though it were a 4-byte VEX prefix. Existing instructions which can now accept an EVEX prefix need not be further annotated in the op code map (x86-opcode-map.txt). In the case of new instructions, the op code map is updated accordingly. Also add associated Mask Instructions that are used to manipulate mask registers used in AVX-512 instructions. The 'perf tools' instruction decoder is updated in a subsequent patch. And a representative set of instructions is added to the perf tools new instructions test in a subsequent patch. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: X86 ML <x86@kernel.org> Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPTSrinivas Pandruvada
The MSR MSR_HWP_INTERRUPT is valid only when CPUID.06H:EAX[8] = 1, so check for feature before accessing this MSR. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21intel_pstate: Update cpu_frequency tracepoint every timeRafael J. Wysocki
Currently, intel_pstate only updates the cpu_frequency tracepoint if the new P-state to set is different from the current one, but that causes powertop to report 100% idle on an 100% loaded system sometimes. Prevent that from happening by updating the cpu_frequency tracepoint every time intel_pstate_update_pstate() is called. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>-
2016-07-21cpufreq: intel_pstate: clean remnant struct elementCarsten Emde
When I was working with the Intel P state driver I came across a remnant struct element that is no longer needed after the function intel_pstate_calc_freq() was retired. Signed-off-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21ACPI / DPTF: move int340x_thermal.c to the DPTF folderSrinivas Pandruvada
Since DPTF has its own folder under ACPI, move this file also there. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>