summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-06-18Merge tag 'drm-amdkfd-next-fixes-2015-06-16' of ↵Dave Airlie
git://people.freedesktop.org/~gabbayo/linux into drm-next - Dan fixed some range checks in the address watch ioctl impl. - Remove obsolete member from radeon_device structure * tag 'drm-amdkfd-next-fixes-2015-06-16' of git://people.freedesktop.org/~gabbayo/linux: drm/amdkfd: fix some range checks in address watch ioctl drm/radeon: remove obsolete kfd_bo from radeon_device
2015-06-17NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writesTrond Myklebust
If a write attempt fails, and the write is queued up for resending to the server, as opposed to being dropped, then we need to set the appropriate flag so that nfs_file_fsync() does the right thing. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-17pNFS: Fix a memory leak when attempted pnfs failsTrond Myklebust
pnfs_do_write() expects the call to pnfs_write_through_mds() to free the pgio header and to release the layout segment before exiting. The problem is that nfs_pgio_data_destroy() doesn't actually do this; it only frees the memory allocated by nfs_generic_pgio(). Ditto for pnfs_do_read()... Fix in both cases is to add a call to hdr->release(hdr). Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-18Merge remote-tracking branches 'spi/topic/sirf', 'spi/topic/spidev' and ↵Mark Brown
'spi/topic/zynq' into spi-next
2015-06-18Merge remote-tracking branches 'spi/topic/pxa', 'spi/topic/rb4xx', ↵Mark Brown
'spi/topic/rspi', 'spi/topic/s3c64xx' and 'spi/topic/sh-msiof' into spi-next
2015-06-18Merge remote-tracking branches 'spi/topic/fsl-dspi', 'spi/topic/gpio', ↵Mark Brown
'spi/topic/imx' and 'spi/topic/orion' into spi-next
2015-06-18Merge remote-tracking branches 'spi/topic/ath79', 'spi/topic/atmel' and ↵Mark Brown
'spi/topic/davinci' into spi-next
2015-06-18Merge remote-tracking branch 'spi/topic/omap2-mcspi' into spi-nextMark Brown
2015-06-18Merge remote-tracking branch 'spi/topic/bcm2835' into spi-nextMark Brown
2015-06-18Merge remote-tracking branches 'spi/fix/fsl-dspi', 'spi/fix/fsl-espi', ↵Mark Brown
'spi/fix/orion' and 'spi/fix/pl022' into spi-linus
2015-06-18Merge remote-tracking branch 'spi/fix/core' into spi-linusMark Brown
2015-06-17selftests: add seccomp suiteKees Cook
This imports the existing seccomp test suite into the kernel's selftests tree. It contains extensive testing of seccomp features and corner cases. There remain additional tests to move into the kernel tree, but they have not yet been ported to all the architectures seccomp supports: https://github.com/redpig/seccomp/tree/master/tests Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2015-06-17PCI: pciehp: Clean up debug loggingBjorn Helgaas
The pciehp debug logging is overly verbose and often redundant. Almost all of the information printed by dbg_ctrl() is also printed by the normal PCI core enumeration code and by pcie_init(). Remove the redundant debug info. When claiming a pciehp bridge, we print the slot characteristics, e.g., Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+ Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information, and print it all in the same order as lspci does. No functional change except the message text changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rajat Jain <rajatja@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org>
2015-06-17Merge branch 'pci/resource' into nextBjorn Helgaas
* pci/resource: x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A
2015-06-17x86/PCI: Use host bridge _CRS info on systems with >32 bit addressingBjorn Helgaas
We enable _CRS on all systems from 2008 and later. On older systems, we ignore _CRS and assume the whole physical address space (excluding RAM and other devices) is available for PCI devices, but on systems that support physical address spaces larger than 4GB, it's doubtful that the area above 4GB is really available for PCI. After d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible"), we try to use that space above 4GB *first*, so we're more likely to put a device there. On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394, and card reader devices unassigned (but only after Windows had been booted). Only the sound device had a 64-bit BAR, so it was the only device placed above 4GB, and hence the only device that didn't work. Keep _CRS enabled even on pre-2008 systems if they support physical address space larger than 4GB. Fixes: d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible") Reported-and-tested-by: Juan Dayer <jdayer@outlook.com> Reported-and-tested-by: Alan Horsfield <alan@hazelgarth.co.uk> Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221 Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.14+
2015-06-17Bluetooth: hci_uart: Add new line discipline enhancementsIlya Faenson
Added the ability to flow control the UART, improved the UART baud rate setting, transferred the speeds into line discipline from the protocol and introduced the tty init function. Signed-off-by: Ilya Faenson <ifaenson@broadcom.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-17VFIO: platform: add reset struct and lookup tableEric Auger
This patch introduces the vfio_platform_reset_combo struct that stores all the information useful to handle the reset modality: compat string, name of the reset function, name of the module that implements the reset function. A lookup table of such structures is added, currently void. Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Baptiste Reynal <b.reynal@virtualopensystems.com> Tested-by: Baptiste Reynal <b.reynal@virtualopensystems.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-06-17fm10k: Fix missing braces after if statementAlexander Duyck
While reviewing the code I noticed that one of the commits added an if statement followed by a for loop, but the if statement was missing the braces around the loop. This change corrects the coding style error. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: fix iov_msg_lport_state_pf issueJacob Keller
When a VF issues an LPORT_STATE request to enable a port that is already enabled, the PF will first disable the VF LPORT. Then it should re-enable the VF again with the new requested settings. This ensures that any switch rules are cleared by deleting the LPORT on the switch. However, the flow is bugged because we actually check if the VF is enabled at the end, and thus don't re-enable it. Fix the flow so that we actually clear the enabled flags as part of our removal of the LPORT. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: remove err_no reference in fm10k_mbx.cJacob Keller
The reference to err_no was left around after a previous code refactor. We never use the value, and it doesn't seem to be used in side a hidden macro reference. Discovered via cppcheck. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: fix incorrect DIR_NEVATIVE bit in 1588 codeJacob Keller
The SYSTIME_CFG.Adjust Direction bit is actually supposed to indicate that the adjustment is positive. Fix the code to align correctly with hardware and documentation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: pack TLV overlay structuresJacob Keller
This patch adds the __attribute__((packed)) indicator to some structures which are overlayed onto a TLV message. These structures must be packed as small as possible in order to correctly align when copied into the mailbox buffer. Without doing so, the receiving mailbox code incorrectly parses the values and we get invalid message responses from the switch manager software. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: re-map all possible VF queues after a VFLRJacob Keller
During initialization, the VF counts its rings by walking the TQDLOC registers. This works only if the TQMAP/RQMAP registers are set to map all of the out-of-bound rings back to the first one. This allows the VF to cleanly detect when it has run out of queues. Update the PF code so that it resets the empty TQMAP/RQMAP registers post-VFLR to prevent innocent VF drivers from triggering malicious driver events. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: force LPORT delete when updating VLAN or MAC addressJacob Keller
Currently, we don't notify the switch at all when the PF administratively sets a new VLAN or MAC address. This causes the old addresses to remain valid on the switch table. Since the PF is overriding any configuration done directly by the VF, we choose to simply re-create the LPORT for the VF. This does mean that all rules for the VF will be dropped when we set something directly via the PF, but it prevents some weird issues where the MAC/VLAN table retains some stale configuration. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: use dma_set_mask_and_coherent in fm10k_probeJacob Keller
This patch cleans up the use of dma_get_required_mask and uses the simpler dma_set_mask_and_coherent function instead of doing these as separate steps. I removed the dma_get_required_mask call because based on some minimal testing it appears that either (a) we're not doing the right thing with the call or (b) we don't need it anyways. If the value returned is <48bits, we'll end up trying with 48 bits anyways. If it's over 48bits, fm10k can't support that anyways, and we should try 48bits. If 48bits fails, we'll fallback to 32bits. This cleans up some very funky code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: trivial fixup message style to include a colonJacob Keller
Also use %d for error values, since printing in hexadecimal is probably not helpful. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: remove extraneous NULL check on l2_accelJacob Keller
l2_accel was checked for NULL at the top of fm10k_dfwd_del_station, and we return if it is not defined. Due to this, we already know it can't be null here so a separate check is meaningless. Discovered via cppcheck. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: use an unsigned int for i in ethtool_get_stringsJacob Keller
The value will never be negative, and we use the %u print format. Thus, use unsigned int for the loop counter. Issue found using cppcheck. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: add call to fm10k_clean_all_rx_rings in fm10k_downJacob Keller
This prevents a memory leak in fm10k_set_ringparams. The leak occurs because we go down, change ring parameters, and then come up. However, fm10k_down on its own is not clearing the Rx rings. Since fm10k_up assumes the rings are clean we basically drop the buffers and leak a bunch of memory. Eventually we hit dirty page faults and reboot the system. This issue does not occur elsewhere because other flows that involve fm10k_down go through fm10k_close which immediately called fm10k_free_all_rx_resources which properly cleans the rings. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: fix incorrect free on skb in ts_tx_enqueueJacob Keller
This patch resolves a bug in the ts_tx_enqueue code responsible for a NULL pointer dereference and invalid access of the skb list. We incorrectly freed the actual skb we found instead of our copy. Thus the skb queue is essentially invalidated. Resolve this by freeing our clone in the cases where we did not add it to the queue. This also avoids the skb memory leak caused by failure to free the clone. [ 589.719320] BUG: unable to handle kernel NULL pointer dereference at (null) [ 589.722344] IP: [<ffffffffa0310e60>] fm10k_ts_tx_subtask+0xb0/0x160 [fm10k] [ 589.723796] PGD 0 [ 589.725228] Oops: 0000 [#1] SMP Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: move setting shinfo inside ts_tx_enqueueJacob Keller
This patch simplifies the code flow for setting the IN_PROGRESS bit of the shinfo for an skb we will be timestamping. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: use correct ethernet driver Tx timestamp functionJacob Keller
skb_complete_tx_timestamp is intended for use by PHY drivers which implement a different method of returning timestamps. This method is intended to be used after a PHY driver accepts a cloned packet via its phy_driver.txtstamp function. It is not correct to use in the standard ethernet driver such as fm10k. This patch fixes the following possible kernel panic. [ 2744.552896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W OE 3.19.3-200.fc21.x86_64 #1 [ 2744.552899] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014 [ 2744.552901] 0000000000000000 2f4c8b10ea3f9848 ffff88081ee03a38 ffffffff8176e215 [ 2744.552906] 0000000000000000 0000000000000000 ffff88081ee03a78 ffffffff8109bc1a [ 2744.552910] ffff88081ee03c50 ffff88080e55fc00 ffff88080e55fc00 ffffffff81647c50 [ 2744.552914] Call Trace: [ 2744.552917] <IRQ> [<ffffffff8176e215>] dump_stack+0x45/0x57 [ 2744.552931] [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0 [ 2744.552936] [<ffffffff81647c50>] ? skb_queue_purge+0x20/0x40 [ 2744.552941] [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20 [ 2744.552946] [<ffffffff81646911>] skb_release_head_state+0xe1/0xf0 [ 2744.552950] [<ffffffff81647b26>] skb_release_all+0x16/0x30 [ 2744.552954] [<ffffffff81647ba6>] kfree_skb+0x36/0x90 [ 2744.552958] [<ffffffff81647c50>] skb_queue_purge+0x20/0x40 [ 2744.552964] [<ffffffff81751f8d>] packet_sock_destruct+0x1d/0x90 [ 2744.552968] [<ffffffff81642053>] __sk_free+0x23/0x140 [ 2744.552973] [<ffffffff81642189>] sk_free+0x19/0x20 [ 2744.552977] [<ffffffff81647d60>] skb_complete_tx_timestamp+0x50/0x60 [ 2744.552988] [<ffffffffa02eee40>] fm10k_ts_tx_hwtstamp+0xd0/0x100 [fm10k] [ 2744.552994] [<ffffffffa02e054e>] fm10k_1588_msg_pf+0x12e/0x140 [fm10k] [ 2744.553002] [<ffffffffa02edf1d>] fm10k_tlv_msg_parse+0x8d/0xc0 [fm10k] [ 2744.553010] [<ffffffffa02eb2d0>] fm10k_mbx_dequeue_rx+0x60/0xb0 [fm10k] [ 2744.553016] [<ffffffffa02ebf98>] fm10k_sm_mbx_process+0x178/0x3c0 [fm10k] [ 2744.553022] [<ffffffffa02e09ca>] fm10k_msix_mbx_pf+0xfa/0x360 [fm10k] [ 2744.553030] [<ffffffff811030a7>] ? get_next_timer_interrupt+0x1f7/0x270 [ 2744.553036] [<ffffffff810f2a47>] handle_irq_event_percpu+0x77/0x1a0 [ 2744.553041] [<ffffffff810f2bab>] handle_irq_event+0x3b/0x60 [ 2744.553045] [<ffffffff810f5d6e>] handle_edge_irq+0x6e/0x120 [ 2744.553054] [<ffffffff81017414>] handle_irq+0x74/0x140 [ 2744.553061] [<ffffffff810bb54a>] ? atomic_notifier_call_chain+0x1a/0x20 [ 2744.553066] [<ffffffff8177777f>] do_IRQ+0x4f/0xf0 [ 2744.553072] [<ffffffff8177556d>] common_interrupt+0x6d/0x6d [ 2744.553074] <EOI> [<ffffffff81609b16>] ? cpuidle_enter_state+0x66/0x160 [ 2744.553084] [<ffffffff81609b01>] ? cpuidle_enter_state+0x51/0x160 [ 2744.553087] [<ffffffff81609cf7>] cpuidle_enter+0x17/0x20 [ 2744.553092] [<ffffffff810de101>] cpu_startup_entry+0x321/0x3c0 [ 2744.553098] [<ffffffff81764497>] rest_init+0x77/0x80 [ 2744.553103] [<ffffffff81d4f02c>] start_kernel+0x4a4/0x4c5 [ 2744.553107] [<ffffffff81d4e120>] ? early_idt_handlers+0x120/0x120 [ 2744.553110] [<ffffffff81d4e4d7>] x86_64_start_reservations+0x2a/0x2c [ 2744.553114] [<ffffffff81d4e62b>] x86_64_start_kernel+0x152/0x175 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: ignore invalid multicast address entriesJacob Keller
This change fixes an issue with adding an invalid multicast address using the iproute2 tool (ip maddr add <MADDR> dev <dev>). The iproute2 tool and the kernel do not validate or filter the multicast addresses when adding them to the multicast list. Thus, when synchronizing this list with an invalid entry, the action will be aborted with an error since the fm10k driver currently validates the list. Consequently, multicast entries beyond the invalid one will not be processed and communicated with the switch via the mailbox. This change makes it so that invalid addresses will simply be skipped and allows synchronizing the full list to proceed. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-17fm10k: fold fm10k_pull_tail into fm10k_add_rx_fragAlexander Duyck
This change folds the fm10k_pull_tail call into fm10k_add_rx_frag. The advantage to doing this is that the fragment doesn't have to be modified after it is added to the skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-06-18powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=offAlexey Kardashevskiy
The pnv_pci_ioda2_unset_window() function is used to do the final cleanup of a DMA window being released: - via VFIO ioctl by the guest request; - via unplugging a virtual PCI function. However the function was under #ifdef CONFIG_IOMMU_API and was missing. This moves the helper outside of IOMMU_API block and enables it for either or both IOMMU_API and PCI_IOV. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-18powerpc/include: Add opal-prd to installed uapi headersJeremy Kerr
We'll want to build the opal-prd daemon with the prd headers, so include this in the uapi headers list. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-18powerpc/powernv: fix construction of opal PRD messagesJeremy Kerr
We currently have a bug in the PRD code, where the contents of an incoming message (beyond the header) will be overwritten by the list item manipulations when adding to to the prd_msg_queue. This change reorders struct opal_prd_msg_queue_item, so that the message body doesn't overlap the list_head. We also clarify the memcpy of the message, as we're copying unnecessary bytes at the end of the message data. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-18powerpc/powernv: Increase opal-irqchip initcall priorityAlistair Popple
The eeh subsystem for powernv requires the opal event irqchip to be initialised prior to initialisation or the following errors are produced (and eeh doesn't work as expected): irq: XICS didn't like hwirq-0x9 to VIRQ17 mapping (rc=-22) pnv_eeh_post_init: Can't request OPAL event interrupt (0) On powernv eeh is initialised from a subsys_initcall due to a check for machine_is(powernv) in eeh_init(). This patch increases the initcall priority of opal_event_init() to an arch_initcall to ensure the opal event interface is initialised prior to any users of it. Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-17MAINTAINERS: update email for Michael TurquetteMichael Turquette
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
2015-06-17Merge branch 'clk-shmobile-for-4.2' of ↵Michael Turquette
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into clk-next
2015-06-17Merge remote-tracking branch 'clk/clk-next' into clk-nextMichael Turquette
2015-06-17perf top: Allow disabling/enabling events dynamiclyArnaldo Carvalho de Melo
Now it is possible to press CTRL+z at anytime and that will disable the events being monitored, essentially turning 'top' into 'report', with pressing CTRL+z again making it enable the events again, returning to the 'top' behaviour, i.e. dynamic + decaying of older samples. One may want, for instance, play with: -d, --delay <n> number of seconds to delay between refreshes and: -z, --zero zero history across updates Plus CTRL+z to see only the events since last zeroing, etc. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zq7tnh5462blt2yda0bcxh5b@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf evlist: Add toggle_enable() methodArnaldo Carvalho de Melo
For an upcoming feature in 'perf top' we will have a hotkey to enable/disable events, so remember if the events in the list are enabled or disabled and allows toggling this state using a new method. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-64c4jvdl5feg2zhimxvokqka@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf trace: Fix race condition at the end of started workloadsSukadev Bhattiprolu
I get following crash on multiple systems and across several releases (at least since v3.18). Core was generated by `/tmp/perf trace sleep 0.2 '. Program terminated with signal SIGSEGV, Segmentation fault. #0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195 195 u64 head = ACCESS_ONCE(pc->data_head); (gdb) bt #0 perf_mmap__read_head (mm=0x3fff9bf30070) at util/evlist.h:195 #1 perf_evlist__mmap_read (evlist=0x10027f11910, idx=<optimized out>) at util/evlist.c:637 #2 0x000000001003ce4c in trace__run (argv=<optimized out>, argc=<optimized out>, trace=0x3fffd7b28288) at builtin-trace.c:2259 #3 cmd_trace (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin-trace.c:2799 #4 0x00000000100657b8 in run_builtin (p=0x10176798 <commands+480>, argc=3, argv=0x3fffd7b2b550) at perf.c:370 #5 0x00000000100063e8 in handle_internal_command (argv=0x3fffd7b2b550, argc=3) at perf.c:429 #6 run_argv (argv=0x3fffd7b2af70, argcp=0x3fffd7b2af7c) at perf.c:473 #7 main (argc=3, argv=0x3fffd7b2b550) at perf.c:588 The problem seems to be a race condition, when the application has just exited. Some/all fds associated with the perf-events (tracepoints) go into a POLLHUP/ POLLERR state and the mmap region associated with those events are unmapped (in perf_evlist__filter_pollfd()). But we go back and do a perf_evlist__mmap_read() which assumes that the mmaps are still valid and we hit the crash. If the mapping for an event is released, its refcnt is 0 (and ->base is NULL), so ensure we have non-zero refcount before accessing the map. Note that perf-record has a similar logic but unlike perf-trace, the record__mmap_read_all() checks the evlist->mmap[i].base before accessing the map. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20150612060003.GA19913@us.ibm.com [ Fixed it up to use atomic_read() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf probe: Speed up perf probe --list by caching debuginfoMasami Hiramatsu
Speed up the "perf probe --list" by caching the last used debuginfo. perf probe --list always open and load debuginfo for each entry of probe list. This takes very a long time. E.g. with vfs_* events (total 96 probes) [root@localhost perf]# time ./perf probe -l &> /dev/null real 0m25.376s user 0m24.381s sys 0m1.012s To solve this issue, this adds debuginfo_cache to cache the last used debuginfo on memory. With this fix, the perf-probe --list significantly improves its speed. [root@localhost perf]# time ./perf probe -l &> /dev/null real 0m0.161s user 0m0.136s sys 0m0.025s Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naohiro Aota <naota@elisp.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150617145854.19715.15314.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf probe: Show usage even if the last event is skippedMasami Hiramatsu
When the last part of converted events are blacklisted or out-of-text, those are skipped and perf probe doesn't show usage examples. This fixes it to show the example even if the last part of event list is skipped. E.g. without this patch, events are added, but suddenly end: # perf probe vfs_* vfs_caches_init_early is out of .text, skip it. vfs_caches_init is out of .text, skip it. Added new events: probe:vfs_fallocate (on vfs_*) probe:vfs_open (on vfs_*) ... probe:vfs_dentry_acceptable (on vfs_*) probe:vfs_load_quota_inode (on vfs_*) # With this fix: # perf probe vfs_* vfs_caches_init_early is out of .text, skip it. vfs_caches_init is out of .text, skip it. Added new events: probe:vfs_fallocate (on vfs_*) ... probe:vfs_load_quota_inode (on vfs_*) You can now use it in all perf tools, such as: perf record -e probe:vfs_load_quota_inode -aR sleep 1 Note that this can be reproduced ONLY IF the vfs_caches_init* is the last part of matched symbol list. I've checked this happens on "3.19.0-generic #18-Ubuntu" kernel binary. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naohiro Aota <naota@elisp.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150616115057.19906.5502.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf tools: Move libtraceevent dynamic list to separated LDFLAGS variableWang Nan
Commit e3d09ec8126fe2c9a3ade661e2126e215ca27a80 ("tools lib traceevent: Export dynamic symbols used by traceevent plugins") adds libtraceevent dynamic list directly into LDFLAGS, which makes all targets depend on that list through LDFLAGS. This is not good since some of targets like libgtk.so doesn't use plugin at all, but require the existance of that list because of linker options. This patch isolates the -Xlink option into LIBTRACEEVENT_DYNAMIC_LIST_LDFLAGS, makes only perf and perf.so use it. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: He Kuang <hekuang@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1434552389-89144-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17perf tools: Fix a problem when opening old perf.data with different byte orderWang Nan
Following error occurs when trying to use 'perf report' on x86_64 to cross analysis a perf.data generated by an old perf on a big-endian machine: # perf report *** Error in `/home/w00229757/perf': free(): invalid next size (fast): 0x00000000032c99f0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x6eeef)[0x7ff6ff7e2eef] /lib64/libc.so.6(+0x78cae)[0x7ff6ff7eccae] /lib64/libc.so.6(+0x79987)[0x7ff6ff7ed987] /path/to/perf[0x4ac734] /path/to/perf[0x4ac829] /path/to/perf(perf_header__process_sections+0x129)[0x4ad2c9] /path/to/perf(perf_session__read_header+0x2e1)[0x4ad9e1] /path/to/perf(perf_session__new+0x168)[0x4bd458] /path/to/perf(cmd_report+0xfa0)[0x43eb70] /path/to/perf[0x47adc3] /path/to/perf(main+0x5f6)[0x42fd06] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7ff6ff795bd5] /path/to/perf[0x42fe35] ======= Memory map: ======== [SNIP] The bug is in perf_event__attr_swap(). It swaps all fields in 'struct perf_event_attr' without checking whether the swapped field exist or not. In addition, in read_event_desc() allocs memory for attr according to size read from perf.data. Therefore, if the perf.data is collected by an old perf (without aux_watermark, for example), when perf_event__attr_swap() swaping attr->aux_watermark it destroy malloc's metadata. This patch introduces boundary checking in perf_event__attr_swap(). It adds macros bswap_field_64 and bswap_field_32 into perf_event__attr_swap() to make it only swap exist fields. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1434534999-85347-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-06-17writeback, blkio: add documentation for cgroup writeback supportTejun Heo
Update Documentation/cgroups/blkio-controller.txt to reflect the recently added cgroup writeback support. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: cgroups@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-17vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWBTejun Heo
FS_CGROUP_WRITEBACK indicates whether a file_system_type supports cgroup writeback; however, different super_blocks of the same file_system_type may or may not support cgroup writeback depending on filesystem options. This patch replaces FS_CGROUP_WRITEBACK with a per-super_block flag. super_block->s_flags carries some internal flags in the high bits but it's exposd to userland through uapi header and running out of space anyway. This patch adds a new field super_block->s_iflags to carry kernel-internal flags. It is currently only used by the new SB_I_CGROUPWB flag whose concatenated and abbreviated name is for consistency with other super_block flags. ext2_fill_super() is updated to set SB_I_CGROUPWB. v2: Added super_block->s_iflags instead of stealing another high bit from sb->s_flags as suggested by Christoph and Jan. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: linux-ext4@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>