summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2016-02-29Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵Ingo Molnar
applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29perf: Allow storage of PMU private data in eventThomas Gleixner
For PMUs which are not per CPU, but e.g. per package/socket, we want to be able to store a reference to the underlying per package/socket facility in the event at init time so we can avoid magic storage constructs in the PMU driver. This allows us to get rid of the per CPU dance in the intel uncore and RAPL drivers and avoids a lookup of the per package data in the perf hotpath. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Harish Chegondi <harish.chegondi@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160222221011.364140369@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29objtool: Add STACK_FRAME_NON_STANDARD() macroJosh Poimboeuf
Add a new macro, STACK_FRAME_NON_STANDARD(), which is used to denote a function which does something unusual related to its stack frame. Use of the macro prevents objtool from emitting a false positive warning. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/34487a17b23dba43c50941599d47054a9584b219.1456719558.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-28Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A rather largish series of 12 patches addressing a maze of race conditions in the perf core code from Peter Zijlstra" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Robustify task_function_call() perf: Fix scaling vs. perf_install_in_context() perf: Fix scaling vs. perf_event_enable() perf: Fix scaling vs. perf_event_enable_on_exec() perf: Fix ctx time tracking by introducing EVENT_TIME perf: Cure event->pending_disable race perf: Fix race between event install and jump_labels perf: Fix cloning perf: Only update context time when active perf: Allow perf_release() with !event->ctx perf: Do not double free perf: Close install vs. exit race
2016-02-27Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: dax: move writeback calls into the filesystems dax: give DAX clearing code correct bdev ext4: online defrag not supported with DAX ext2, ext4: only set S_DAX for regular inodes block: disable block device DAX by default ocfs2: unlock inode if deleting inode from orphan fails mm: ASLR: use get_random_long() drivers: char: random: add get_random_long() mm: numa: quickly fail allocations for NUMA balancing on full nodes mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
2016-02-27Merge tag 'pci-v4.5-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Enumeration: Revert x86 pcibios_alloc_irq() to fix regression (Bjorn Helgaas) Marvell MVEBU host bridge driver: Restrict build to 32-bit ARM (Thierry Reding)" * tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: mvebu: Restrict build to 32-bit ARM Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()" Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed" Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
2016-02-27dax: move writeback calls into the filesystemsRoss Zwisler
Previously calls to dax_writeback_mapping_range() for all DAX filesystems (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range(). dax_writeback_mapping_range() needs a struct block_device, and it used to get that from inode->i_sb->s_bdev. This is correct for normal inodes mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time files. Instead, call dax_writeback_mapping_range() directly from the filesystem ->writepages function so that it can supply us with a valid block device. This also fixes DAX code to properly flush caches in response to sync(2). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27dax: give DAX clearing code correct bdevRoss Zwisler
dax_clear_blocks() needs a valid struct block_device and previously it was using inode->i_sb->s_bdev in all cases. This is correct for normal inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time devices. Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change its arguments to take a bdev and a sector instead of an inode and a block. This better reflects what the function does, and it allows the filesystem and raw block device code to pass in an appropriate struct block_device. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27drivers: char: random: add get_random_long()Daniel Cashman
Commit d07e22597d1d ("mm: mmap: add new /proc tunable for mmap_base ASLR") added the ability to choose from a range of values to use for entropy count in generating the random offset to the mmap_base address. The maximum value on this range was set to 32 bits for 64-bit x86 systems, but this value could be increased further, requiring more than the 32 bits of randomness provided by get_random_int(), as is already possible for arm64. Add a new function: get_random_long() which more naturally fits with the mmap usage of get_random_int() but operates exactly the same as get_random_int(). Also, fix the shifting constant in mmap_rnd() to be an unsigned long so that values greater than 31 bits generate an appropriate mask without overflow. This is especially important on x86, as its shift instruction uses a 5-bit mask for the shift operand, which meant that any value for mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base randomization. Finally, replace calls to get_random_int() with get_random_long() where appropriate. This patch (of 2): Add get_random_long(). Signed-off-by: Daniel Cashman <dcashman@android.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27clocksource: Introduce clocksource_freq2mult()Alexander Kuleshov
The clocksource_khz2mult() and clocksource_hz2mult() share similar code wihch calculates a mult from the given frequency. Both implementations in differ only in value of a frequency. This patch introduces the clocksource_freq2mult() helper with generic implementation of mult calculation to prevent code duplication. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1456542854-22104-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-26clk: Make of_clk_get_parent_count() return unsigned intsStephen Boyd
Russell King recently pointed out a bug in the clk-gpio code where it fails to register the clk if of_clk_get_parent_count() returns an error because the "clocks" property isn't present in the DT node. If we're trying to count parents from DT we'd like to know the count, not if there is a "clocks" property or not. Furthermore, some drivers are assigning the return value to their clk_init_data::num_parents member which is unsigned, leading to potentially large numbers of parents when the property isn't present. Let's change the API to return an unsigned int instead of an int. All the callers just want to know the count anyway, and this avoids the bug that was in the clk-gpio driver. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-02-26dmaengine: mmp-pdma: add number of requestorsRobert Jarzmik
The DMA chip has a fixed number of requestor lines used for flow control. This number is platform dependent. The pxa_dma dma driver will use this value to activate or not the flow control. There won't be any impact on mmp_pdma driver. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
2016-02-26cpufreq: Simplify the cpufreq_for_each_valid_entry()Rafael J. Wysocki
That macro uses an internal static inline function that is first totally unnecessary and second hard to read, so simplify it and get rid of that monster. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-02-26drivers/perf: arm_pmu: implement CPU_PM notifierLorenzo Pieralisi
When a CPU is suspended (either through suspend-to-RAM or CPUidle), its PMU registers content can be lost, which means that counters registers values that were initialized on power down entry have to be reprogrammed on power-up to make sure the counters set-up is preserved (ie on power-up registers take the reset values on Cold or Warm reset, which can be architecturally UNKNOWN). To guarantee seamless profiling conditions across a core power down this patch adds a CPU PM notifier to ARM pmus, that upon CPU PM entry/exit from low-power states saves/restores the pmu registers set-up (by using the ARM perf API), so that the power-down/up cycle does not affect the perf behaviour (apart from a black-out period between power-up/down CPU PM notifications that is unavoidable). Cc: Will Deacon <will.deacon@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-26fbdev: kill fb_rotateRasmus Villemoes
The fb_rotate method in struct fb_ops is never actually invoked, and it's been that way in the entire history of git (in fact, the last occurrence of the string '->fb_rotate' vanished over 10 years ago, with b4d8aea6d6, and that merely tested whether the callback existed). So remove some dead code and make struct fb_obs a little smaller. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2016-02-25staging/lustre: proper support of NFS anonymous dentriesDmitry Eremin
NFS can ask to encode dentries that are not connected to the root. The fix check for parent is NULL and encode a file handle accordingly. Reviewed-on: http://review.whamcloud.com/8347 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4231 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Jian Yu <jian.yu@intel.com> Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Acked-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25net: ethtool: remove unused __ethtool_get_settingsDavid Decotigny
replaced by __ethtool_get_link_ksettings. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25net: ethtool: add new ETHTOOL_xLINKSETTINGS APIDavid Decotigny
This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Two fixes for compatibility with the ACPI 6.1 specification. Without these fixes multi-interface DIMMs will fail to be probed, and address range scrub commands to find memory errors will give results that the kernel will mis-interpret. For multi-interface DIMMs Linux will accept either the original 6.0 implementation or 6.1. For address range scrub we'll only support 6.1 since ACPI formalized this DSM differently than the original example [1] implemented in v4.2. The expectation is that production systems will only ever ship the ACPI 6.1 address range scrub command definition. - The wider async address range scrub work targeting 4.6 discovered that the original synchronous implementation in 4.5 is not sizing its return buffer correctly. - Arnd caught that my recent fix to the size of the pfn_t flags missed updating the flags variable used in the pmem driver. - Toshi found that we mishandle the memremap() return value in devm_memremap(). * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: use 'u64' for pfn flags devm_memremap: Fix error value when memremap failed nfit: update address range scrub commands to the acpi 6.1 format libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing nfit: fix multi-interface dimm handling, acpi6.1 compatibility
2016-02-25net: ipv6: Make address flushing on ifdown optionalDavid Ahern
Currently, all ipv6 addresses are flushed when the interface is configured down, including global, static addresses: $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 << nothing; all addresses have been flushed>> Add a new sysctl to make this behavior optional. The new setting defaults to flush all addresses to maintain backwards compatibility. When the set global addresses with no expire times are not flushed on an admin down. The sysctl is per-interface or system-wide for all interfaces $ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1 or $ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1 Will keep addresses on eth1 on an admin down. $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000 inet6 2100:1::2/120 scope global tentative valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative valid_lft forever preferred_lft forever Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-26regmap: add regmap_fields_force_xxx() macrosKuninori Morimoto
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-26regmap: add regmap_field_force_xxx() macrosKuninori Morimoto
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-25Merge tag 'for-v4.5-rc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: "Add a regression fix for changed sysfs path of bq27xxx_battery and update MAINTAINERS file" * tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: bq27xxx_battery: Restore device name MAINTAINERS: update bq27xxx driver
2016-02-25libata: Align ata_device's id on a cachelineHarvey Hunt
The id buffer in ata_device is a DMA target, but it isn't explicitly cacheline aligned. Due to this, adjacent fields can be overwritten with stale data from memory on non coherent architectures. As a result, the kernel is sometimes unable to communicate with an ATA device. Fix this by ensuring that the id buffer is cacheline aligned. This issue is similar to that fixed by Commit 84bda12af31f ("libata: align ap->sector_buf"). Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 2.6.18 Signed-off-by: Tejun Heo <tj@kernel.org>
2016-02-25PCI: Disable IO/MEM decoding for devices with non-compliant BARsBjorn Helgaas
The PCI config header (first 64 bytes of each device's config space) is defined by the PCI spec so generic software can identify the device and manage its usage of I/O, memory, and IRQ resources. Some non-spec-compliant devices put registers other than BARs where the BARs should be. When the PCI core sizes these "BARs", the reads and writes it does may have unwanted side effects, and the "BAR" may appear to describe non-sensical address space. Add a flag bit to mark non-compliant devices so we don't touch their BARs. Turn off IO/MEM decoding to prevent the devices from consuming address space, since we can't read the BARs to find out what that address space would be. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> CC: stable@vger.kernel.org
2016-02-25KVM: Use simple waitqueue for vcpu->wqMarcelo Tosatti
The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25wait.[ch]: Introduce the simple waitqueue (swait) implementationPeter Zijlstra (Intel)
The existing wait queue support has support for custom wake up call backs, wake flags, wake key (passed to call back) and exclusive flags that allow wakers to be tagged as exclusive, for limiting the number of wakers. In a lot of cases, none of these features are used, and hence we can benefit from a slimmed down version that lowers memory overhead and reduces runtime overhead. The concept originated from -rt, where waitqueues are a constant source of trouble, as we can't convert the head lock to a raw spinlock due to fancy and long lasting callbacks. With the removal of custom callbacks, we can use a raw lock for queue list manipulations, hence allowing the simple wait support to be used in -rt. [Patch is from PeterZ which is based on Thomas version. Commit message is written by Paul G. Daniel: - Fixed some compile issues - Added non-lazy implementation of swake_up_locked as suggested by Boqun Feng.] Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-2-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25MIPS: Make smp CMP, CPS and MT use the new generic IPI functionsQais Yousef
This commit does several things to avoid breaking bisectability. 1- Remove IPI init code from irqchip/mips-gic 2- Implement the new irqchip->send_ipi() in irqchip/mips-gic 3- Select GENERIC_IRQ_IPI Kconfig symbol for MIPS_GIC 4- Change MIPS SMP to use the generic IPI implementation Only the SMP variants that use GIC were converted as it's the only irqchip that will have the support for generic IPI for now. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-18-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Implement ipi_send_mask/single()Qais Yousef
Add APIs to send IPIs from driver and arch code. We have different functions because we allow architecture code to cache the irq descriptor to avoid lookups. Driver code has to use the irq number and is subject to more restrictive checks. [ tglx: Polish the implementation ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-12-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add send_ipi callbacks to irq_chipQais Yousef
Introduce the new callbacks which can be used by the core code to implement a generic IPI send mechanism. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-11-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add a new function to get IPI reverse mappingQais Yousef
When dealing with coprocessors we need to find out the actual hwirqs values to pass on to the firmware so that it knows what it needs to use to receive IPIs from and send IPIs to Linux cpus. [ tglx: Fixed the single hwirq IPI case. The hardware irq number does not change due to the cpu number ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-10-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add a new generic IPI reservation code to irq coreQais Yousef
Add a generic mechanism to dynamically allocate an IPI. Depending on the underlying implementation this creates either a single Linux irq or a consective range of Linux irqs. The Linux irq is used later to send IPIs to other CPUs. [ tglx: Massaged the code and removed the 'consecutive mask' restriction for the single IRQ case ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-9-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Make irq_domain_alloc_descs() non staticQais Yousef
We will need to use this function to implement irq_reserve_ipi() later. So make it non static and move the prototype to irqdomain.h to allow using it outside irqdomain.c Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-8-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add ipi_offset to irq_common_dataQais Yousef
IPIs are always assumed to be consecutively allocated, hence virqs and hwirqs can be inferred by using CPU id as an offset. But the first cpu doesn't always have to start at offset 0. ipi_offset stores the position of the first cpu so that we can easily calculate the virq or hwirq of an IPI associated with a specific cpu. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-6-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add an extra comment about the use of affinity in irq_common_dataQais Yousef
Affinity will have dual meaning depends on the type of the irq. If it is a normal irq, it'll have the standard affinity meaning. If it is an IPI, it will hold the mask of the cpus to which an IPI can be sent. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-7-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add DOMAIN_BUS_IPIQais Yousef
We need a way to search and match IPI domains. Using the new enum we can use irq_find_matching_host() to do that. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-3-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25genirq: Add new IPI irqdomain flagsQais Yousef
These flags will be used to identify an IPI domain. We have two flavours of IPI implementations: IRQ_DOMAIN_FLAG_IPI_PER_CPU: Each CPU has its own virq and hwirq IRQ_DOMAIN_FLAG_IPI_SINGLE : A single virq and hwirq for all CPUs Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-2-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25Merge branch 'x86/debug' into core/objtool, to pick up frame pointer fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Fix race between event install and jump_labelsPeter Zijlstra
perf_install_in_context() relies upon the context switch hooks to have scheduled in events when the IPI misses its target -- after all, if the task has moved from the CPU (or wasn't running at all), it will have to context switch to run elsewhere. This however doesn't appear to be happening. It is possible for the IPI to not happen (task wasn't running) only to later observe the task running with an inactive context. The only possible explanation is that the context switch hooks are not called. Therefore put in a sync_sched() after toggling the jump_label to guarantee all CPUs will have them enabled before we install an event. A simple if (0->1) sync_sched() will not in fact work, because any further increment can race and complete before the sync_sched(). Therefore we must jump through some hoops. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Fix cloningPeter Zijlstra
Alexander reported that when the 'original' context gets destroyed, no new clones happen. This can happen irrespective of the ctx switch optimization, any task can die, even the parent, and we want to continue monitoring the task hierarchy until we either close the event or no tasks are left in the hierarchy. perf_event_init_context() will attempt to pin the 'parent' context during clone(). At that point current is the parent, and since current cannot have exited while executing clone(), its context cannot have passed through perf_event_exit_task_context(). Therefore perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE. However, since inherit_event() does: if (parent_event->parent) parent_event = parent_event->parent; it looks at the 'original' event when it does: is_orphaned_event(). This can return true if the context that contains the this event has passed through perf_event_exit_task_context(). And thus we'll fail to clone the perf context. Fix this by adding a new state: STATE_DEAD, which is set by perf_release() to indicate that the filedesc (or kernel reference) is dead and there are no observers for our data left. Only for STATE_DEAD will is_orphaned_event() be true and inhibit cloning. STATE_EXIT is otherwise preserved such that is_event_hup() remains functional and will report when the observed task hierarchy becomes empty. Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24Merge tag 'reset-for-4.6' of git://git.pengutronix.de/git/pza/linux into ↵Olof Johansson
next/drivers Reset controller changes for v4.6 - add support for the imgtec Pistachio SoC reset controller - make struct reset_control_ops const - move DT cell size check into the core to avoid code duplication in the drivers * tag 'reset-for-4.6' of git://git.pengutronix.de/git/pza/linux: reset: sti: Make reset_control_ops const reset: zynq: Make reset_control_ops const reset: socfpga: Make reset_control_ops const reset: hi6220: Make reset_control_ops const reset: ath79: Make reset_control_ops const reset: lpc18xx: Make reset_control_ops const reset: sunxi: Make reset_control_ops const reset: img: Make reset_control_ops const reset: berlin: Make reset_control_ops const reset: berlin: drop DT cell size check reset: img: Add Pistachio reset controller driver reset: img: Add pistachio reset controller binding document reset: hisilicon: check return value of reset_controller_register() reset: Move DT cell size check to the core reset: Make reset_control_ops const reset: remove unnecessary local variable initialization from of_reset_control_get_by_index Signed-off-by: Olof Johansson <olof@lixom.net>
2016-02-24Merge tag 'at91-ab-4.6-drivers' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into next/drivers From Alexandre Belloni: "This is a rework of the PMC driver. It touches multiple subsystems so the easiest path is through arm-soc." drivers update for 4.6: - Big PMC rework that touches clk, PM, usb * tag 'at91-ab-4.6-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: clk: at91: remove useless includes clk: at91: pmc: remove useless capacities handling clk: at91: pmc: drop at91_pmc_base usb: gadget: atmel: access the PMC using regmap ARM: at91: remove useless includes and function prototypes ARM: at91: pm: move idle functions to pm.c ARM: at91: pm: find and remap the pmc ARM: at91: pm: simply call at91_pm_init clk: at91: pmc: move pmc structures to C file clk: at91: pmc: merge at91_pmc_init in atmel_pmc_probe clk: at91: remove IRQ handling and use polling clk: at91: make use of syscon/regmap internally clk: at91: make use of syscon to share PMC registers in several drivers Signed-off-by: Olof Johansson <olof@lixom.net>
2016-02-25ARM: EXYNOS: Move pmu specific headers under "linux/soc/samsung"Pankaj Dubey
Moving Exynos PMU specific header file into "include/linux/soc/samsung" thus updated affected files under "mach-exynos" to use new location of these header files. Signed-off-by: Amit Daniel Kachhap <amitdanielk@gmail.com> [tested on Peach-Pi (Exynos5880)] Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> [for testing on Trats2 (Exynos4412) and Odroid XU3 (Exynos5422)] Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-02-24Merge tag 'scpi-for-v4.6/updates' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into next/drivers SCPI updates and fixes for v4.6 1. Minor fix to restore functionality in big-endian mode 2. Fix race by decreasing Tx timeout to 20ms 3. Adds support for 64-bit sensor values and energy meter * tag 'scpi-for-v4.6/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: hwmon: (scpi) add energy meter support firmware: arm_scpi: add support for 64-bit sensor values firmware: arm_scpi: decrease Tx timeout to 20ms firmware: arm_scpi: fix send_message and sensor_get_value for big-endian Signed-off-by: Olof Johansson <olof@lixom.net>
2016-02-24net/mlx5e: Wake On LAN supportTariq Toukan
Implement set/get WOL by ethtool and added the needed device commands and structures to mlx5_ifc. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24net/mlx5e: Implement DCBNL IEEE max rateTariq Toukan
Add support for DCBNL IEEE get/set max rate. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24net/mlx5: Introduce physical port TC/prio access functionsSaeed Mahameed
Add access functions to set and query a physical port TC groups and prio parameters. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24net/mlx5: Introduce physical port PFC access functionsAchiad Shochat
Add access functions to set and query a physical port PFC (Priority Flow Control) parameters. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24net/mlx5: Introduce a new header file for physical port functionsAchiad Shochat
All the device physical port access functions are implemented in the port.c file. We just extract the exposure of these functions from driver.h into a dedicated header file called port.h. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24efi: stub: implement efi_get_random_bytes() based on EFI_RNG_PROTOCOLArd Biesheuvel
This exposes the firmware's implementation of EFI_RNG_PROTOCOL via a new function efi_get_random_bytes(). Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>