linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-03-01	watchdog: Add 'action' and 'data' parameters to restart handler callback	Guenter Roeck
	The 'action' (or restart mode) and data parameters may be used by restart handlers, so they should be passed to the restart callback functions. Cc: Sylvain Lemieux <slemieux@tycoint.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2016-03-01	Merge tag 'pxa-for-4.6' of https://github.com/rjarzmik/linux into next/soc	Arnd Bergmann
	Merge "pxa changes for v4.6 cycle" from Robert Jarzmik: This is a minor cycle with : - cleanup fixes from Arnd, mainly build oriented and sparse type ones - dma fixes for requestors above 32 (impacting mainly camera driver) - some minor cleanup on pxa3xx device-tree side * tag 'pxa-for-4.6' of https://github.com/rjarzmik/linux: dmaengine: pxa_dma: fix the maximum requestor line ARM: pxa: add the number of DMA requestor lines dmaengine: mmp-pdma: add number of requestors dma: mmp_pdma: Add the #dma-requests DT property documentation ARM: pxa: pxa3xx device-tree support cleanup ARM: pxa: don't select RFKILL if CONFIG_NET is disabled ARM: pxa: fix building without IWMMXT ARM: pxa: move extern declarations to pm.h ARM: pxa: always select one of the two CPU types ARM: pxa: don't select GPIO_SYSFS for MIOA701 ARM: pxa: mark unused eseries code as __maybe_unused ARM: pxa: mark spitz_card_pwr_ctrl as __maybe_unused ARM: pxa: define clock registers as __iomem
2016-02-29	IB/mlx4: Add support for the don't trap rule	Marina Varshaver
	Add support for receiving multicast/unicast traffic with the don't trap rule. Sniffing these packets requires a flow steering rule of type NORMAL at priority 0 with flag IB_FLOW_ATTR_FLAGS_DONT_TRAP set. Choosing between multicast or unicast is done via ethernet L2 dest_mac mask and value: - If mask is all zeros - unicast and multicast are set. - If mask non zero - only mask with multicast bit 1 and rest 0 is supported, the mac value will choose if it is multicast or unicast rule. If the mask multicast bit is on and some other bits are on too, it means a request for specific multicast or unicast, this is not supported, either receive all multicast or all unicast. Only when limitations are met registered QP will receive requested type but other QPs can receive same traffic if registered for it. Otherwise, if limitations are not met, an error will be returned. Limitations: - Rule must be with priority 0. - A0 mode is not supported. - Sniffer QP cannot appear in any other flow steering rule. Signed-off-by: Marina Varshaver <marinav@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-29	use ->d_seq to get coherency between ->d_inode and ->d_flags	Al Viro
	Games with ordering and barriers are way too brittle. Just bump ->d_seq before and after updating ->d_inode and ->d_flags type bits, so that verifying ->d_seq would guarantee they are coherent. Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-02-29	regulator: act8865: Rename platform_data field to init_data	Maarten ter Huurne
	Make the field name match its type. Signed-off-by: Maarten ter Huurne <maarten@treewalker.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-29	mmc: dw_mmc: remove DW_MCI_QUIRK_BROKEN_CARD_DETECTION quirk	Shawn Lin
	dw_mmc already use mmc_of_parse to get "broken-cd" property, but it considered "broken-cd" to be a quirk in its driver. We don't need this quirk here, and just take what we need from mmc->caps. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Tested-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-02-29	mmc: dw_mmc: remove struct block_settings	Shawn Lin
	This patch removes struct block_settings since it's never used anywhere. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-02-29	mmc: tmio: refactor set_clock a little	Wolfram Sang
	Some of the indentation made the code awful to read. Fix that. Also, introduce defines instead of magic hex values. Note that this includes one change: We mask out know 0xff instead of 0x1ff. But 0x100 has always been the clock enable bit. It doesn't make any sense to set it depending on the clock calculation. Update copyright notices, too. I'll be working on those files some more in the future. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-02-29	mmc: tmio: add flag to reduce delay after changing clock status	Wolfram Sang
	The docs for RCar Gen2 & 3 I have access to, mention delays of 5ms after stop and 1ms after start. Make it possible to apply these values. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-02-29	mmc: core: remove the MMC_DATA_STREAM flag	Jaehoon Chung
	It's not set to MMC_DATA_STREAM anywhere. It seems that it had been used with CMD11/CMD20. But according to Spec, CMD11/CMD20 are obsolete command. Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-02-29	Merge tag 'v4.5-rc6' into locking/core, to pick up fixes	Ingo Molnar
	Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29	sched/debug: Fix preempt_disable_ip recording for preempt_disable()	Sebastian Andrzej Siewior
	The preempt_disable() invokes preempt_count_add() which saves the caller in ->preempt_disable_ip. It uses CALLER_ADDR1 which does not look for its caller but for the parent of the caller. Which means we get the correct caller for something like spin_lock() unless the architectures inlines those invocations. It is always wrong for preempt_disable() or local_bh_disable(). This patch makes the function get_lock_parent_ip() which tries CALLER_ADDR0,1,2 if the former is a locking function. This seems to record the preempt_disable() caller properly for preempt_disable() itself as well as for get_cpu_var() or local_bh_disable(). Steven asked for the get_parent_ip() -> get_lock_parent_ip() rename. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160226135456.GB18244@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29	sched/rt: Fix PI handling vs. sched_setscheduler()	Peter Zijlstra
	Andrea Parri reported: > I found that the following scenario (with CONFIG_RT_GROUP_SCHED=y) is not > handled correctly: > > T1 (prio = 20) > lock(rtmutex); > > T2 (prio = 20) > blocks on rtmutex (rt_nr_boosted = 0 on T1's rq) > > T1 (prio = 20) > sys_set_scheduler(prio = 0) > [new_effective_prio == oldprio] > T1 prio = 20 (rt_nr_boosted = 0 on T1's rq) > > The last step is incorrect as T1 is now boosted (c.f., rt_se_boosted()); > in particular, if we continue with > > T1 (prio = 20) > unlock(rtmutex) > wakeup(T2) > adjust_prio(T1) > [prio != rt_mutex_getprio(T1)] > dequeue(T1) > rt_nr_boosted = (unsigned long)(-1) > ... > T1 prio = 0 > > then we end up leaving rt_nr_boosted in an "inconsistent" state. > > The simple program attached could reproduce the previous scenario; note > that, as a consequence of the presence of this state, the "assertion" > > WARN_ON(!rt_nr_running && rt_nr_boosted) > > from dec_rt_group() may trigger. So normally we dequeue/enqueue tasks in sched_setscheduler(), which would ensure the accounting stays correct. However in the early PI path we fail to do so. So this was introduced at around v3.14, by: c365c292d059 ("sched: Consider pi boosting in setscheduler()") which fixed another problem exactly because that dequeue/enqueue, joy. Fix this by teaching rt about DEQUEUE_SAVE/ENQUEUE_RESTORE and have it preserve runqueue location with that option. This requires decoupling the on_rt_rq() state from being on the list. In order to allow for explicit movement during the SAVE/RESTORE, introduce {DE,EN}QUEUE_MOVE. We still must use SAVE/RESTORE in these cases to preserve other invariants. Respecting the SAVE/RESTORE flags also has the (nice) side-effect that things like sys_nice()/sys_sched_setaffinity() also do not reorder FIFO tasks (whereas they used to before this patch). Reported-by: Andrea Parri <parri.andrea@gmail.com> Tested-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29	Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵	Ingo Molnar
	applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29	perf: Allow storage of PMU private data in event	Thomas Gleixner
	For PMUs which are not per CPU, but e.g. per package/socket, we want to be able to store a reference to the underlying per package/socket facility in the event at init time so we can avoid magic storage constructs in the PMU driver. This allows us to get rid of the per CPU dance in the intel uncore and RAPL drivers and avoids a lookup of the per package data in the perf hotpath. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Harish Chegondi <harish.chegondi@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20160222221011.364140369@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-29	objtool: Add STACK_FRAME_NON_STANDARD() macro	Josh Poimboeuf
	Add a new macro, STACK_FRAME_NON_STANDARD(), which is used to denote a function which does something unusual related to its stack frame. Use of the macro prevents objtool from emitting a false positive warning. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris J Arges <chris.j.arges@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Pedro Alves <palves@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/34487a17b23dba43c50941599d47054a9584b219.1456719558.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-28	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A rather largish series of 12 patches addressing a maze of race conditions in the perf core code from Peter Zijlstra" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Robustify task_function_call() perf: Fix scaling vs. perf_install_in_context() perf: Fix scaling vs. perf_event_enable() perf: Fix scaling vs. perf_event_enable_on_exec() perf: Fix ctx time tracking by introducing EVENT_TIME perf: Cure event->pending_disable race perf: Fix race between event install and jump_labels perf: Fix cloning perf: Only update context time when active perf: Allow perf_release() with !event->ctx perf: Do not double free perf: Close install vs. exit race
2016-02-27	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: dax: move writeback calls into the filesystems dax: give DAX clearing code correct bdev ext4: online defrag not supported with DAX ext2, ext4: only set S_DAX for regular inodes block: disable block device DAX by default ocfs2: unlock inode if deleting inode from orphan fails mm: ASLR: use get_random_long() drivers: char: random: add get_random_long() mm: numa: quickly fail allocations for NUMA balancing on full nodes mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
2016-02-27	Merge tag 'pci-v4.5-fixes-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Enumeration: Revert x86 pcibios_alloc_irq() to fix regression (Bjorn Helgaas) Marvell MVEBU host bridge driver: Restrict build to 32-bit ARM (Thierry Reding)" * tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: mvebu: Restrict build to 32-bit ARM Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()" Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed" Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
2016-02-27	dax: move writeback calls into the filesystems	Ross Zwisler
	Previously calls to dax_writeback_mapping_range() for all DAX filesystems (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range(). dax_writeback_mapping_range() needs a struct block_device, and it used to get that from inode->i_sb->s_bdev. This is correct for normal inodes mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time files. Instead, call dax_writeback_mapping_range() directly from the filesystem ->writepages function so that it can supply us with a valid block device. This also fixes DAX code to properly flush caches in response to sync(2). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27	dax: give DAX clearing code correct bdev	Ross Zwisler
	dax_clear_blocks() needs a valid struct block_device and previously it was using inode->i_sb->s_bdev in all cases. This is correct for normal inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time devices. Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change its arguments to take a bdev and a sector instead of an inode and a block. This better reflects what the function does, and it allows the filesystem and raw block device code to pass in an appropriate struct block_device. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27	drivers: char: random: add get_random_long()	Daniel Cashman
	Commit d07e22597d1d ("mm: mmap: add new /proc tunable for mmap_base ASLR") added the ability to choose from a range of values to use for entropy count in generating the random offset to the mmap_base address. The maximum value on this range was set to 32 bits for 64-bit x86 systems, but this value could be increased further, requiring more than the 32 bits of randomness provided by get_random_int(), as is already possible for arm64. Add a new function: get_random_long() which more naturally fits with the mmap usage of get_random_int() but operates exactly the same as get_random_int(). Also, fix the shifting constant in mmap_rnd() to be an unsigned long so that values greater than 31 bits generate an appropriate mask without overflow. This is especially important on x86, as its shift instruction uses a 5-bit mask for the shift operand, which meant that any value for mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base randomization. Finally, replace calls to get_random_int() with get_random_long() where appropriate. This patch (of 2): Add get_random_long(). Signed-off-by: Daniel Cashman <dcashman@android.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27	clocksource: Introduce clocksource_freq2mult()	Alexander Kuleshov
	The clocksource_khz2mult() and clocksource_hz2mult() share similar code wihch calculates a mult from the given frequency. Both implementations in differ only in value of a frequency. This patch introduces the clocksource_freq2mult() helper with generic implementation of mult calculation to prevent code duplication. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Link: http://lkml.kernel.org/r/1456542854-22104-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-26	clk: Make of_clk_get_parent_count() return unsigned ints	Stephen Boyd
	Russell King recently pointed out a bug in the clk-gpio code where it fails to register the clk if of_clk_get_parent_count() returns an error because the "clocks" property isn't present in the DT node. If we're trying to count parents from DT we'd like to know the count, not if there is a "clocks" property or not. Furthermore, some drivers are assigning the return value to their clk_init_data::num_parents member which is unsigned, leading to potentially large numbers of parents when the property isn't present. Let's change the API to return an unsigned int instead of an int. All the callers just want to know the count anyway, and this avoids the bug that was in the clk-gpio driver. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-02-26	dmaengine: mmp-pdma: add number of requestors	Robert Jarzmik
	The DMA chip has a fixed number of requestor lines used for flow control. This number is platform dependent. The pxa_dma dma driver will use this value to activate or not the flow control. There won't be any impact on mmp_pdma driver. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
2016-02-26	cpufreq: Simplify the cpufreq_for_each_valid_entry()	Rafael J. Wysocki
	That macro uses an internal static inline function that is first totally unnecessary and second hard to read, so simplify it and get rid of that monster. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-02-26	drivers/perf: arm_pmu: implement CPU_PM notifier	Lorenzo Pieralisi
	When a CPU is suspended (either through suspend-to-RAM or CPUidle), its PMU registers content can be lost, which means that counters registers values that were initialized on power down entry have to be reprogrammed on power-up to make sure the counters set-up is preserved (ie on power-up registers take the reset values on Cold or Warm reset, which can be architecturally UNKNOWN). To guarantee seamless profiling conditions across a core power down this patch adds a CPU PM notifier to ARM pmus, that upon CPU PM entry/exit from low-power states saves/restores the pmu registers set-up (by using the ARM perf API), so that the power-down/up cycle does not affect the perf behaviour (apart from a black-out period between power-up/down CPU PM notifications that is unavoidable). Cc: Will Deacon <will.deacon@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-26	fbdev: kill fb_rotate	Rasmus Villemoes
	The fb_rotate method in struct fb_ops is never actually invoked, and it's been that way in the entire history of git (in fact, the last occurrence of the string '->fb_rotate' vanished over 10 years ago, with b4d8aea6d6, and that merely tested whether the callback existed). So remove some dead code and make struct fb_obs a little smaller. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2016-02-25	staging/lustre: proper support of NFS anonymous dentries	Dmitry Eremin
	NFS can ask to encode dentries that are not connected to the root. The fix check for parent is NULL and encode a file handle accordingly. Reviewed-on: http://review.whamcloud.com/8347 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4231 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Jian Yu <jian.yu@intel.com> Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Acked-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-25	net: ethtool: remove unused __ethtool_get_settings	David Decotigny
	replaced by __ethtool_get_link_ksettings. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25	net: ethtool: add new ETHTOOL_xLINKSETTINGS API	David Decotigny
	This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25	Merge branch 'libnvdimm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Two fixes for compatibility with the ACPI 6.1 specification. Without these fixes multi-interface DIMMs will fail to be probed, and address range scrub commands to find memory errors will give results that the kernel will mis-interpret. For multi-interface DIMMs Linux will accept either the original 6.0 implementation or 6.1. For address range scrub we'll only support 6.1 since ACPI formalized this DSM differently than the original example [1] implemented in v4.2. The expectation is that production systems will only ever ship the ACPI 6.1 address range scrub command definition. - The wider async address range scrub work targeting 4.6 discovered that the original synchronous implementation in 4.5 is not sizing its return buffer correctly. - Arnd caught that my recent fix to the size of the pfn_t flags missed updating the flags variable used in the pmem driver. - Toshi found that we mishandle the memremap() return value in devm_memremap(). * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: use 'u64' for pfn flags devm_memremap: Fix error value when memremap failed nfit: update address range scrub commands to the acpi 6.1 format libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing nfit: fix multi-interface dimm handling, acpi6.1 compatibility
2016-02-25	net: ipv6: Make address flushing on ifdown optional	David Ahern
	Currently, all ipv6 addresses are flushed when the interface is configured down, including global, static addresses: $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 << nothing; all addresses have been flushed>> Add a new sysctl to make this behavior optional. The new setting defaults to flush all addresses to maintain backwards compatibility. When the set global addresses with no expire times are not flushed on an admin down. The sysctl is per-interface or system-wide for all interfaces $ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1 or $ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1 Will keep addresses on eth1 on an admin down. $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000 inet6 2100:1::2/120 scope global tentative valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative valid_lft forever preferred_lft forever Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-26	regmap: add regmap_fields_force_xxx() macros	Kuninori Morimoto
	Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-26	regmap: add regmap_field_force_xxx() macros	Kuninori Morimoto
	Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-25	Merge tag 'for-v4.5-rc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: "Add a regression fix for changed sysfs path of bq27xxx_battery and update MAINTAINERS file" * tag 'for-v4.5-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: bq27xxx_battery: Restore device name MAINTAINERS: update bq27xxx driver
2016-02-25	libata: Align ata_device's id on a cacheline	Harvey Hunt
	The id buffer in ata_device is a DMA target, but it isn't explicitly cacheline aligned. Due to this, adjacent fields can be overwritten with stale data from memory on non coherent architectures. As a result, the kernel is sometimes unable to communicate with an ATA device. Fix this by ensuring that the id buffer is cacheline aligned. This issue is similar to that fixed by Commit 84bda12af31f ("libata: align ap->sector_buf"). Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com> Cc: linux-kernel@vger.kernel.org Cc: <stable@vger.kernel.org> # 2.6.18 Signed-off-by: Tejun Heo <tj@kernel.org>
2016-02-25	PCI: Disable IO/MEM decoding for devices with non-compliant BARs	Bjorn Helgaas
	The PCI config header (first 64 bytes of each device's config space) is defined by the PCI spec so generic software can identify the device and manage its usage of I/O, memory, and IRQ resources. Some non-spec-compliant devices put registers other than BARs where the BARs should be. When the PCI core sizes these "BARs", the reads and writes it does may have unwanted side effects, and the "BAR" may appear to describe non-sensical address space. Add a flag bit to mark non-compliant devices so we don't touch their BARs. Turn off IO/MEM decoding to prevent the devices from consuming address space, since we can't read the BARs to find out what that address space would be. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Andi Kleen <ak@linux.intel.com> CC: stable@vger.kernel.org
2016-02-25	KVM: Use simple waitqueue for vcpu->wq	Marcelo Tosatti
	The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	wait.[ch]: Introduce the simple waitqueue (swait) implementation	Peter Zijlstra (Intel)
	The existing wait queue support has support for custom wake up call backs, wake flags, wake key (passed to call back) and exclusive flags that allow wakers to be tagged as exclusive, for limiting the number of wakers. In a lot of cases, none of these features are used, and hence we can benefit from a slimmed down version that lowers memory overhead and reduces runtime overhead. The concept originated from -rt, where waitqueues are a constant source of trouble, as we can't convert the head lock to a raw spinlock due to fancy and long lasting callbacks. With the removal of custom callbacks, we can use a raw lock for queue list manipulations, hence allowing the simple wait support to be used in -rt. [Patch is from PeterZ which is based on Thomas version. Commit message is written by Paul G. Daniel: - Fixed some compile issues - Added non-lazy implementation of swake_up_locked as suggested by Boqun Feng.] Originally-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-2-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	MIPS: Make smp CMP, CPS and MT use the new generic IPI functions	Qais Yousef
	This commit does several things to avoid breaking bisectability. 1- Remove IPI init code from irqchip/mips-gic 2- Implement the new irqchip->send_ipi() in irqchip/mips-gic 3- Select GENERIC_IRQ_IPI Kconfig symbol for MIPS_GIC 4- Change MIPS SMP to use the generic IPI implementation Only the SMP variants that use GIC were converted as it's the only irqchip that will have the support for generic IPI for now. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-18-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Implement ipi_send_mask/single()	Qais Yousef
	Add APIs to send IPIs from driver and arch code. We have different functions because we allow architecture code to cache the irq descriptor to avoid lookups. Driver code has to use the irq number and is subject to more restrictive checks. [ tglx: Polish the implementation ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-12-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add send_ipi callbacks to irq_chip	Qais Yousef
	Introduce the new callbacks which can be used by the core code to implement a generic IPI send mechanism. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-11-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add a new function to get IPI reverse mapping	Qais Yousef
	When dealing with coprocessors we need to find out the actual hwirqs values to pass on to the firmware so that it knows what it needs to use to receive IPIs from and send IPIs to Linux cpus. [ tglx: Fixed the single hwirq IPI case. The hardware irq number does not change due to the cpu number ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-10-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add a new generic IPI reservation code to irq core	Qais Yousef
	Add a generic mechanism to dynamically allocate an IPI. Depending on the underlying implementation this creates either a single Linux irq or a consective range of Linux irqs. The Linux irq is used later to send IPIs to other CPUs. [ tglx: Massaged the code and removed the 'consecutive mask' restriction for the single IRQ case ] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-9-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Make irq_domain_alloc_descs() non static	Qais Yousef
	We will need to use this function to implement irq_reserve_ipi() later. So make it non static and move the prototype to irqdomain.h to allow using it outside irqdomain.c Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-8-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add ipi_offset to irq_common_data	Qais Yousef
	IPIs are always assumed to be consecutively allocated, hence virqs and hwirqs can be inferred by using CPU id as an offset. But the first cpu doesn't always have to start at offset 0. ipi_offset stores the position of the first cpu so that we can easily calculate the virq or hwirq of an IPI associated with a specific cpu. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-6-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add an extra comment about the use of affinity in irq_common_data	Qais Yousef
	Affinity will have dual meaning depends on the type of the irq. If it is a normal irq, it'll have the standard affinity meaning. If it is an IPI, it will hold the mask of the cpus to which an IPI can be sent. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-7-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add DOMAIN_BUS_IPI	Qais Yousef
	We need a way to search and match IPI domains. Using the new enum we can use irq_find_matching_host() to do that. Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-3-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-25	genirq: Add new IPI irqdomain flags	Qais Yousef
	These flags will be used to identify an IPI domain. We have two flavours of IPI implementations: IRQ_DOMAIN_FLAG_IPI_PER_CPU: Each CPU has its own virq and hwirq IRQ_DOMAIN_FLAG_IPI_SINGLE : A single virq and hwirq for all CPUs Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: <jason@lakedaemon.net> Cc: <marc.zyngier@arm.com> Cc: <jiang.liu@linux.intel.com> Cc: <ralf@linux-mips.org> Cc: <linux-mips@linux-mips.org> Cc: <lisa.parratt@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Link: http://lkml.kernel.org/r/1449580830-23652-2-git-send-email-qais.yousef@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>