linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2014-06-07	arch: tile: kernel: unaligned.c: Cleaning up uninitialized variables	Rickard Strandqvist
	There is a risk that the variable will be used without being initialized. This was largely found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [minor cleanups]
2014-06-07	Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd into next	Linus Torvalds
	Pull exofs raid6 support from Boaz Harrosh: "These simple patches will enable raid6 using the kernel's raid6_pq engine for support under exofs and pnfs-objects. There is nothing needed to do at exofs and pnfs-obj. Just fire your mkfs.exofs with --raid=6 (that was already supported before) and off you go as usual. The ORE will pick up the new map and will start writing two devices of redundancy bits. The patches are so simple because most of the ORE was already for the general raid case, only a few bug fixes were needed and the actual wiring into the raid6_pq engine" * 'for-linus' of git://git.open-osd.org/linux-open-osd: ore: Support for raid 6 ore: Remove redundant dev_order(), more cleanups ore: (trivial) reformat some code
2014-06-08	f2fs: support f2fs_fiemap	Jaegeuk Kim
	This patch links f2fs_fiemap with generic function with get_block. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-06-07	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "I had this in my 3.16 merge window queue, but it is small and obvious enough for 3.15. I cherry-picked and retested against current rc8" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: send, fix corrupted path strings for long paths
2014-06-07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending	Linus Torvalds
	Pull SCSI target fixes from Nicholas Bellinger: "Here are the remaining fixes for v3.15. This series includes: - iser-target fix for ImmediateData exception reference count bug (Sagi + nab) - iscsi-target fix for MC/S login + potential iser-target MRDSL buffer overrun (Santosh + Roland) - iser-target fix for v3.15-rc multi network portal shutdown regression (nab) - target fix for allowing READ_CAPCITY during ALUA Standby access state (Chris + nab) - target fix for NULL pointer dereference of alua_access_state for un-configured devices (Chris + nab)" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: Fix alua_access_state attribute OOPs for un-configured devices target: Allow READ_CAPACITY opcode in ALUA Standby access state iser-target: Fix multi network portal shutdown regression iscsi-target: Fix wrong buffer / buffer overrun in iscsi_change_param_value() iser-target: Add missing target_put_sess_cmd for ImmedateData failure
2014-06-07	Merge branch 'x86/urgent' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "A significantly larger than I'd like set of patches for just below the wire. All of these, however, fix real problems. The one thing that is genuinely scary in here is the change of SMP initialization, but that does fix a confirmed hang when booting virtual machines. There is also a patch to actually do the right thing about not offlining a CPU when there are not enough interrupt vectors available in the system; the accounting was done incorrectly. The worst case for that patch is that we fail to offline CPUs when we should (the new code is strictly more conservative than the old), so is not particularly risky. Most of the rest is minor stuff; the EFI patches are all about exporting correct information to boot loaders and kexec" * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: EFI_MIXED should not prohibit loading above 4G x86/smpboot: Initialize secondary CPU only if master CPU will wait for it x86/smpboot: Log error on secondary CPU wakeup failure at ERR level x86: Fix list/memory corruption on CPU hotplug x86: irq: Get correct available vectors for cpu disable x86/efi: Do not export efi runtime map in case old map x86/efi: earlyprintk=efi,keep fix
2014-06-07	tools lib traceevent: Added support for __get_bitmask() macro	Steven Rostedt (Red Hat)
	Coming in v3.16, trace events will be able to save bitmasks in raw format in the ring buffer and output it with the __get_bitmask() macro. In order for userspace tools to parse this, it must be able to handle the __get_bitmask() call and be able to convert the data that's in the ring buffer into a nice bitmask format. The output is similar to what the kernel uses to print bitmasks, with a comma separator every 4 bytes (8 characters). This allows for cpumasks to also be saved efficiently. The first user is the thermal:thermal_power_limit event which has the following output: thermal_power_limit: cpus=0000000f freq=1900000 cdev_state=0 power=5252 Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.home Suggested-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Javi Merino <javi.merino@arm.com> Link: http://lkml.kernel.org/r/20140603032224.229186537@goodmis.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-07	tools lib traceevent: Add options to function plugin	Steven Rostedt (Red Hat)
	Add the options "parent" and "indent" to the function plugin. When parent is set, the output looks like this: function: fsnotify_modify <-- vfs_write function: zone_statistics <-- get_page_from_freelist function: __inc_zone_state <-- zone_statistics function: inotify_inode_queue_event <-- fsnotify_modify function: fsnotify_parent <-- fsnotify_modify function: __inc_zone_state <-- zone_statistics function: __fsnotify_parent <-- fsnotify_parent function: inotify_dentry_parent_queue_event <-- fsnotify_parent function: add_to_page_cache_lru <-- do_read_cache_page When it's not set, it looks like: function: fsnotify_modify function: zone_statistics function: __inc_zone_state function: inotify_inode_queue_event function: fsnotify_parent function: __inc_zone_state function: __fsnotify_parent function: inotify_dentry_parent_queue_event function: add_to_page_cache_lru When the otpion "indent" is not set, it looks like this: function: fsnotify_modify <-- vfs_write function: zone_statistics <-- get_page_from_freelist function: __inc_zone_state <-- zone_statistics function: inotify_inode_queue_event <-- fsnotify_modify function: fsnotify_parent <-- fsnotify_modify function: __inc_zone_state <-- zone_statistics function: __fsnotify_parent <-- fsnotify_parent function: inotify_dentry_parent_queue_event <-- fsnotify_parent function: add_to_page_cache_lru <-- do_read_cache_page Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20140603032224.056940410@goodmis.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-07	tools lib traceevent: Add options to plugins	Steven Rostedt
	The traceevent plugins allows developers to have their events print out information that is more advanced than what can be achieved by the trace event format files. As these plugins are used on the userspace side of the tracing tools, it is only logical that the tools should be able to produce different types of output for the events. The types of events still need to be defined by the plugins thus we need a way to pass information from the tool to the plugin to specify what type of information to be shown. Not only does the information need to be passed by the tool to plugin, but the plugin also requires a way to notify the tool of what options it can provide. This builds the plugin option infrastructure that is taken from trace-cmd that is used to allow plugins to produce different output based on the options specified by the tool. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20140603184154.0a4c031c@gandalf.local.home Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-07	tools lib traceevent: Add flag to not load event plugins	Steven Rostedt (Red Hat)
	Add a flag to pevent that will let the callers be able to set it and keep the system, and perhaps even normal plugins from being loaded. This is useful when plugins might hide certain information and seeing the raw events shows what may be going on. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20140603032223.678098063@goodmis.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-08	ceph: use truncate_pagecache() instead of truncate_inode_pages()	Yan, Zheng
	Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-07	cpufreq: governor: Be friendly towards latency-sensitive bursty workloads	Srivatsa S. Bhat
	Cpufreq governors like the ondemand governor calculate the load on the CPU periodically by employing deferrable timers. A deferrable timer won't fire if the CPU is completely idle (and there are no other timers to be run), in order to avoid unnecessary wakeups and thus save CPU power. However, the load calculation logic is agnostic to all this, and this can lead to the problem described below. Time (ms) CPU 1 100 Task-A running 110 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. 110.5 Task-A running 120 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. 125 Task-A went to sleep. With nothing else to do, CPU 1 went completely idle. 200 Task-A woke up and started running again. 200.5 Governor's deferred timer (which was originally programmed to fire at time 130) fires now. It calculates load for the time period 120 to 200.5, and finds the load is almost zero. Hence it decreases the CPU frequency to the minimum. 210 Governor's timer fires, finds load as 100% in the last 10ms interval and increases the CPU frequency. So, after the workload woke up and started running, the frequency was suddenly dropped to absolute minimum, and after that, there was an unnecessary delay of 10ms (sampling period) to increase the CPU frequency back to a reasonable value. And this pattern repeats for every wake-up-from-cpu-idle for that workload. This can be quite undesirable for latency- or response-time sensitive bursty workloads. So we need to fix the governor's logic to detect such wake-up-from- cpu-idle scenarios and start the workload at a reasonably high CPU frequency. One extreme solution would be to fake a load of 100% in such scenarios. But that might lead to undesirable side-effects such as frequency spikes (which might also need voltage changes) especially if the previous frequency happened to be very low. We just want to avoid the stupidity of dropping down the frequency to a minimum and then enduring a needless (and long) delay before ramping it up back again. So, let us simply carry forward the previous load - that is, let us just pretend that the 'load' for the current time-window is the same as the load for the previous window. That way, the frequency and voltage will continue to be set to whatever values they were set at previously. This means that bursty workloads will get a chance to influence the CPU frequency at which they wake up from cpu-idle, based on their past execution history. Thus, they might be able to avoid suffering from slow wakeups and long response-times. However, we should take care not to over-do this. For example, such a "copy previous load" logic will benefit cases like this: (where # represents busy and . represents idle) ##########.........#########.........###########...........##########........ but it will be detrimental in cases like the one shown below, because it will retain the high frequency (copied from the previous interval) even in a mostly idle system: ##########.........#.................#.....................#............... (i.e., the workload finished and the remaining tasks are such that their busy periods are smaller than the sampling interval, which causes the timer to always get deferred. So, this will make the copy-previous-load logic copy the initial high load to subsequent idle periods over and over again, thus keeping the frequency high unnecessarily). So, we modify this copy-previous-load logic such that it is used only once upon every wakeup-from-idle. Thus if we have 2 consecutive idle periods, the previous load won't get blindly copied over; cpufreq will freshly evaluate the load in the second idle interval, thus ensuring that the system comes back to its normal state. [ The right way to solve this whole problem is to teach the CPU frequency governors to also track load on a per-task basis, not just a per-CPU basis, and then use both the data sources intelligently to set the appropriate frequency on the CPUs. But that involves redesigning the cpufreq subsystem, so this patch should make the situation bearable until then. ] Experimental results: +-------------------+ I ran a modified version of ebizzy (called 'sleeping-ebizzy') that sleeps in between its execution such that its total utilization can be a user-defined value, say 10% or 20% (higher the utilization specified, lesser the amount of sleeps injected). This ebizzy was run with a single-thread, tied to CPU 8. Behavior observed with tracing (sample taken from 40% utilization runs): ------------------------------------------------------------------------ Without patch: ~~~~~~~~~~~~~~ kworker/8:2-12137 416.335742: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.335744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.345741: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.345744: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.345746: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.355738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 <snip> --------------------------------------------------------------------- <snip> <...>-40753 416.402202: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 <idle>-0 416.502130: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40753 416.505738: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.505739: cpu_frequency: state=2061000 cpu_id=8 kworker/8:2-12137 416.505741: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40753 416.515739: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-12137 416.515742: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-12137 416.515744: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy Observation: Ebizzy went idle at 416.402202, and started running again at 416.502130. But cpufreq noticed the long idle period, and dropped the frequency at 416.505739, only to increase it back again at 416.515742, realizing that the workload is in-fact CPU bound. Thus ebizzy needlessly ran at the lowest frequency for almost 13 milliseconds (almost 1 full sample period), and this pattern repeats on every sleep-wakeup. This could hurt latency-sensitive workloads quite a lot. With patch: ~~~~~~~~~~~ kworker/8:2-29802 464.832535: cpu_frequency: state=2061000 cpu_id=8 <snip> --------------------------------------------------------------------- <snip> kworker/8:2-29802 464.962538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.972533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 464.972536: cpu_frequency: state=4123000 cpu_id=8 kworker/8:2-29802 464.972538: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 464.982531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 <snip> --------------------------------------------------------------------- <snip> kworker/8:2-29802 465.022533: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.032531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.032532: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.035797: sched_switch: prev_comm=ebizzy ==> next_comm=swapper/8 <idle>-0 465.240178: sched_switch: prev_comm=swapper/8 ==> next_comm=ebizzy <...>-40738 465.242533: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 kworker/8:2-29802 465.242535: sched_switch: prev_comm=kworker/8:2 ==> next_comm=ebizzy <...>-40738 465.252531: sched_switch: prev_comm=ebizzy ==> next_comm=kworker/8:2 Observation: Ebizzy went idle at 465.035797, and started running again at 465.240178. Since ebizzy was the only real workload running on this CPU, cpufreq retained the frequency at 4.1Ghz throughout the run of ebizzy, no matter how many times ebizzy slept and woke-up in-between. Thus, ebizzy got the 10ms worth of 4.1 Ghz benefit during every sleep-wakeup (as compared to the run without the patch) and this boost gave a modest improvement in total throughput, as shown below. Sleeping-ebizzy records-per-second: ----------------------------------- Utilization Without patch With patch Difference (Absolute and % values) 10% 274767 277046 + 2279 (+0.829%) 20% 543429 553484 + 10055 (+1.850%) 40% 1090744 1107959 + 17215 (+1.578%) 60% 1634908 1662018 + 27110 (+1.658%) A rudimentary and somewhat approximately latency-sensitive workload such as sleeping-ebizzy itself showed a consistent, noticeable performance improvement with this patch. Hence, workloads that are truly latency-sensitive will benefit quite a bit from this change. Moreover, this is an overall win-win since this patch does not hurt power-savings at all (because, this patch does not reduce the idle time or idle residency; and the high frequency of the CPU when it goes to cpu-idle does not affect/hurt the power-savings of deep idle states). Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-06-07	arm, unwind, LLVMLinux: Enable clang to be used for unwinding the stack	Mark Charlebois
	Patch to prevent warning of a buggy compiler when using clang and the ARM_UNWIND option. Clang defines (at least on the current trunk) GNUC, GNUC_MINOR, and GNUC_PATCHLEVEL to 4, 2, and 1 respectively. This version of GCC gets flagged as buggy, but it isn't actually an issue with clang so the patch will do what it did before unless clang is defined and then it will not report the GCC version as an issue. Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com>
2014-06-07	ARM: LLVMLinux: Change "extern inline" to "static inline" in glue-cache.h	Behan Webster
	With compilers which follow the C99 standard (like modern versions of gcc and clang), "extern inline" does the wrong thing (emits code for an externally linkable version of the inline function). "static inline" is the correct choice instead. Author: Behan Webster <behanw@converseincode.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Reviewed-by: Mark Charlebois <charlebm@gmail.com>
2014-06-07	all: LLVMLinux: Change DWARF flag to support gcc and clang	Behan Webster
	Both gcc (well, actually gnu as) and clang support the "-Wa,-gdwarf-2" option (though clang does not support "-Wa,--gdwarf-2"). Since these flags are equivalent in meaning, this patch uses the one which is better supported across compilers. Signed-off-by: Behan Webster <behanw@converseincode.com>
2014-06-07	net: netfilter: LLVMLinux: vlais-netfilter	Mark Charlebois
	Replaced non-standard C use of Variable Length Arrays In Structs (VLAIS) in xt_repldata.h with a C99 compliant flexible array member and then calculated offsets to the other struct members. These other members aren't referenced by name in this code, however this patch maintains the same memory layout and padding as was previously accomplished using VLAIS. Had the original structure been ordered differently, with the entries VLA at the end, then it could have been a flexible member, and this patch would have been a lot simpler. However since the data stored in this structure is ultimately exported to userspace, the order of this structure can't be changed. This patch makes no attempt to change the existing behavior, merely the way in which the current layout is accomplished using standard C99 constructs. As such the code can now be compiled with either gcc or clang. This version of the patch removes the trailing alignment that the VLAIS structure would allocate in order to simplify the patch. Author: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Vinícius Tinti <viniciustinti@gmail.com>
2014-06-07	crypto: LLVMLinux: aligned-attribute.patch	Mark Charlebois
	__attribute__((aligned)) applies the default alignment for the largest scalar type for the target ABI. gcc allows it to be applied inline to a defined type. Clang only allows it to be applied to a type definition (PR11071). Making it into 2 lines makes it more readable and works with both compilers. Author: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com>
2014-06-07	x86/boot: EFI_MIXED should not prohibit loading above 4G	Matt Fleming
	commit 7d453eee36ae ("x86/efi: Wire up CONFIG_EFI_MIXED") introduced a regression for the functionality to load kernels above 4G. The relevant (incorrect) reasoning behind this change can be seen in the commit message, "The xloadflags field in the bzImage header is also updated to reflect that the kernel supports both entry points by setting both of XLF_EFI_HANDOVER_32 and XLF_EFI_HANDOVER_64 when CONFIG_EFI_MIXED=y. XLF_CAN_BE_LOADED_ABOVE_4G is disabled so that the kernel text is guaranteed to be addressable with 32-bits." This is obviously bogus since 32-bit EFI loaders will never place the kernel above the 4G mark. So this restriction is entirely unnecessary. But things are worse than that - since we want to encourage people to always compile with CONFIG_EFI_MIXED=y so that their kernels work out of the box for both 32-bit and 64-bit firmware, commit 7d453eee36ae effectively disables XLF_CAN_BE_LOADED_ABOVE_4G completely. Remove the overzealous and superfluous restriction and restore the XLF_CAN_BE_LOADED_ABOVE_4G functionality. Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1402140380-15377-1-git-send-email-matt@console-pimps.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-07	rtmutex: Detect changes in the pi lock chain	Thomas Gleixner
	When we walk the lock chain, we drop all locks after each step. So the lock chain can change under us before we reacquire the locks. That's harmless in principle as we just follow the wrong lock path. But it can lead to a false positive in the dead lock detection logic: T0 holds L0 T0 blocks on L1 held by T1 T1 blocks on L2 held by T2 T2 blocks on L3 held by T3 T4 blocks on L4 held by T4 Now we walk the chain lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> drop locks T2 times out and blocks on L0 Now we continue: lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all. Brad tried to work around that in the deadlock detection logic itself, but the more I looked at it the less I liked it, because it's crystal ball magic after the fact. We actually can detect a chain change very simple: lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> next_lock = T2->pi_blocked_on->lock; drop locks T2 times out and blocks on L0 Now we continue: lock T2 -> if (next_lock != T2->pi_blocked_on->lock) return; So if we detect that T2 is now blocked on a different lock we stop the chain walk. That's also correct in the following scenario: lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> next_lock = T2->pi_blocked_on->lock; drop locks T3 times out and drops L3 T2 acquires L3 and blocks on L4 now Now we continue: lock T2 -> if (next_lock != T2->pi_blocked_on->lock) return; We don't have to follow up the chain at that point, because T2 propagated our priority up to T4 already. [ Folded a cleanup patch from peterz ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Brad Mouring <bmouring@ni.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de Cc: stable@vger.kernel.org
2014-06-07	rtmutex: Handle deadlock detection smarter	Thomas Gleixner
	Even in the case when deadlock detection is not requested by the caller, we can detect deadlocks. Right now the code stops the lock chain walk and keeps the waiter enqueued, even on itself. Silly not to yell when such a scenario is detected and to keep the waiter enqueued. Return -EDEADLK unconditionally and handle it at the call sites. The futex calls return -EDEADLK. The non futex ones dequeue the waiter, throw a warning and put the task into a schedule loop. Tagged for stable as it makes the code more robust. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brad Mouring <bmouring@ni.com> Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-06-07	iio: Fix endianness issue in ak8975_read_axis()	Peter Meerwald
	i2c_smbus_read_word_data() does host endian conversion already, no need for le16_to_cpu() Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Cc: Stable@vger.kernel.org
2014-06-07	staging/iio: IIO_SIMPLE_DUMMY_BUFFER neds IIO_BUFFER	Arnd Bergmann
	An obviously missing 'select' statement, without this we get a link error Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2014-06-06	tracing: Fix memory leak on instance deletion	Steven Rostedt (Red Hat)
	When an instance is created, it also gets a snapshot ring buffer allocated (with minimum of pages). But when it is deleted the snapshot buffer is not. There was a helper function added to match the allocation of these ring buffers to a way to free them, but it wasn't used by the deletion of an instance. Using that helper function solves this memory leak. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06	net: qmi_wwan: add Olivetti Olicard modems	Bjørn Mork
	Lars writes: "I'm only 99% sure that the net interfaces are qmi interfaces, nothing to lose by adding them in my opinion." And I tend to agree based on the similarity with the two Olicard modems we already have here. Reported-by: Lars Melin <larsm17@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-06	Merge branch 'akpm' (patches from Andrew Morton) into next	Linus Torvalds
	Merge more updates from Andrew Morton: - Most of the rest of MM. This includes "mark remap_file_pages syscall as deprecated" but the actual "replace remap_file_pages syscall with emulation" is held back. I guess we'll need to work out when to pull the trigger on that one. - various minor cleanups to obscure filesystems - the drivers/rtc queue - hfsplus updates - ufs, hpfs, fatfs, affs, reiserfs - Documentation/ - signals - procfs - cpu hotplug - lib/idr.c - rapidio - sysctl - ipc updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (171 commits) ufs: sb mutex merge + mutex_destroy powerpc: update comments for generic idle conversion cris: update comments for generic idle conversion idle: remove cpu_idle() forward declarations nbd: zero from and len fields in NBD_CMD_DISCONNECT. mm: convert some level-less printks to pr_* MAINTAINERS: adi-buildroot-devel is moderated MAINTAINERS: add linux-api for review of API/ABI changes mm/kmemleak-test.c: use pr_fmt for logging fs/dlm/debug_fs.c: replace seq_printf by seq_puts fs/dlm/lockspace.c: convert simple_str to kstr fs/dlm/config.c: convert simple_str to kstr mm: mark remap_file_pages() syscall as deprecated mm: memcontrol: remove unnecessary memcg argument from soft limit functions mm: memcontrol: clean up memcg zoneinfo lookup mm/memblock.c: call kmemleak directly from memblock_(alloc\|free) mm/mempool.c: update the kmemleak stack trace for mempool allocations lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations mm: introduce kmemleak_update_trace() mm/kmemleak.c: use %u to print ->checksum ...
2014-06-06	mac802154: llsec: add forgotten list_del_rcu in key removal	Phoebe Buckheister
	During key removal, the key object is freed, but not taken out of the llsec key list properly. Fix that. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-06	net: use ethtool_cmd_speed_set helper to set ethtool speed value	Jiri Pirko
	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-06	net: use SPEED_UNKNOWN and DUPLEX_UNKNOWN when appropriate	Jiri Pirko
	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-06	svcrdma: Fence LOCAL_INV work requests	Steve Wise
	Fencing forces the invalidate to only happen after all prior send work requests have been completed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reported by : Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	svcrdma: refactor marshalling logic	Steve Wise
	This patch refactors the NFSRDMA server marshalling logic to remove the intermediary map structures. It also fixes an existing bug where the NFSRDMA server was not minding the device fast register page list length limitations. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2014-06-06	nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry	Jeff Layton
	Currently, the DRC cache pruner will stop scanning the list when it hits an entry that is RC_INPROG. It's possible however for a call to take a very long time. In that case, we don't want it to block other entries from being pruned if they are expired or we need to trim the cache to get back under the limit. Fix the DRC cache pruner to just ignore RC_INPROG entries. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	nfs4: remove unused CHANGE_SECURITY_LABEL	J. Bruce Fields
	This constant has the wrong value. And we don't use it. And it's been removed from the 4.2 spec anyway. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	nfsd4: kill READ64	J. Bruce Fields
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	nfsd4: kill READ32	J. Bruce Fields
	While we're here, let's kill off a couple of the read-side macros. Leaving the more complicated ones alone for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	nfsd4: simplify server xdr->next_page use	J. Bruce Fields
	The rpc code makes available to the NFS server an array of pages to encod into. The server represents its reply as an xdr buf, with the head pointing into the first page in that array, the pages ** array starting just after that, and the tail (if any) sharing any leftover space in the page used by the head. While encoding, we use xdr_stream->page_ptr to keep track of which page we're currently using. Currently we set xdr_stream->page_ptr to buf->pages, which makes the head a weird exception to the rule that page_ptr always points to the page we're currently encoding into. So, instead set it to buf->pages - 1 (the page actually containing the head), and remove the need for a little unintuitive logic in xdr_get_next_encode_buffer() and xdr_truncate_encode. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06	ufs: sb mutex merge + mutex_destroy	Fabian Frederick
	Commit 788257d6101d ("ufs: remove the BKL") replaced BKL with mutex protection using functions lock_ufs, unlock_ufs and struct mutex 'mutex' in sb_info. Commit b6963327e052 ("ufs: drop lock/unlock super") removed lock/unlock super and added struct mutex 's_lock' in sb_info. Those 2 mutexes are generally locked/unlocked at the same time except in allocation (balloc, ialloc). This patch merges the 2 mutexes and propagates first commit solution. It also adds mutex destruction before kfree during ufs_fill_super failure and ufs_put_super. [akpm@linux-foundation.org: avoid ifdefs, return -EROFS not -EINVAL] Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: "Chen, Jet" <jet.chen@intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	powerpc: update comments for generic idle conversion	Geert Uytterhoeven
	As of commit 799fef06123f ("powerpc: Use generic idle loop"), this applies to arch_cpu_idle() instead of cpu_idle(). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	cris: update comments for generic idle conversion	Geert Uytterhoeven
	As of commit 8dc7c5ecd8d0 ("cris: Use generic idle loop"), cris no longer provides cpu_idle(). - On cris-v10, etrax_gpio_wake_up_check() is called from default_idle() instead of cpu_idle(), - On cris-v32, etrax_gpio_wake_up_check() is not called from default_idle(), so remove this (copy-and-paste?) part. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	idle: remove cpu_idle() forward declarations	Geert Uytterhoeven
	After all architectures were converted to the generic idle framework, commit d190e8195b90 ("idle: Remove GENERIC_IDLE_LOOP config switch") removed the last caller of cpu_idle(). The forward declarations in header files were forgotten. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	nbd: zero from and len fields in NBD_CMD_DISCONNECT.	Hani Benhabiles
	Len field is already set to zero, but not the from field which is sent as 0xfffffffffffffe00. This makes no sense, and may cause confuse server implementations doing sanity checks (qemu-nbd is an example.) Signed-off-by: Hani Benhabiles <hani@linux.com> Cc: Paul Clements <paul.clements@us.sios.com> Cc: Paul Clements <Paul.Clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	mm: convert some level-less printks to pr_*	Mitchel Humpherys
	printk is meant to be used with an associated log level. There are some instances of printk scattered around the mm code where the log level is missing. Add a log level and adhere to suggestions by scripts/checkpatch.pl by moving to the pr_* macros. Also add the typical pr_fmt definition so that print statements can be easily traced back to the modules where they occur, correlated one with another, etc. This will require the removal of some (now redundant) prefixes on a few print statements. Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	MAINTAINERS: adi-buildroot-devel is moderated	Richard Weinberger
	Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	MAINTAINERS: add linux-api for review of API/ABI changes	Josh Triplett
	This makes it more likely that patch submitters will CC API/ABI changes to the linux-api list, and tools like get_maintainer.pl will do so automatically. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Michael Kerrisk <mtk.man-pages@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	mm/kmemleak-test.c: use pr_fmt for logging	Fabian Frederick
	Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	fs/dlm/debug_fs.c: replace seq_printf by seq_puts	Fabian Frederick
	Replace seq_printf where possible. This patch also fixes the following checkpatch warning "unnecessary whitespace before a quoted newline" Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	fs/dlm/lockspace.c: convert simple_str to kstr	Fabian Frederick
	Replace obsolete functions. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	fs/dlm/config.c: convert simple_str to kstr	Fabian Frederick
	Replace obsolete functions simple_strtoul/kstrtouint simple_strtol/kstrtoint (kstr __must_check requires the right function to be applied) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	mm: mark remap_file_pages() syscall as deprecated	Kirill A. Shutemov
	The remap_file_pages() system call is used to create a nonlinear mapping, that is, a mapping in which the pages of the file are mapped into a nonsequential order in memory. The advantage of using remap_file_pages() over using repeated calls to mmap(2) is that the former approach does not require the kernel to create additional VMA (Virtual Memory Area) data structures. Supporting of nonlinear mapping requires significant amount of non-trivial code in kernel virtual memory subsystem including hot paths. Also to get nonlinear mapping work kernel need a way to distinguish normal page table entries from entries with file offset (pte_file). Kernel reserves flag in PTE for this purpose. PTE flags are scarce resource especially on some CPU architectures. It would be nice to free up the flag for other usage. Fortunately, there are not many users of remap_file_pages() in the wild. It's only known that one enterprise RDBMS implementation uses the syscall on 32-bit systems to map files bigger than can linearly fit into 32-bit virtual address space. This use-case is not critical anymore since 64-bit systems are widely available. The plan is to deprecate the syscall and replace it with an emulation. The emulation will create new VMAs instead of nonlinear mappings. It's going to work slower for rare users of remap_file_pages() but ABI is preserved. One side effect of emulation (apart from performance) is that user can hit vm.max_map_count limit more easily due to additional VMAs. See comment for DEFAULT_MAX_MAP_COUNT for more details on the limit. [akpm@linux-foundation.org: fix spello] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Armin Rigo <arigo@tunes.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	mm: memcontrol: remove unnecessary memcg argument from soft limit functions	Johannes Weiner
	Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06	mm: memcontrol: clean up memcg zoneinfo lookup	Jianyu Zhan
	Memcg zoneinfo lookup sites have either the page, the zone, or the node id and zone index, but sites that only have the zone have to look up the node id and zone index themselves, whereas sites that already have those two integers use a function for a simple pointer chase. Provide mem_cgroup_zone_zoneinfo() that takes a zone pointer and let sites that already have node id and zone index - all for each node, for each zone iterators - use &memcg->nodeinfo[nid]->zoneinfo[zid]. Rename page_cgroup_zoneinfo() to mem_cgroup_page_zoneinfo() to match. Signed-off-by: Jianyu Zhan <nasa4836@gmail.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>