linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-07-03	Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm	Linus Torvalds
	Pull ARM updates from Russell King: "This contains the usual updates from other people (listed below) and the usual random muddle of miscellaneous ARM updates which cover some low priority bug fixes and performance improvements. I've started to put the pull request wording into the merge commits, which are: - NoMMU stuff: This includes the following series sent earlier to the list: - nommu-fixes - R7 Support - MPU support I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I were discussing today until we've reached a conclusion/that's had some more review. This is rebased (and re-tested) on your devel-stable branch because otherwise there were going to be conflicts with Uwe's V7M work now that you've merged that. I've included the fix for limiting MPU to CPU_V7. - Huge page support These changes bring both HugeTLB support and Transparent HugePage (THP) support to ARM. Only long descriptors (LPAE) are supported in this series. The code has been tested on an Arndale board (Exynos 5250). - LPAE updates Please pull these miscellaneous LPAE fixes I've been collecting for a while now for 3.11. They've been tested and reviewed by quite a few people, and most of the patches are pretty trivial. -- Will Deacon. - arch_timer cleanups Please pull these arch_timer cleanups I've been holding onto for a while. They're the same as my last posting, but have been rebased to v3.10-rc3. - mpidr linearisation (multiprocessor id register - identifies which CPU number we are in the system) This patch series that implements MPIDR linearization through a simple hashing algorithm and updates current cpu_{suspend}/{resume} code to use the newly created hash structures to retrieve context pointers. It represents a stepping stone for the implementation of power management code on forthcoming multi-cluster ARM systems. It has been tested on TC2 (dual cluster A15xA7 system), iMX6q, OMAP4 and Tegra, with processors hitting low-power states requiring warm-boot resume through the cpu_resume code path" * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits) ARM: 7775/1: mm: Remove do_sect_fault from LPAE code ARM: 7777/1: Avoid extra calls to the C compiler ARM: 7774/1: Fix dtb dependency to use order-only prerequisites ARM: 7770/1: remove residual ARMv2 support from decompressor ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator ARM: 7767/1: let the ASID allocator handle suspended animation ARM: 7766/1: versatile: don't mark pen as __INIT ARM: 7765/1: perf: Record the user-mode PC in the call chain. ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork ARM: kernel: implement stack pointer save array through MPIDR hashing ARM: kernel: build MPIDR hash function data structure ARM: mpu: Ensure that MPU depends on CPU_V7 ARM: mpu: protect the vectors page with an MPU region ARM: mpu: Allow enabling of the MPU via kconfig ARM: 7758/1: introduce config HAS_BANDGAP ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting ARM: 7751/1: zImage: don't overwrite ourself with a page table ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace ...
2013-07-03	HID: kye: Add report fixup for Genius Gila Gaming mouse	Benjamin Tissoires
	Genius Gila Gaming Mouse presents an obviously wrong report descriptor. the Consumer control (report ID 3) is the following: 0x05, 0x0c, // Usage Page (Consumer Devices) 105 0x09, 0x01, // Usage (Consumer Control) 107 0xa1, 0x01, // Collection (Application) 109 0x85, 0x03, // Report ID (3) 111 0x19, 0x00, // Usage Minimum (0) 113 0x2a, 0xff, 0x7f, // Usage Maximum (32767) 115 0x15, 0x00, // Logical Minimum (0) 118 0x26, 0xff, 0x7f, // Logical Maximum (32767) 120 0x75, 0x10, // Report Size (16) 123 0x95, 0x03, // Report Count (3) 125 0x81, 0x00, // Input (Data,Arr,Abs) 127 0x75, 0x08, // Report Size (8) 129 0x95, 0x01, // Report Count (1) 131 0x81, 0x01, // Input (Cnst,Arr,Abs) 133 0xc0, // End Collection 135 So the first input whithin this report has a count of 3 but a usage range of 32768. So this value is obviously wrong as it should not be greater than the report count. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=959721 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-07-03	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second set of VFS changes from Al Viro: "Assorted f_pos race fixes, making do_splice_direct() safe to call with i_mutex on parent, O_TMPFILE support, Jeff's locks.c series, ->d_hash/->d_compare calling conventions changes from Linus, misc stuff all over the place." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) Document ->tmpfile() ext4: ->tmpfile() support vfs: export lseek_execute() to modules lseek_execute() doesn't need an inode passed to it block_dev: switch to fixed_size_llseek() cpqphp_sysfs: switch to fixed_size_llseek() tile-srom: switch to fixed_size_llseek() proc_powerpc: switch to fixed_size_llseek() ubi/cdev: switch to fixed_size_llseek() pci/proc: switch to fixed_size_llseek() isapnp: switch to fixed_size_llseek() lpfc: switch to fixed_size_llseek() locks: give the blocked_hash its own spinlock locks: add a new "lm_owner_key" lock operation locks: turn the blocked_list into a hashtable locks: convert fl_link to a hlist_node locks: avoid taking global lock if possible when waking up blocked waiters locks: protect most of the file_lock handling with i_lock locks: encapsulate the fl_link list handling locks: make "added" in __posix_lock_file a bool ...
2013-07-03	posix_timers: fix racy timer delta caching on task exit	Frederic Weisbecker
	When a task exits, we perform a caching of the remaining cputime delta before expiring of its timers. This is done from the following places: * When the task is reaped. We iterate through its list of posix cpu timers and store the remaining timer delta to the timer struct instead of the absolute value. (See posix_cpu_timers_exit() / posix_cpu_timers_exit_group() ) * When we call posix_cpu_timer_get() or posix_cpu_timer_schedule(). If the timer's task is considered dying when watched from these places, the same conversion from absolute to relative expiry time is performed. Then the given task's reference is released. (See clear_dead_task() ). The relevance of this caching is questionable but this is another and deeper debate. The big issue here is that these two sources of caching don't mix up very well together. More specifically, the caching can easily be done twice, resulting in a wrong delta as it gets spuriously substracted a second time by the elapsed clock. This can happen in the following scenario: 1) The task exits and gets reaped: we call posix_cpu_timers_exit() and the absolute timer expiry values are converted to a relative delta. 2) timer_gettime() -> posix_cpu_timer_get() is called and relies on clear_dead_task() because tsk->exit_state == EXIT_DEAD. The delta gets substracted again by the elapsed clock and we return a wrong result. To fix this, just remove the caching done on task reaping time. It doesn't bring much value on its own. The caching done from posix_cpu_timer_get/schedule is enough. And it would also be hard to get it really right: we could make it put and clear the target task in the timer struct so that readers know if they are dealing with a relative cached of absolute value. But it would be racy. The only safe way to do it would be to lock the itimer->it_lock so that we know nobody reads the cputime expiry value while we modify it and its target task reference. Doing so would involve some funny workarounds to avoid circular lock against the sighand lock. There is just no reason to maintain this. The user visible effect of this patch can be observed by running the following code: it creates a subthread that launches a posix cputimer which expires after 10 seconds. But then the subthread only busy loops for 2 seconds and exits. The parent reaps the subthread and read the timer value. Its expected value should the be the initial timer's expiration value minus the cputime elapsed in the subthread. Roughly 10 - 2 = 8 seconds: #include <sys/time.h> #include <stdio.h> #include <unistd.h> #include <time.h> #include <pthread.h> static timer_t id; static struct itimerspec val = { .it_value.tv_sec = 10, }, new; static void thread(void unused) { int err; struct timeval start, end, diff; timer_create(CLOCK_THREAD_CPUTIME_ID, NULL, &id); if (err < 0) { perror("Can't create timer\n"); return NULL; } /* Arm 10 sec timer / err = timer_settime(id, 0, &val, NULL); if (err < 0) { perror("Can't set timer\n"); return NULL; } / Exit after 2 seconds of execution / gettimeofday(&start, NULL); do { gettimeofday(&end, NULL); timersub(&end, &start, &diff); } while (diff.tv_sec < 2); return NULL; } int main(int argc, char argv) { pthread_t pthread; int err; err = pthread_create(&pthread, NULL, thread, NULL); if (err) { perror("Can't create thread\n"); return -1; } pthread_join(pthread, NULL); / Just wait a little bit to make sure the child got reaped */ sleep(1); err = timer_gettime(id, &new); if (err) perror("Can't get timer value\n"); printf("%d %ld\n", new.it_value.tv_sec, new.it_value.tv_nsec); return 0; } Before the patch: $ ./posix_cpu_timers 6 2278074 After the patch: $ ./posix_cpu_timers 8 1158766 Before the patch, the elapsed time got two more seconds spuriously accounted. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	posix-timers: correctly get dying task time sample in posix_cpu_timer_schedule()	Frederic Weisbecker
	In order to re-arm a timer after it fired, we take a sample of the current process or thread cputime. If the task is dying though, we don't arm anything but we cache the remaining timer expiration delta for further reads. Something similar is performed in posix_cpu_timer_get() but here we forget to take the process wide cputime sample before caching it. As a result we are storing random stack content, leading every further reads of that timer to return junk values. Fix this by taking the appropriate sample in the case of process wide timers. This probably doesn't matter much in practice because, at this stage, the thread is the last one in the group and we reached exit_notify(). This implies that we called exit_itimers() and there should be no more timers to handle for that task. So this is likely dead code anyway but let's fix the current logic and the warning that came along: kernel/posix-cpu-timers.c: In function 'posix_cpu_timer_schedule': kernel/posix-cpu-timers.c:1127: warning: 'now' may be used uninitialized in this function Then we can start to think further about cleaning up that code. Reported-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Chen Gang <gang.chen@asianux.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	selftests: add basic posix timers selftests	Frederic Weisbecker
	Add some initial basic tests on a few posix timers interface such as setitimer() and timer_settime(). These simply check that expiration happens in a reasonable timeframe after expected elapsed clock time (user time, user + system time, real time, ...). This is helpful for finding basic breakages while hacking on this subsystem. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	posix_cpu_timers: consolidate expired timers check	Frederic Weisbecker
	Consolidate the common code amongst per thread and per process timers list on tick time. List traversal, expiry check and subsequent updates can be shared in a common helper. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	posix_cpu_timers: consolidate timer list cleanups	Frederic Weisbecker
	Cleaning up the posix cpu timers on task exit shares some common code among timer list types, most notably the list traversal and expiry time update. Unify this in a common helper. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	posix_cpu_timer: consolidate expiry time type	Frederic Weisbecker
	The posix cpu timer expiry time is stored in a union of two types: a 64 bits field if we rely on scheduler precise accounting, or a cputime_t if we rely on jiffies. This results in quite some duplicate code and special cases to handle the two types. Just unify this into a single 64 bits field. cputime_t can always fit into it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Olivier Langlois <olivier@trillion01.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-07-03	tools/include: use stdint types for user-space byteshift headers	Yaakov Selkowitz
	Commit a07f7672d7cf0ff0d6e548a9feb6e0bd016d9c6c added user-space copies of the byteshift headers to be used by hostprogs, changing e.g. u8 to __u8. However, in order to cross-compile the kernel from a non-Linux system, stdint.h types need to be used instead of linux/types.h types. Signed-off-by: Yaakov Selkowitz <yselkowitz@users.sourceforge.net> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	scripts: Coccinelle script for pci_free_consistent()	strnape1@fel.cvut.cz
	Created coccinelle script for reporting missing pci_free_consistent() calls. Signed-off-by: Petr Strnad <strnape1@fel.cvut.cz> Signed-off-by: Nicolas Palix <nicolas.palix@imag.fr> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	Coccinelle: Update the documentation	Nicolas Palix
	- The new default mode is 'report'. - The available modes are detailed a bit more. - Some information about the use of spatch options are also given concerning the use of indexing tools. Signed-off-by: Nicolas Palix <nicolas.palix@imag.fr> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	Coccinelle: Update section of MAINTAINERS	Nicolas Palix
	Add Documentation/coccinelle.txt in the COCCINELLE section. Signed-off-by: Nicolas Palix <nicolas.palix@imag.fr> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	coccicheck: span checks across CPUs	Kees Cook
	This adds parallelism by default to the "coccicheck" target using spatch's "-max" and "-index" arguments. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nicolas Palix <nicolas.palix@imag.fr> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	Makefile: Fix install error with make -j option	Robert Richter
	Make modules_install fails with -j option: DEPMOD Usage: .../.source/linux/scripts/depmod.sh /sbin/depmod <kernelrelease> make[1]: *** [_modinst_post] Error 1 Adding kernelrelease dependency to fix this. Signed-off-by: Robert Richter <robert.richter@calxeda.com> Cc: <stable@vger.kernel.org> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	Fix a build warning in scripts/mod/file2alias.c	Daniel Tang
	On some systems, __used is already defined in sys/cdefs.h and causes a build warning: scripts/mod/file2alias.c:85:1: warning: "__used" redefined In file included from /usr/include/stdio.h:64, from scripts/mod/modpost.h:1, from scripts/mod/file2alias.c:13: /usr/include/sys/cdefs.h:146:1: warning: this is the location of the previous definition This adds an extra check before defining the __used macro to see if the macro was already defined elsewhere. Signed-off-by: Daniel Tang <dt.tangr@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2013-07-03	Document ->tmpfile()	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-03	ext4: ->tmpfile() support	Al Viro
	very similar to ext3 counterpart... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-03	vfs: export lseek_execute() to modules	Jie Liu
	For those file systems(btrfs/ext4/ocfs2/tmpfs) that support SEEK_DATA/SEEK_HOLE functions, we end up handling the similar matter in lseek_execute() to update the current file offset to the desired offset if it is valid, ceph also does the simliar things at ceph_llseek(). To reduce the duplications, this patch make lseek_execute() public accessible so that we can call it directly from the underlying file systems. Thanks Dave Chinner for this suggestion. [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back] v2->v1: - Add kernel-doc comments for lseek_execute() - Call lseek_execute() in ceph->llseek() Signed-off-by: Jie Liu <jeff.liu@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Mason <chris.mason@fusionio.com> Cc: Josef Bacik <jbacik@fusionio.com> Cc: Ben Myers <bpm@sgi.com> Cc: Ted Tso <tytso@mit.edu> Cc: Hugh Dickins <hughd@google.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Sage Weil <sage@inktank.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-03	ALSA: vmaster: Fix the regression of missing vmaster hook call	Takashi Iwai
	The commit [1ca2f2ec: ALSA: vmaster: Add snd_ctl_sync_vmaster() helper function] changed master_put() function and the check for the required vmaster hook call is wrongly performed now, which results in the missing hook call upon "Master Playback Switch" value changes. This patch corrects the check logic. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-07-03	regulator: s5m8767: Update s5m8767-regulator bindings document	Sachin Kamat
	s5m8767 regulator is used on Exynos platforms which use pin controller to configure GPIOs. Update the example accordingly. [This had previously been broken by code changes which failed to update the documentation -- broonie] Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Mark Brown <broonie@linaro.org>
2013-07-02	net: lls fix build with allnoconfig	Eliezer Tamir
	correct placeholder declarations to prevent build breakage when !CONFIG_NET_LL_RX_POLL Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02	Input: tps6507x-ts - select INPUT_POLLDEV	Dmitry Torokhov
	Since the driver was converted to polled device infrastructure we need to make sure it is enabled. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	Input: bcm5974 - add support for the 2013 MacBook Air	Dmitry Torokhov
	The June 2013 Macbook Air (13'') has a new trackpad protocol; four new values are inserted in the header, and the mode switch is no longer needed. This patch adds support for the new devices. Cc: stable@vger.kernel.org Reported-and-tested-by: Brad Ford <plymouthffl@gmail.com> Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	HID: apple: Add support for the 2013 Macbook Air	Dmitry Torokhov
	This patch adds keyboard support for MacbookAir6,2 as WELLSPRING8 (0x0291, 0x0292, 0x0293). The touchpad is handled in a separate bcm5974 patch, as usual. Cc: stable@vger.kernel.org Reported-and-tested-by: Brad Ford <plymouthffl@gmail.com> Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	Input: cyttsp4 - leak on error path in probe()	Dan Carpenter
	We leak "cd" if the cd->xfer_buf allocation fails. It was weird to "goto error_gpio_irq" so I changed the label name. (Label names should reflect the label location not the goto location otherwise you get an "all roads lead to Rome problem"). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	Input: cyttsp4 - silence NULL dereference warning	Dan Carpenter
	If "cd" were NULL then we would dereference it when we print the error message. Fortunately enough, it can't ever be NULL so we can remove those lines. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	Input: cyttsp4 - silence shift wrap warning	Dan Carpenter
	"*max" is a size_t (long) type but "1" is an int so static checkers complain that the shift could wrap. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-07-02	Merge branch 'for-3.11-cpuset' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cpuset changes from Tejun Heo: "cpuset has always been rather odd about its configurations - a cgroup right after creation didn't allow any task executions before configuration, changing configuration in the parent modifies the descendants irreversibly and so on. These behaviors are inherently nasty and almost hostile against sharing the hierarchy with other controllers making it very difficult to use in unified hierarchy. Li is currently in the process of updating the behaviors for __DEVEL__sane_behavior which is the bulk of changes in this pull request. It isn't complete yet and the behaviors will change further but all changes are gated behind sane_behavior. In the process, the rather hairy work-item punting which was used to work around the limitations of cgroup descendant iterator was simplified." * 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: rename @cont to @cgrp cpuset: fix to migrate mm correctly in a corner case cpuset: allow to move tasks to empty cpusets cpuset: allow to keep tasks in empty cpusets cpuset: introduce effective_{cpumask\|nodemask}_cpuset() cpuset: record old_mems_allowed in struct cpuset cpuset: remove async hotplug propagation work cpuset: let hotplug propagation work wait for task attaching cpuset: re-structure update_cpumask() a bit cpuset: remove cpuset_test_cpumask() cpuset: remove unnecessary variable in cpuset_attach() cpuset: cleanup guarantee_online_{cpus\|mems}() cpuset: remove redundant check in cpuset_cpus_allowed_fallback()
2013-07-02	Merge branch 'for-3.11' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "This pull request contains the following changes. - cgroup_subsys_state (css) reference counting has been converted to percpu-ref. css is what each resource controller embeds into its own control structure and perform reference count against. It may be used in hot paths of various subsystems and is similar to module refcnt in that aspect. For example, block-cgroup's css refcnting was showing up a lot in Mikulaus's device-mapper scalability work and this should alleviate it. - cgroup subtree iterator has been updated so that RCU read lock can be released after grabbing reference. This allows simplifying its users which requires blocking which used to build iteration list under RCU read lock and then traverse it outside. This pull request contains simplification of cgroup core and device-cgroup. A separate pull request will update cpuset. - Fixes for various bugs including corner race conditions and RCU usage bugs. - A lot of cleanups and some prepartory work for the planned unified hierarchy support." * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (48 commits) cgroup: CGRP_ROOT_SUBSYS_BOUND should also be ignored when mounting an existing hierarchy cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options cgroup: fix deadlock on cgroup_mutex via drop_parsed_module_refcounts() cgroup: always use RCU accessors for protected accesses cgroup: fix RCU accesses around task->cgroups cgroup: fix RCU accesses to task->cgroups cgroup: grab cgroup_mutex in drop_parsed_module_refcounts() cgroup: fix cgroupfs_root early destruction path cgroup: reserve ID 0 for dummy_root and 1 for unified hierarchy cgroup: implement for_each_[builtin_]subsys() cgroup: move init_css_set initialization inside cgroup_mutex cgroup: s/for_each_subsys()/for_each_root_subsys()/ cgroup: clean up find_css_set() and friends cgroup: remove cgroup->actual_subsys_mask cgroup: prefix global variables with "cgroup_" cgroup: convert CFTYPE_* flags to enums cgroup: rename cont to cgrp cgroup: clean up cgroup_serial_nr_cursor cgroup: convert cgroup_cft_commit() to use cgroup_for_each_descendant_pre() cgroup: make serial_nr_cursor available throughout cgroup.c ...
2013-07-02	Merge branch 'libata/for-3.10-fixes' into libata/for-3.11	Tejun Heo
	libata/for-3.10-fixes never got submitted during v3.10 cycle. Merge it into for-3.11 so that it can be routed together with other changes scheduled for v3.11. Three trivial conflicts in drivers/ata/sata_rcar.c. All are caused by 1b20f6a9ad ("sata_rcar: add 'base' local variable to some functions") conflicting with logic updates in for-3.10-fixes. The offending commit simply adds local variable @base on functions which dereferences sata_rcar_priv->base multiple times. The resolutions are trivial - applying s/priv->base/base/ in the conflicting logic updates. Signed-off-by: Tejun Heo <tj@kernel.org>
2013-07-02	Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	Linus Torvalds
	Pull workqueue changes from Tejun Heo: "Surprisingly, Lai and I didn't break too many things implementing custom pools and stuff last time around and there aren't any follow-up changes necessary at this point. The only change in this pull request is Viresh's patches to make some per-cpu workqueues to behave as unbound workqueues dependent on a boot param whose default can be configured via a config option. This leads to higher processing overhead / lower bandwidth as more work items are bounced across CPUs; however, it can lead to noticeable powersave in certain configurations - ~10% w/ idlish constant workload on a big.LITTLE configuration according to Viresh. This is because per-cpu workqueues interfere with how the scheduler perceives whether or not each CPU is idle by forcing pinned tasks on them, which makes the scheduler's power-aware scheduling decisions less effective. Its effectiveness is likely less pronounced on homogenous configurations and this type of optimization can probably be made automatic; however, the changes are pretty minimal and the affected workqueues are clearly marked, so it's an easy gain for some configurations for the time being with pretty unintrusive changes." * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: fbcon: queue work on power efficient wq block: queue work on power efficient wq PHYLIB: queue work on system_power_efficient_wq workqueue: Add system wide power_efficient workqueues workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented workqueues
2013-07-02	Merge branch 'for-3.11' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull per-cpu changes from Tejun Heo: "This pull request contains Kent's per-cpu reference counter. It has gone through several iterations since the last time and the dynamic allocation is gone. The usual usage is relatively straight-forward although async kill confirm interface, which is not used int most cases, is somewhat icky. There also are some interface concerns - e.g. I'm not sure about passing in @relesae callback during init as that becomes funny when we later implement synchronous kill_and_drain - but nothing too serious and it's quite useable now. cgroup_subsys_state refcnting has already been converted and we should convert module refcnt (Kent?)" * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu-refcount: use RCU-sched insted of normal RCU percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm() percpu-refcount: implement percpu_ref_cancel_init() percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu() percpu-refcount: cosmetic updates percpu-refcount: consistently use plain (non-sched) RCU percpu-refcount: Don't use silly cmpxchg() percpu: implement generic percpu refcounting
2013-07-03	module: cleanup call chain.	Rusty Russell
	Fold alloc_module_percpu into percpu_modalloc(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-07-03	module: do percpu allocation after uniqueness check. No, really!	Rusty Russell
	v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting percpu memory on large machines: Now we have a new state MODULE_STATE_UNFORMED, we can insert the module into the list (and thus guarantee its uniqueness) before we allocate the per-cpu region. In my defence, it didn't actually say the patch did this. Just that we "can". This patch actually does it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Jim Hull <jim.hull@hp.com> Cc: stable@kernel.org # 3.8
2013-07-02	tracing: Make tracing_open_generic_{tr,tc}() static	Steven Rostedt (Red Hat)
	I have patches that will use tracing_open_generic_tr/tc() in other files, but as they are not ready to be merged yet, and Fengguang Wu's sparse scripts pointed out that these functions were not declared anywhere, I'll make them static for now. When these functions are required to be used elsewhere, I'll remove the static then. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	tracing: Remove ftrace() function	zhangwei(Jovi)
	The only caller of function ftrace(...) was removed a long time ago, so remove the function body as well. Link: http://lkml.kernel.org/r/1365564393-10972-10-git-send-email-jovi.zhangwei@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	tracing: Remove TRACE_EVENT_TYPE enum definition	zhangwei(Jovi)
	TRACE_EVENT_TYPE enum is not used at present, remove it. Link: http://lkml.kernel.org/r/1365564393-10972-8-git-send-email-jovi.zhangwei@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	tracing: Make tracer_tracing_{off,on,is_on}() static	Steven Rostedt (Red Hat)
	I have patches that will use tracer_tracing_on/off/is_on() in other files, but as they are not ready to be merged yet, and Fengguang Wu's sparse scripts pointed out that these functions were not declared anywhere, I'll make them static for now. When these functions are required to be used elsewhere, I'll remove the static then. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	tracing: Fix irqs-off tag display in syscall tracing	zhangwei(Jovi)
	All syscall tracing irqs-off tags are wrong, the syscall enter entry doesn't disable irqs. [root@jovi tracing]#echo "syscalls:sys_enter_open" > set_event [root@jovi tracing]# cat trace # tracer: nop # # entries-in-buffer/entries-written: 13/13 #P:2 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| irqbalance-513 [000] d... 56115.496766: sys_open(filename: 804e1a6, flags: 0, mode: 1b6) irqbalance-513 [000] d... 56115.497008: sys_open(filename: 804e1bb, flags: 0, mode: 1b6) sendmail-771 [000] d... 56115.827982: sys_open(filename: b770e6d1, flags: 0, mode: 1b6) The reason is syscall tracing doesn't record irq_flags into buffer. The proper display is: [root@jovi tracing]#echo "syscalls:sys_enter_open" > set_event [root@jovi tracing]# cat trace # tracer: nop # # entries-in-buffer/entries-written: 14/14 #P:2 # # _-----=> irqs-off # / _----=> need-resched # \| / _---=> hardirq/softirq # \|\| / _--=> preempt-depth # \|\|\| / delay # TASK-PID CPU# \|\|\|\| TIMESTAMP FUNCTION # \| \| \| \|\|\|\| \| \| irqbalance-514 [001] .... 46.213921: sys_open(filename: 804e1a6, flags: 0, mode: 1b6) irqbalance-514 [001] .... 46.214160: sys_open(filename: 804e1bb, flags: 0, mode: 1b6) <...>-920 [001] .... 47.307260: sys_open(filename: 4e82a0c5, flags: 80000, mode: 0) Link: http://lkml.kernel.org/r/1365564393-10972-3-git-send-email-jovi.zhangwei@huawei.com Cc: stable@vger.kernel.org # 2.6.35 Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	uprobes: Fix return value in error handling path	zhangwei(Jovi)
	When wrong argument is passed into uprobe_events it does not return an error: [root@jovi tracing]# echo 'p:myprobe /bin/bash' > uprobe_events [root@jovi tracing]# The proper response is: [root@jovi tracing]# echo 'p:myprobe /bin/bash' > uprobe_events -bash: echo: write error: Invalid argument Link: http://lkml.kernel.org/r/51B964FF.5000106@huawei.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: <srikar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # 3.5+ Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	tracing: Fix race between deleting buffer and setting events	Steven Rostedt (Red Hat)
	While analyzing the code, I discovered that there's a potential race between deleting a trace instance and setting events. There are a few races that can occur if events are being traced as the buffer is being deleted. Mostly the problem comes with freeing the descriptor used by the trace event callback. To prevent problems like this, the events are disabled before the buffer is deleted. The problem with the current solution is that the event_mutex is let go between disabling the events and freeing the files, which means that the events could be enabled again while the freeing takes place. Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02	ip_tunnels: Use skb-len to PMTU check.	Pravin B Shelar
	In path mtu check, ip header total length works for gre device but not for gre-tap device. Use skb len which is consistent for all tunneling types. This is old bug in gre. This also fixes mtu calculation bug introduced by commit c54419321455631079c7d (GRE: Refactor GRE tunneling code). Reported-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-03	md/raid10: fix two bugs affecting RAID10 reshape.	NeilBrown
	1/ If a RAID10 is being reshaped to a fewer number of devices and is stopped while this is ongoing, then when the array is reassembled the 'mirrors' array will be allocated too small. This will lead to an access error or memory corruption. 2/ A sanity test for a reshaping RAID10 array is restarted is slightly incorrect. Due to the first bug, this is suitable for any -stable kernel since 3.5 where this code was introduced. Cc: stable@vger.kernel.org (v3.5+) Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-03	md: remove doubled description for sync_max, merging it within sync_min/sync_max	CoolCold
	After last md.txt edits for sync_min/max, sync_max description became doubled. Removing 1st copy, merging details into common sync_min/sync_max section. Signed-off-by: Roman Ovchinnikov <coolthecold@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
2013-07-02	Merge branch 'l2tp_seq'	David S. Miller
	James Chapman says: ==================== L2TP data sequence numbers, if enabled, ensure in-order delivery. A receiver may reorder data packets, or simply drop out-of-sequence packets. If reordering is not enabled, the current implementation does not handle data packet loss correctly, which can result in a stalled L2TP session datapath as soon as the first packet is lost. Most L2TP users either disable sequence numbers or enable data packet reordering when sequence numbers are used to circumvent the issue. This patch series fixes the problem, and makes the L2TP sequence number handling RFC-compliant. v2 incorporates string format changes requested by sergei.shtylyov@cogentembedded.com. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02	l2tp: make datapath resilient to packet loss when sequence numbers enabled	James Chapman
	If L2TP data sequence numbers are enabled and reordering is not enabled, data reception stops if a packet is lost since the kernel waits for a sequence number that is never resent. (When reordering is enabled, data reception restarts when the reorder timeout expires.) If no reorder timeout is set, we should count the number of in-sequence packets after the out-of-sequence (OOS) condition is detected, and reset sequence number state after a number of such packets are received. For now, the number of in-sequence packets while in OOS state which cause the sequence number state to be reset is hard-coded to 5. This could be configurable later. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02	l2tp: make datapath sequence number support RFC-compliant	James Chapman
	The L2TP datapath is not currently RFC-compliant when sequence numbers are used in L2TP data packets. According to the L2TP RFC, any received sequence number NR greater than or equal to the next expected NR is acceptable, where the "greater than or equal to" test is determined by the NR wrap point. This differs for L2TPv2 and L2TPv3, so add state in the session context to hold the max NR value and the NR window size in order to do the acceptable sequence number value check. These might be configurable later, but for now we derive it from the tunnel L2TP version, which determines the sequence number field size. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02	l2tp: do data sequence number handling in a separate func	James Chapman
	This change moves some code handling data sequence numbers into a separate function to avoid too much indentation. This is to prepare for some changes to data sequence number handling in subsequent patches. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-02	Merge branch 'x86-uv-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 UV update from Ingo Molnar: "There's a single commit in this tree, which adds support for a new SGI UV GRU (Global Reference Unit - fast NUMA messaging ASIC) hardware feature to scale up and beyond: an optional distributed mode that will allow per-node address mapping of local GRU space, as opposed to mapping all GRU hardware to the same contiguous high space" * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/UV: Add GRU distributed mode mappings