summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-03-23iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS conditionBaptiste Reynal
This patch is a fix to "iommu/arm-smmu: add support for iova_to_phys through ATS1PR". According to ARM documentation, translation registers are optional even in SMMUv1, so ID0_S1TS needs to be checked to verify their presence. Also, we check that the domain is a stage-1 domain. Signed-off-by: Baptiste Reynal <b.reynal@virtualopensystems.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-03-23x86/asm/entry/64: Fix incorrect commentDenys Vlasenko
The recent old_rsp -> rsp_scratch rename also changed this comment, but in this case "old_rsp" was not referring to PER_CPU(old_rsp). Fix this. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1427115839-6397-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23metag: Fix ioremap_wc/ioremap_cached build errorsJames Hogan
When ioremap_wc() or ioremap_cached() are used without first including asm/pgtable.h, the _PAGE_CACHEABLE or _PAGE_WR_COMBINE definitions aren't found, resulting in build errors like the following (in next-20150323 due to "lib: devres: add a helper function for ioremap_wc"): lib/devres.c: In function ‘devm_ioremap_wc’: lib/devres.c:91: error: ‘_PAGE_WR_COMBINE’ undeclared We can't easily include asm/pgtable.h in asm/io.h due to dependency problems, so split out the _PAGE_* definitions from asm/pgtable.h into a separate asm/pgtable-bits.h header (as a couple of other architectures already do), and include that in io.h instead. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Cc: Abhilash Kesavan <a.kesavan@samsung.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-23parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BITHelge Deller
Make the code which sets up the pmd depend on PT_NLEVELS == 3, not on CONFIG_64BIT. The reason is, that a 64bit kernel with a page size greater than 4k doesn't need the pmd and thus has PT_NLEVELS = 2. Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23parisc: mm: don't count preallocated pmdsMikulas Patocka
The patch dc6c9a35b66b520cf67e05d8ca60ebecad3b0479 that counts pmds allocated for a process introduced a bug on 64-bit PA-RISC kernels. The PA-RISC architecture preallocates one pmd with each pgd. This preallocated pmd can never be freed - pmd_free does nothing when it is called with this pmd. When the kernel attempts to free this preallocated pmd, it decreases the count of allocated pmds. The result is that the counter underflows and this error is reported. This patch fixes the bug by artifically incrementing the counter in pmd_free when the kernel tries to free the preallocated pmd. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23x86/asm/entry: Replace some open-coded VM86 checks with v8086_mode() checksAndy Lutomirski
This allows us to remove some unnecessary ifdefs. There should be no change to the generated code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/f7e00f0d668e253abf0bd8bf36491ac47bd761ff.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Remove user_mode_vm()Andy Lutomirski
It has no callers anymore. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a594afd6a0bddb1311bd7c92a15201c87fbb8681.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()'Andy Lutomirski
user_mode_vm() and user_mode() are now the same. Change all callers of user_mode_vm() to user_mode(). The next patch will remove the definition of user_mode_vm. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/43b1f57f3df70df5a08b0925897c660725015554.1426728647.git.luto@kernel.org [ Merged to a more recent kernel. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Make user_mode() work correctly if regs came from VM86 modeAndy Lutomirski
user_mode() is now identical to user_mode_vm(). Subsequent patches will change all callers of user_mode_vm() to user_mode() and then delete user_mode_vm(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0dd03eacb5f0a2b5ba0240de25347a31b493c289.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Use user_mode_ignore_vm86() where appropriateAndy Lutomirski
A few of the user_mode() checks in traps.c are immediately after explicit checks for vm86 mode. Change them to user_mode_ignore_vm86(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/0b324d5b75c3402be07f8d3c6245ed7f4995029e.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry, perf: Explicitly optimize vm86 handling in code_segment_base()Andy Lutomirski
There's no point in checking the VM bit on 64-bit, and, since we're explicitly checking it, we can use user_mode_ignore_vm86() after the check. While we're at it, rearrange the #ifdef slightly to make the code flow a bit clearer. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/dc1457a734feccd03a19bb3538a7648582f57cdd.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Add user_mode_ignore_vm86()Andy Lutomirski
user_mode() is dangerous and user_mode_vm() has a confusing name. Add user_mode_ignore_vm86() (equivalent to current user_mode()). We'll change the small number of legitimate users of user_mode() to user_mode_ignore_vm86(). Inspired by grsec, although this works rather differently. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/202c56ca63823c338af8e2e54948dbe222da6343.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23Merge tag 'v4.0-rc5' into x86/asm, to resolve conflictsIngo Molnar
Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23parisc: Add compile-time check when adding new syscallsHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2015-03-23sched/rt: Use IPI to trigger RT task push migration instead of pullingSteven Rostedt
When debugging the latencies on a 40 core box, where we hit 300 to 500 microsecond latencies, I found there was a huge contention on the runqueue locks. Investigating it further, running ftrace, I found that it was due to the pulling of RT tasks. The test that was run was the following: cyclictest --numa -p95 -m -d0 -i100 This created a thread on each CPU, that would set its wakeup in iterations of 100 microseconds. The -d0 means that all the threads had the same interval (100us). Each thread sleeps for 100us and wakes up and measures its latencies. cyclictest is maintained at: git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rt-tests.git What happened was another RT task would be scheduled on one of the CPUs that was running our test, when the other CPU tests went to sleep and scheduled idle. This caused the "pull" operation to execute on all these CPUs. Each one of these saw the RT task that was overloaded on the CPU of the test that was still running, and each one tried to grab that task in a thundering herd way. To grab the task, each thread would do a double rq lock grab, grabbing its own lock as well as the rq of the overloaded CPU. As the sched domains on this box was rather flat for its size, I saw up to 12 CPUs block on this lock at once. This caused a ripple affect with the rq locks especially since the taking was done via a double rq lock, which means that several of the CPUs had their own rq locks held while trying to take this rq lock. As these locks were blocked, any wakeups or load balanceing on these CPUs would also block on these locks, and the wait time escalated. I've tried various methods to lessen the load, but things like an atomic counter to only let one CPU grab the task wont work, because the task may have a limited affinity, and we may pick the wrong CPU to take that lock and do the pull, to only find out that the CPU we picked isn't in the task's affinity. Instead of doing the PULL, I now have the CPUs that want the pull to send over an IPI to the overloaded CPU, and let that CPU pick what CPU to push the task to. No more need to grab the rq lock, and the push/pull algorithm still works fine. With this patch, the latency dropped to just 150us over a 20 hour run. Without the patch, the huge latencies would trigger in seconds. I've created a new sched feature called RT_PUSH_IPI, which is enabled by default. When RT_PUSH_IPI is not enabled, the old method of grabbing the rq locks and having the pulling CPU do the work is implemented. When RT_PUSH_IPI is enabled, the IPI is sent to the overloaded CPU to do a push. To enabled or disable this at run time: # mount -t debugfs nodev /sys/kernel/debug # echo RT_PUSH_IPI > /sys/kernel/debug/sched_features or # echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched_features Update: This original patch would send an IPI to all CPUs in the RT overload list. But that could theoretically cause the reverse issue. That is, there could be lots of overloaded RT queues and one CPU lowers its priority. It would then send an IPI to all the overloaded RT queues and they could then all try to grab the rq lock of the CPU lowering its priority, and then we have the same problem. The latest design sends out only one IPI to the first overloaded CPU. It tries to push any tasks that it can, and then looks for the next overloaded CPU that can push to the source CPU. The IPIs stop when all overloaded CPUs that have pushable tasks that have priorities greater than the source CPU are covered. In case the source CPU lowers its priority again, a flag is set to tell the IPI traversal to restart with the first RT overloaded CPU after the source CPU. Parts-suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Joern Engel <joern@purestorage.com> Cc: Clark Williams <williams@redhat.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150318144946.2f3cc982@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23irq_work: Fix build failure when CONFIG_IRQ_WORK is not definedSteven Rostedt
When CONFIG_IRQ_WORK is not defined (difficult to do, as it also requires CONFIG_PRINTK not to be defined), we get a build failure: kernel/built-in.o: In function `flush_smp_call_function_queue': kernel/smp.c:263: undefined reference to `irq_work_run' kernel/smp.c:263: undefined reference to `irq_work_run' Makefile:933: recipe for target 'vmlinux' failed Simplest thing to do is to make irq_work_run() a nop when not set. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150319101851.4d224d9b@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23Merge branch 'sched/urgent' into sched/core, to pick up fixes before ↵Ingo Molnar
applying new patches Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23timers/tick/broadcast-hrtimer: Fix suspicious RCU usage in idle loopPreeti U Murthy
The hrtimer mode of broadcast queues hrtimers in the idle entry path so as to wakeup cpus in deep idle states. The associated call graph is : cpuidle_idle_call() |____ clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, ....)) |_____tick_broadcast_set_event() |____clockevents_program_event() |____bc_set_next() The hrtimer_{start/cancel} functions call into tracing which uses RCU. But it is not legal to call into RCU in cpuidle because it is one of the quiescent states. Hence protect this region with RCU_NONIDLE which informs RCU that the cpu is momentarily non-idle. As an aside it is helpful to point out that the clock event device that is programmed here is not a per-cpu clock device; it is a pseudo clock device, used by the broadcast framework alone. The per-cpu clock device programming never goes through bc_set_next(). Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: linuxppc-dev@ozlabs.org Cc: mpe@ellerman.id.au Cc: tglx@linutronix.de Link: http://lkml.kernel.org/r/20150318104705.17763.56668.stgit@preeti.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23lockdep: Fix the module unload key range freeing logicPeter Zijlstra
Module unload calls lockdep_free_key_range(), which removes entries from the data structures. Most of the lockdep code OTOH assumes the data structures are append only; in specific see the comments in add_lock_to_list() and look_up_lock_class(). Clearly this has only worked by accident; make it work proper. The actual scenario to make it go boom would involve the memory freed by the module unlock being re-allocated and re-used for a lock inside of a rcu-sched grace period. This is a very unlikely scenario, still better plug the hole. Use RCU list iteration in all places and ammend the comments. Change lockdep_free_key_range() to issue a sync_sched() between removal from the lists and returning -- which results in the memory being freed. Further ensure the callers are placed correctly and comment the requirements. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Tsyvarev <tsyvarev@ispras.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23sched: Fix RLIMIT_RTTIME when PI-boosting to RTBrian Silverman
When non-realtime tasks get priority-inheritance boosted to a realtime scheduling class, RLIMIT_RTTIME starts to apply to them. However, the counter used for checking this (the same one used for SCHED_RR timeslices) was not getting reset. This meant that tasks running with a non-realtime scheduling class which are repeatedly boosted to a realtime one, but never block while they are running realtime, eventually hit the timeout without ever running for a time over the limit. This patch resets the realtime timeslice counter when un-PI-boosting from an RT to a non-RT scheduling class. I have some test code with two threads and a shared PTHREAD_PRIO_INHERIT mutex which induces priority boosting and spins while boosted that gets killed by a SIGXCPU on non-fixed kernels but doesn't with this patch applied. It happens much faster with a CONFIG_PREEMPT_RT kernel, and does happen eventually with PREEMPT_VOLUNTARY kernels. Signed-off-by: Brian Silverman <brian@peloton-tech.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: austin@peloton-tech.com Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1424305436-6716-1-git-send-email-brian@peloton-tech.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23perf: Fix irq_work 'tail' recursionPeter Zijlstra
Vince reported a watchdog lockup like: [<ffffffff8115e114>] perf_tp_event+0xc4/0x210 [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160 [<ffffffff810b7f10>] lock_release+0x130/0x260 [<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40 [<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80 [<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0 [<ffffffff811f71ce>] send_sigio+0xae/0x100 [<ffffffff811f72b7>] kill_fasync+0x97/0xf0 [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0 [<ffffffff8115d103>] perf_pending_event+0x33/0x60 [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80 [<ffffffff8114e448>] irq_work_run+0x18/0x40 [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0 [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80 Which is caused by an irq_work generating new irq_work and therefore not allowing forward progress. This happens because processing the perf irq_work triggers another perf event (tracepoint stuff) which in turn generates an irq_work ad infinitum. Avoid this by raising the recursion counter in the irq_work -- which effectively disables all software events (including tracepoints) from actually triggering again. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/boot: Standardize strcmp()Arjun Sreedharan
strcmp() is always expected to return 0 when arguments are equal, negative when its first argument @str1 is less than its second argument @str2 and a positive value otherwise. Previously strcmp("a", "b") returned 1. Now it gives -1, as it is supposed to. Until now this bug never triggered, because all uses for strcmp() in the boot code tested for nonzero: triton:~/tip> git grep strcmp arch/x86/boot/ arch/x86/boot/boot.h:int strcmp(const char *str1, const char *str2); arch/x86/boot/edd.c: if (!strcmp(eddarg, "skipmbr") || !strcmp(eddarg, "skip")) { arch/x86/boot/edd.c: else if (!strcmp(eddarg, "off")) arch/x86/boot/edd.c: else if (!strcmp(eddarg, "on")) should in the future strcmp() be used in a comparative way in the boot code, it might have led to (not so subtle) bugs. Signed-off-by: Arjun Sreedharan <arjun024@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1426520267-1803-1-git-send-email-arjun024@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/cpu/cacheinfo: Fix cache_get_priv_group() for Intel processorsSudeep Holla
The private pointer provided by the cacheinfo code is used to implement the AMD L3 cache-specific attributes using a pointer to the northbridge descriptor. It is needed for performing L3-specific operations and for that we need a couple of PCI devices and other service information, all contained in the northbridge descriptor. This results in failure of cacheinfo setup as shown below as cache_get_priv_group() returns the uninitialised private attributes which are not valid for Intel processors. ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1 at fs/sysfs/group.c:102 internal_create_group+0x151/0x280() sysfs: (bin_)attrs not set by subsystem for group: index3/ Modules linked in: CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc3+ #1 Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014 ... Call Trace: dump_stack warn_slowpath_common warn_slowpath_fmt internal_create_group sysfs_create_groups device_add cpu_device_create ? __kmalloc cache_add_dev cacheinfo_sysfs_init ? container_dev_init do_one_initcall kernel_init_freeable ? rest_init kernel_init ret_from_fork ? rest_init This patch fixes the issue by checking if the L3 cache indices are populated correctly (AMD-specific) before initializing the private attributes. Reported-by: Borislav Petkov <bp@suse.de> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/mce: Reindent __mcheck_cpu_apply_quirks() properlyBorislav Petkov
Had some strange 3 tabs + 2 chars indentation, probably from me. Fix it. No code changed: # arch/x86/kernel/cpu/mcheck/mce.o: text data bss dec hex filename 21371 5923 264 27558 6ba6 mce.o.before 21371 5923 264 27558 6ba6 mce.o.after md5: eb3996c84d15e08ed836f043df2cbb01 mce.o.before.asm eb3996c84d15e08ed836f043df2cbb01 mce.o.after.asm Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/mce: Use safe MSR accesses for AMD quirkJesse Larrew
Certain MSRs are only relevant to a kernel in host mode, and kvm had chosen not to implement these MSRs at all for guests. If a guest kernel ever tried to access these MSRs, the result was a general protection fault. KVM will be separately patched to return 0 when these MSRs are read, and this patch ensures that MSR accesses are tolerant of exceptions. Signed-off-by: Jesse Larrew <jesse.larrew@amd.com> [ Drop {} braces around loop ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Joel Schopp <joel.schopp@amd.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/1426262619-5016-1-git-send-email-jesse.larrew@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Kill eager_fpu_init_bp()Oleg Nesterov
Now that eager_fpu_init_bp() does setup_init_fpu_buf() only and nothing else, we can remove it and move this code into its "caller", eager_fpu_init(). This avoids the confusing games with "static __refdata void (*boot_func)": init_xstate_buf can be NULL only during boot, so it is safe to call the __init-annotated setup_init_fpu_buf() function in eager_fpu_init(), we just need to mark it as __init_refok. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150314151334.GC13029@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Don't allocate fpu->state for swapper/0Oleg Nesterov
Now that kthreads do not use FPU until they get executed, swapper/0 doesn't need to allocate fpu->state. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150313182716.GB8249@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Rename drop_init_fpu() to fpu_reset_state()Borislav Petkov
Call it what it does and in accordance with the context where it is used: we reset the FPU state either because we were unable to restore it from the one saved in the task or because we simply want to reset it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Fold __drop_fpu() into its sole userBorislav Petkov
Fold it into drop_fpu(). Phew, one less FPU function to pay attention to. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Don't abuse drop_init_fpu() in flush_thread()Oleg Nesterov
flush_thread() -> drop_init_fpu() is suboptimal and confusing. It does drop_fpu() or restore_init_xstate() depending on !use_eager_fpu(). But flush_thread() too checks eagerfpu right after that, and if it is true then restore_init_xstate() just burns CPU for no reason. We are going to load init_xstate_buf again after we set used_math()/user_has_fpu(), until then the FPU state can't survive after switch_to(). Remove it, and change the "if (!use_eager_fpu())" to call drop_fpu(). While at it, clean up the tsk/current usage. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150313173030.GA31217@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Use restore_init_xstate() instead of math_state_restore() on ↵Oleg Nesterov
kthread exec Change flush_thread() to do user_fpu_begin() and restore_init_xstate() instead of math_state_restore(). Note: "TODO: cleanup this horror" is still valid. We do not need init_fpu() at all, we only need fpu_alloc() and memset(0). But this needs other changes, in particular user_fpu_begin() should set used_math(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150311173449.GE5032@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Introduce restore_init_xstate()Oleg Nesterov
Extract the "use_eager_fpu()" code from drop_init_fpu() into a new, simple helper restore_init_xstate(). The next patch adds another user. - It is not clear why we do not check use_fxsr() like fpu_restore_checking() does. eager_fpu_init_bp() calls setup_init_fpu_buf() too, and we have the "eagerfpu=on" kernel option. - Ignoring the fact that init_xstate_buf is "struct xsave_struct *", not "union thread_xstate *", it is not clear why we can not simply use fpu_restore_checking() and avoid the code duplication. - It is not clear why we can't call setup_init_fpu_buf() unconditionally to always create init_xstate_buf(). Then do_device_not_available() path (at least) could use restore_init_xstate() too. It doesn't need to init fpu->state, its content doesn't matter until unlazy_fpu()/__switch_to()/etc which overwrites this memory anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150311173429.GD5032@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/fpu: Document user_fpu_begin()Oleg Nesterov
Currently, user_fpu_begin() has a single caller and it is not clear why do we actually need it and why we should not worry about preemption right after preempt_enable(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150311173409.GC5032@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23Merge tag 'v4.0-rc5' into x86/fpu, to prevent conflictsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry, perf: Fix incorrect TIF_IA32 check in code_segment_base()Andy Lutomirski
We want to check whether user code is in 32-bit mode, not whether the task is nominally 32-bit. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/33e5107085ce347a8303560302b15c2cadd62c4c.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/mm/fault: Use TASK_SIZE_MAX in is_prefetch()Andy Lutomirski
This is slightly shorter and slightly faster. It's also more correct: the split between user and kernel addresses is TASK_SIZE_MAX, regardless of ti->flags. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brad Spengler <spender@grsecurity.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/09156b63bad90a327827003c9e53faa82ef4c56e.1426728647.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23x86/asm/entry: Fix execve() and sigreturn() syscalls to always return via IRETBrian Gerst
Both the execve() and sigreturn() family of syscalls have the ability to change registers in ways that may not be compatabile with the syscall path they were called from. In particular, SYSRET and SYSEXIT can't handle non-default %cs and %ss, and some bits in eflags. These syscalls have stubs that are hardcoded to jump to the IRET path, and not return to the original syscall path. The following commit: 76f5df43cab5e76 ("Always allocate a complete "struct pt_regs" on the kernel stack") recently changed this for some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit program to a 64-bit program would fail because it still returned via SYSRETL. This caused Wine to fail when built for both 32-bit and 64-bit. This patch sets TIF_NOTIFY_RESUME for execve() and sigreturn() so that the IRET path is always taken on exit to userspace. Signed-off-by: Brian Gerst <brgerst@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1426978461-32089-1-git-send-email-brgerst@gmail.com [ Improved the changelog and comments. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-23powerpc/book3s: Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLERMahesh Salgaonkar
commit id 2ba9f0d has changed CONFIG_KVM_BOOK3S_64_HV to tristate to allow HV/PR bits to be built as modules. But the MCE code still depends on CONFIG_KVM_BOOK3S_64_HV which is wrong. When user selects CONFIG_KVM_BOOK3S_64_HV=m to build HV/PR bits as a separate module the relevant MCE code gets excluded. This patch fixes the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER. This makes sure that the relevant MCE code is included when HV/PR bits are built as a separate modules. Fixes: 2ba9f0d88750 ("kvm: powerpc: book3s: Support building HV and PR KVM as module") Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-23Merge tag 'iwlwifi-for-kalle-2014-03-22' of ↵Kalle Valo
https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes * avoid panic with lots of IBSS stations * Fix dvm's behavior after suspend resume * Allow to keep connection after CSA failure * Remove a noisy by harmless WARN_ON * New device IDs
2015-03-22Linux 4.0-rc5v4.0-rc5Linus Torvalds
2015-03-22Merge tag 'md/4.0-rc4-fix' of git://neil.brown.name/mdLinus Torvalds
Pull bugfix for md from Neil Brown: "One fix for md in 4.0-rc4 Regression in recent patch causes crash on error path" * tag 'md/4.0-rc4-fix' of git://neil.brown.name/md: md: fix problems with freeing private data after ->run failure.
2015-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix missing initialization of tuple structure in nfnetlink_cthelper to avoid mismatches when looking up to attach userspace helpers to flows, from Ian Wilson. 2) Fix potential crash in nft_hash when we hit -EAGAIN in nft_hash_walk(), from Herbert Xu. 3) We don't need to indicate the hook information to update the basechain default policy in nf_tables. 4) Restore tracing over nfnetlink_log due to recent rework to accomodate logging infrastructure into nf_tables. 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY. 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and REJECT6 from xt over nftables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-22Merge tag 'driver-core-4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two bugfixes for things reported. One regression in kernfs, and another issue fixed in the LZ4 code that was fixed in the "upstream" codebase that solves a reported kernel crash Both have been in linux-next for a while" * tag 'driver-core-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: LZ4 : fix the data abort issue kernfs: handle poll correctly on 'direct_read' files.
2015-03-22Merge tag 'char-misc-4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are three fixes for 4.0-rc5 that revert 3 PCMCIA patches that were merged in 4.0-rc1 that cause regressions. So let's revert them for now and they will be reworked and resent sometime in the future. All have been tested in linux-next for a while" * tag 'char-misc-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Revert "pcmcia: add a new resource manager for non ISA systems" Revert "pcmcia: fix incorrect bracketing on a test" Revert "pcmcia: add missing include for new pci resource handler"
2015-03-22Merge tag 'staging-4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are four small staging driver fixes, all for the vt6656 and vt6655 drivers, that resolve some reported issues with them. All of these patches have been in linux next for a while" * tag 'staging-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: vt6655: Fix late setting of byRFType. vt6655: RFbSetPower fix missing rate RATE_12M staging: vt6656: vnt_rf_setpower: fix missing rate RATE_12M staging: vt6655: vnt_tx_packet fix dma_idx selection.
2015-03-22Merge tag 'tty-4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fix from Greg KH: "Here's a single 8250 serial driver that fixes a reported deadlock with the serial console and the tty driver. It's been in linux-next for a while now" * tag 'tty-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250_dw: Fix deadlock in LCR workaround
2015-03-22Merge tag 'usb-4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / PHY driver fixes from Greg KH: "Here's a number of USB and PHY driver fixes for 4.0-rc5. The largest thing here is a revert of a gadget function driver patch that removes 500 lines of code. Other than that, it's a number of reported bugs fixes and new quirk/id entries. All have been in linux-next for a while" * tag 'usb-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits) usb: common: otg-fsm: only signal connect after switching to peripheral uas: Add US_FL_NO_ATA_1X for Initio Corporation controllers / devices USB: ehci-atmel: rework clk handling MAINTAINERS: add entry for USB OTG FSM usb: chipidea: otg: add a_alt_hnp_support response for B device phy: omap-usb2: Fix missing clk_prepare call when using old dt name phy: ti/omap: Fix modalias phy: core: Fixup return value of phy_exit when !pm_runtime_enabled phy: miphy28lp: Convert to devm_kcalloc and fix wrong sizof phy: miphy365x: Convert to devm_kcalloc and fix wrong sizeof phy: twl4030-usb: Remove redundant assignment for twl->linkstat phy: exynos5-usbdrd: Fix off-by-one valid value checking for args->args[0] phy: Find the right match in devm_phy_destroy() phy: rockchip-usb: Fixup rockchip_usb_phy_power_on failure path phy: ti-pipe3: Simplify ti_pipe3_dpll_wait_lock implementation phy: samsung-usb2: Remove NULL terminating entry from phys array phy: hix5hd2-sata: Check return value of platform_get_resource phy: exynos-dp-video: Kill exynos_dp_video_phy_pwr_isol function Revert "usb: gadget: zero: Add support for interrupt EP" Revert "xhci: Clear the host side toggle manually when endpoint is 'soft reset'" ...
2015-03-22netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is setPablo Neira Ayuso
ip6tables extensions check for this flag to restrict match/target to a given protocol. Without this flag set, SYNPROXY6 returns an error. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net>
2015-03-22can: flexcan: Deferred on Regulator return EPROBE_DEFERAndreas Werner
Return EPROBE_DEFER if Regulator returns EPROBE_DEFER If the Flexcan driver is built into kernel and a regulator is used to enable the CAN transceiver, the Flexcan driver may not use the regulator. When initializing the Flexcan device with a regulator defined in the device tree, but not initialized, the regulator subsystem returns EPROBE_DEFER, hence the Flexcan init fails. The solution for this is to return EPROBE_DEFER if regulator is not initialized and wait until the regulator is initialized. Signed-off-by: Andreas Werner <kernel@andy89.org> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-03-22can: flexcan: fix bus-off error state handling.Andri Yngvason
Making sure that the bus-off state gets passed to can_change_state(). Signed-off-by: Andri Yngvason <andri.yngvason@marel.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>