summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-08-29x86/idt: Remove the tracing IDT completelyThomas Gleixner
No more users of the tracing IDT. All exception tracepoints have been moved into the regular handlers. Get rid of the mess which shouldn't have been created in the first place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064957.378851687@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/smp: Use static key for reschedule interrupt tracingThomas Gleixner
It's worth to avoid the extra irq_enter()/irq_exit() pair in the case that the reschedule interrupt tracepoints are disabled. Use the static key which indicates that exception tracing is enabled. For now this key is global. It will be optimized in a later step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170828064957.299808677@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/smp: Remove pointless duplicated interrupt codeThomas Gleixner
Two NOP5s are really a good tradeoff vs. the unholy IDT switching mess, which duplicates code all over the place. The rescheduling interrupt gets optimized in a later step. Make the ordering of function call and statistics increment the same as in other places. Calculate stats first, then do the function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064957.222101344@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/mce: Remove duplicated tracing interrupt codeThomas Gleixner
Machine checks are not really high frequency events. The extra two NOP5s for the disabled tracepoints are noise vs. the heavy lifting which needs to be done in the MCE handler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064957.144301907@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irqwork: Get rid of duplicated tracing interrupt codeThomas Gleixner
Two NOP5s are a reasonable tradeoff to avoid duplicated code and the requirement to switch the IDT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064957.064746737@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/apic: Remove the duplicated tracing versions of interruptsThomas Gleixner
The error and the spurious interrupt are really rare events and not at all performance sensitive: two NOP5s can be tolerated when tracing is disabled. Remove the complication. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170828064956.986009402@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irq: Get rid of duplicated trace_x86_platform_ipi() codeThomas Gleixner
Two NOP5s are really a good tradeoff vs. the unholy IDT switching mess, which duplicates code all over the place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170828064956.907209383@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/apic: Use this_cpu_ptr() in local_timer_interrupt()Thomas Gleixner
Accessing the per cpu data via per_cpu(, smp_processor_id()) is pointless. Use this_cpu_ptr() instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.829552757@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/apic: Remove the duplicated tracing version of local_timer_interrupt()Thomas Gleixner
The two NOP5s are noise in the rest of the work which is done by the timer interrupt and modern CPUs are pretty good in optimizing NOPs anyway. Get rid of the interrupt handler duplication and move the tracepoints into the regular handler. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170828064956.751247330@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/traps: Simplify pagefault tracing logicThomas Gleixner
Make use of the new irqvector tracing static key and remove the duplicated trace_do_pagefault() implementation. If irq vector tracing is disabled, then the overhead of this is a single NOP5, which is a reasonable tradeoff to avoid duplicated code and the unholy macro mess. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.672965407@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/tracing: Introduce a static key for exception tracingThomas Gleixner
Switching the IDT just for avoiding tracepoints creates a completely impenetrable macro/inline/ifdef mess. There is no point in avoiding tracepoints for most of the traps/exceptions. For the more expensive tracepoints, like pagefaults, this can be handled with an explicit static key. Preparatory patch to remove the tracing IDT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.593094539@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/boot: Move EISA setup to a separate fileThomas Gleixner
EISA has absolutely nothing to do with traps, so move it out of traps.c into its own eisa.c file. Furthermore, the EISA bus detection does not need to run during very early boot, it's good enough to run it before the EISA bus and drivers are initialized. I.e. instead of calling it from the very early trap_init() code, make it a subsys_initcall(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.515322409@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irq: Remove duplicated used_vectors definitionThomas Gleixner
Also remove the unparseable comment in the other place while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.436711634@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irq: Get rid of the 'first_system_vector' indirection bogosityThomas Gleixner
This variable is beyond pointless. Nothing allocates a vector via alloc_gate() below FIRST_SYSTEM_VECTOR. So nothing can change first_system_vector. If there is a need for a gate below FIRST_SYSTEM_VECTOR then it can be added to the vector defines and FIRST_SYSTEM_VECTOR can be adjusted accordingly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.357109735@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irq: Unexport used_vectors[]Thomas Gleixner
No modular users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.278375986@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29x86/irq: Remove vector_used_by_percpu_irq()Thomas Gleixner
Last user (lguest) is gone. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20170828064956.201432430@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29Merge branch 'linus' into x86/apic, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-28page waitqueue: always add new entries at the endLinus Torvalds
Commit 3510ca20ece0 ("Minor page waitqueue cleanups") made the page queue code always add new waiters to the back of the queue, which helps upcoming patches to batch the wakeups for some horrid loads where the wait queues grow to thousands of entries. However, I forgot about the nasrt add_page_wait_queue() special case code that is only used by the cachefiles code. That one still continued to add the new wait queue entries at the beginning of the list. Fix it, because any sane batched wakeup will require that we don't suddenly start getting new entries at the beginning of the list that we already handled in a previous batch. [ The current code always does the whole list while holding the lock, so wait queue ordering doesn't matter for correctness, but even then it's better to add later entries at the end from a fairness standpoint ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-28cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configsTejun Heo
When !NUMA, cpumask_of_node(@node) equals cpu_online_mask regardless of @node. The assumption seems that if !NUMA, there shouldn't be more than one node and thus reporting cpu_online_mask regardless of @node is correct. However, that assumption was broken years ago to support DISCONTIGMEM and whether a system has multiple nodes or not is separately controlled by NEED_MULTIPLE_NODES. This means that, on a system with !NUMA && NEED_MULTIPLE_NODES, cpumask_of_node() will report cpu_online_mask for all possible nodes, indicating that the CPUs are associated with multiple nodes which is an impossible configuration. This bug has been around forever but doesn't look like it has caused any noticeable symptoms. However, it triggers a WARN recently added to workqueue to verify NUMA affinity configuration. Fix it by reporting empty cpumask on non-zero nodes if !NUMA. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-28ARCv2: SMP: Mask only private-per-core IRQ lines on boot at core intcAlexey Brodkin
Recent commit a8ec3ee861b6 "arc: Mask individual IRQ lines during core INTC init" breaks interrupt handling on ARCv2 SMP systems. That commit masked all interrupts at onset, as some controllers on some boards (customer as well as internal), would assert interrutps early before any handlers were installed. For SMP systems, the masking was done at each cpu's core-intc. Later, when the IRQ was actually requested, it was unmasked, but only on the requesting cpu. For "common" interrupts, which were wired up from the 2nd level IDU intc, this was as issue as they needed to be enabled on ALL the cpus (given that IDU IRQs are by default served Round Robin across cpus) So fix that by NOT masking "common" interrupts at core-intc, but instead at the 2nd level IDU intc (latter already being done in idu_of_init()) Fixes: a8ec3ee861b6 ("arc: Mask individual IRQ lines during core INTC init") Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> [vgupta: reworked changelog, removed the extraneous idu_irq_mask_raw()] Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-28fs/select: Fix memory corruption in compat_get_fd_set()Helge Deller
Commit 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()") changed the calculation on how many bytes need to be zeroed when userspace handed over a NULL pointer for a fdset array in the select syscall. The calculation was changed in compat_get_fd_set() wrongly from memset(fdset, 0, ((nr + 1) & ~1)*sizeof(compat_ulong_t)); to memset(fdset, 0, ALIGN(nr, BITS_PER_LONG)); The ALIGN(nr, BITS_PER_LONG) calculates the number of _bits_ which need to be zeroed in the target fdset array (rounded up to the next full bits for an unsigned long). But the memset() call expects the number of _bytes_ to be zeroed. This leads to clearing more memory than wanted (on the stack area or even at kmalloc()ed memory areas) and to random kernel crashes as we have seen them on the parisc platform. The correct change should have been memset(fdset, 0, (ALIGN(nr, BITS_PER_LONG) / BITS_PER_LONG) * BYTES_PER_LONG); which is the same as can be archieved with a call to zero_fd_set(nr, fdset). Fixes: 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()" Acked-by:: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-28Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreamingLinus Torvalds
Pull c6x tweaks from Mark Salter. * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: c6x: Convert to using %pOF instead of full_name c6x: defconfig: Cleanup from old Kconfig options
2017-08-27Linux 4.13-rc7Linus Torvalds
2017-08-27Merge tag 'iommu-fixes-v4.13-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Another fix, this time in common IOMMU sysfs code. In the conversion from the old iommu sysfs-code to the iommu_device_register interface, I missed to update the release path for the struct device associated with an IOMMU. It freed the 'struct device', which was a pointer before, but is now embedded in another struct. Freeing from the middle of allocated memory had all kinds of nasty side effects when an IOMMU was unplugged. Unfortunatly nobody unplugged and IOMMU until now, so this was not discovered earlier. The fix is to make the 'struct device' a pointer again" * tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Fix wrong freeing of iommu_device->dev
2017-08-27Merge tag 'char-misc-4.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fix from Greg KH: "Here is a single misc driver fix for 4.13-rc7. It resolves a reported problem in the Android binder driver due to previous patches in 4.13-rc. It's been in linux-next with no reported issues" * tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: ANDROID: binder: fix proc->tsk check.
2017-08-27Merge tag 'staging-4.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/iio fixes from Greg KH: "Here are few small staging driver fixes, and some more IIO driver fixes for 4.13-rc7. Nothing major, just resolutions for some reported problems. All of these have been in linux-next with no reported problems" * tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: magnetometer: st_magn: remove ihl property for LSM303AGR iio: magnetometer: st_magn: fix status register address for LSM303AGR iio: hid-sensor-trigger: Fix the race with user space powering up sensors iio: trigger: stm32-timer: fix get trigger mode iio: imu: adis16480: Fix acceleration scale factor for adis16480 PATCH] iio: Fix some documentation warnings staging: rtl8188eu: add RNX-N150NUB support Revert "staging: fsl-mc: be consistent when checking strcmp() return" iio: adc: stm32: fix common clock rate iio: adc: ina219: Avoid underflow for sleeping time iio: trigger: stm32-timer: add enable attribute iio: trigger: stm32-timer: fix get/set down count direction iio: trigger: stm32-timer: fix write_raw return value iio: trigger: stm32-timer: fix quadrature mode get routine iio: bmp280: properly initialize device for humidity reading
2017-08-27Merge tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntbLinus Torvalds
Pull NTB fixes from Jon Mason: "NTB bug fixes to address an incorrect ntb_mw_count reference in the NTB transport, improperly bringing down the link if SPADs are corrupted, and an out-of-order issue regarding link negotiation and data passing" * tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb: ntb: ntb_test: ensure the link is up before trying to configure the mws ntb: transport shouldn't disable link due to bogus values in SPADs ntb: use correct mw_count function in ntb_tool and ntb_transport
2017-08-27Avoid page waitqueue race leaving possible page locker waitingLinus Torvalds
The "lock_page_killable()" function waits for exclusive access to the page lock bit using the WQ_FLAG_EXCLUSIVE bit in the waitqueue entry set. That means that if it gets woken up, other waiters may have been skipped. That, in turn, means that if it sees the page being unlocked, it *must* take that lock and return success, even if a lethal signal is also pending. So instead of checking for lethal signals first, we need to check for them after we've checked the actual bit that we were waiting for. Even if that might then delay the killing of the process. This matches the order of the old "wait_on_bit_lock()" infrastructure that the page locking used to use (and is still used in a few other areas). Note that if we still return an error after having unsuccessfully tried to acquire the page lock, that is ok: that means that some other thread was able to get ahead of us and lock the page, and when that other thread then unlocks the page, the wakeup event will be repeated. So any other pending waiters will now get properly woken up. Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Jan Kara <jack@suse.cz> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-27Minor page waitqueue cleanupsLinus Torvalds
Tim Chen and Kan Liang have been battling a customer load that shows extremely long page wakeup lists. The cause seems to be constant NUMA migration of a hot page that is shared across a lot of threads, but the actual root cause for the exact behavior has not been found. Tim has a patch that batches the wait list traversal at wakeup time, so that we at least don't get long uninterruptible cases where we traverse and wake up thousands of processes and get nasty latency spikes. That is likely 4.14 material, but we're still discussing the page waitqueue specific parts of it. In the meantime, I've tried to look at making the page wait queues less expensive, and failing miserably. If you have thousands of threads waiting for the same page, it will be painful. We'll need to try to figure out the NUMA balancing issue some day, in addition to avoiding the excessive spinlock hold times. That said, having tried to rewrite the page wait queues, I can at least fix up some of the braindamage in the current situation. In particular: (a) we don't want to continue walking the page wait list if the bit we're waiting for already got set again (which seems to be one of the patterns of the bad load). That makes no progress and just causes pointless cache pollution chasing the pointers. (b) we don't want to put the non-locking waiters always on the front of the queue, and the locking waiters always on the back. Not only is that unfair, it means that we wake up thousands of reading threads that will just end up being blocked by the writer later anyway. Also add a comment about the layout of 'struct wait_page_key' - there is an external user of it in the cachefiles code that means that it has to match the layout of 'struct wait_bit_key' in the two first members. It so happens to match, because 'struct page *' and 'unsigned long *' end up having the same values simply because the page flags are the first member in struct page. Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Christopher Lameter <cl@linux.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-27Clarify (and fix) MAX_LFS_FILESIZE macrosLinus Torvalds
We have a MAX_LFS_FILESIZE macro that is meant to be filled in by filesystems (and other IO targets) that know they are 64-bit clean and don't have any 32-bit limits in their IO path. It turns out that our 32-bit value for that limit was bogus. On 32-bit, the VM layer is limited by the page cache to only 32-bit index values, but our logic for that was confusing and actually wrong. We used to define that value to (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1) which is actually odd in several ways: it limits the index to 31 bits, and then it limits files so that they can't have data in that last byte of a page that has the highest 31-bit index (ie page index 0x7fffffff). Neither of those limitations make sense. The index is actually the full 32 bit unsigned value, and we can use that whole full page. So the maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG". However, we do wan tto avoid the maximum index, because we have code that iterates over the page indexes, and we don't want that code to overflow. So the maximum size of a file on a 32-bit host should actually be one page less than the full 32-bit index. So the actual limit is ULONG_MAX << PAGE_SHIFT. That means that we will not actually be using the page of that last index (ULONG_MAX), but we can grow a file up to that limit. The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5 volume. It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one byte less), but the actual true VM limit is one page less than 16TiB. This was invisible until commit c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"), which started applying that MAX_LFS_FILESIZE limit to block devices too. NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is actually just the offset type itself (loff_t), which is signed. But for clarity, on 64-bit, just use the maximum signed value, and don't make people have to count the number of 'f' characters in the hex constant. So just use LLONG_MAX for the 64-bit case. That was what the value had been before too, just written out as a hex constant. Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-and-tested-by: Doug Nazar <nazard@nazar.ca> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a tweak to the IBM Trackpoint driver that helps recognizing trackpoints on never Lenovo Carbons - a fix to the ALPS driver solving scroll issues on some Dells - yet another ACPI ID has been added to Elan I2C toucpad driver - quieted diagnostic message in soc_button_array driver * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365 Input: trackpoint - add new trackpoint firmware ID Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
2017-08-26Merge tag 'pci-v4.13-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Remove needlessly alarming MSI affinity warning (this is not actually a bug fix, but the warning prompts unnecessary bug reports)" * tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
2017-08-26Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: one for an ldt_struct handling bug and a cherry-picked objtool fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix use-after-free of ldt_struct objtool: Fix '-mtune=atom' decoding support in objtool 2.0
2017-08-26Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a timer granularity handling race+bug, which would manifest itself by spuriously increasing timeouts of some timers (from 1 jiffy to ~500 jiffies in the worst case measured) in certain nohz states" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Fix excessive granularity of new timers after a nohz idle
2017-08-26Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "A single fix to not allow nonsensical event groups that result in kernel warnings" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix group {cpu,task} validation
2017-08-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "6 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/memblock.c: reversed logic in memblock_discard() fork: fix incorrect fput of ->exe_file causing use-after-free mm/madvise.c: fix freeing of locked page with MADV_FREE dax: fix deadlock due to misaligned PMD faults mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled PM/hibernate: touch NMI watchdog when creating snapshot
2017-08-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull Paolo Bonzini: "Bugfixes for x86, PPC and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state KVM: x86: simplify handling of PKRU KVM: x86: block guest protection keys unless the host has them enabled KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9 KVM: s390: sthyi: fix specification exception detection KVM: s390: sthyi: fix sthyi inline assembly
2017-08-25Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Fixes two obvious bugs in virtio pci" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_pci: fix cpu affinity support virtio_blk: fix incorrect message when disk is resized
2017-08-25Merge tag 'powerpc-4.13-8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix, to add a barrier in the switch_mm() code to make sure the mm cpumask update is ordered vs the MMU starting to load translations. As far as we know no one's actually hit the bug, but that's just luck. Thanks to Benjamin Herrenschmidt, Nicholas Piggin" * tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Ensure cpumask update is ordered
2017-08-25Merge tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Two nfsd bugfixes, neither 4.13 regressions, but both potentially serious" * tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux: net: sunrpc: svcsock: fix NULL-pointer exception nfsd: Limit end of page list when decoding NFSv4 WRITE
2017-08-25Merge tag 'cifs-fixes-for-4.13-rc6-and-stable' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Some bug fixes for stable for cifs" * tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6: cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup() cifs: Fix df output for users with quota limits
2017-08-25Merge tag 'for-linus-20170825' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull MTD fixes from Brian Norris: "Two fixes - one for a 4.13 regression, and the other for an older one: - Atmel NAND: since we started utilizing ONFI timings, we found that we were being too restrict at rejecting them, partly due to discrepancies in ONFI 4.0 and earlier versions. Relax the restriction to keep these platforms booting. This is a 4.13-rc1 regression. - nandsim: repeated probe/removal may not work after a failed init, because we didn't free up our debugfs files properly on the failure path. This has been around since 3.8, but it's nice to get this fixed now in a nice easy patch that can target -stable, since there's already refactoring work (that also fixes the issue) targeted for the next merge window" * tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd: mtd: nand: atmel: Relax tADL_min constraint mtd: nandsim: remove debugfs entries in error path
2017-08-25Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A small batch of fixes that should be included for the 4.13 release. This contains: - Revert of the 4k loop blocksize support. Even with a recent batch of 4 fixes, we're still not really happy with it. Rather than be stuck with an API issue, let's revert it and get it right for 4.14. - Trivial patch from Bart, adding a few flags to the blk-mq debugfs exports that were added in this release, but not to the debugfs parts. - Regression fix for bsg, fixing a potential kernel panic. From Benjamin. - Tweak for the blk throttling, improving how we account discards. From Shaohua" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq-debugfs: Add names for recently added flags bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer Revert "loop: support 4k physical blocksize" blk-throttle: cap discard request size
2017-08-25Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has some bugfixes for you: mainly Jarkko fixed up a few things in the designware driver regarding the new slave mode. But Ulf also fixed a long-standing and now agreed suspend problem. Plus, some simple stuff which nonetheless needs fixing" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: designware: Fix runtime PM for I2C slave mode i2c: designware: Remove needless pm_runtime_put_noidle() call i2c: aspeed: fixed potential null pointer dereference i2c: simtec: use release_mem_region instead of release_resource i2c: core: Make comment about I2C table requirement to reflect the code i2c: designware: Fix standard mode speed when configuring the slave mode i2c: designware: Fix oops from i2c_dw_irq_handler_slave i2c: designware: Fix system suspend
2017-08-25PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULLChristoph Hellwig
irq_create_affinity_masks() can return NULL on non-SMP systems, when there are not enough "free" vectors available to spread, or if memory allocation for the CPU masks fails. Only the allocation failure is of interest, and even then the system will work just fine except for non-optimally spread vectors. Thus remove the warnings. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net>
2017-08-25Merge tag 'mmc-v4.13-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "MMC core: don't return error code R1_OUT_OF_RANGE for open-ending mode" * tag 'mmc-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: block: prevent propagating R1_OUT_OF_RANGE for open-ending mode
2017-08-25Merge tag 'sound-4.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "We're keeping in a good shape, this batch contains just a few small fixes (a regression fix for ASoC rt5677 codec, NULL dereference and error-path fixes in firewire, and a corner-case ioctl error fix for user TLV), as well as usual quirks for USB-audio and HD-audio" * tag 'sound-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: rt5677: Reintroduce I2C device IDs ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978) ALSA: core: Fix unexpected error at replacing user TLV ALSA: usb-audio: Add delay quirk for H650e/Jabra 550a USB headsets ALSA: firewire-motu: destroy stream data surely at failure of card initialization ALSA: firewire: fix NULL pointer dereference when releasing uninitialized data of iso-resource
2017-08-25Merge tag 'dmaengine-fix-4.13-rc7' of ↵Linus Torvalds
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fix from Vinod Koul: "A single fix for tegra210-adma driver to check of_irq_get() error" * tag 'dmaengine-fix-4.13-rc7' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: tegra210-adma: fix of_irq_get() error check
2017-08-25Merge tag 'drm-fixes-for-v4.13-rc7' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Fixes for rc7, nothing too crazy, some core, i915, and sunxi fixes, Intel CI has been responsible for some of these fixes being required" * tag 'drm-fixes-for-v4.13-rc7' of git://people.freedesktop.org/~airlied/linux: drm/i915/gvt: Fix the kernel null pointer error drm: Release driver tracking before making the object available again drm/i915: Clear lost context-switch interrupts across reset drm/i915/bxt: use NULL for GPIO connection ID drm/i915/cnl: Fix LSPCON support. drm/i915/vbt: ignore extraneous child devices for a port drm/i915: Initialize 'data' in intel_dsi_dcs_backlight.c drm/atomic: If the atomic check fails, return its value first drm/atomic: Handle -EDEADLK with out-fences correctly drm: Fix framebuffer leak drm/imx: ipuv3-plane: fix YUV framebuffer scanout on the base plane gpu: ipu-v3: add DRM dependency drm/rockchip: Fix suspend crash when drm is not bound drm/sun4i: Implement drm_driver lastclose to restore fbdev console
2017-08-25mm/memblock.c: reversed logic in memblock_discard()Pavel Tatashin
In recently introduced memblock_discard() there is a reversed logic bug. Memory is freed of static array instead of dynamically allocated one. Link: http://lkml.kernel.org/r/1503511441-95478-2-git-send-email-pasha.tatashin@oracle.com Fixes: 3010f876500f ("mm: discard memblock data later") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reported-by: Woody Suwalski <terraluna977@gmail.com> Tested-by: Woody Suwalski <terraluna977@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>