Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One notable fix for kexec on Power9, where we were not clearing MMU
PID properly which sometimes leads to hangs. Finally debugged to a
root cause by Nick.
A revert of a patch which tried to rework our panic handling to get
more output on the console, but inadvertently broke reporting the
panic to the hypervisor, which apparently people care about.
Then a fix for an oops in the PMU code, and finally some s/%p/%px/ in
xmon.
Thanks to: David Gibson, Nicholas Piggin, Ravi Bangoria"
* tag 'powerpc-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/xmon: Don't print hashed pointers in xmon
powerpc/64s: Initialize ISAv3 MMU registers before setting partition table
Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"
powerpc/perf: Fix oops when grouping different pmu events
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- three more patches in regard to the SPDX license tags. The missing
tags for the files in arch/s390/kvm will be merged via the KVM tree.
With that all s390 related files should have their SPDX tags.
- a patch to get rid of 'struct timespec' in the DASD driver.
- bug fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix compat system call table
s390/mm: fix off-by-one bug in 5-level page table handling
s390: Remove redudant license text
s390: add a few more SPDX identifiers
s390/dasd: prevent prefix I/O error
s390: always save and restore all registers on context switch
s390/dasd: remove 'struct timespec' usage
s390/qdio: restrict target-full handling to IQDIO
s390/qdio: consider ERROR buffers for inbound-full condition
s390/virtio: add BSD license to virtio-ccw
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Fix some more FP register fallout from the SVE patches and also some
problems with the PGD tracking in our software PAN emulation code,
after we received a crash report from a 3.18 kernel running a
backport.
Summary:
- fix SW PAN pgd shadowing for kernel threads, EFI and exiting user
tasks
- fix FP register leak when a task_struct is re-allocated
- fix potential use-after-free in FP state tracking used by KVM"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/sve: Avoid dereference of dead task_struct in KVM guest entry
arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlb
arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mm
arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu.
arm64: fpsimd: Prevent registers leaking from dead tasks
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This fixes an out of bounds warning from KASAN in the ACPI CPPC
driver"
* tag 'acpi-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / CPPC: Fix KASAN global out of bounds warning
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"This fixes an issue in the device runtime PM framework that prevents
customer devices from resuming if runtime PM is disabled for one or
more of their supplier devices (as reflected by device links between
those devices)"
* tag 'pm-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / runtime: Fix handling of suppliers with disabled runtime PM
|
|
When wiring up the socket system calls the compat entries were
incorrectly set. Not all of them point to the corresponding compat
wrapper functions, which clear the upper 33 bits of user space
pointers, like it is required.
Fixes: 977108f89c989 ("s390: wire up separate socketcalls system calls")
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb
Pull kgdb fixes from Jason Wessel:
- Fix long standing problem with kdb kallsyms_symbol_next() return
value
- Add new co-maintainer Daniel Thompson
* tag 'for_linus-4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
kgdb/kdb/debug_core: Add co-maintainer Daniel Thompson
kdb: Fix handling of kallsyms_symbol_next() return value
|
|
It's a user pointer, and while the permissions of the file are pretty
questionable (should it really be readable to everybody), hashing the
pointer isn't going to be the solution.
We should take a closer look at more of the /proc/<pid> file permissions
in general. Sure, we do want many of them to often be readable (for
'ps' and friends), but I think we should probably do a few conversions
from S_IRUGO to S_IRUSR.
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu fixes from Greg Ungerer:
"There are two fixes here. One to add a missing linker section to the
m68k architecture linker scripts, the other to fix a defconfig build
problem"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k/defconfig: fix stmark2 broken local compilation
m68k: add missing SOFTIRQENTRY_TEXT linker section
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- make CR4 handling irq-safe, which bug vmware guests ran into
- don't crash on early IRQs in Xen guests
- don't crash secondary CPU bringup if #UD assisted WARN()ings are
triggered
- make X86_BUG_FXSAVE_LEAK optional on newer AMD CPUs that have the fix
- fix AMD Fam17h microcode loading
- fix broadcom_postcore_init() if ACPI is disabled
- fix resume regression in __restore_processor_context()
- fix Sparse warnings
- fix a GCC-8 warning
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Change time() prototype to match __vdso_time()
x86: Fix Sparse warnings about non-static functions
x86/power: Fix some ordering bugs in __restore_processor_context()
x86/PCI: Make broadcom_postcore_init() check acpi_disabled
x86/microcode/AMD: Add support for fam17h microcode loading
x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
x86/idt: Load idt early in start_secondary
x86/xen: Support early interrupts in xen pv guests
x86/tlb: Disable interrupts when changing CR4
x86/tlb: Refactor CR4 setting and shadow write
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
"A single fix moving the smp-call queue flush step to the intended
point in the state machine"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Move step CPUHP_AP_SMPCFD_DYING to the correct place
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"This includes a fix for the add_wait_queue() queue ordering brown
paperbag bug, plus PELT accounting fixes for cgroups scheduling
artifacts"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Update and fix the runnable propagation rule
sched/wait: Fix add_wait_queue() behavioral change
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"This includes perf namespace support kernel side fixes, plus an
accumulated set of perf tooling fixes - including UAPI header
synchronization that should make the perf build less noisy"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
tooling/headers: Synchronize updated s390 and x86 UAPI headers
tools headers: Syncronize mman.h ABI header
tools headers: Synchronize prctl.h ABI header
tools headers: Synchronize KVM arch ABI headers
tools headers: Synchronize drm/i915_drm.h
tools headers uapi: Synchronize drm/drm.h
tools headers: Synchronize perf_event.h header
tools headers: Synchronize kernel ABI headers wrt SPDX tags
tools/headers: Synchronize kernel x86 UAPI headers
perf intel-pt: Bring instruction decoder files into line with the kernel
perf test: Fix test 21 for s390x
perf bench numa: Fixup discontiguous/sparse numa nodes
perf top: Use signal interface for SIGWINCH handler
perf top: Fix window dimensions change handling
perf: Fix header.size for namespace events
perf top: Ignore kptr_restrict when not sampling the kernel
perf record: Ignore kptr_restrict when not sampling the kernel
perf report: Ignore kptr_restrict when not sampling the kernel
perf evlist: Add helper to check if attr.exclude_kernel is set in all evsels
perf test shell: Fix test case probe libc's inet_pton on s390x
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull lockdep fix from Ingo Molnar:
"Fix a possible NULL dereference for the (rare) case when a task
doesn't have ->xhlocks space allocated due to kmalloc() OOM-ing"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Fix possible NULL deref
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Two fixes: use bool type consistently, plus a irq_matrix_available()
bugfix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqdesc: Use bool return type instead of int
genirq/matrix: Fix the precedence fix for real
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Misc fixes: world-readable pointer removal from sysfs, a ESRT kfree()
bug fix and a comment update"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Add comment to avoid future expanding of sysfs systab
efi/esrt: Use memunmap() instead of kfree() to free the remapping
efi: Move some sysfs files to be read-only by root
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Ingo Molnar:
"Two fixes:
- objtool cross-build fixes
- removal of an obsolete CPU-hotplug state name from comments"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix 64-bit build on 32-bit host
cpu/hotplug: Fix state name in takedown_cpu() comment
|
|
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
|
|
kallsyms_symbol_next() returns a boolean (true on success). Currently
kdb_read() tests the return value with an inequality that
unconditionally evaluates to true.
This is fixed in the obvious way and, since the conditional branch is
supposed to be unreachable, we also add a WARN_ON().
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
|
|
There were two trivial updates to these upstream UAPI headers:
arch/s390/include/uapi/asm/kvm.h
arch/s390/include/uapi/asm/kvm_perf.h
arch/x86/lib/x86-opcode-map.txt
Synchronize them with their tooling copies.
(The x86 opcode map includes a new instruction pattern now.)
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The new ORC unwinder breaks the build of a 64-bit kernel on a 32-bit
host. Building the kernel on a i386 or x32 host fails with:
orc_dump.c: In function 'orc_dump':
orc_dump.c:105:26: error: passing argument 2 of 'elf_getshdrnum' from incompatible pointer type [-Werror=incompatible-pointer-types]
if (elf_getshdrnum(elf, &nr_sections)) {
^
In file included from /usr/local/include/gelf.h:32:0,
from elf.h:22,
from warn.h:26,
from orc_dump.c:20:
/usr/local/include/libelf.h:304:12: note: expected 'size_t * {aka unsigned int *}' but argument is of type 'long unsigned int *'
extern int elf_getshdrnum (Elf *__elf, size_t *__dst);
^~~~~~~~~~~~~~
orc_dump.c:190:17: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Sxword {aka long long int}' [-Werror=format=]
printf("%s+%lx:", name, rela.r_addend);
~~^ ~~~~~~~~~~~~~
%llx
Fix the build failure.
Another problem is that if the user specifies HOSTCC or HOSTLD
variables, they are ignored in the objtool makefile. Change the
Makefile to respect these variables.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Joachim <svenjoac@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 627fce14809b ("objtool: Add ORC unwind table generation")
Link: http://lkml.kernel.org/r/19f0e64d8e07e30a7b307cd010eb780c404fe08d.1512252895.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
gcc-8 warns that time() is an alias for __vdso_time() but the two
have different prototypes:
arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias]
int time(time_t *t)
^~~~
arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here
I could not figure out whether this is intentional, but I see that
changing it to return time_t avoids the warning.
Returning 'int' from time() is also a bit questionable, as it causes an
overflow in y2038 even on 64-bit architectures that use a 64-bit time_t
type. On 32-bit architecture with 64-bit time_t, time() should always
be implement by the C library by calling a (to be added) clock_gettime()
variant that takes a sufficiently wide argument.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When deciding whether to invalidate FPSIMD state cached in the cpu,
the backend function sve_flush_cpu_state() attempts to dereference
__this_cpu_read(fpsimd_last_state). However, this is not safe:
there is no guarantee that this task_struct pointer is still valid,
because the task could have exited in the meantime.
This means that we need another means to get the appropriate value
of TIF_SVE for the associated task.
This patch solves this issue by adding a cached copy of the TIF_SVE
flag in fpsimd_last_state, which we can check without dereferencing
the task pointer.
In particular, although this patch is not a KVM fix per se, this
means that this check is now done safely in the KVM world switch
path (which is currently the only user of this code).
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Pull IOMMU fix from Alex Williamson:
"Fix VT-d handling of scatterlists where sg->offset exceeds PAGE_SIZE"
* tag 'iommu-v4.15-rc3' of git://github.com/awilliam/linux-vfio:
iommu/vt-d: Fix scatterlist offset handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All fixes are small and for stable:
- a PCM ioctl race fix
- yet another USB-audio hardening for malicious descriptors
- Realtek ALC257 codec support"
* tag 'sound-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: prevent UAF in snd_pcm_info
ALSA: hda/realtek - New codec support for ALC257
ALSA: usb-audio: Add check return value for usb_string()
ALSA: usb-audio: Fix out-of-bound error
ALSA: seq: Remove spurious WARN_ON() at timer check
|
|
Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common()
are local to the source and do not need to be in global scope, so make them
static.
Fixes up various sparse warnings.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Cc: travis@sgi.com
Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
/sys/firmware/efi/systab shows several different values, it breaks sysfs
one file one value design. But since there are already userspace tools
depend on it eg. kexec-tools so add code comment to alert future expanding
of this file.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20171206095010.24170-4-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The remapping result of memremap() should be freed with memunmap(), not kfree().
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20171206095010.24170-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Thanks to the scripts/leaking_addresses.pl script, it was found that
some EFI values should not be readable by non-root users.
So make them root-only, and to do that, add a __ATTR_RO_MODE() macro to
make this easier, and use it in other places at the same time.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: stable <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20171206095010.24170-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Unlike running, the runnable part can't be directly propagated through
the hierarchy when we migrate a task. The main reason is that runnable
time can be shared with other sched_entities that stay on the rq and
this runnable time will also remain on prev cfs_rq and must not be
removed.
Instead, we can estimate what should be the new runnable of the prev
cfs_rq and check that this estimation stay in a possible range. The
prop_runnable_sum is a good estimation when adding runnable_sum but
fails most often when we remove it. Instead, we could use the formula
below instead:
gcfs_rq's runnable_sum = gcfs_rq->avg.load_sum / gcfs_rq->load.weight
which assumes that tasks are equally runnable which is not true but
easy to compute.
Beside these estimates, we have several simple rules that help us to filter
out wrong ones:
- ge->avg.runnable_sum <= than LOAD_AVG_MAX
- ge->avg.runnable_sum >= ge->avg.running_sum (ge->avg.util_sum << LOAD_AVG_MAX)
- ge->avg.runnable_sum can't increase when we detach a task
The effect of these fixes is better cgroups balancing.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yuyang Du <yuyang.du@intel.com>
Link: http://lkml.kernel.org/r/1510842112-21028-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The following cleanup commit:
50816c48997a ("sched/wait: Standardize internal naming of wait-queue entries")
... unintentionally changed the behavior of add_wait_queue() from
inserting the wait entry at the head of the wait queue to the tail
of the wait queue.
Beyond a negative performance impact this change in behavior
theoretically also breaks wait queues which mix exclusive and
non-exclusive waiters, as non-exclusive waiters will not be
woken up if they are queued behind enough exclusive waiters.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Fixes: ("sched/wait: Standardize internal naming of wait-queue entries")
Link: http://lkml.kernel.org/r/a16c8ccffd39bd08fdaa45a5192294c784b803a7.1512544324.git.osandov@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
We can't invalidate xhlocks when we've not yet allocated any.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: f52be5708076 ("locking/lockdep: Untangle xhlock history save/restore from task independence")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
CPUHP_AP_SCHED_MIGRATE_DYING doesn't exist, it looks like this was
supposed to refer to CPUHP_AP_SCHED_STARTING's teardown callback,
i.e. sched_cpu_dying().
Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171206105911.28093-1-brendan.jackman@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
enter_lazy_tlb is called when a kernel thread rides on the back of
another mm, due to a context switch or an explicit call to unuse_mm
where a call to switch_mm is elided.
In these cases, it's important to keep the saved ttbr value up to date
with the active mm, otherwise we can end up with a stale value which
points to a potentially freed page table.
This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0
is kept up-to-date with the active mm for kernel threads.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper
contains kernel mappings and should never be installed into ttbr0. However,
this means that callers must avoid passing the init_mm to update_saved_ttbr0
which in turn can cause the saved ttbr0 value to be out-of-date in the context
of the idle thread. For example, EFI runtime services may leave the saved ttbr0
pointing at the EFI page table, and kernel threads may end up with stale
references to freed page tables.
This patch changes update_saved_ttbr0 so that the init_mm points the saved
ttbr0 value to the empty zero page, which always exists and never contains
valid translations. EFI and switch can then call into update_saved_ttbr0
unconditionally.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: 39bc88e5e38e9b21 ("arm64: Disable TTBR0_EL1 during normal kernel execution")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
There is currently some duplicate logic to associate current's
FPSIMD context with the cpu when loading FPSIMD state into the cpu
regs.
Subsequent patches will update that logic, so in order to ensure it
only needs to be done in one place, this patch factors the relevant
code out into a new function fpsimd_bind_to_cpu().
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Currently, loading of a task's fpsimd state into the CPU registers
is skipped if that task's state is already present in the registers
of that CPU.
However, the code relies on the struct fpsimd_state * (and by
extension struct task_struct *) to unambiguously identify a task.
There is a particular case in which this doesn't work reliably:
when a task exits, its task_struct may be recycled to describe a
new task.
Consider the following scenario:
1) Task P loads its fpsimd state onto cpu C.
per_cpu(fpsimd_last_state, C) := P;
P->thread.fpsimd_state.cpu := C;
2) Task X is scheduled onto C and loads its fpsimd state on C.
per_cpu(fpsimd_last_state, C) := X;
X->thread.fpsimd_state.cpu := C;
3) X exits, causing X's task_struct to be freed.
4) P forks a new child T, which obtains X's recycled task_struct.
T == X.
T->thread.fpsimd_state.cpu == C (inherited from P).
5) T is scheduled on C.
T's fpsimd state is not loaded, because
per_cpu(fpsimd_last_state, C) == T (== X) &&
T->thread.fpsimd_state.cpu == C.
(This is the check performed by fpsimd_thread_switch().)
So, T gets X's registers because the last registers loaded onto C
were those of X, in (2).
This patch fixes the problem by ensuring that the sched-in check
fails in (5): fpsimd_flush_task_state(T) is called when T is
forked, so that T->thread.fpsimd_state.cpu == C cannot be true.
This relies on the fact that T is not schedulable until after
copy_thread() completes.
Once T's fpsimd state has been loaded on some CPU C there may still
be other cpus D for which per_cpu(fpsimd_last_state, D) ==
&X->thread.fpsimd_state. But D is necessarily != C in this case,
and the check in (5) must fail.
An alternative fix would be to do refcounting on task_struct. This
would result in each CPU holding a reference to the last task whose
fpsimd state was loaded there. It's not clear whether this is
preferable, and it involves higher overhead than the fix proposed
in this patch. It would also move all the task_struct freeing
work into the context switch critical section, or otherwise some
deferred cleanup mechanism would need to be introduced, neither of
which seems obviously justified.
Cc: <stable@vger.kernel.org>
Fixes: 005f78cd8849 ("arm64: defer reloading a task's FPSIMD state to userland resume")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: word-smithed the comment so it makes more sense]
Signed-off-by: Will Deacon <will.deacon@arm.com>
|
|
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers printed with %p are hashed, ie. you don't see the actual
pointer value but rather a cryptographic hash of its value.
In xmon we want to see the actual pointer values, because xmon is a
debugger, so replace %p with %px which prints the actual pointer
value.
We justify doing this in xmon because 1) xmon is a kernel crash
debugger, it's only accessible via the console 2) xmon doesn't print
to dmesg, so the pointers it prints are not able to be leaked that
way.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
kexec can leave MMU registers set when booting into a new kernel,
the PIDR (Process Identification Register) in particular. The boot
sequence does not zero PIDR, so it only gets set when CPUs first
switch to a userspace processes (until then it's running a kernel
thread with effective PID = 0).
This leaves a window where a process table entry and page tables are
set up due to user processes running on other CPUs, that happen to
match with a stale PID. The CPU with that PID may cause speculative
accesses that address quadrant 0 (aka userspace addresses), which will
result in cached translations and PWC (Page Walk Cache) for that
process, on a CPU which is not in the mm_cpumask and so they will not
be invalidated properly.
The most common result is the kernel hanging in infinite page fault
loops soon after kexec (usually in schedule_tail, which is usually the
first non-speculative quadrant 0 access to a new PID) due to a stale
PWC. However being a stale translation error, it could result in
anything up to security and data corruption problems.
Fix this by zeroing out PIDR at boot and kexec.
Fixes: 7e381c0ff618 ("powerpc/mm/radix: Add mmu context handling callback for radix")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
__restore_processor_context() had a couple of ordering bugs. It
restored GSBASE after calling load_gs_index(), and the latter can
call into tracing code. It also tried to restore segment registers
before restoring the LDT, which is straight-up wrong.
Reorder the code so that we restore GSBASE, then the descriptor
tables, then the segments.
This fixes two bugs. First, it fixes a regression that broke resume
under certain configurations due to irqflag tracing in
native_load_gs_index(). Second, it fixes resume when the userspace
process that initiated suspect had funny segments. The latter can be
reproduced by compiling this:
// SPDX-License-Identifier: GPL-2.0
/*
* ldt_echo.c - Echo argv[1] while using an LDT segment
*/
int main(int argc, char **argv)
{
int ret;
size_t len;
char *buf;
const struct user_desc desc = {
.entry_number = 0,
.base_addr = 0,
.limit = 0xfffff,
.seg_32bit = 1,
.contents = 0, /* Data, grow-up */
.read_exec_only = 0,
.limit_in_pages = 1,
.seg_not_present = 0,
.useable = 0
};
if (argc != 2)
errx(1, "Usage: %s STRING", argv[0]);
len = asprintf(&buf, "%s\n", argv[1]);
if (len < 0)
errx(1, "Out of memory");
ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc));
if (ret < -1)
errno = -ret;
if (ret)
err(1, "modify_ldt");
asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7));
write(1, buf, len);
return 0;
}
and running ldt_echo >/sys/power/mem
Without the fix, the latter causes a triple fault on resume.
Fixes: ca37e57bbe0c ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()")
Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
acpi_os_get_root_pointer() may return a valid address even if acpi_disabled
is set, but the host bridge information from the ACPI tables is not going
to be used in that case and the Broadcom host bridge initialization should
not be skipped then, So make broadcom_postcore_init() check acpi_disabled
too to avoid this issue.
Fixes: 6361d72b04d1 (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan)
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The size for the Microcode Patch Block (MPB) for an AMD family 17h
processor is 3200 bytes. Add a #define for fam17h so that it does
not default to 2048 bytes and fail a microcode load/update.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The latest AMD AMD64 Architecture Programmer's Manual
adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]).
If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES
/ FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers,
thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs.
Signed-off-by: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Things like this will probably keep showing up for other architectures
and other special cases.
I actually thought we already used %lx for this, and that is indeed
_historically_ the case, but we moved to %p when merging the 32-bit and
64-bit cases as a convenient way to get the formatting right (ie
automatically picking "%08lx" vs "%016lx" based on register size).
So just turn this %p into %px.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The refcount_t protection on x86 was not intended to use the stricter
GPL export. This adjusts the linkage again to avoid a regression in
the availability of the refcount API.
Reported-by: Dave Airlie <airlied@gmail.com>
Fixes: 7a46ec0e2f48 ("locking/refcounts, x86/asm: Implement fast refcount overflow protection")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When the device descriptor is closed, the `substream->runtime` pointer
is freed. But another thread may be in the ioctl handler, case
SNDRV_CTL_IOCTL_PCM_INFO. This case calls snd_pcm_info_user() which
calls snd_pcm_info() which accesses the now freed `substream->runtime`.
Note: this fixes CVE-2017-0861
Signed-off-by: Robb Glasser <rglasser@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Default value of pcc_subspace_idx is -1.
Make sure to check pcc_subspace_idx before using the same as array index.
This will avoid following KASAN warnings too.
[ 15.113449] ==================================================================
[ 15.116983] BUG: KASAN: global-out-of-bounds in cppc_get_perf_caps+0xf3/0x3b0
[ 15.116983] Read of size 8 at addr ffffffffb9a5c0d8 by task swapper/0/1
[ 15.116983] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2+ #2
[ 15.116983] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.2.8 01/26/2016
[ 15.116983] Call Trace:
[ 15.116983] dump_stack+0x7c/0xbb
[ 15.116983] print_address_description+0x1df/0x290
[ 15.116983] kasan_report+0x28a/0x370
[ 15.116983] ? cppc_get_perf_caps+0xf3/0x3b0
[ 15.116983] cppc_get_perf_caps+0xf3/0x3b0
[ 15.116983] ? cpc_read+0x210/0x210
[ 15.116983] ? __rdmsr_on_cpu+0x90/0x90
[ 15.116983] ? rdmsrl_on_cpu+0xa9/0xe0
[ 15.116983] ? rdmsr_on_cpu+0x100/0x100
[ 15.116983] ? wrmsrl_on_cpu+0x9c/0xd0
[ 15.116983] ? wrmsrl_on_cpu+0x9c/0xd0
[ 15.116983] ? wrmsr_on_cpu+0xe0/0xe0
[ 15.116983] __intel_pstate_cpu_init.part.16+0x3a2/0x530
[ 15.116983] ? intel_pstate_init_cpu+0x197/0x390
[ 15.116983] ? show_no_turbo+0xe0/0xe0
[ 15.116983] ? __lockdep_init_map+0xa0/0x290
[ 15.116983] intel_pstate_cpu_init+0x30/0x60
[ 15.116983] cpufreq_online+0x155/0xac0
[ 15.116983] cpufreq_add_dev+0x9b/0xb0
[ 15.116983] subsys_interface_register+0x1ae/0x290
[ 15.116983] ? bus_unregister_notifier+0x40/0x40
[ 15.116983] ? mark_held_locks+0x83/0xb0
[ 15.116983] ? _raw_write_unlock_irqrestore+0x32/0x60
[ 15.116983] ? intel_pstate_setup+0xc/0x104
[ 15.116983] ? intel_pstate_setup+0xc/0x104
[ 15.116983] ? cpufreq_register_driver+0x1ce/0x2b0
[ 15.116983] cpufreq_register_driver+0x1ce/0x2b0
[ 15.116983] ? intel_pstate_setup+0x104/0x104
[ 15.116983] intel_pstate_register_driver+0x3a/0xa0
[ 15.116983] intel_pstate_init+0x3c4/0x434
[ 15.116983] ? intel_pstate_setup+0x104/0x104
[ 15.116983] ? intel_pstate_setup+0x104/0x104
[ 15.116983] do_one_initcall+0x9c/0x206
[ 15.116983] ? parameq+0xa0/0xa0
[ 15.116983] ? initcall_blacklisted+0x150/0x150
[ 15.116983] ? lock_downgrade+0x2c0/0x2c0
[ 15.116983] kernel_init_freeable+0x327/0x3f0
[ 15.116983] ? start_kernel+0x612/0x612
[ 15.116983] ? _raw_spin_unlock_irq+0x29/0x40
[ 15.116983] ? finish_task_switch+0xdd/0x320
[ 15.116983] ? finish_task_switch+0x8e/0x320
[ 15.116983] ? rest_init+0xd0/0xd0
[ 15.116983] kernel_init+0xf/0x11a
[ 15.116983] ? rest_init+0xd0/0xd0
[ 15.116983] ret_from_fork+0x24/0x30
[ 15.116983] The buggy address belongs to the variable:
[ 15.116983] __key.36299+0x38/0x40
[ 15.116983] Memory state around the buggy address:
[ 15.116983] ffffffffb9a5bf80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 fa fa fa
[ 15.116983] ffffffffb9a5c000: fa fa fa fa 00 fa fa fa fa fa fa fa 00 fa fa fa
[ 15.116983] >ffffffffb9a5c080: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
[ 15.116983] ^
[ 15.116983] ffffffffb9a5c100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 15.116983] ffffffffb9a5c180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 15.116983] ==================================================================
Fixes: 85b1407bf6d2 (ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs)
Reported-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: George Cherian <george.cherian@cavium.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"A bunch of fixes for aacraid, a set of coherency fixes that only
affect non-coherent platforms and one coccinelle detected null check
after use"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: libsas: align sata_device's rps_resp on a cacheline
scsi: use dma_get_cache_alignment() as minimum DMA alignment
scsi: dma-mapping: always provide dma_get_cache_alignment
scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg
scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path
scsi: aacraid: Perform initialization reset only once
scsi: aacraid: Check for PCI state of device in a generic way
|
|
Pull rdma fixes from Jason Gunthorpe:
"Here is the first rc pull request for RDMA. This includes an important
core fix for a regression in iWarp if SELinux is enabled, a fix for a
compilation regression introduced in this merge window, and one
obscure kconfig combination that oops's the kernel.
For drivers, we have hns fixes needed to make their devices work on
certain ARM IOMMU configurations, a stack data leak for hfi1, and
various testing discovered -rc bug fixes for i40iw.
This cycle we pushed back on the driver maintainers to have better
commit messages for -rc material"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/core: Only enforce security for InfiniBand
RDMA/hns: Get rid of page operation after dma_alloc_coherent
RDMA/hns: Get rid of virt_to_page and vmap calls after dma_alloc_coherent
RDMA/hns: Fix the issue of IOVA not page continuous in hip08
IB/core: Init subsys if compiled to vmlinuz-core
RDMA/cma: Make sure that PSN is not over max allowed
i40iw: Notify user of established connection after QP in RTS
i40iw: Move MPA request event for loopback after connect
i40iw: Correct ARP index mask
i40iw: Do not free sqbuf when event is I40IW_TIMER_TYPE_CLOSE
i40iw: Allocate a sdbuf per CQP WQE
IB: INFINIBAND should depend on HAS_DMA
IB/hfi1: Initialize bth1 in 16B rc ack builder
|