Age | Commit message (Collapse) | Author |
|
Since we would be introducing a new user of the DTL buffer in a
subsequent patch, we need a way to gatekeep use of the DTL buffer.
The current debugfs interface for DTL allows registering and opening
cpu-specific DTL buffers. Cpu specific files are exposed under
debugfs 'powerpc/dtl/' node, and changing 'dtl_event_mask' in the same
directory enables controlling the event mask used when registering DTL
buffer for a particular cpu.
Subsequently, we will be introducing a user of the DTL buffers that
registers access to the DTL buffers across all cpus with the same event
mask. To ensure these two users do not step on each other, we introduce
a rwlock to gatekeep DTL buffer access. This fits the requirement of the
current debugfs interface wanting to allow multiple independent
cpu-specific users (read lock), and the subsequent user wanting
exclusive access (write lock).
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Introduce new helpers for DTL buffer allocation and registration and
have the existing code use those.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Don't split error messages across lines, for grepability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is enabled, we always initialize
DTL enable mask to DTL_LOG_PREEMPT (0x2). There are no other places
where the mask is changed. As such, when reading the DTL log buffer
through debugfs, there is no need to save and restore the previous mask
value.
We don't need to save and restore the earlier mask value if
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not enabled. So, remove the field
from the structure as well.
Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Introduce macros to encode the DTL enable mask fields and use those
instead of hardcoding numbers.
Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Do not issue CLP_SET_ENABLE_MIO after opting out of MIO instruction
usage. This should not fix a bug but reduce overhead within firmware.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Unfortunately we have to handle a class of devices that don't support the
new MIO instructions. Adjust resource assignment and mapping accordingly.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
|
|
Enable CONFIG_IPV6 in ppc64_defconfig to enable
certain network functionalities required for tests.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The RISC-V free_initrd_mem is identical to the default one, except
that it doesn't poison the freed memory. Remove it so that the
default implementations gets used instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
Reading the count register clears the interrupt signal. Currently, the
count registers are read into 'regval' variable but the variable is
never used. Therefore remove it. V2 of this patch add comments to
justify the readl calls without checking the return value.
Signed-off-by: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
perf/core has an earlier version of the x86/cpu tree merged, to avoid
conflicts, and due to this we want to pick up this ABI impacting
revert as well:
049331f277fe: ("x86/fsgsbase: Revert FSGSBASE support")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In spufs_cntl_fops, since we use nonseekable_open() to open, we
should use no_llseek() to seek, not generic_file_llseek().
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When CONFIG_UML_TIME_TRAVEL_SUPPORT isn't set, the build was broken.
Fix this.
Fixes: 065038706f77 ("um: Support time travel mode")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
latencytop adds almost 4kB to each and every task struct and as such
it doesn't deserve to be in our defconfigs.
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Formatting of Kconfig files doesn't look so pretty, so let the
Great White Handkerchief come around and clean it up.
Also convert "---help---" as requested.
Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Remove the CONFIG_UEVENT_HELPER_PATH because:
1. It is disabled since commit 1be01d4a5714 ("driver: base: Disable
CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
made default to 'n',
2. It is not recommended (help message: "This should not be used today
[...] creates a high system load") and was kept only for ancient
userland,
3. Certain userland specifically requests it to be disabled (systemd
README: "Legacy hotplug slows down the system and confuses udev").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
|
|
Strangely enough, NIOS2 defines TRACE_IRQFLAGS_SUPPORT twice
with different values, which is pointless and confusing.
[1] arch/nios2/Kconfig
config TRACE_IRQFLAGS_SUPPORT
def_bool n
[2] arch/nios2/Kconfig.debug
config TRACE_IRQFLAGS_SUPPORT
def_bool y
[1] is included before [2]. In the Kconfig syntax, the first one
is effective. So, TRACE_IRQFLAGS_SUPPORT is always 'n'.
The second define in arch/nios2/Kconfig.debug is dead code.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This includes three fixes:
- Fix a deadlock from a previous fix to keep module loading and
function tracing text modifications from stepping on each other
(this has a few patches to help document the issue in comments)
- Fix a crash when the snapshot buffer gets out of sync with the main
ring buffer
- Fix a memory leak when reading the memory logs"
* tag 'trace-v5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/x86: Anotate text_mutex split between ftrace_arch_code_modify_post_process() and ftrace_arch_code_modify_prepare()
tracing/snapshot: Resize spare buffer if size changed
tracing: Fix memory leak in tracing_err_log_open()
ftrace/x86: Add a comment to why we take text_mutex in ftrace_arch_code_modify_prepare()
ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()
|
|
While mips might architecturally have the uncached segment all the time,
the infrastructure to use it is only need on platforms where DMA is
at least partially incoherent. Only select it for those configuration
to fix a build failure as the arch_dma_prep_coherent symbol is also only
provided for non-coherent platforms.
Fixes: 2e96e04d25ca ("MIPS: use the generic uncached segment support in dma-direct")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
|
|
This patch implements both 4MB huge page support for 32bit kernel
and 2MB/1GB huge pages support for 64bit kernel.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
ARCH_WANT_HUGE_PMD_SHARE config was declared in both architectures:
move this declaration in arch/Kconfig and make those architectures
select it.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
If an OpenCAPI context is to be used directly by a kernel driver, there
may not be a suitable mm to use.
The patch makes the mm parameter to ocxl_context_attach optional.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/r/20190620041203.12274-1-alastair@au1.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2019-07-03
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix the interpreter to properly handle BPF_ALU32 | BPF_ARSH
on BE architectures, from Jiong.
2) Fix several bugs in the x32 BPF JIT for handling shifts by 0,
from Luke and Xi.
3) Fix NULL pointer deref in btf_type_is_resolve_source_only(),
from Stanislav.
4) Properly handle the check that forwarding is enabled on the device
in bpf_ipv6_fib_lookup() helper code, from Anton.
5) Fix UAPI bpf_prog_info fields alignment for archs that have 16 bit
alignment such as m68k, from Baruch.
6) Fix kernel hanging in unregister_netdevice loop while unregistering
device bound to XDP socket, from Ilya.
7) Properly terminate tail update in xskq_produce_flush_desc(), from Nathan.
8) Fix broken always_inline handling in test_lwt_seg6local, from Jiri.
9) Fix bpftool to use correct argument in cgroup errors, from Jakub.
10) Fix detaching dummy prog in XDP redirect sample code, from Prashant.
11) Add Jonathan to AF_XDP reviewers, from Björn.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The FSGSBASE series turned out to have serious bugs and there is still an
open issue which is not fully understood yet.
The confidence in those changes has become close to zero especially as the
test cases which have been shipped with that series were obviously never
run before sending the final series out to LKML.
./fsgsbase_64 >/dev/null
Segmentation fault
As the merge window is close, the only sane decision is to revert FSGSBASE
support. The revert is necessary as this branch has been merged into
perf/core already and rebasing all of that a few days before the merge
window is not the most brilliant idea.
I could definitely slap myself for not noticing the test case fail when
merging that series, but TBH my expectations weren't that low back
then. Won't happen again.
Revert the following commits:
539bca535dec ("x86/entry/64: Fix and clean up paranoid_exit")
2c7b5ac5d5a9 ("Documentation/x86/64: Add documentation for GS/FS addressing mode")
f987c955c745 ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2")
2032f1f96ee0 ("x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit")
5bf0cab60ee2 ("x86/entry/64: Document GSBASE handling in the paranoid path")
708078f65721 ("x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit")
79e1932fa3ce ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
1d07316b1363 ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
f60a83df4593 ("x86/process/64: Use FSGSBASE instructions on thread copy and ptrace")
1ab5f3f7fe3d ("x86/process/64: Use FSBSBASE in switch_to() if available")
a86b4625138d ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions")
8b71340d702e ("x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions")
b64ed19b93c3 ("x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
|
|
The trailing newlines will lead to extra newlines in the trace file
which looks like the following output, so remove it.
qemu-system-x86-15695 [002] ...1 15774.839240: kvm_hv_timer_state: vcpu_id 0 hv_timer 1
qemu-system-x86-15695 [002] ...1 15774.839309: kvm_hv_timer_state: vcpu_id 0 hv_timer 1
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Allow testing code for old processors that lack the next RIP save
feature, by disabling usage of the next_rip field.
Nested hypervisors however get the feature unconditionally.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This implements 5-way interleaving for ECB, CBC decryption and CTR,
resulting in a speedup of ~11% on Marvell ThunderX2, which has a
very deep pipeline and therefore a high issue latency for NEON
instructions operating on the same registers.
Note that XTS is left alone: implementing 5-way interleave there
would either involve spilling of the calculated tweaks to the
stack, or recalculating them after the encryption operation, and
doing either of those would most likely penalize low end cores.
For ECB, this is not a concern at all, given that we have plenty
of spare registers. For CTR and CBC decryption, we take advantage
of the fact that v16 is not used by the CE version of the code
(which is the only one targeted by the optimization), and so we
can reshuffle the code a bit and avoid having to spill to memory
(with the exception of one extra reload in the CBC routine)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
In preparation of tweaking the accelerated AES chaining mode routines
to be able to use a 5-way stride, implement the core routines to
support processing 5 blocks of input at a time. While at it, drop
the 2 way versions, which have been unused for a while now.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Generic kprobe_page_fault() calls into kprobe_fault_handler() which must be
available with and without CONFIG_KPROBES. There is one stub implementation
for !CONFIG_KPROBES. For CONFIG_KPROBES all subscribing archs must provide
a kprobe_fault_handler() definition. Currently mips has an implementation
which is defined as 'static inline'. Make it available for generic kprobes
to comply with the above new requirement.
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-mm@kvack.org
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 773734b44557 ("mm, kprobes: generalize and rename notify_page_fault() as kprobe_page_fault()")
Cc: linux-kernel@vger.kernel.org
|
|
With CONFIG_OPTIMIZE_INLINING enabled, Laura Abbott reported error
with gcc 9.1.1:
arch/powerpc/mm/book3s64/radix_tlb.c: In function '_tlbiel_pid':
arch/powerpc/mm/book3s64/radix_tlb.c:104:2: warning: asm operand 3 probably doesn't match constraints
104 | asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
| ^~~
arch/powerpc/mm/book3s64/radix_tlb.c:104:2: error: impossible constraint in 'asm'
Fixing _tlbiel_pid() is enough to address the warning above, but I
inlined more functions to fix all potential issues.
To meet the "i" (immediate) constraint for the asm operands, functions
propagating "ric" must be always inlined.
Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
scatterlist support}' into for-arm-soc
|
|
Convert sa1100 to use the common clock framework.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
The current x32 BPF JIT does not correctly compile shift operations when
the immediate shift amount is 0. The expected behavior is for this to
be a no-op.
The following program demonstrates the bug. The expexceted result is 1,
but the current JITed code returns 2.
r0 = 1
r1 = 1
r1 <<= 0
if r1 == 1 goto end
r0 = 2
end:
exit
This patch simplifies the code and fixes the bug.
Fixes: 03f5781be2c7 ("bpf, x86_32: add eBPF JIT compiler for ia32")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
The current x32 BPF JIT for shift operations is not correct when the
shift amount in a register is 0. The expected behavior is a no-op, whereas
the current implementation changes bits in the destination register.
The following example demonstrates the bug. The expected result of this
program is 1, but the current JITed code returns 2.
r0 = 1
r1 = 1
r2 = 0
r1 <<= r2
if r1 == 1 goto end
r0 = 2
end:
exit
The bug is caused by an incorrect assumption by the JIT that a shift by
32 clear the register. On x32 however, shifts use the lower 5 bits of
the source, making a shift by 32 equivalent to a shift by 0.
This patch fixes the bug using double-precision shifts, which also
simplifies the code.
Fixes: 03f5781be2c7 ("bpf, x86_32: add eBPF JIT compiler for ia32")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Continue consolidating Hyper-V clock and timer code into an ISA
independent Hyper-V clocksource driver.
Move the existing clocksource code under drivers/hv and arch/x86 to the new
clocksource driver while separating out the ISA dependencies. Update
Hyper-V initialization to call initialization and cleanup routines since
the Hyper-V synthetic clock is not independently enumerated in ACPI.
Update Hyper-V clocksource users in KVM and VDSO to get definitions from
the new include file.
No behavior is changed and no new functionality is added.
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "bp@alien8.de" <bp@alien8.de>
Cc: "will.deacon@arm.com" <will.deacon@arm.com>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>
Cc: "mark.rutland@arm.com" <mark.rutland@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Cc: "olaf@aepfle.de" <olaf@aepfle.de>
Cc: "apw@canonical.com" <apw@canonical.com>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>
Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>
Cc: Sunil Muthuswamy <sunilmut@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>
Cc: "sashal@kernel.org" <sashal@kernel.org>
Cc: "vincenzo.frascino@arm.com" <vincenzo.frascino@arm.com>
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>
Cc: "linux-kselftest@vger.kernel.org" <linux-kselftest@vger.kernel.org>
Cc: "arnd@arndb.de" <arnd@arndb.de>
Cc: "linux@armlinux.org.uk" <linux@armlinux.org.uk>
Cc: "ralf@linux-mips.org" <ralf@linux-mips.org>
Cc: "paul.burton@mips.com" <paul.burton@mips.com>
Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Cc: "salyzyn@android.com" <salyzyn@android.com>
Cc: "pcc@google.com" <pcc@google.com>
Cc: "shuah@kernel.org" <shuah@kernel.org>
Cc: "0x7f454c46@gmail.com" <0x7f454c46@gmail.com>
Cc: "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>
Cc: "huw@codeweavers.com" <huw@codeweavers.com>
Cc: "sfr@canb.auug.org.au" <sfr@canb.auug.org.au>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>
Cc: "rkrcmar@redhat.com" <rkrcmar@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Link: https://lkml.kernel.org/r/1561955054-1838-3-git-send-email-mikelley@microsoft.com
|
|
Hyper-V clock/timer code and data structures are currently mixed
in with other code in the ISA independent drivers/hv directory as
well as the ISA dependent Hyper-V code under arch/x86.
Consolidate this code and data structures into a Hyper-V clocksource driver
to better follow the Linux model. In doing so, separate out the ISA
dependent portions so the new clocksource driver works for x86 and for the
in-process Hyper-V on ARM64 code.
To start, move the existing clockevents code to create the new clocksource
driver. Update the VMbus driver to call initialization and cleanup routines
since the Hyper-V synthetic timers are not independently enumerated in
ACPI.
No behavior is changed and no new functionality is added.
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "bp@alien8.de" <bp@alien8.de>
Cc: "will.deacon@arm.com" <will.deacon@arm.com>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>
Cc: "mark.rutland@arm.com" <mark.rutland@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Cc: "olaf@aepfle.de" <olaf@aepfle.de>
Cc: "apw@canonical.com" <apw@canonical.com>
Cc: "jasowang@redhat.com" <jasowang@redhat.com>
Cc: "marcelo.cerri@canonical.com" <marcelo.cerri@canonical.com>
Cc: Sunil Muthuswamy <sunilmut@microsoft.com>
Cc: KY Srinivasan <kys@microsoft.com>
Cc: "sashal@kernel.org" <sashal@kernel.org>
Cc: "vincenzo.frascino@arm.com" <vincenzo.frascino@arm.com>
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "linux-mips@vger.kernel.org" <linux-mips@vger.kernel.org>
Cc: "linux-kselftest@vger.kernel.org" <linux-kselftest@vger.kernel.org>
Cc: "arnd@arndb.de" <arnd@arndb.de>
Cc: "linux@armlinux.org.uk" <linux@armlinux.org.uk>
Cc: "ralf@linux-mips.org" <ralf@linux-mips.org>
Cc: "paul.burton@mips.com" <paul.burton@mips.com>
Cc: "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Cc: "salyzyn@android.com" <salyzyn@android.com>
Cc: "pcc@google.com" <pcc@google.com>
Cc: "shuah@kernel.org" <shuah@kernel.org>
Cc: "0x7f454c46@gmail.com" <0x7f454c46@gmail.com>
Cc: "linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>
Cc: "huw@codeweavers.com" <huw@codeweavers.com>
Cc: "sfr@canb.auug.org.au" <sfr@canb.auug.org.au>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>
Cc: "rkrcmar@redhat.com" <rkrcmar@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
Link: https://lkml.kernel.org/r/1561955054-1838-2-git-send-email-mikelley@microsoft.com
|
|
so the hyper-v clocksource update can be applied.
|
|
Quite some time ago the interrupt entry stubs for unused vectors in the
system vector range got removed and directly mapped to the spurious
interrupt vector entry point.
Sounds reasonable, but it's subtly broken. The spurious interrupt vector
entry point pushes vector number 0xFF on the stack which makes the whole
logic in __smp_spurious_interrupt() pointless.
As a consequence any spurious interrupt which comes from a vector != 0xFF
is treated as a real spurious interrupt (vector 0xFF) and not
acknowledged. That subsequently stalls all interrupt vectors of equal and
lower priority, which brings the system to a grinding halt.
This can happen because even on 64-bit the system vector space is not
guaranteed to be fully populated. A full compile time handling of the
unused vectors is not possible because quite some of them are conditonally
populated at runtime.
Bring the entry stubs back, which wastes 160 bytes if all stubs are unused,
but gains the proper handling back. There is no point to selectively spare
some of the stubs which are known at compile time as the required code in
the IDT management would be way larger and convoluted.
Do not route the spurious entries through common_interrupt and do_IRQ() as
the original code did. Route it to smp_spurious_interrupt() which evaluates
the vector number and acts accordingly now that the real vector numbers are
handed in.
Fixup the pr_warn so the actual spurious vector (0xff) is clearly
distiguished from the other vectors and also note for the vectored case
whether it was pending in the ISR or not.
"Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen."
"Spurious interrupt vector 0xed on CPU#1. Acked."
"Spurious interrupt vector 0xee on CPU#1. Not pending!."
Fixes: 2414e021ac8d ("x86: Avoid building unused IRQ entry stubs")
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jan Beulich <jbeulich@suse.com>
Link: https://lkml.kernel.org/r/20190628111440.550568228@linutronix.de
|
|
Since the rework of the vector management, warnings about spurious
interrupts have been reported. Robert provided some more information and
did an initial analysis. The following situation leads to these warnings:
CPU 0 CPU 1 IO_APIC
interrupt is raised
sent to CPU1
Unable to handle
immediately
(interrupts off,
deep idle delay)
mask()
...
free()
shutdown()
synchronize_irq()
clear_vector()
do_IRQ()
-> vector is clear
Before the rework the vector entries of legacy interrupts were statically
assigned and occupied precious vector space while most of them were
unused. Due to that the above situation was handled silently because the
vector was handled and the core handler of the assigned interrupt
descriptor noticed that it is shut down and returned.
While this has been usually observed with legacy interrupts, this situation
is not limited to them. Any other interrupt source, e.g. MSI, can cause the
same issue.
After adding proper synchronization for level triggered interrupts, this
can only happen for edge triggered interrupts where the IO-APIC obviously
cannot provide information about interrupts in flight.
While the spurious warning is actually harmless in this case it worries
users and driver developers.
Handle it gracefully by marking the vector entry as VECTOR_SHUTDOWN instead
of VECTOR_UNUSED when the vector is freed up.
If that above late handling happens the spurious detector will not complain
and switch the entry to VECTOR_UNUSED. Any subsequent spurious interrupt on
that line will trigger the spurious warning as before.
Fixes: 464d12309e1b ("x86/vector: Switch IOAPIC to global reservation mode")
Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>-
Tested-by: Robert Hodaszi <Robert.Hodaszi@digi.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.459647741@linutronix.de
|
|
When an interrupt is shut down in free_irq() there might be an inflight
interrupt pending in the IO-APIC remote IRR which is not yet serviced. That
means the interrupt has been sent to the target CPUs local APIC, but the
target CPU is in a state which delays the servicing.
So free_irq() would proceed to free resources and to clear the vector
because synchronize_hardirq() does not see an interrupt handler in
progress.
That can trigger a spurious interrupt warning, which is harmless and just
confuses users, but it also can leave the remote IRR in a stale state
because once the handler is invoked the interrupt resources might be freed
already and therefore acknowledgement is not possible anymore.
Implement the irq_get_irqchip_state() callback for the IO-APIC irq chip. The
callback is invoked from free_irq() via __synchronize_hardirq(). Check the
remote IRR bit of the interrupt and return 'in flight' if it is set and the
interrupt is configured in level mode. For edge mode the remote IRR has no
meaning.
As this is only meaningful for level triggered interrupts this won't cure
the potential spurious interrupt warning for edge triggered interrupts, but
the edge trigger case does not result in stale hardware state. This has to
be addressed at the vector/interrupt entry level seperately.
Fixes: 464d12309e1b ("x86/vector: Switch IOAPIC to global reservation mode")
Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.370295517@linutronix.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Fix a build failure with the LLVM linker and a module allocation
failure when KASLR is active:
- Fix module allocation when running with KASLR enabled
- Fix broken build due to bug in LLVM linker (ld.lld)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/efi: Mark __efistub_stext_offset as an absolute symbol explicitly
arm64: kaslr: keep modules inside module region when KASAN is enabled
|
|
This patch corrects the SPDX License Identifier style
in the powerpc Hardware Architecture related files.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
If the previous comment made sense, continue debugging or call your
doctor immediately.
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Remove the CONFIG_UEVENT_HELPER_PATH because:
1. It is disabled since commit 1be01d4a5714 ("driver: base: Disable
CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
made default to 'n',
2. It is not recommended (help message: "This should not be used today
[...] creates a high system load") and was kept only for ancient
userland,
3. Certain userland specifically requests it to be disabled (systemd
README: "Legacy hotplug slows down the system and confuses udev").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When testing out gpio-keys with a button, a spurious
interrupt (and therefore a key press or release event)
gets triggered as soon as the driver enables the irq
line for the first time.
This patch clears any potential bogus generated interrupt
that was caused by the switching of the associated irq's
type and polarity.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Commit ddf35cf3764b ("powerpc: Use barrier_nospec in copy_from_user()")
Added barrier_nospec before loading from user-controlled pointers. The
intention was to order the load from the potentially user-controlled
pointer vs a previous branch based on an access_ok() check or similar.
In order to achieve the same result, add a barrier_nospec to the
raw_copy_in_user() function before loading from such a user-controlled
pointer.
Fixes: ddf35cf3764b ("powerpc: Use barrier_nospec in copy_from_user()")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The
code currently sets:
CR0 <- 00 || MSR[TS]
but according to the ISA it should be:
CR0 <- 0 || MSR[TS] || 0
This fixes the bit shift to put the bits in the correct location.
This is a data integrity issue as CR0 is corrupted.
Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9")
Cc: stable@vger.kernel.org # v4.17+
Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The powernv platform uses @dma_iommu_ops for non-bypass DMA. These ops
need an iommu_table pointer which is stored in
dev->archdata.iommu_table_base. It is initialized during
pcibios_setup_device() which handles boot time devices. However when a
device is taken from the system in order to pass it through, the
default IOMMU table is destroyed but the pointer in a device is not
updated; also when a device is returned back to the system, a new
table pointer is not stored in dev->archdata.iommu_table_base either.
So when a just returned device tries using IOMMU, it crashes on
accessing stale iommu_table or its members.
This calls set_iommu_table_base() when the default window is created.
Note it used to be there before but was wrongly removed (see "fixes").
It did not appear before as these days most devices simply use bypass.
This adds set_iommu_table_base(NULL) when a device is taken from the
system to make it clear that IOMMU DMA cannot be used past that point.
Fixes: c4e9d3c1e65a ("powerpc/powernv/pseries: Rework device adding to IOMMU groups")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The pseries platform uses the PCI_PROBE_DEVTREE method of PCI probing
which reads "assigned-addresses" of every PCI device and initializes
the device resources. However if the property is missing or zero sized,
then there is no fallback of any kind and the PCI resources remain
undiscovered, i.e. pdev->resource[] array remains empty.
This adds a fallback which parses the "reg" property in pretty much same
way except it marks resources as "unset" which later make Linux assign
those resources proper addresses.
This has an effect when:
1. a hypervisor failed to assign any resource for a device;
2. /chosen/linux,pci-probe-only=0 is in the DT so the system may try
assigning a resource.
Neither is likely to happen under PowerVM.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
So far the pseries platforms has always been using IOMMU making
SWIOTLB unnecessary. Now we want secure guests which means devices can
only access certain areas of guest physical memory; we are going to
use SWIOTLB for this purpose.
This allows SWIOTLB for pseries. By default there is no change in
behavior.
This enables SWIOTLB when the "swiotlb" kernel parameter is set to
"force".
With the SWIOTLB enabled, the kernel creates a directly mapped DMA
window (using the usual DDW mechanism) and implements SWIOTLB on top
of that.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|