Age | Commit message (Collapse) | Author |
|
* for-next/sve:
arm64/sve: Fix warnings when SVE is disabled
arm64/sve: Add stub for sve_max_virtualisable_vl()
arm64/sve: Track vector lengths for tasks in an array
arm64/sve: Explicitly load vector length when restoring SVE state
arm64/sve: Put system wide vector length information into structs
arm64/sve: Use accessor functions for vector lengths in thread_struct
arm64/sve: Rename find_supported_vector_length()
arm64/sve: Make access to FFR optional
arm64/sve: Make sve_state_size() static
arm64/sve: Remove sve_load_from_fpsimd_state()
arm64/fp: Reindent fpsimd_save()
|
|
* for-next/mte:
kasan: Extend KASAN mode kernel parameter
arm64: mte: Add asymmetric mode support
arm64: mte: CPU feature detection for Asymm MTE
arm64: mte: Bitfield definitions for Asymm MTE
kasan: Remove duplicate of kasan_flag_async
arm64: kasan: mte: move GCR_EL1 switch to task switch when KASAN disabled
|
|
* for-next/misc:
arm64: Select POSIX_CPU_TIMERS_TASK_WORK
arm64: Document boot requirements for FEAT_SME_FA64
arm64: ftrace: use function_nocfi for _mcount as well
arm64: asm: setup.h: export common variables
arm64/traps: Avoid unnecessary kernel/user pointer conversion
|
|
* for-next/kexec:
arm64: trans_pgd: remove trans_pgd_map_page()
arm64: kexec: remove cpu-reset.h
arm64: kexec: remove the pre-kexec PoC maintenance
arm64: kexec: keep MMU enabled during kexec relocation
arm64: kexec: install a copy of the linear-map
arm64: kexec: use ld script for relocation function
arm64: kexec: relocate in EL1 mode
arm64: kexec: configure EL2 vectors for kexec
arm64: kexec: pass kimage as the only argument to relocation function
arm64: kexec: Use dcache ops macros instead of open-coding
arm64: kexec: skip relocation code for inplace kexec
arm64: kexec: flush image and lists during kexec load time
arm64: hibernate: abstract ttrb0 setup function
arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors
arm64: kernel: add helper for booted at EL2 and not VHE
|
|
* for-next/extable:
arm64: vmlinux.lds.S: remove `.fixup` section
arm64: extable: add load_unaligned_zeropad() handler
arm64: extable: add a dedicated uaccess handler
arm64: extable: add `type` and `data` fields
arm64: extable: use `ex` for `exception_table_entry`
arm64: extable: make fixup_exception() return bool
arm64: extable: consolidate definitions
arm64: gpr-num: support W registers
arm64: factor out GPR numbering helpers
arm64: kvm: use kvm_exception_table_entry
arm64: lib: __arch_copy_to_user(): fold fixups into body
arm64: lib: __arch_copy_from_user(): fold fixups into body
arm64: lib: __arch_clear_user(): fold fixups into body
|
|
In configurations where SVE is disabled we define but never reference the
functions for retrieving the default vector length, causing warnings. Fix
this by move the ifdef up, marking get_default_vl() inline since it is
referenced from code guarded by an IS_ENABLED() check, and do the same for
the other accessors for consistency.
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211022141635.2360415-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We no longer place anything into a `.fixup` section, so we no longer
need to place those sections into the `.text` section in the main kernel
Image.
Remove the use of `.fixup`.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups as
offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address is
branched to at the end of the fixup. This is potentially wasteful of
I-cache resources, and makes it hard to add additional logic to fixups
without significant bloat.
This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.
Other than backtracing, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Subsequent patches will add specialized handlers for fixups, in addition
to the simple PC fixup and BPF handlers we have today. In preparation,
this patch adds a new `type` field to struct exception_table_entry, and
uses this to distinguish the fixup and BPF cases. A `data` field is also
added so that subsequent patches can associate data specific to each
exception site (e.g. register numbers).
Handlers are named ex_handler_*() for consistency, following the exmaple
of x86. At the same time, get_ex_fixup() is split out into a helper so
that it can be used by other ex_handler_*() functions ins subsequent
patches.
This patch will increase the size of the exception tables, which will be
remedied by subsequent patches removing redundant fixup code. There
should be no functional change as a result of this patch.
Since each entry is now 12 bytes in size, we must reduce the alignment
of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4
bytes), which is the natrual alignment of the `insn` and `fixup` fields.
The current 8-byte alignment is a holdover from when the `insn` and
`fixup` fields was 8 bytes, and while not harmful has not been necessary
since commit:
6c94f27ac847ff8e ("arm64: switch to relative exception tables")
Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes.
Concurrently with this patch, x86's exception table entry format is
being updated (similarly to a 12-byte format, with 32-bytes of absolute
data). Once both have been merged it should be possible to unify the
sorttable logic for the two.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As for SVE we will track a per task SME vector length for tasks. Convert
the existing storage for the vector length into an array and update
fpsimd_flush_task() to initialise this in a function.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-10-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently when restoring the SVE state we supply the SVE vector length
as an argument to sve_load_state() and the underlying macros. This becomes
inconvenient with the addition of SME since we may need to restore any
combination of SVE and SME vector lengths, and we already separately
restore the vector length in the KVM code. We don't need to know the vector
length during the actual register load since the SME load instructions can
index into the data array for us.
Refactor the interface so we explicitly set the vector length separately
to restoring the SVE registers in preparation for adding SME support, no
functional change should be involved.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With the introduction of SME we will have a second vector length in the
system, enumerated and configured in a very similar fashion to the
existing SVE vector length. While there are a few differences in how
things are handled this is a relatively small portion of the overall
code so in order to avoid code duplication we factor out
We create two structs, one vl_info for the static hardware properties
and one vl_config for the runtime configuration, with an array
instantiated for each and update all the users to reference these. Some
accessor functions are provided where helpful for readability, and the
write to set the vector length is put into a function since the system
register being updated needs to be chosen at compile time.
This is a mostly mechanical replacement, further work will be required
to actually make things generic, ensuring that we handle those places
where there are differences properly.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In a system with SME there are parallel vector length controls for SVE and
SME vectors which function in much the same way so it is desirable to
share the code for handling them as much as possible. In order to prepare
for doing this add a layer of accessor functions for the various VL related
operations on tasks.
Since almost all current interactions are actually via task->thread rather
than directly with the thread_info the accessors use that. Accessors are
provided for both generic and SVE specific usage, the generic accessors
should be used for cases where register state is being manipulated since
the registers are shared between streaming and regular SVE so we know that
when SME support is implemented we will always have to be in the appropriate
mode already and hence can generalise now.
Since we are using task_struct and we don't want to cause widespread
inclusion of sched.h the acessors are all out of line, it is hoped that
none of the uses are in a sufficiently critical path for this to be an
issue. Those that are most likely to present an issue are in the same
translation unit so hopefully the compiler may be able to inline anyway.
This is purely adding the layer of abstraction, additional work will be
needed to support tasks using SME.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The function has SVE specific checks in it and it will be more trouble
to add conditional code for SME than it is to simply rename it to be SVE
specific.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
SME introduces streaming SVE mode in which FFR is not present and the
instructions for accessing it UNDEF. In preparation for handling this
update the low level SVE state access functions to take a flag specifying
if FFR should be handled. When saving the register state we store a zero
for FFR to guard against uninitialized data being read. No behaviour change
should be introduced by this patch.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There are no users outside fpsimd.c so make sve_state_size() static.
KVM open codes an equivalent.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Following optimisations of the SVE register handling we no longer load the
SVE state from a saved copy of the FPSIMD registers, we convert directly
in registers or from one saved state to another. Remove the function so we
don't need to update it during further refactoring.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently all the active code in fpsimd_save() is inside a check for
TIF_FOREIGN_FPSTATE. Reduce the indentation level by changing to return
from the function if TIF_FOREIGN_FPSTATE is set.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since userspace can make use of the CNTVSS_EL0 instruction, expose
it via a HWCAP.
Suggested-by: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-18-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since CNTVCTSS obey the same control bits as CNTVCT, add the necessary
decoding to the hook table. Note that there is no known user of
this at the moment.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-17-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a new capability to detect the Enhanced Counter Virtualization
feature (FEAT_ECV).
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-15-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
MTE provides an asymmetric mode for detecting tag exceptions. In
particular, when such a mode is present, the CPU triggers a fault
on a tag mismatch during a load operation and asynchronously updates
a register when a tag mismatch is detected during a store operation.
Add support for MTE asymmetric mode.
Note: If the CPU does not support MTE asymmetric mode the kernel falls
back on synchronous mode which is the default for kasan=on.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20211006154751.4463-5-vincenzo.frascino@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the cpufeature entries to detect the presence of Asymmetric MTE.
Note: The tag checking mode is initialized via cpu_enable_mte() ->
kasan_init_hw_tags() hence to enable it we require asymmetric mode
to be at least on the boot CPU. If the boot CPU does not have it, it is
fine for late CPUs to have it as long as the feature is not enabled
(ARM64_CPUCAP_BOOT_CPU_FEATURE).
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20211006154751.4463-4-vincenzo.frascino@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This header contains only cpu_soft_restart() which is never used directly
anymore. So, remove this header, and rename the helper to be
cpu_soft_restart().
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-15-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now that kexec does its relocations with the MMU enabled, we no longer
need to clean the relocation data to the PoC.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-14-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now, that we have linear map page tables configured, keep MMU enabled
to allow faster relocation of segments to final destination.
Cavium ThunderX2:
Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M
MMU-disabled:
relocation 7.489539915s
MMU-enabled:
relocation 0.03946095s
Broadcom Stingray:
The performance data: for a moderate size kernel + initramfs: 25M the
relocation was taking 0.382s, with enabled MMU it now takes
0.019s only or x20 improvement.
The time is proportional to the size of relocation, therefore if initramfs
is larger, 100M it could take over a second.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-13-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
To perform the kexec relocation with the MMU enabled, we need a copy
of the linear map.
Create one, and install it from the relocation code. This has to be done
from the assembly code as it will be idmapped with TTBR0. The kernel
runs in TTRB1, so can't use the break-before-make sequence on the mapping
it is executing from.
The makes no difference yet as the relocation code runs with the MMU
disabled.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-12-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently, relocation code declares start and end variables
which are used to compute its size.
The better way to do this is to use ld script, and put relocation
function in its own section.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-11-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since we are going to keep MMU enabled during relocation, we need to
keep EL1 mode throughout the relocation.
Keep EL1 enabled, and switch EL2 only before entering the new world.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-10-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If we have a EL2 mode without VHE, the EL2 vectors are needed in order
to switch to EL2 and jump to new world with hypervisor privileges.
In preparation to MMU enabled relocation, configure our EL2 table now.
Kexec uses #HVC_SOFT_RESTART to branch to the new world, so extend
el1_sync vector that is provided by trans_pgd_copy_el2_vectors() to
support this case.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-9-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
the following arguments:
head: start of array that contains relocation information.
entry: entry point for new kernel or purgatory.
dtb_mem: first and only argument to entry.
The number of arguments cannot be easily expended, because this
function is also called from HVC_SOFT_RESTART, which preserves only
three arguments. And, also arm64_relocate_new_kernel is written in
assembly but called without stack, thus no place to move extra arguments
to free registers.
Soon, we will need to pass more arguments: once we enable MMU we
will need to pass information about page tables.
Pass kimage to arm64_relocate_new_kernel, and teach it to get the
required fields from kimage.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-8-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
kexec does dcache maintenance when it re-writes all memory. Our
dcache_by_line_op macro depends on reading the sanitized DminLine
from memory. Kexec may have overwritten this, so open-codes the
sequence.
dcache_by_line_op is a whole set of macros, it uses dcache_line_size
which uses read_ctr for the sanitsed DminLine. Reading the DminLine
is the first thing the dcache_by_line_op does.
Rename dcache_by_line_op dcache_by_myline_op and take DminLine as
an argument. Kexec can now use the slightly smaller macro.
This makes up-coming changes to the dcache maintenance easier on
the eye.
Code generated by the existing callers is unchanged.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-7-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In case of kdump or when segments are already in place the relocation
is not needed, therefore the setup of relocation function and call to
it can be skipped.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Suggested-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-6-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently, during kexec load we are copying relocation function and
flushing it. However, we can also flush kexec relocation buffers and
if new kernel image is already in place (i.e. crash kernel), we can
also flush the new kernel image itself.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-5-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently, only hibernate sets custom ttbr0 with safe idmaped function.
Kexec, is also going to be using this functionality when relocation code
is going to be idmapped.
Move the setup sequence to a dedicated cpu_install_ttbr0() for custom
ttbr0.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-4-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Users of trans_pgd may also need a copy of vector table because it is
also may be overwritten if a linear map can be overwritten.
Move setup of EL2 vectors from hibernate to trans_pgd, so it can be
later shared with kexec as well.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-3-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Replace places that contain logic like this:
is_hyp_mode_available() && !is_kernel_in_hyp_mode()
With a dedicated boolean function is_hyp_nvhe(). This will be needed
later in kexec in order to sooner switch back to EL2.
Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-2-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
It is not necessary to write to GCR_EL1 on every kernel entry and
exit when HW tag-based KASAN is disabled because the kernel will not
execute any IRG instructions in that mode. Since accessing GCR_EL1
can be expensive on some microarchitectures, avoid doing so by moving
the access to task switch when HW tag-based KASAN is disabled.
Signed-off-by: Peter Collingbourne <pcc@google.com>
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://linux-review.googlesource.com/id/I78e90d60612a94c24344526f476ac4ff216e10d2
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210924010655.2886918-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Annotating a pointer from kernel to __user and then back again requires
an extra __force annotation to silent sparse warning. In call_undef_hook()
this unnecessary complexity can be avoided by modifying the intermediate
user pointer to unsigned long.
This way there is no inter-changeable use of user and kernel pointers
and the code is consistent.
Note: This patch adds no functional changes to code.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210917055811.22341-1-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a recent commit related to memory management that turned out to
be problematic (Jia He)"
* tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- It turns out that the optimised string routines merged in 5.14 are
not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
the end of a string (strcmp, strncmp). Such reading may go across a
16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
is enabled, use the generic strcmp/strncmp C implementation.
- An errata workaround for ThunderX relied on the CPU capabilities
being enabled in a specific order. This disappeared with the
automatic generation of the cpucaps.h file (sorted alphabetically).
Fix it by checking the current CPU only rather than the system-wide
capability.
- Add system_supports_mte() checks on the kernel entry/exit path and
thread switching to avoid unnecessary barriers and function calls on
systems where MTE is not supported.
- kselftests: skip arm64 tests if the required features are missing.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Restore forced disabling of KPTI on ThunderX
kselftest/arm64: signal: Skip tests if required features are missing
arm64: Mitigate MTE issues with str{n}cmp()
arm64: add MTE supported check to thread switching and syscall entry/exit
|
|
This reverts commit 437b38c51162f8b87beb28a833c4d5dc85fa864e.
The memory semantics added in commit 437b38c51162 causes SystemMemory
Operation region, whose address range is not described in the EFI memory
map to be mapped as NormalNC memory on arm64 platforms (through
acpi_os_map_memory() in acpi_ex_system_memory_space_handler()).
This triggers the following abort on an ARM64 Ampere eMAG machine,
because presumably the physical address range area backing the Opregion
does not support NormalNC memory attributes driven on the bus.
Internal error: synchronous external abort: 96000410 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462
Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[...snip...]
Call trace:
acpi_ex_system_memory_space_handler+0x26c/0x2c8
acpi_ev_address_space_dispatch+0x228/0x2c4
acpi_ex_access_region+0x114/0x268
acpi_ex_field_datum_io+0x128/0x1b8
acpi_ex_extract_from_field+0x14c/0x2ac
acpi_ex_read_data_from_field+0x190/0x1b8
acpi_ex_resolve_node_to_value+0x1ec/0x288
acpi_ex_resolve_to_value+0x250/0x274
acpi_ds_evaluate_name_path+0xac/0x124
acpi_ds_exec_end_op+0x90/0x410
acpi_ps_parse_loop+0x4ac/0x5d8
acpi_ps_parse_aml+0xe0/0x2c8
acpi_ps_execute_method+0x19c/0x1ac
acpi_ns_evaluate+0x1f8/0x26c
acpi_ns_init_one_device+0x104/0x140
acpi_ns_walk_namespace+0x158/0x1d0
acpi_ns_initialize_devices+0x194/0x218
acpi_initialize_objects+0x48/0x50
acpi_init+0xe0/0x498
If the Opregion address range is not present in the EFI memory map there
is no way for us to determine the memory attributes to use to map it -
defaulting to NormalNC does not work (and it is not correct on a memory
region that may have read side-effects) and therefore commit
437b38c51162 should be reverted, which means reverting back to the
original behavior whereby address ranges that are mapped using
acpi_os_map_memory() default to the safe devicenGnRnE attributes on
ARM64 if the mapped address range is not defined in the EFI memory map.
Fixes: 437b38c51162 ("ACPI: Add memory semantics to acpi_os_map_memory()")
Signed-off-by: Jia He <justin.he@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
is that cpucaps are now sorted, changing the enumeration order. This
assumed no dependencies between cpucaps, which turned out not to be true
in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
when unmap_kernel_at_el0() checks for it, so the kernel tries to
run w/ KPTI - and quickly falls over.
Because all ThunderX implementations have homogeneous CPUs, we can remove
this dependency by just checking the current CPU for the erratum.
Fixes: 0c6c2d3615ef ("arm64: Generate cpucaps.h")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210923145002.3394558-1-dann.frazier@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Invoke rseq_handle_notify_resume() from tracehook_notify_resume() now
that the two function are always called back-to-back by architectures
that have rseq. The rseq helper is stubbed out for architectures that
don't support rseq, i.e. this is a nop across the board.
Note, tracehook_notify_resume() is horribly named and arguably does not
belong in tracehook.h as literally every line of code in it has nothing
to do with tracing. But, that's been true since commit a42c6ded827d
("move key_repace_session_keyring() into tracehook_notify_resume()")
first usurped tracehook_notify_resume() back in 2012. Punt cleaning that
mess up to future patches.
No functional change intended.
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This lets us avoid doing unnecessary work on hardware that does not
support MTE, and will allow us to freely use MTE instructions in the
code called by mte_thread_switch().
Since this would mean that we do a redundant check in
mte_check_tfsr_el1(), remove it and add two checks now required in its
callers. This also avoids an unnecessary DSB+ISB sequence on the syscall
exit path for hardware not supporting MTE.
Fixes: 65812c6921cc ("arm64: mte: Enable async tag check fault")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I02fd000d1ef2c86c7d2952a7f099b254ec227a5d
Link: https://lore.kernel.org/r/20210915190336.398390-1-pcc@google.com
[catalin.marinas@arm.com: adjust the commit log slightly]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
__stack_chk_guard is setup once while init stage and never changed
after that.
Although the modification of this variable at runtime will usually
cause the kernel to crash (so does the attacker), it should be marked
as __ro_after_init, and it should not affect performance if it is
placed in the ro_after_init section.
Signed-off-by: Dan Li <ashimida@linux.alibaba.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1631612642-102881-1-git-send-email-ashimida@linux.alibaba.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Remove all but the first include of linux/sched.h from process.c
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210902011126.29828-1-lv.ruyi@zte.com.cn
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
When we need a buffer for SVE register state we call sve_alloc() to make
sure that one is there. In order to avoid repeated allocations and frees
we keep the buffer around unless we change vector length and just memset()
it to ensure a clean register state. The function that deals with this
takes the task to operate on as an argument, however in the case where we
do a memset() we initialise using the SVE state size for the current task
rather than the task passed as an argument.
This is only an issue in the case where we are setting the register state
for a task via ptrace and the task being configured has a different vector
length to the task tracing it. In the case where the buffer is larger in
the traced process we will leak old state from the traced process to
itself, in the case where the buffer is smaller in the traced process we
will overflow the buffer and corrupt memory.
Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Cc: <stable@vger.kernel.org> # 4.15.x
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210909165356.10675-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Ensure that all usage sites of get/put_online_cpus() except for the
struggler in drivers/thermal are gone. So the last user and the deprecated
inlines can be removed.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Convert controller drivers to generic_handle_domain_irq() (Marc
Zyngier)
- Simplify VPD (Vital Product Data) access and search (Heiner
Kallweit)
- Update bnx2, bnx2x, bnxt, cxgb4, cxlflash, sfc, tg3 drivers to use
simplified VPD interfaces (Heiner Kallweit)
- Run Max Payload Size quirks before configuring MPS; work around
ASMedia ASM1062 SATA MPS issue (Marek Behún)
Resource management:
- Refactor pci_ioremap_bar() and pci_ioremap_wc_bar() (Krzysztof
Wilczyński)
- Optimize pci_resource_len() to reduce kernel size (Zhen Lei)
PCI device hotplug:
- Fix a double unmap in ibmphp (Vishal Aslot)
PCIe port driver:
- Enable Bandwidth Notification only if port supports it (Stuart
Hayes)
Sysfs/proc/syscalls:
- Add schedule point in proc_bus_pci_read() (Krzysztof Wilczyński)
- Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure (Krzysztof
Wilczyński)
- Return "int" from pciconfig_read() syscall (Krzysztof Wilczyński)
Virtualization:
- Extend "pci=noats" to also turn on Translation Blocking to protect
against some DMA attacks (Alex Williamson)
- Add sysfs mechanism to control the type of reset used between
device assignments to VMs (Amey Narkhede)
- Add support for ACPI _RST reset method (Shanker Donthineni)
- Add ACS quirks for Cavium multi-function devices (George Cherian)
- Add ACS quirks for NXP LX2xx0 and LX2xx2 platforms (Wasim Khan)
- Allow HiSilicon AMBA devices that appear as fake PCI devices to use
PASID and SVA (Zhangfei Gao)
Endpoint framework:
- Add support for SR-IOV Endpoint devices (Kishon Vijay Abraham I)
- Zero-initialize endpoint test tool parameters so we don't use
random parameters (Shunyong Yang)
APM X-Gene PCIe controller driver:
- Remove redundant dev_err() call in xgene_msi_probe() (ErKun Yang)
Broadcom iProc PCIe controller driver:
- Don't fail devm_pci_alloc_host_bridge() on missing 'ranges' because
it's optional on BCMA devices (Rob Herring)
- Fix BCMA probe resource handling (Rob Herring)
Cadence PCIe driver:
- Work around J7200 Link training electrical issue by increasing
delays in LTSSM (Nadeem Athani)
Intel IXP4xx PCI controller driver:
- Depend on ARCH_IXP4XX to avoid useless config questions (Geert
Uytterhoeven)
Intel Keembay PCIe controller driver:
- Add Intel Keem Bay PCIe controller (Srikanth Thokala)
Marvell Aardvark PCIe controller driver:
- Work around config space completion handling issues (Evan Wang)
- Increase timeout for config access completions (Pali Rohár)
- Emulate CRS Software Visibility bit (Pali Rohár)
- Configure resources from DT 'ranges' property to fix I/O space
access (Pali Rohár)
- Serialize INTx mask/unmask (Pali Rohár)
MediaTek PCIe controller driver:
- Add MT7629 support in DT (Chuanjia Liu)
- Fix an MSI issue (Chuanjia Liu)
- Get syscon regmap ("mediatek,generic-pciecfg"), IRQ number
("pci_irq"), PCI domain ("linux,pci-domain") from DT properties if
present (Chuanjia Liu)
Microsoft Hyper-V host bridge driver:
- Add ARM64 support (Boqun Feng)
- Support "Create Interrupt v3" message (Sunil Muthuswamy)
NVIDIA Tegra PCIe controller driver:
- Use seq_puts(), move err_msg from stack to static, fix OF node leak
(Christophe JAILLET)
NVIDIA Tegra194 PCIe driver:
- Disable suspend when in Endpoint mode (Om Prakash Singh)
- Fix MSI-X address programming error (Om Prakash Singh)
- Disable interrupts during suspend to avoid spurious AER link down
(Om Prakash Singh)
Renesas R-Car PCIe controller driver:
- Work around hardware issue that prevents Link L1->L0 transition
(Marek Vasut)
- Fix runtime PM refcount leak (Dinghao Liu)
Rockchip DesignWare PCIe controller driver:
- Add Rockchip RK356X host controller driver (Simon Xue)
TI J721E PCIe driver:
- Add support for J7200 and AM64 (Kishon Vijay Abraham I)
Toshiba Visconti PCIe controller driver:
- Add Toshiba Visconti PCIe host controller driver (Nobuhiro
Iwamatsu)
Xilinx NWL PCIe controller driver:
- Enable PCIe reference clock via CCF (Hyun Kwon)
Miscellaneous:
- Convert sta2x11 from 'pci_' to 'dma_' API (Christophe JAILLET)
- Fix pci_dev_str_match_path() alloc while atomic bug (used for
kernel parameters that specify devices) (Dan Carpenter)
- Remove pointless Precision Time Management warning when PTM is
present but not enabled (Jakub Kicinski)
- Remove surplus "break" statements (Krzysztof Wilczyński)"
* tag 'pci-v5.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (132 commits)
PCI: ibmphp: Fix double unmap of io_mem
x86/PCI: sta2x11: switch from 'pci_' to 'dma_' API
PCI/VPD: Use unaligned access helpers
PCI/VPD: Clean up public VPD defines and inline functions
cxgb4: Use pci_vpd_find_id_string() to find VPD ID string
PCI/VPD: Add pci_vpd_find_id_string()
PCI/VPD: Include post-processing in pci_vpd_find_tag()
PCI/VPD: Stop exporting pci_vpd_find_info_keyword()
PCI/VPD: Stop exporting pci_vpd_find_tag()
PCI: Set dma-can-stall for HiSilicon chips
PCI: rockchip-dwc: Add Rockchip RK356X host controller driver
PCI: dwc: Remove surplus break statement after return
PCI: artpec6: Remove local code block from switch statement
PCI: artpec6: Remove surplus break statement after return
MAINTAINERS: Add entries for Toshiba Visconti PCIe controller
PCI: visconti: Add Toshiba Visconti PCIe host controller driver
PCI/portdrv: Enable Bandwidth Notification only if port supports it
PCI: Allow PASID on fake PCIe devices without TLP prefixes
PCI: mediatek: Use PCI domain to handle ports detection
PCI: mediatek: Add new method to get irq number
...
|