linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-08-08	powerpc/powernv: Query firmware for count cache flush settings	Michael Ellerman
	Look for fw-features properties to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/pseries: Query hypervisor for count cache flush settings	Michael Ellerman
	Use the existing hypercall to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64s: Add support for software count cache flush	Michael Ellerman
	Some CPU revisions support a mode where the count cache needs to be flushed by software on context switch. Additionally some revisions may have a hardware accelerated flush, in which case the software flush sequence can be shortened. If we detect the appropriate flag from firmware we patch a branch into _switch() which takes us to a count cache flush sequence. That sequence in turn may be patched to return early if we detect that the CPU supports accelerating the flush sequence in hardware. Add debugfs support for reporting the state of the flush, as well as runtime disabling it. And modify the spectre_v2 sysfs file to report the state of the software flush. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64s: Add new security feature flags for count cache flush	Michael Ellerman
	Add security feature flags to indicate the need for software to flush the count cache on context switch, and for the presence of a hardware assisted count cache flush. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/asm: Add a patch_site macro & helpers for patching instructions	Michael Ellerman
	Add a macro and some helper C functions for patching single asm instructions. The gas macro means we can do something like: 1: nop patch_site 1b, patch__foo Which is less visually distracting than defining a GLOBAL symbol at 1, and also doesn't pollute the symbol table which can confuse eg. perf. These are obviously similar to our existing feature sections, but are not automatically patched based on CPU/MMU features, rather they are designed to be manually patched by C code at some arbitrary point. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/fsl: Sanitize the syscall table for NXP PowerPC 32 bit platforms	Diana Craciun
	Used barrier_nospec to sanitize the syscall table. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E	Diana Craciun
	Implement the barrier_nospec as a isync;sync instruction sequence. The implementation uses the infrastructure built for BOOK3S 64. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Make meltdown reporting Book3S 64 specific	Diana Craciun
	In a subsequent patch we will enable building security.c for Book3E. However the NXP platforms are not vulnerable to Meltdown, so make the Meltdown vulnerability reporting PPC_BOOK3S_64 specific. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Call setup_barrier_nospec() from setup_arch()	Michael Ellerman
	Currently we require platform code to call setup_barrier_nospec(). But if we add an empty definition for the !CONFIG_PPC_BARRIER_NOSPEC case then we can call it in setup_arch(). Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC	Michael Ellerman
	Add a config symbol to encode which platforms support the barrier_nospec speculation barrier. Currently this is just Book3S 64 but we will add Book3E in a future patch. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Make stf barrier PPC_BOOK3S_64 specific.	Diana Craciun
	NXP Book3E platforms are not vulnerable to speculative store bypass, so make the mitigations PPC_BOOK3S_64 specific. Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	um: clean up archheaders recipe	Masahiro Yamada
	Now that '%asm-generic' is added to no-dot-config-targets, 'make asm-generic' does not include the kernel configuration. You can simply do 'make asm-generic' in the recursed top Makefile without bothering syncconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Richard Weinberger <richard@nod.at>
2018-08-07	um: fix parallel building with O= option	Masahiro Yamada
	Randy Dunlap reports UML occasionally fails to build with -j<N> and O=<builddir> options. make[1]: Entering directory '/home/rdunlap/mmotm-2018-0802-1529/UM64' UPD include/generated/uapi/linux/version.h WRAP arch/x86/include/generated/asm/dma-contiguous.h WRAP arch/x86/include/generated/asm/export.h WRAP arch/x86/include/generated/asm/early_ioremap.h WRAP arch/x86/include/generated/asm/mcs_spinlock.h WRAP arch/x86/include/generated/asm/mm-arch-hooks.h WRAP arch/x86/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/x86/include/generated/uapi/asm/poll.h GEN ./Makefile make[2]: * No rule to make target 'archheaders'. Stop. arch/um/Makefile:119: recipe for target 'archheaders' failed make[1]: * [archheaders] Error 2 make[1]: * Waiting for unfinished jobs.... UPD include/config/kernel.release make[1]: * wait: No child processes. Stop. Makefile:146: recipe for target 'sub-make' failed make: *** [sub-make] Error 2 The cause of the problem is the use of '$(MAKE) KBUILD_SRC=', which recurses to the top Makefile via the $(objtree)/Makefile generated by scripts/mkmakefile. When you run "make -j<N> O=<builddir> ARCH=um", Make can execute 'archheaders' and 'outputmakefile' targets simultaneously because there is no dependency between them. If it happens, $(Q)$(MAKE) KBUILD_SRC= ARCH=$(HEADER_ARCH) archheaders ... tries to run $(objtree)/Makefile that is being updated. The correct way for the recursion is $(Q)$(MAKE) -f $(srctree)/Makefile ARCH=$(HEADER_ARCH) archheaders ..., which does not rely on the generated Makefile. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Richard Weinberger <richard@nod.at>
2018-08-07	powerpc/64: Disable the speculation barrier from the command line	Diana Craciun
	The speculation barrier can be disabled from the command line with the parameter: "nospectre_v1". Signed-off-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Don't use __MASKABLE_EXCEPTION unnecessarily	Michael Ellerman
	We only need to use __MASKABLE_EXCEPTION in one of the four cases for hardware interrupt, so use the helper macros in the other cases. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Drop unused loc parameter to MASKABLE_EXCEPTION macros	Michael Ellerman
	We pass the "loc" (location) parameter to MASKABLE_EXCEPTION and friends, but it's not used, so drop it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Remove PSERIES naming from the MASKABLE macros	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Drop _MASKABLE_RELON_EXCEPTION_PSERIES()	Michael Ellerman
	_MASKABLE_RELON_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_RELON_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Drop _MASKABLE_EXCEPTION_PSERIES()	Michael Ellerman
	_MASKABLE_EXCEPTION_PSERIES() does nothing useful, update all callers to use __MASKABLE_EXCEPTION_PSERIES() directly. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES to EXCEPTION_PROLOG	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES	Michael Ellerman
	To just EXCEPTION_RELON_PROLOG(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES_1	Michael Ellerman
	The EXCEPTION_RELON_PROLOG_PSERIES_1() macro does the same job as EXCEPTION_PROLOG_2 (which we just recently created), except for "RELON" (relocation on) exceptions. So rename it as such. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Remove PSERIES from the NORI macros	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES_1 to EXCEPTION_PROLOG_2	Michael Ellerman
	As with the other patches in this series, we are removing the "PSERIES" from the name as it's no longer meaningful. In this case it's not simply a case of removing the "PSERIES" as that would result in a clash with the existing EXCEPTION_PROLOG_1. Instead we name this one EXCEPTION_PROLOG_2, as it's usually used in sequence after 0 and 1. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES_OOL to STD_RELON_EXCEPTION_OOL	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES to STD_RELON_EXCEPTION	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename STD_EXCEPTION_PSERIES_OOL to STD_EXCEPTION_OOL	Michael Ellerman
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Rename STD_EXCEPTION_PSERIES to STD_EXCEPTION	Michael Ellerman
	The "PSERIES" in STD_EXCEPTION_PSERIES is to differentiate the macros from the legacy iSeries versions, which are called STD_EXCEPTION_ISERIES. It is not anything to do with pseries vs powernv or powermac etc. We removed the legacy iSeries code in 2012, in commit 8ee3e0d69623x ("powerpc: Remove the main legacy iSerie platform code"). So remove "PSERIES" from the macros. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES()	Michael Ellerman
	EXCEPTION_RELON_PROLOG_PSERIES() only has two users, STD_RELON_EXCEPTION_PSERIES() and STD_RELON_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES()	Michael Ellerman
	EXCEPTION_PROLOG_PSERIES() only has two users, STD_EXCEPTION_PSERIES() and STD_EXCEPTION_HV() both of which "call" SET_SCRATCH0(), so just move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/pasemi: Search for PCI root bus by compatible property	Darren Stevens
	Pasemi arch code finds the root of the PCI-e bus by searching the device-tree for a node called 'pxp'. But the root bus has a compatible property of 'pasemi,rootbus' so search for that instead. Signed-off-by: Darren Stevens <darren@stevens-zone.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/lib: Implement strlen() in assembly for PPC32	Christophe Leroy
	The generic implementation of strlen() reads strings byte per byte. This patch implements strlen() in assembly based on a read of entire words, in the same spirit as what some other arches and glibc do. On a 8xx the time spent in strlen is reduced by 3/4 for long strings. strlen() selftest on an 8xx provides the following values: Before the patch (ie with the generic strlen() in lib/string.c): len 256 : time = 1.195055 len 016 : time = 0.083745 len 008 : time = 0.046828 len 004 : time = 0.028390 After the patch: len 256 : time = 0.272185 ==> 78% improvment len 016 : time = 0.040632 ==> 51% improvment len 008 : time = 0.033060 ==> 29% improvment len 004 : time = 0.029149 ==> 2% degradation On a 832x: Before the patch: len 256 : time = 0.236125 len 016 : time = 0.018136 len 008 : time = 0.011000 len 004 : time = 0.007229 After the patch: len 256 : time = 0.094950 ==> 60% improvment len 016 : time = 0.013357 ==> 26% improvment len 008 : time = 0.010586 ==> 4% improvment len 004 : time = 0.008784 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/pseries: Defer the logging of rtas error to irq work queue.	Mahesh Salgaonkar
	rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX.	Mahesh Salgaonkar
	The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead. Fixes: d368514c3097 ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek <msuchanek@suse.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/xive: Remove xive_kexec_teardown_cpu()	Benjamin Herrenschmidt
	It's identical to xive_teardown_cpu() so just use the latter Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/xive: Remove now useless pr_debug statements	Benjamin Herrenschmidt
	Those overly verbose statement in the setup of the pool VP aren't particularly useful (esp. considering we don't actually use the pool, we configure it bcs HW requires it only). So remove them which improves the code readability. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s: free page table caches at exit_mmap time	Nicholas Piggin
	The kernel page table caches are tied to init_mm, so there is no more need for them after userspace is finished. destroy_context() gets called when we drop the last reference for an mm, which can be much later than the task exit due to other lazy mm references to it. We can free the page table cache pages on task exit because they only cache the userspace page tables and kernel threads should not access user space addresses. The mapping for kernel threads itself is maintained in init_mm and page table cache for that is attached to init_mm. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Merge change log additions from Aneesh] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc/64s/radix: tlb do not flush on page size when fullmm	Nicholas Piggin
	When the mm is being torn down there will be a full PID flush so there is no need to flush the TLB on page size changes. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	powerpc: Add a checkpatch wrapper with our preferred settings	Michael Ellerman
	This makes it easy to run checkpatch with settings that I like. Usage is eg: $ ./arch/powerpc/tools/checkpatch.sh -g origin/master.. To check all commits since origin/master. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Russell Currey <ruscur@russell.cc>
2018-08-07	powerpc/64: Disable irq restore warning for now	Michael Ellerman
	We recently added a warning in arch_local_irq_restore() to check that the soft masking state matches reality. Unfortunately it trips in a few places, which are not entirely trivial to fix. The key problem is if we're doing function_graph tracing of restore_math(), the warning pops and then seems to recurse. It's not entirely clear because the system continuously oopses on all CPUs, with the output interleaved and unreadable. It's also been observed on a G5 coming out of idle. Until we can fix those cases disable the warning for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-07	s390: fix br_r1_trampoline for machines without exrl	Martin Schwidefsky
	For machines without the exrl instruction the BFP jit generates code that uses an "br %r1" instruction located in the lowcore page. Unfortunately there is a cut & paste error that puts an additional "larl %r1,.+14" instruction in the code that clobbers the branch target address in %r1. Remove the larl instruction. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: de5cb6eb51 ("s390: use expoline thunks in the BPF JIT") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-08-07	s390/lib: use expoline for all bcr instructions	Martin Schwidefsky
	The memove, memset, memcpy, __memset16, __memset32 and __memset64 function have an additional indirect return branch in form of a "bzr" instruction. These need to use expolines as well. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 97489e0663 ("s390/lib: use expoline for indirect branches") Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-08-07	cpu/hotplug: Fix SMT supported evaluation	Thomas Gleixner
	Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-08-07	crypto: arm64/ghash-ce - implement 4-way aggregation	Ard Biesheuvel
	Enhance the GHASH implementation that uses 64-bit polynomial multiplication by adding support for 4-way aggregation. This more than doubles the performance, from 2.4 cycles per byte to 1.1 cpb on Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/ghash-ce - replace NEON yield check with block limit	Ard Biesheuvel
	Checking the TIF_NEED_RESCHED flag is disproportionately costly on cores with fast crypto instructions and comparatively slow memory accesses. On algorithms such as GHASH, which executes at ~1 cycle per byte on cores that implement support for 64 bit polynomial multiplication, there is really no need to check the TIF_NEED_RESCHED particularly often, and so we can remove the NEON yield check from the assembler routines. However, unlike the AEAD or skcipher APIs, the shash/ahash APIs take arbitrary input lengths, and so there needs to be some sanity check to ensure that we don't hog the CPU for excessive amounts of time. So let's simply cap the maximum input size that is processed in one go to 64 KB. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: x86/aegis,morus - Fix and simplify CPUID checks	Ondrej Mosnacek
	It turns out I had misunderstood how the x86_match_cpu() function works. It evaluates a logical OR of the matching conditions, not logical AND. This caused the CPU feature checks for AEGIS to pass even if only SSE2 (but not AES-NI) was supported (or vice versa), leading to potential crashes if something tried to use the registered algs. This patch switches the checks to a simpler method that is used e.g. in the Camellia x86 code. The patch also removes the MODULE_DEVICE_TABLE declarations which actually seem to cause the modules to be auto-loaded at boot, which is not desired. The crypto API on-demand module loading is sufficient. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Fixes: 6ecc9d9ff91f ("crypto: x86 - Add optimized MORUS implementations") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable	Ard Biesheuvel
	Squeeze out another 5% of performance by minimizing the number of invocations of kernel_neon_begin()/kernel_neon_end() on the common path, which also allows some reloads of the key schedule to be optimized away. The resulting code runs at 2.3 cycles per byte on a Cortex-A53. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/aes-ce-gcm - implement 2-way aggregation	Ard Biesheuvel
	Implement a faster version of the GHASH transform which amortizes the reduction modulo the characteristic polynomial across two input blocks at a time. On a Cortex-A53, the gcm(aes) performance increases 24%, from 3.0 cycles per byte to 2.4 cpb for large input sizes. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	crypto: arm64/aes-ce-gcm - operate on two input blocks at a time	Ard Biesheuvel
	Update the core AES/GCM transform and the associated plumbing to operate on 2 AES/GHASH blocks at a time. By itself, this is not expected to result in a noticeable speedup, but it paves the way for reimplementing the GHASH component using 2-way aggregation. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-08-07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Herbert Xu
	Merge crypto-2.6 to pick up NEON yield revert.