linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2018-08-07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-07 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add cgroup local storage for BPF programs, which provides a fast accessible memory for storing various per-cgroup data like number of transmitted packets, etc, from Roman. 2) Support bpf_get_socket_cookie() BPF helper in several more program types that have a full socket available, from Andrey. 3) Significantly improve the performance of perf events which are reported from BPF offload. Also convert a couple of BPF AF_XDP samples overto use libbpf, both from Jakub. 4) seg6local LWT provides the End.DT6 action, which allows to decapsulate an outer IPv6 header containing a Segment Routing Header. Adds this action now to the seg6local BPF interface, from Mathieu. 5) Do not mark dst register as unbounded in MOV64 instruction when both src and dst register are the same, from Arthur. 6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier instructions on arm64 for the AF_XDP sample code, from Brian. 7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts over from Python 2 to Python 3, from Jeremy. 8) Enable BTF build flags to the BPF sample code Makefile, from Taeung. 9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee. 10) Several improvements to the README.rst from the BPF documentation to make it more consistent with RST format, from Tobin. 11) Replace all occurrences of strerror() by calls to strerror_r() in libbpf and fix a FORTIFY_SOURCE build error along with it, from Thomas. 12) Fix a bug in bpftool's get_btf() function to correctly propagate an error via PTR_ERR(), from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07	xfs: remove dead error handling code in xfs_dquot_disk_alloc()	Brian Foster
	Colin Ian King reports that commit 82ff27bc52 ("xfs: automatic dfops buffer relogging") leaves around some dead error handling code in xfs_dquot_disk_alloc(). This was discovered via Coverity scan. Since the associated commit eliminates the act of joining a buffer to a dfops, this intermediate error state is no longer possible and the error handling code can be removed. Since the caller cancels the transaction on error, which cancels the dfops, eliminate the unnecessary xfs_defer_cancel() call and error handling labels. Fixes: 82ff27bc52 ("xfs: automatic dfops buffer relogging") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-08-07	xfs: use WRITE_ONCE to update if_seq	Christoph Hellwig
	This adds ordering of the updates and makes sure we always see the if_seq update before the extent tree is modified. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-08-07	MIPS: Use dins to simplify __write_64bit_c0_split()	Paul Burton
	The code in __write_64bit_c0_split() is used by MIPS32 kernels running on MIPS64 CPUs to write a 64-bit value to a 64-bit coprocessor 0 register using a single 64-bit dmtc0 instruction. It does this by combining the 2x 32-bit registers used to hold the 64-bit value into a single register, which in the existing code involves three steps: 1) Zero extend register A which holds bits 31:0 of our data, since it may have previously held a sign-extended value. 2) Shift register B which holds bits 63:32 of our data in bits 31:0 left by 32 bits, such that the bits forming our data are in the position they'll be in the final 64-bit value & bits 31:0 of the register are zero. 3) Or the two registers together to form the 64-bit value in one 64-bit register. From MIPS r2 onwards we have a dins instruction which can effectively perform all 3 of those steps using a single instruction. Add a path for MIPS r2 & beyond which uses dins to take bits 31:0 from register B & insert them into bits 63:32 of register A, giving us our full 64-bit value in register A with one instruction. Since we know that MIPS r2 & above support the sel field for the dmtc0 instruction, we don't bother special casing sel==0. Omiting the sel field would assemble to exactly the same instruction as when we explicitly specify that it equals zero. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	MIPS: Use read-write output operand in __write_64bit_c0_split()	Paul Burton
	Commit c22c80431055 ("MIPS: Fix input modify in __write_64bit_c0_split()") modified __write_64bit_c0_split() constraints such that we have both an input & an output which we hope to assign to the same registers, and modify the output rather than incorrectly clobbering an input. The way in which we use both an output & an input parameter with the input constrained to share the output registers is a little convoluted & also problematic for clang, which complains if the input & output values have different widths. For example: In file included from kernel/fork.c:98: ./arch/mips/include/asm/mmu_context.h:149:19: error: unsupported inline asm: input with type 'unsigned long' matching output with type 'unsigned long long' write_c0_entryhi(cpu_asid(cpu, next)); ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ ./arch/mips/include/asm/mmu_context.h:93:2: note: expanded from macro 'cpu_asid' (cpu_context((cpu), (mm)) & cpu_asid_mask(&cpu_data[cpu])) ^ ./arch/mips/include/asm/mipsregs.h:1617:65: note: expanded from macro 'write_c0_entryhi' #define write_c0_entryhi(val) __write_ulong_c0_register($10, 0, val) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./arch/mips/include/asm/mipsregs.h:1430:39: note: expanded from macro '__write_ulong_c0_register' __write_64bit_c0_register(reg, sel, val); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ./arch/mips/include/asm/mipsregs.h:1400:41: note: expanded from macro '__write_64bit_c0_register' __write_64bit_c0_split(register, sel, value); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ ./arch/mips/include/asm/mipsregs.h:1498:13: note: expanded from macro '__write_64bit_c0_split' : "r,0" (val)); \ ^~~ We can both fix this build failure & simplify the code somewhat by assigning the __tmp variable with the input value in C prior to our inline assembly, and then using a single read-write output operand (ie. a constraint beginning with +) to provide this value to our assembly. Signed-off-by: Paul Burton <paul.burton@mips.com>
2018-08-07	x86/mm/pti: Fix 32 bit PCID check	Joerg Roedel
	The check uses the wrong operator and causes false positive warnings in the kernel log on some systems. Fixes: 5e8105950a8b3 ('x86/mm/pti: Add Warning when booting on a PCID capable CPU') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Waiman Long <llong@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "David H . Gutteridge" <dhgutteridge@sympatico.ca> Cc: joro@8bytes.org Link: https://lkml.kernel.org/r/1533637471-30953-2-git-send-email-joro@8bytes.org
2018-08-07	hwmon: (k10temp) 27C Offset needed for Threadripper2	Michael Larabel
	For at least the Threadripper 2950X and Threadripper 2990WX, it's confirmed a 27 degree offset is needed. Signed-off-by: Michael Larabel <michael@phoronix.com> Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-08-07	ieee802154: hwsim: fix rcu address annotation	Alexander Aring
	This patch fixes the following sparse warning about mismatch rcu attribute for address space annotation: ... error: incompatible types in comparison expression (different modifiers) error: incompatible types in comparison expression (different address spaces) ... Some __rcu annotation was at non-pointers list head structures and one was missing in edge information which is used by rcu_assign_pointer() to update edge setting information. Cc: Stefan Schmidt <stefan@datenfreihafen.org> Fixes: f25da51fdc38 ("ieee802154: hwsim: add replacement for fakelb") Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07	xen: don't use privcmd_call() from xen_mc_flush()	Juergen Gross
	Using privcmd_call() for a singleton multicall seems to be wrong, as privcmd_call() is using stac()/clac() to enable hypervisor access to Linux user space. Even if currently not a problem (pv domains can't use SMAP while HVM and PVH domains can't use multicalls) things might change when PVH dom0 support is added to the kernel. Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-07	i40e: convert priority flow control stats to use helpers	Jacob Keller
	The priority flow control statistics are laid out in the stats structure using arrays. This made it unwieldy to use as part of an i40e_stats array. Add a new structure type, i40e_pfc_stats, and a helper function i40e_get_pfc_stats which can return the stats for a given priority value as an i40e_pfc_stats structure. Use this to create an i40e_stats array, which we'll use to format and copy the strings and stats into the supplied buffers. This reduces even more boiler plate code in i40e_get_ethtool_stats and i40e_get_stat_strings. An alternative would be to modify the structure definition for the pfc stats, but this is more invasive to the rest of the code base. Note that a macro was used to setup the copy of stats from the pf->stats, as this reduces the chance of typos in the code names. It will produce a checkpatch.pl warning due to re-use of a macro argument. In this case, it should be safe, as the macro will fail to compile in cases where the argument is not a simple structure member name, and thus arguments with side effects should not be an issue. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e: convert VEB TC stats to use an i40e_stats array	Jacob Keller
	The VEB TC stats are currently implemented with separate parsing, instead of using the i40e_stats array and associated helper functions. This is likely because the stats rely on embedding the TC number into the stat name. Update i40e_add_stat_strings to take variadic arguments, and use these to vsnprintf the i40e_stats string as a string containing format specifiers. Create a stats array for the VEB TC related stats, i40e_gstrings_veb_tc_stats, and use this along with the helper functions to remove the specialized boiler plate code. Always call i40e_add_ethtool_stats for both this array and the general VEB stats array. This ensures that we zero out any memory in case it was not zero-allocated for us. This ultimately results in less boiler plate code for the i40e_get_stat_strings and i40e_get_ethtool_stats. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e: Set fec_config when forcing link state	Mariusz Stachura
	This patch configures FEC setting in i40e_force_link_state(). For some reason setting this field was overlooked thus causing 25G link to be configured incorrectly. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e: add helper to copy statistic values into ethtool buffer	Jacob Keller
	Similar to the helper function to copy the ethtool stats strings, add and use a helper function for copying the ethtool stats into the supplied buffer. Just like before, we use a macro to avoid having to pass ARRAY_SIZE manually, so as to reduce chance of bugs. Some of the stats, especially queue stats, are a bit trickier, and will be handled in future patches. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e: add helper function for copying strings from stat arrays	Jacob Keller
	Many of the ethtool statistics use the same basic logic for copying strings into the supplied buffer. A set of stats are stored in a const array of i40e_stats structures, and we apply these all together. Simplify the stats code by introducing a helper function which can take a stats array and copy the strings into the buffer, updating the buffer pointer as we go. We use a macro to implement i40e_add_stat_strings so that ARRAY_SIZE can be used on the array passed in. This ensures that we always use the matching size in __i40e_add_stat_strings. More complex stats currently do not use i40e_stats arrays, usually due to custom formatted strings, or because the stats are not laid out in the expected way. These stats will be updated to use the helper function in separate future patches. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e/i40evf: remove redundant functions i40evf_aq_{set/get}_phy_register	YueHaibing
	There are no in-tree callers. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	i40e: Remove duplicated prepare call in i40e_shutdown	Sergey Nemov
	Function call to i40e_prep_for_reset() is duplicated in i40e_shutdown routine and gets called before i40e_enable_mc_magic_wake() which blocks it from being executed correctly on system reboot or shutdown because adminq is already disabled by first i40e_prep_for_reset() call. Two register write calls are also duplicated. Signed-off-by: Sergey Nemov <sergey.nemov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07	netfilter: nft_ct: enable conntrack for helpers	Pablo Neira Ayuso
	Enable conntrack if the user defines a helper to be used from the ruleset policy. Fixes: 1a64edf54f55 ("netfilter: nft_ct: add helper set support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	netfilter: nft_ct: add ct timeout support	Harsha Sharma
	This patch allows to add, list and delete connection tracking timeout policies via nft objref infrastructure and assigning these timeout via nft rule. %./libnftnl/examples/nft-ct-timeout-add ip raw cttime tcp Ruleset: table ip raw { ct timeout cttime { protocol tcp; policy = {established: 111, close: 13 } } chain output { type filter hook output priority -300; policy accept; ct timeout set "cttime" } } %./libnftnl/examples/nft-rule-ct-timeout-add ip raw output cttime %conntrack -E [NEW] tcp 6 111 ESTABLISHED src=172.16.19.128 dst=172.16.19.1 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128 sport=41360 dport=22 %nft delete rule ip raw output handle <handle> %./libnftnl/examples/nft-ct-timeout-del ip raw cttime Joint work with Pablo Neira. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	netfilter: remove ifdef around cttimeout in struct nf_conntrack_l4proto	Pablo Neira Ayuso
	Simplify this, include it inconditionally in this structure layout as we do with ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout object	Pablo Neira Ayuso
	The timeout policy is currently embedded into the nfnetlink_cttimeout object, move the policy into an independent object. This allows us to reuse part of the existing conntrack timeout extension from nf_tables without adding dependencies with the nfnetlink_cttimeout object layout. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	netfilter: cttimeout: move ctnl_untimeout to nf_conntrack	Harsha Sharma
	As, ctnl_untimeout is required by nft_ct, so move ctnl_timeout from nfnetlink_cttimeout to nf_conntrack_timeout and rename as nf_ct_timeout. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	netfilter: nft_osf: use NFT_OSF_MAXGENRELEN instead of IFNAMSIZ	Fernando Fernandez Mancera
	As no "genre" on pf.os exceed 16 bytes of length, we reduce NFT_OSF_MAXGENRELEN parameter to 16 bytes and use it instead of IFNAMSIZ. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07	gfs2: Fix gfs2_testbit to use clone bitmaps	Bob Peterson
	Function gfs2_testbit is called in three places. Two of those places, gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used by the block reservations scheme to determine the length of an extent of free blocks. Before this patch, it wasn't using the clone bitmap, which means recently-freed blocks were treated as free blocks for the purposes of an allocation. This patch adds a new parameter to gfs2_testbit to indicate whether or not the clone bitmaps should be used (if available). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-08-08	powerpc/powernv/opal: Use standard interrupts property when available	Benjamin Herrenschmidt
	For (bad) historical reasons, OPAL used to create a non-standard pair of properties "opal-interrupts" and "opal-interrupts-names" for representing the list of interrupts it wants Linux to request on its behalf. Among other issues, the opal-interrupts doesn't have a way to carry the type of interrupts, and they were assumed to be all level sensitive. This is wrong on some recent systems where some of them are edge sensitive causing warnings in the XIVE code and possible misbehaviours if they need to be retriggered (typically the NPU2 TCE error interrupts). This makes Linux switch to using the standard "interrupts" and "interrupt-names" properties instead when they are available, using standard of_irq helpers, which can carry all the desired type information. Newer versions of OPAL will generate those properties in addition to the legacy ones. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Fixup prefix logic to check strlen(r->name). Reinstate setting of start = 0 in opal_event_shutdown() to avoid double free warnings] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc: Allow CPU selection of e300core variants	Christophe Leroy
	GCC supports -mcpu=e300c2 and -mcpu=e300c3 This patch gives the opportunity to tune kernel to one of those two types. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc: Allow CPU selection also on PPC32	Christophe Leroy
	This patch extends to PPC32 the capability to select the exact CPU type. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc: Make CPU selection logic generic in Makefile	Christophe Leroy
	At the time being, when adding a new CPU for selection, both Kconfig.cputype and Makefile have to be modified. This patch moves into Kconfig.cputype the name of the CPU to me passed to the -mcpu= argument. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Rename the option to TARGET_CPU to echo the gcc documentation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/Makefiles: Convert ifeq to ifdef where possible	Rodrigo R. Galvao
	In Makefiles if we're testing a CONFIG_FOO symbol for equality with 'y' we can instead just use ifdef. The latter reads easily, so convert to it where possible. Signed-off-by: Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com> Reviewed-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Copy as much as possible in __copy_tofrom_user	Paul Mackerras
	In __copy_tofrom_user, if we encounter an exception on a store, we stop copying and return the number of bytes not copied. However, if the store is wider than one byte and is to an unaligned address, it is possible that the store operand overlaps a page boundary and the exception occurred on the latter part of the store operand, meaning that it would be possible to copy a few more bytes. Since copy_to_user is generally expected to copy as much as possible, it would be better to copy those extra few bytes. This adds code to do that. Since this edge case is not performance-critical, the code has been written to be compact rather than as fast as possible. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	selftests/powerpc/64: Test exception cases in copy_tofrom_user	Michael Ellerman
	This adds a set of test cases to test the behaviour of copy_tofrom_user when exceptions are encountered accessing the source or destination. Currently, copy_tofrom_user does not always copy as many bytes as possible when an exception occurs on a store to the destination, and that is reflected in failures in these tests. Based on a test program from Anton Blanchard. [paulus@ozlabs.org - test all three paths, wrote commit description, made EX_TABLE create an exception table.] Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	selftests/powerpc/64: Test all paths through copy routines	Paul Mackerras
	The hand-coded assembler 64-bit copy routines include feature sections that select one code path or another depending on which CPU we are executing on. The self-tests for these copy routines end up testing just one path. This adds a mechanism for selecting any desired code path at compile time, and makes 2 or 3 versions of each test, each using a different code path, so as to cover all the possible paths. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Add -mcpu=power4 to CFLAGS for older compilers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64: Make exception table clearer in __copy_tofrom_user_base	Paul Mackerras
	This aims to make the generation of exception table entries for the loads and stores in __copy_tofrom_user_base clearer and easier to verify. Instead of having a series of local labels on the loads and stores, with a series of corresponding labels later for the exception handlers, we now use macros to generate exception table entries at the point of each load and store that could potentially trap. We do this with the macros lex (load exception) and stex (store exception). These macros are used right before the load or store to which they apply. Some complexity is introduced by the fact that we have some more work to do after hitting an exception, because we need to calculate and return the number of bytes not copied. The code uses r3 as the current pointer into the destination buffer, that is, the address of the first byte of the destination that has not been modified. However, at various points in the copy loops, r3 can be 4, 8, 16 or 24 bytes behind that point. To express this offset in an understandable way, we define a symbol r3_offset which is updated at various points so that it equal to the difference between the address of the first unmodified byte of the destination and the value in r3. (In fact it only needs to be accurate at the point of each lex or stex macro invocation.) The rules for updating r3_offset are as follows: * It starts out at 0 * An addi r3,r3,N instruction decreases r3_offset by N * A store instruction (stb, sth, stw, std) to N(r3) increases r3_offset by the width of the store (1, 2, 4, 8) * A store with update instruction (stbu, sthu, stwu, stdu) to N(r3) sets r3_offset to the width of the store. There is some trickiness to the way that the lex and stex macros and the associated exception handlers work. I would have liked to use the current value of r3_offset in the name of the symbol used as the exception handler, as in ".Lld_exc_$(r3_offset)" and then have symbols .Lld_exc_0, .Lld_exc_8, .Lld_exc_16 etc. corresponding to the offsets that needed to be added to r3. However, I couldn't see a way to do that with gas. Instead, the exception handler address is .Lld_exc - r3_offset or .Lst_exc - r3_offset, that is, the distance ahead of .Lld_exc/.Lst_exc that we start executing is equal to the amount that we need to add to r3. This works because r3_offset is always a small multiple of 4, and our instructions are 4 bytes long. This means that before .Lld_exc and .Lst_exc, we have a sequence of instructions that increments r3 by 4, 8, 16 or 24 depending on where we start. The sequence increments r3 by 4 per instruction (on average). We also replace the exception table for the 4k copy loop by a macro per load or store. These loads and stores all use exactly the same exception handler, which simply resets the argument registers r3, r4 and r5 to there original values and re-does the whole copy using the slower loop. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/powermac: of_node_put() is not needed after iterator	zhong jiang
	for_each_node_by_name() iterators only exit normally when the loop cursor is NULL, So there is no need to call of_node_put(). Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	crypto/nx: Initialize 842 high and normal RxFIFO control registers	Haren Myneni
	NX increments readOffset by FIFO size in receive FIFO control register when CRB is read. But the index in RxFIFO has to match with the corresponding entry in FIFO maintained by VAS in kernel. Otherwise NX may be processing incorrect CRBs and can cause CRB timeout. VAS FIFO offset is 0 when the receive window is opened during initialization. When the module is reloaded or in kexec boot, readOffset in FIFO control register may not match with VAS entry. This patch adds nx_coproc_init OPAL call to reset readOffset and queued entries in FIFO control register for both high and normal FIFOs. Signed-off-by: Haren Myneni <haren@us.ibm.com> [mpe: Fixup uninitialized variable warning] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/powernv: Export opal_check_token symbol	Haren Myneni
	Export opal_check_token symbol for modules to check the availability of OPAL calls before using them. Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/platforms/85xx: fix t1042rdb_diu.c build errors & warning	Randy Dunlap
	Fix build errors and warnings in t1042rdb_diu.c by adding header files and MODULE_LICENSE(). ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: data definition has no type or storage class early_initcall(t1042rdb_diu_init); ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: error: type defaults to 'int' in declaration of 'early_initcall' [-Werror=implicit-int] ../arch/powerpc/platforms/85xx/t1042rdb_diu.c:152:1: warning: parameter names (without types) in function declaration and WARNING: modpost: missing MODULE_LICENSE() in arch/powerpc/platforms/85xx/t1042rdb_diu.o Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Scott Wood <oss@buserror.net> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/perf: Remove sched_task function defined for thread-imc	Anju T Sudhakar
	Call trace observed while running perf-fuzzer: CPU: 43 PID: 9088 Comm: perf_fuzzer Not tainted 4.13.0-32-generic #35~lp1746225 task: c000003f776ac900 task.stack: c000003f77728000 NIP: c000000000299b70 LR: c0000000002a4534 CTR: c00000000029bb80 REGS: c000003f7772b760 TRAP: 0700 Not tainted (4.13.0-32-generic) MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24008822 XER: 00000000 CFAR: c000000000299a70 SOFTE: 0 GPR00: c0000000002a4534 c000003f7772b9e0 c000000001606200 c000003fef858908 GPR04: c000003f776ac900 0000000000000001 ffffffffffffffff 0000003fee730000 GPR08: 0000000000000000 0000000000000000 c0000000011220d8 0000000000000002 GPR12: c00000000029bb80 c000000007a3d900 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 c000003f776ad090 c000000000c71354 GPR24: c000003fef716780 0000003fee730000 c000003fe69d4200 c000003f776ad330 GPR28: c0000000011220d8 0000000000000001 c0000000014c6108 c000003fef858900 NIP [c000000000299b70] perf_pmu_sched_task+0x170/0x180 LR [c0000000002a4534] __perf_event_task_sched_in+0xc4/0x230 Call Trace: perf_iterate_sb+0x158/0x2a0 (unreliable) __perf_event_task_sched_in+0xc4/0x230 finish_task_switch+0x21c/0x310 __schedule+0x304/0xb80 schedule+0x40/0xc0 do_wait+0x254/0x2e0 kernel_wait4+0xa0/0x1a0 SyS_wait4+0x64/0xc0 system_call+0x58/0x6c Instruction dump: 3beafea0 7faa4800 409eff18 e8010060 eb610028 ebc10040 7c0803a6 38210050 eb81ffe0 eba1ffe8 ebe1fff8 4e800020 <0fe00000> 4bffffbc 60000000 60420000 ---[ end trace 8c46856d314c1811 ]--- The context switch call-backs for thread-imc are defined in sched_task function. So when thread-imc events are grouped with software pmu events, perf_pmu_sched_task hits the WARN_ON_ONCE condition, since software PMUs are assumed not to have a sched_task defined. Patch to move the thread_imc enable/disable opal call back from sched_task to event_[add/del] function Fixes: f74c89bd80fb ("powerpc/perf: Add thread IMC PMU support") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/64s: Fix page table fragment refcount race vs speculative references	Nicholas Piggin
	The page table fragment allocator uses the main page refcount racily with respect to speculative references. A customer observed a BUG due to page table page refcount underflow in the fragment allocator. This can be caused by the fragment allocator set_page_count stomping on a speculative reference, and then the speculative failure handler decrements the new reference, and the underflow eventually pops when the page tables are freed. Fix this by using a dedicated field in the struct page for the page table fragment allocator. Fixes: 5c1f6ee9a31c ("powerpc: Reduce PTE table memory wastage") Cc: stable@vger.kernel.org # v3.10+ Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	misc: cxl: changed asterisk position	Parth Y Shah
	Resolved <"foo* bar" should be "foo *bar"> error Signed-off-by: Parth Y Shah <sparth1292@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/pasemi: Use pr_err/pr_warn... for kernel messages	Darren Stevens
	Pasemi code still uses printk(KERN_ERR/KERN_WARN ... change these to pr_err(, pr_warn(... to match other powerpc arch code. No functional changes. Signed-off-by: Darren Stevens <darren@stevens-zone.net> [mpe: Unsplit some strings while we're at it] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/traps: Show instructions on exceptions	Murilo Opsfelder Araujo
	Call show_user_instructions() in arch/powerpc/kernel/traps.c to dump instructions at faulty location, useful to debugging. Before this patch, an unhandled signal message looked like: pandafault[10524]: segfault (11) at 100007d0 nip 1000061c lr 7fffbd295100 code 2 in pandafault[10000000+10000] After this patch, it looks like: pandafault[10524]: segfault (11) at 100007d0 nip 1000061c lr 7fffbd295100 code 2 in pandafault[10000000+10000] pandafault[10524]: code: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe pandafault[10524]: code: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc: Add show_user_instructions()	Murilo Opsfelder Araujo
	show_user_instructions() is a slightly modified version of show_instructions() that allows userspace instruction dump. This will be useful within show_signal_msg() to dump userspace instructions of the faulty location. Here is a sample of what show_user_instructions() outputs: pandafault[10850]: code: 4bfffeec 4bfffee8 3c401002 38427f00 fbe1fff8 f821ffc1 7c3f0b78 3d22fffe pandafault[10850]: code: 392988d0 f93f0020 e93f0020 39400048 <99490000> 39200000 7d234b78 383f0040 The current->comm and current->pid printed can serve as a glue that links the instructions dump to its originator, allowing messages to be interleaved in the logs. Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/traps: Print VMA for unhandled signals	Murilo Opsfelder Araujo
	This adds VMA address in the message printed for unhandled signals, similarly to what other architectures, like x86, print. Before this patch, a page fault looked like: pandafault[61470]: unhandled signal 11 at 100007d0 nip 1000061c lr 7fff8d185100 code 2 After this patch, a page fault looks like: pandafault[6303]: segfault 11 at 100007d0 nip 1000061c lr 7fff93c55100 code 2 in pandafault[10000000+10000] Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/traps: Use %lx format in show_signal_msg()	Murilo Opsfelder Araujo
	Use %lx format to print registers. This avoids having two different formats and avoids checking for MSR_64BIT, improving readability of the function. Even though we could have used %px, which is functionally equivalent to %lx as per Documentation/core-api/printk-formats.rst, it is not semantically correct because the data printed are not pointers. And using %px requires casting data to (void *). Besides that, %lx matches the format used in show_regs(). Before this patch: pandafault[4808]: unhandled signal 11 at 0000000010000718 nip 0000000010000574 lr 00007fff935e7a6c code 2 After this patch: pandafault[4732]: unhandled signal 11 at 10000718 nip 10000574 lr 7fff86697a6c code 2 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/traps: Use an explicit ratelimit state for show_signal_msg()	Murilo Opsfelder Araujo
	Replace printk_ratelimited() by printk() and a default rate limit burst to limit displaying unhandled signals messages. This will allow us to call print_vma_addr() in a future patch, which does not work with printk_ratelimited(). Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	powerpc/traps: Print unhandled signals in a separate function	Murilo Opsfelder Araujo
	Isolate the logic of printing unhandled signals out of _exception_pkey(). No functional change, only code rearrangement. Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-08-08	selftests/powerpc: Add more version checks to alignment_handler test	Michael Ellerman
	The alignment_handler is documented to only work on Power8/Power9, but we can make it run on older CPUs by guarding more of the tests with feature checks. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
2018-08-08	selftests/powerpc: Skip earlier in alignment_handler test	Michael Ellerman
	Currently the alignment_handler test prints "Can't open /dev/fb0" about 80 times per run, which is a little annoying. Refactor it to check earlier if it can open /dev/fb0 and skip if not, this results in each test printing something like: test: test_alignment_handler_vsx_206 tags: git_version:v4.18-rc3-134-gfb21a48904aa [SKIP] Test skipped on line 291 skip: test_alignment_handler_vsx_206 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
2018-08-08	powerpc/64s: Make rfi_flush_fallback a little more robust	Michael Ellerman
	Because rfi_flush_fallback runs immediately before the return to userspace it currently runs with the user r1 (stack pointer). This means if we oops in there we will report a bad kernel stack pointer in the exception entry path, eg: Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4 Oops: Bad kernel stack pointer, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7 NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80 GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020 ... NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80 LR [0000000010053e00] 0x10053e00 Although the NIP tells us where we were, and the TRAP number tells us what happened, it would still be nicer if we could report the actual exception rather than barfing about the stack pointer. We an do that fairly simply by loading the kernel stack pointer on entry and restoring the user value before returning. That way we see a regular oops such as: Unrecoverable exception 4100 at c00000000000239c Oops: Unrecoverable exception, sig: 6 [#1] LE SMP NR_CPUS=32 NUMA PowerNV Modules linked in: CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40 NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040 REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty) MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000 CFAR: c00000000000bac8 IRQMASK: 0 ... NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80 LR [0000000010053e00] 0x10053e00 Call Trace: [c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable) Note this shouldn't make the kernel stack pointer vulnerable to a meltdown attack, because it should be flushed from the cache before we return to userspace. The user r1 value will be in the cache, because we load it in the return path, but that is harmless. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-08-08	powerpc/powernv: Query firmware for count cache flush settings	Michael Ellerman
	Look for fw-features properties to determine the appropriate settings for the count cache flush, and then call the generic powerpc code to set it up based on the security feature flags. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>