linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-01-22	genirq, sched/isolation: Isolate from handling managed interrupts	Ming Lei
	The affinity of managed interrupts is completely handled in the kernel and cannot be changed via the /proc/irq/* interfaces from user space. As the kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent vector exhaustion, it can happen that a managed interrupt whose affinity mask contains both isolated and housekeeping CPUs is routed to an isolated CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts on the isolated CPU. Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding logic in the interrupt affinity selection code. The subparameter indicates to the interrupt affinity selection logic that it should try to avoid the above scenario. This isolation is best effort and only effective if the automatically assigned interrupt mask of a device queue contains isolated and housekeeping CPUs. If housekeeping CPUs are online then such interrupts are directed to the housekeeping CPU so that IO submitted on the housekeeping CPU cannot disturb the isolated CPU. If a queue's affinity mask contains only isolated CPUs then this parameter has no effect on the interrupt routing decision, though interrupts are only happening when tasks running on those isolated CPUs submit IO. IO submitted on housekeeping CPUs has no influence on those queues. If the affinity mask contains both housekeeping and isolated CPUs, but none of the contained housekeeping CPUs is online, then the interrupt is also routed to an isolated CPU. Interrupts are only delivered when one of the isolated CPUs in the affinity mask submits IO. If one of the contained housekeeping CPUs comes online, the CPU hotplug logic migrates the interrupt automatically back to the upcoming housekeeping CPU. Depending on the type of interrupt controller, this can require that at least one interrupt is delivered to the isolated CPU in order to complete the migration. [ tglx: Removed unused parameter, added and edited comments/documentation and rephrased the changelog so it contains more details. ] Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com
2020-01-22	hrtimer: Add missing sparse annotation for __run_timer()	Jules Irenge
	Sparse reports a warning at __run_hrtimer() \|warning: context imbalance in __run_hrtimer() - unexpected unlock Add the missing must_hold() annotation. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200120224347.51843-1-jbi.octave@gmail.com
2020-01-22	arm64: acpi: fix DAIF manipulation with pNMI	Mark Rutland
	Since commit: d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface") ... the top-level APEI SEA handler has the shape: 1. current_flags = arch_local_save_flags() 2. local_daif_restore(DAIF_ERRCTX) 3. <GHES handler> 4. local_daif_restore(current_flags) However, since commit: 4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") ... when pseudo-NMIs (pNMIs) are in use, arch_local_save_flags() will save the PMR value rather than the DAIF flags. The combination of these two commits means that the APEI SEA handler will erroneously attempt to restore the PMR value into DAIF. Fix this by factoring local_daif_save_flags() out of local_daif_save(), so that we can consistently save DAIF in step #1, regardless of whether pNMIs are in use. Both commits were introduced concurrently in v5.0. Cc: <stable@vger.kernel.org> Fixes: 4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Fixes: d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22	devtmpfs: factor out common tail of devtmpfs_{create,delete}_node	Rasmus Villemoes
	There's some common boilerplate in devtmpfs_{create,delete}_node, put that in a little helper. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20200115184154.3492-6-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	devtmpfs: initify a bit	Rasmus Villemoes
	devtmpfs_mount() is only called from prepare_namespace() in init/do_mounts.c, which is an __init function, so devtmpfs_mount() can also be moved to .init.text. Then the mount_dev static variable is only referenced from __init functions (devtmpfs_mount and its initializer function mount_param). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20200115184154.3492-5-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	devtmpfs: simplify initialization of mount_dev	Rasmus Villemoes
	Avoid a bit of ifdeffery by using the IS_ENABLED() helper. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20200115184154.3492-4-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	devtmpfs: factor out setup part of devtmpfsd()	Rasmus Villemoes
	Factor out the setup part of devtmpfsd() to make it a bit easier to see that we always call setup_done() exactly once (provided of course the kthread is succesfully created). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20200115184154.3492-3-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	devtmpfs: fix theoretical stale pointer deref in devtmpfsd()	Rasmus Villemoes
	After complete(&setup_done), devtmpfs_init proceeds and may actually return, invalidating the err pointer, before devtmpfsd() proceeds to reading back err. This is of course completely theoretical since the error conditions never trigger in practice, and even if they did, nobody cares about the exit value from a kernel thread, so it doesn't matter if we happen to read back some garbage from some other stack frame. Still, this isn't a pattern that should be copy-pasted, so fix it. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20200115184154.3492-2-linux@rasmusvillemoes.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	driver core: platform: fix u32 greater or equal to zero comparison	Colin Ian King
	Currently the check that a u32 variable i is >= 0 is always true because the unsigned variable will never be negative, causing the loop to run forever. Fix this by changing the pre-decrement check to a zero check on i followed by a decrement of i. Addresses-Coverity: ("Unsigned compared against 0") Fixes: 39cc539f90d0 ("driver core: platform: Prevent resouce overflow from causing infinite loops") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200116175758.88396-1-colin.king@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	irqchip/gic-v4.1: Allow direct invalidation of VLPIs	Marc Zyngier
	Just like for INVALL, GICv4.1 has grown a VPE-aware INVLPI register. Let's plumb it in and make use of the DirectLPI code in that case. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-16-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Suppress per-VLPI doorbell	Marc Zyngier
	Since GICv4.1 gives us a per-VPE doorbell, avoid programming anything else on VMOVI/VMAPI/VMAPTI and on any other action that would have otherwise resulted in a per-VLPI doorbell to be programmed. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-15-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Add VPE INVALL callback	Marc Zyngier
	GICv4.1 redistributors have a VPE-aware INVALL register. Progress! We can now emulate a guest-requested INVALL without emiting a VINVALL command. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-14-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Add VPE eviction callback	Marc Zyngier
	When descheduling a VPE, special care must be taken to tell the GIC about whether we want to receive a doorbell or not. This is a major improvement on GICv4.0, where the doorbell had to be separately enabled/disabled. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-13-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Add VPE residency callback	Marc Zyngier
	Making a VPE resident on GICv4.1 is pretty simple, as it is just a single write to the local redistributor. We just need extra information about which groups to enable, which the KVM code will have to provide. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-12-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Add mask/unmask doorbell callbacks	Marc Zyngier
	masking/unmasking doorbells on GICv4.1 relies on a new INVDB command, which broadcasts the invalidation to all RDs. Implement the new command as well as the masking callbacks, and plug the whole thing into the v4.1 VPE irqchip. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-11-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Plumb skeletal VPE irqchip	Marc Zyngier
	Just like for GICv4.0, each VPE has its own doorbell interrupt, and thus an irqchip that manages them. Since the doorbell management is quite different on GICv4.1, let's introduce an almost empty irqchip the will get populated over the next new patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-10-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP	Marc Zyngier
	With GICv4.1, VMOVP is extended to allow a default doorbell to be specified, as well as a validity bit for this doorbell. As an added bonus, VMOVP isn't required anymore of moving a VPE between redistributors that share the same affinity. Let's add this support to the VMOVP builder, and make sure we don't issue the command if we don't really need to. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-9-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Don't use the VPE proxy if RVPEID is set	Marc Zyngier
	The infamous VPE proxy device isn't used with GICv4.1 because: - we can invalidate any LPI from the DirectLPI MMIO interface - the ITS and redistributors understand the life cycle of the doorbell, so we don't need to enable/disable it all the time So let's escape early from the proxy related functions. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-8-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP	Marc Zyngier
	The ITS VMAPP command gains some new fields with GICv4.1: - a default doorbell, which allows a single doorbell to be used for all the VLPIs routed to a given VPE - a pointer to the configuration table (instead of having it in a register that gets context switched) - a flag indicating whether this is the first map or the last unmap for this particular VPE - a flag indicating whether the pending table is known to be zeroed, or not Plumb in the new fields in the VMAPP builder, and add the map/unmap refcounting so that the ITS can do the right thing. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-7-maz@kernel.org
2020-01-22	irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation	Marc Zyngier
	GICv4.1 defines a new VPE table that is potentially shared between both the ITSs and the redistributors, following complicated affinity rules. To make things more confusing, the programming of this table at the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register for something completely different. The code flow is somewhat complexified by the need to respect the affinities required by the HW, meaning that tables can either be inherited from a previously discovered ITS or redistributor. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-6-maz@kernel.org
2020-01-22	irqchip/gic-v3: Add GICv4.1 VPEID size discovery	Marc Zyngier
	While GICv4.0 mandates 16 bit worth of VPEIDs, GICv4.1 allows smaller implementations to be built. Add the required glue to dynamically compute the limit. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-3-maz@kernel.org
2020-01-22	irqchip/gic-v3: Detect GICv4.1 supporting RVPEID	Marc Zyngier
	GICv4.1 supports the RVPEID ("Residency per vPE ID"), which allows for a much efficient way of making virtual CPUs resident (to allow direct injection of interrupts). The functionnality needs to be discovered on each and every redistributor in the system, and disabled if the settings are inconsistent. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20191224111055.11836-2-maz@kernel.org
2020-01-22	irqchip/gic-v3-its: Fix get_vlpi_map() breakage with doorbells	Marc Zyngier
	When updating an LPI configuration, get_vlpi_map() may be passed a irq_data structure relative to an ITS domain (the normal case) or one that is relative to the core GICv3 domain in the case of a GICv4 doorbell. In the latter case, special care must be take not to dereference the irq_chip data as an its_dev structure, as that isn't what is stored there. Instead, check first whether the IRQ is forwarded to a vcpu, and only then try to obtain the vlpi mapping. Fixes: c1d4d5cd203c ("irqchip/gic-v3-its: Add its_vlpi_map helpers") Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200122085609.658-1-yuzenghui@huawei.com
2020-01-22	binder: fix log spam for existing debugfs file creation.	Martin Fuzzey
	Since commit 43e23b6c0b01 ("debugfs: log errors when something goes wrong") debugfs logs attempts to create existing files. However binder attempts to create multiple debugfs files with the same name when a single PID has multiple contexts, this leads to log spamming during an Android boot (17 such messages during boot on my system). Fix this by checking if we already know the PID and only create the debugfs entry for the first context per PID. Do the same thing for binderfs for symmetry. Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group> Acked-by: Todd Kjos <tkjos@google.com> Fixes: 43e23b6c0b01 ("debugfs: log errors when something goes wrong") Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1578671054-5982-1-git-send-email-martin.fuzzey@flowbird.group Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	x86/tsc: Remove redundant assignment	Mateusz Nosek
	Previously, the assignment to the local variable 'now' took place before the for loop. The loop is unconditional so it will be entered at least once. The variable 'now' is reassigned in the loop and is not used before reassigning. Therefore, the assignment before the loop is unnecessary and can be removed. No code changed: # arch/x86/kernel/tsc_sync.o: text data bss dec hex filename 3569 198 44 3811 ee3 tsc_sync.o.before 3569 198 44 3811 ee3 tsc_sync.o.after md5: 36216de29b208edbcd34fed9fe7f7b69 tsc_sync.o.before.asm 36216de29b208edbcd34fed9fe7f7b69 tsc_sync.o.after.asm [ bp: Massage commit message. ] Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200118171143.25178-1-mateusznosek0@gmail.com
2020-01-22	tracing/uprobe: Fix to make trace_uprobe_filter alignment safe	Masami Hiramatsu
	Commit 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe") moved trace_uprobe_filter on trace_probe_event. However, since it introduced a flexible data structure with char array and type casting, the alignment of trace_uprobe_filter can be broken. This changes the type of the array to trace_uprobe_filter data strucure to fix it. Link: http://lore.kernel.org/r/20200120124022.GA14897@hirez.programming.kicks-ass.net Link: http://lkml.kernel.org/r/157966340499.5107.10978352478952144902.stgit@devnote2 Fixes: 99c9a923e97a ("tracing/uprobe: Fix double perf_event linking on multiprobe uprobe") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-22	s390: fix __EMIT_BUG() macro	Sven Schnelle
	Setting a kprobe on getname_flags() failed: $ echo 'p:tmr1 getname_flags +0(%r2):ustring' > kprobe_events -bash: echo: write error: Invalid argument Debugging the kprobes code showed that the address of getname_flags() is contained in the __bug_table. Kprobes doesn't allow to set probes at BUG() locations. $ objdump -j __bug_table -x build/fs/namei.o [..] 0000000000000108 R_390_PC32 .text+0x00000000000075a8 000000000000010c R_390_PC32 .L223+0x0000000000000004 I was expecting getname_flags() to start with a BUG(), but: 7598: e3 20 10 00 00 04 lg %r2,0(%r1) 759e: c0 f4 00 00 00 00 jg 759e <putname+0x7e> 75a0: R_390_PLT32DBL kmem_cache_free+0x2 75a4: a7 f4 00 01 j 75a6 <putname+0x86> 00000000000075a8 <getname_flags>: 75a8: c0 04 00 00 00 00 brcl 0,75a8 <getname_flags> 75ae: eb 6f f0 48 00 24 stmg %r6,%r15,72(%r15) 75b4: b9 04 00 ef lgr %r14,%r15 75b8: e3 f0 ff a8 ff 71 lay %r15,-88(%r15) So the BUG() is actually the last opcode of the previous function. Fix this by switching to using the MONITOR CALL (MC) instruction, and set the entry in __bug_table to the beginning of that MC. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/ftrace: generate traced function stack frame	Vasily Gorbik
	Currently backtrace from ftraced function does not contain ftraced function itself. e.g. for "path_openat": arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Ftraced function is expected in the backtrace by ftrace kselftests, which are now failing. It would also be nice to have it for clarity reasons. "ftrace_caller" itself is called without stack frame allocated for it and does not store its caller (ftraced function). Instead it simply allocates a stack frame for "ftrace_trace_function" and sets backchain to point to ftraced function stack frame (which contains ftraced function caller in saved r14). To fix this issue make "ftrace_caller" allocate a stack frame for itself just to store ftraced function for the stack unwinder. As a result backtrace looks like the following: arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code path_openat+0x6/0xd60 <-- ftraced function do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Reported-by: Sven Schnelle <sven.schnelle@ibm.com> Tested-by: Sven Schnelle <sven.schnelle@ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390: adjust -mpacked-stack support check for clang 10	Vasily Gorbik
	clang 10 introduces -mpacked-stack compiler option implementation. At the same time currently it does not support a combination of -mpacked-stack and -mbackchain. This leads to the following build error: clang: error: unsupported option '-mpacked-stack with -mbackchain' for target 's390x-ibm-linux' If/when clang adds support for a combination of -mpacked-stack and -mbackchain it would also require -msoft-float (like gcc does). According to Ulrich Weigand "stack slot assigned to the kernel backchain overlaps the stack slot assigned to the FPR varargs (both are required to be placed immediately after the saved r15 slot if present)." Extend -mpacked-stack compiler option support check to include all 3 options -mpacked-stack -mbackchain -msoft-float which must present to support -mpacked-stack with -mbackchain. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/jump_label: use "i" constraint for clang	Vasily Gorbik
	Currently kernel build fails under clang if jump labels are enabled. The problem is "X" constraint usage "Any operand whatsoever is allowed", for which clang produces the following: .pushsection __jump_table,"aw" .balign 8 .long 0b-.,.Ltmp577-. .quad %r0+0-. # %r0 is not allowed here .popsection Under gcc constraints "X" or "jdd" (gcc > 9) are used for static keys. Ideally, we'd have used "i" for gcc, but it doesn't work in all cases with -fPIC code. This is gcc-specific problem that doesn't exist in llvm. Since clang does not have "jdd" simply always use "i" constraint for it. Suggested-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/cpum_sf: Use DIV_ROUND_UP	Thomas Richter
	Use macro DIV_ROUND_UP() for calculation of number of SDBT SDBT pages required for index pages. This macro is already used throughout the file. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/cpum_sf: Use kzalloc and minor changes	Thomas Richter
	Use kzalloc() to allocate auxiliary buffer structure initialized with all zeroes to avoid random value in trace output. Avoid double access to SBD hardware flags. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/cpum_sf: Convert debug trace to common layout	Thomas Richter
	Convert debug traces to print the head/alert/empty marks consistently as decimal numbers. Add some trace statements to enable easier debugging during auxiliary tracing. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/pci: Fix possible deadlock in recover_store()	Niklas Schnelle
	With zpci_disable() working, lockdep detected a potential deadlock (lockdep output at the end). The deadlock is between recovering a PCI function via the /sys/bus/pci/devices/<dev>/recover attribute vs powering it off via /sys/bus/pci/slots/<slot>/power. The fix is analogous to the changes in commit 0ee223b2e1f6 ("scsi: core: Avoid that SCSI device removal through sysfs triggers a deadlock") that fixed a potential deadlock on removing a SCSI device via sysfs. [ 204.830107] ====================================================== [ 204.830109] WARNING: possible circular locking dependency detected [ 204.830111] 5.5.0-rc2-06072-gbc03ecc9a672 #6 Tainted: G W [ 204.830112] ------------------------------------------------------ [ 204.830113] bash/1034 is trying to acquire lock: [ 204.830115] 0000000192a1a610 (kn->count#200){++++}, at: kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830122] but task is already holding lock: [ 204.830123] 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48 [ 204.830128] which lock already depends on the new lock. [ 204.830129] the existing dependency chain (in reverse order) is: [ 204.830130] -> #1 (pci_rescan_remove_lock){+.+.}: [ 204.830134] validate_chain+0x93a/0xd08 [ 204.830136] __lock_acquire+0x4ae/0x9d0 [ 204.830137] lock_acquire+0x114/0x280 [ 204.830140] __mutex_lock+0xa2/0x960 [ 204.830142] mutex_lock_nested+0x32/0x40 [ 204.830145] recover_store+0x4c/0xa8 [ 204.830147] kernfs_fop_write+0xe6/0x218 [ 204.830151] vfs_write+0xb0/0x1b8 [ 204.830152] ksys_write+0x6c/0xf8 [ 204.830154] system_call+0xd8/0x2d8 [ 204.830155] -> #0 (kn->count#200){++++}: [ 204.830187] check_noncircular+0x1e6/0x240 [ 204.830189] check_prev_add+0xfc/0xdb0 [ 204.830190] validate_chain+0x93a/0xd08 [ 204.830192] __lock_acquire+0x4ae/0x9d0 [ 204.830193] lock_acquire+0x114/0x280 [ 204.830194] __kernfs_remove.part.0+0x2e4/0x360 [ 204.830196] kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830198] remove_files.isra.0+0x4c/0x98 [ 204.830199] sysfs_remove_group+0x66/0xc8 [ 204.830201] sysfs_remove_groups+0x46/0x68 [ 204.830204] device_remove_attrs+0x52/0x90 [ 204.830207] device_del+0x182/0x418 [ 204.830208] pci_remove_bus_device+0x8a/0x130 [ 204.830210] pci_stop_and_remove_bus_device_locked+0x3a/0x48 [ 204.830212] disable_slot+0x68/0x100 [ 204.830213] power_write_file+0x7c/0x130 [ 204.830215] kernfs_fop_write+0xe6/0x218 [ 204.830217] vfs_write+0xb0/0x1b8 [ 204.830218] ksys_write+0x6c/0xf8 [ 204.830220] system_call+0xd8/0x2d8 [ 204.830221] other info that might help us debug this: [ 204.830223] Possible unsafe locking scenario: [ 204.830224] CPU0 CPU1 [ 204.830225] ---- ---- [ 204.830226] lock(pci_rescan_remove_lock); [ 204.830227] lock(kn->count#200); [ 204.830229] lock(pci_rescan_remove_lock); [ 204.830231] lock(kn->count#200); [ 204.830233] * DEADLOCK * [ 204.830234] 4 locks held by bash/1034: [ 204.830235] #0: 00000001b6fbc498 (sb_writers#4){.+.+}, at: vfs_write+0x158/0x1b8 [ 204.830239] #1: 000000018c9f5090 (&of->mutex){+.+.}, at: kernfs_fop_write+0xaa/0x218 [ 204.830242] #2: 00000001f7da0810 (kn->count#235){.+.+}, at: kernfs_fop_write+0xb6/0x218 [ 204.830245] #3: 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48 [ 204.830248] stack backtrace: [ 204.830250] CPU: 2 PID: 1034 Comm: bash Tainted: G W 5.5.0-rc2-06072-gbc03ecc9a672 #6 [ 204.830252] Hardware name: IBM 8561 T01 703 (LPAR) [ 204.830253] Call Trace: [ 204.830257] [<00000000c05e10c0>] show_stack+0x88/0xf0 [ 204.830260] [<00000000c112dca4>] dump_stack+0xa4/0xe0 [ 204.830261] [<00000000c0694c06>] check_noncircular+0x1e6/0x240 [ 204.830263] [<00000000c0695bec>] check_prev_add+0xfc/0xdb0 [ 204.830264] [<00000000c06971da>] validate_chain+0x93a/0xd08 [ 204.830266] [<00000000c06994c6>] __lock_acquire+0x4ae/0x9d0 [ 204.830267] [<00000000c069867c>] lock_acquire+0x114/0x280 [ 204.830269] [<00000000c09ca15c>] __kernfs_remove.part.0+0x2e4/0x360 [ 204.830270] [<00000000c09cb5c4>] kernfs_remove_by_name_ns+0x5c/0xa8 [ 204.830272] [<00000000c09cee14>] remove_files.isra.0+0x4c/0x98 [ 204.830274] [<00000000c09cf2ae>] sysfs_remove_group+0x66/0xc8 [ 204.830276] [<00000000c09cf356>] sysfs_remove_groups+0x46/0x68 [ 204.830278] [<00000000c0e3dfe2>] device_remove_attrs+0x52/0x90 [ 204.830280] [<00000000c0e40382>] device_del+0x182/0x418 [ 204.830281] [<00000000c0dcfd7a>] pci_remove_bus_device+0x8a/0x130 [ 204.830283] [<00000000c0dcfe92>] pci_stop_and_remove_bus_device_locked+0x3a/0x48 [ 204.830285] [<00000000c0de7190>] disable_slot+0x68/0x100 [ 204.830286] [<00000000c0de6514>] power_write_file+0x7c/0x130 [ 204.830288] [<00000000c09cc846>] kernfs_fop_write+0xe6/0x218 [ 204.830290] [<00000000c08f3480>] vfs_write+0xb0/0x1b8 [ 204.830291] [<00000000c08f378c>] ksys_write+0x6c/0xf8 [ 204.830293] [<00000000c1154374>] system_call+0xd8/0x2d8 [ 204.830294] INFO: lockdep is turned off. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	s390/pci: Recover handle in clp_set_pci_fn()	Niklas Schnelle
	When we try to recover a PCI function using echo 1 > /sys/bus/pci/devices/<id>/recover or manually with echo 1 > /sys/bus/pci/devices/<id>/remove echo 0 > /sys/bus/pci/slots/<slot>/power echo 1 > /sys/bus/pci/slots/<slot>/power clp_disable_fn() / clp_enable_fn() call clp_set_pci_fn() to first disable and then reenable the function. When the function is already in the requested state we may be left with an invalid function handle. To get a new valid handle we do a clp_list_pci() call. For this we need both the function ID and function handle in clp_set_pci_fn() so pass the zdev and get both. To simplify things also pull setting the refreshed function handle into clp_set_pci_fn() Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-22	Merge branch 'for-next/rng' into for-next/core	Will Deacon
	* for-next/rng: (2 commits) arm64: Use v8.5-RNG entropy for KASLR seed ...
2020-01-22	Merge branch 'for-next/errata' into for-next/core	Will Deacon
	* for-next/errata: (3 commits) arm64: Workaround for Cortex-A55 erratum 1530923 ...
2020-01-22	Merge branch 'for-next/asm-annotations' into for-next/core	Will Deacon
	* for-next/asm-annotations: (6 commits) arm64: kernel: Correct annotation of end of el0_sync ...
2020-01-22	Merge branches 'for-next/acpi', 'for-next/cpufeatures', 'for-next/csum', ↵	Will Deacon
	'for-next/e0pd', 'for-next/entry', 'for-next/kbuild', 'for-next/kexec/cleanup', 'for-next/kexec/file-kdump', 'for-next/misc', 'for-next/nofpsimd', 'for-next/perf' and 'for-next/scs' into for-next/core * for-next/acpi: ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map() * for-next/cpufeatures: (2 commits) arm64: Introduce ID_ISAR6 CPU register ... * for-next/csum: (2 commits) arm64: csum: Fix pathological zero-length calls ... * for-next/e0pd: (7 commits) arm64: kconfig: Fix alignment of E0PD help text ... * for-next/entry: (5 commits) arm64: entry: cleanup sp_el0 manipulation ... * for-next/kbuild: (4 commits) arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' ... * for-next/kexec/cleanup: (11 commits) Revert "arm64: kexec: make dtb_mem always enabled" ... * for-next/kexec/file-kdump: (2 commits) arm64: kexec_file: add crash dump support ... * for-next/misc: (12 commits) arm64: entry: Avoid empty alternatives entries ... * for-next/nofpsimd: (7 commits) arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly ... * for-next/perf: (2 commits) perf/imx_ddr: Fix cpu hotplug state cleanup ... * for-next/scs: (6 commits) arm64: kernel: avoid x18 in __cpu_soft_restart ...
2020-01-22	arm64: kconfig: Fix alignment of E0PD help text	Will Deacon
	Remove the additional space. Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22	bpf: Fix error path under memory pressure	Alexei Starovoitov
	Restore the 'if (env->cur_state)' check that was incorrectly removed during code move. Under memory pressure env->cur_state can be freed and zeroed inside do_check(). Hence the check is necessary. Fixes: 51c39bb1d5d1 ("bpf: Introduce function-by-function verification") Reported-by: syzbot+b296579ba5015704d9fa@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200122024138.3385590-1-ast@kernel.org
2020-01-22	bpf: Fix trampoline usage in preempt	Alexei Starovoitov
	Though the second half of trampoline page is unused a task could be preempted in the middle of the first half of trampoline and two updates to trampoline would change the code from underneath the preempted task. Hence wait for tasks to voluntarily schedule or go to userspace. Add similar wait before freeing the trampoline. Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/bpf/20200121032231.3292185-1-ast@kernel.org
2020-01-22	arm64: Use v8.5-RNG entropy for KASLR seed	Mark Brown
	When seeding KALSR on a system where we have architecture level random number generation make use of that entropy, mixing it in with the seed passed by the bootloader. Since this is run very early in init before feature detection is complete we open code rather than use archrandom.h. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22	arm64: Implement archrandom.h for ARMv8.5-RNG	Richard Henderson
	Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system registers are always available at EL0. Implement arch_get_random_seed_long using RNDR. Given that the TRNG is likely to be a shared resource between cores, and VMs, do not explicitly force re-seeding with RNDRRS. In order to avoid code complexity and potential issues with hetrogenous systems only provide values after cpufeature has finalized the system capabilities. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [Modified to only function after cpufeature has finalized the system capabilities and move all the code into the header -- broonie] Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> [will: Advertise HWCAP via /proc/cpuinfo] Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22	powerpc/xive: Discard ESB load value when interrupt is invalid	Frederic Barrat
	A load on an ESB page returning all 1's means that the underlying device has invalidated the access to the PQ state of the interrupt through mmio. It may happen, for example when querying a PHB interrupt while the PHB is in an error state. In that case, we should consider the interrupt to be invalid when checking its state in the irq_get_irqchip_state() handler. Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> [clg: wrote a commit log, introduced XIVE_ESB_INVALID ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
2020-01-22	powerpc: Ultravisor: Fix the dependencies for CONFIG_PPC_UV	Bharata B Rao
	Let PPC_UV depend only on DEVICE_PRIVATE which in turn will satisfy all the other required dependencies Fixes: 013a53f2d25a ("powerpc: Ultravisor: Add PPC_UV config option") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200109092047.24043-1-bharata@linux.ibm.com
2020-01-22	tty: baudrate: SPARC supports few more baud rates	Andy Shevchenko
	According to termbits.h SPARC supports few more baud rates than currently defined in tty_baudrate.c. Append supported ones to baud_table[] and baud_bits[]. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200115224124.74684-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	tty: baudrate: Synchronise baud_table[] and baud_bits[]	Andy Shevchenko
	Synchronize baud rate tables baud_table and baud_bits with each other for better readability. This makes clear what is being used for SPARC. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200115224124.74684-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	tty: serial: meson_uart: Add support for kernel debugger	Julien Masson
	The kgdb invokes the poll_put_char and poll_get_char when communicating with the host. This patch implement the serial polling hooks for the meson_uart to be used for KGDB debugging over serial line. Signed-off-by: Julien Masson <jmasson@baylibre.com> Link: https://lore.kernel.org/r/867e1klo48.fsf@julienm-fedora-R90NQGV9.i-did-not-set--mail-host-address--so-tickle-me Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-22	serial: imx: fix a race condition in receive path	Uwe Kleine-König
	The main irq handler function starts by first masking disabled interrupts in the status register values to ensure to only handle enabled interrupts. This is important as when the RX path in the hardware is disabled reading the RX fifo results in an external abort. This checking must be done under the port lock, otherwise the following can happen: CPU1 \| CPU2 \| irq triggers as there are chars \| in the RX fifo \| \| grab port lock imx_uart_int finds RRDY enabled \| and calls imx_uart_rxint which \| has to wait for port lock \| \| disable RX (e.g. because we're \| using RS485 with !RX_DURING_TX) \| \| release port lock read from RX fifo with RX \| disabled => exception \| So take the port lock only once in imx_uart_int() instead of in the functions called from there. Reported-by: Andre Renaud <arenaud@designa-electronics.com> Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20200121071702.20150-1-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>