summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-04-05Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard
Let's start the 5.18 fixes cycle. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-04-05objtool: Fix SLS validation for kcov tail-call replacementPeter Zijlstra
Since not all compilers have a function attribute to disable KCOV instrumentation, objtool can rewrite KCOV instrumentation in noinstr functions as per commit: f56dae88a81f ("objtool: Handle __sanitize_cov*() tail calls") However, this has subtle interaction with the SLS validation from commit: 1cc1e4c8aab4 ("objtool: Add straight-line-speculation validation") In that when a tail-call instrucion is replaced with a RET an additional INT3 instruction is also written, but is not represented in the decoded instruction stream. This then leads to false positive missing INT3 objtool warnings in noinstr code. Instead of adding additional struct instruction objects, mark the RET instruction with retpoline_safe to suppress the warning (since we know there really is an INT3). Fixes: 1cc1e4c8aab4 ("objtool: Add straight-line-speculation validation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220323230712.GA8939@worktop.programming.kicks-ass.net
2022-04-05objtool: Fix IBT tail-call detectionPeter Zijlstra
Objtool reports: arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx() falls through to next function poly1305_blocks_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_emit_avx() falls through to next function poly1305_emit_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx2() falls through to next function poly1305_blocks_x86_64() Which reads like: 0000000000000040 <poly1305_blocks_x86_64>: 40: f3 0f 1e fa endbr64 ... 0000000000000400 <poly1305_blocks_avx>: 400: f3 0f 1e fa endbr64 404: 44 8b 47 14 mov 0x14(%rdi),%r8d 408: 48 81 fa 80 00 00 00 cmp $0x80,%rdx 40f: 73 09 jae 41a <poly1305_blocks_avx+0x1a> 411: 45 85 c0 test %r8d,%r8d 414: 0f 84 2a fc ff ff je 44 <poly1305_blocks_x86_64+0x4> ... These are simple conditional tail-calls and *should* be recognised as such by objtool, however due to a mistake in commit 08f87a93c8ec ("objtool: Validate IBT assumptions") this is failing. Specifically, the jump_dest is +4, this means the instruction pointed at will not be ENDBR and as such it will fail the second clause of is_first_func_insn() that was supposed to capture this exact case. Instead, have is_first_func_insn() look at the previous instruction. Fixes: 08f87a93c8ec ("objtool: Validate IBT assumptions") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220322115125.811582125@infradead.org
2022-04-05x86/bug: Prevent shadowing in __WARN_FLAGSVincent Mailhol
The macro __WARN_FLAGS() uses a local variable named "f". This being a common name, there is a risk of shadowing other variables. For example, GCC would yield: | In file included from ./include/linux/bug.h:5, | from ./include/linux/cpumask.h:14, | from ./arch/x86/include/asm/cpumask.h:5, | from ./arch/x86/include/asm/msr.h:11, | from ./arch/x86/include/asm/processor.h:22, | from ./arch/x86/include/asm/timex.h:5, | from ./include/linux/timex.h:65, | from ./include/linux/time32.h:13, | from ./include/linux/time.h:60, | from ./include/linux/stat.h:19, | from ./include/linux/module.h:13, | from virt/lib/irqbypass.mod.c:1: | ./include/linux/rcupdate.h: In function 'rcu_head_after_call_rcu': | ./arch/x86/include/asm/bug.h:80:21: warning: declaration of 'f' shadows a parameter [-Wshadow] | 80 | __auto_type f = BUGFLAG_WARNING|(flags); \ | | ^ | ./include/asm-generic/bug.h:106:17: note: in expansion of macro '__WARN_FLAGS' | 106 | __WARN_FLAGS(BUGFLAG_ONCE | \ | | ^~~~~~~~~~~~ | ./include/linux/rcupdate.h:1007:9: note: in expansion of macro 'WARN_ON_ONCE' | 1007 | WARN_ON_ONCE(func != (rcu_callback_t)~0L); | | ^~~~~~~~~~~~ | In file included from ./include/linux/rbtree.h:24, | from ./include/linux/mm_types.h:11, | from ./include/linux/buildid.h:5, | from ./include/linux/module.h:14, | from virt/lib/irqbypass.mod.c:1: | ./include/linux/rcupdate.h:1001:62: note: shadowed declaration is here | 1001 | rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f) | | ~~~~~~~~~~~~~~~^ For reference, sparse also warns about it, c.f. [1]. This patch renames the variable from f to __flags (with two underscore prefixes as suggested in the Linux kernel coding style [2]) in order to prevent collisions. [1] https://lore.kernel.org/all/CAFGhKbyifH1a+nAMCvWM88TK6fpNPdzFtUXPmRGnnQeePV+1sw@mail.gmail.com/ [2] Linux kernel coding style, section 12) Macros, Enums and RTL, paragraph 5) namespace collisions when defining local variables in macros resembling functions https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl Fixes: bfb1a7c91fb7 ("x86/bug: Merge annotate_reachable() into_BUG_FLAGS() asm") Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220324023742.106546-1-mailhol.vincent@wanadoo.fr
2022-04-05perf/core: Always set cpuctx cgrp when enable cgroup eventChengming Zhou
When enable a cgroup event, cpuctx->cgrp setting is conditional on the current task cgrp matching the event's cgroup, so have to do it for every new event. It brings complexity but no advantage. To keep it simple, this patch would always set cpuctx->cgrp when enable the first cgroup event, and reset to NULL when disable the last cgroup event. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220329154523.86438-5-zhouchengming@bytedance.com
2022-04-05perf/core: Fix perf_cgroup_switch()Chengming Zhou
There is a race problem that can trigger WARN_ON_ONCE(cpuctx->cgrp) in perf_cgroup_switch(). CPU1 CPU2 perf_cgroup_sched_out(prev, next) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) perf_cgroup_switch(prev, PERF_CGROUP_SWOUT) cgroup_migrate_execute() task->cgroups = ? perf_cgroup_attach() task_function_call(task, __perf_cgroup_move) perf_cgroup_sched_in(prev, next) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) perf_cgroup_switch(next, PERF_CGROUP_SWIN) __perf_cgroup_move() perf_cgroup_switch(task, PERF_CGROUP_SWOUT | PERF_CGROUP_SWIN) The commit a8d757ef076f ("perf events: Fix slow and broken cgroup context switch code") want to skip perf_cgroup_switch() when the perf_cgroup of "prev" and "next" are the same. But task->cgroups can change in concurrent with context_switch() in cgroup_migrate_execute(). If cgrp1 == cgrp2 in sched_out(), cpuctx won't do sched_out. Then task->cgroups changed cause cgrp1 != cgrp2 in sched_in(), cpuctx will do sched_in. So trigger WARN_ON_ONCE(cpuctx->cgrp). Even though __perf_cgroup_move() will be synchronized as the context switch disables the interrupt, context_switch() still can see the task->cgroups is changing in the middle, since task->cgroups changed before sending IPI. So we have to combine perf_cgroup_sched_in() into perf_cgroup_sched_out(), unified into perf_cgroup_switch(), to fix the incosistency between perf_cgroup_sched_out() and perf_cgroup_sched_in(). But we can't just compare prev->cgroups with next->cgroups to decide whether to skip cpuctx sched_out/in since the prev->cgroups is changing too. For example: CPU1 CPU2 cgroup_migrate_execute() prev->cgroups = ? perf_cgroup_attach() task_function_call(task, __perf_cgroup_move) perf_cgroup_switch(task) cgrp1 = perf_cgroup_from_task(prev) cgrp2 = perf_cgroup_from_task(next) if (cgrp1 != cgrp2) cpuctx sched_out/in ... task_function_call() will return -ESRCH In the above example, prev->cgroups changing cause (cgrp1 == cgrp2) to be true, so skip cpuctx sched_out/in. And later task_function_call() would return -ESRCH since the prev task isn't running on cpu anymore. So we would leave perf_events of the old prev->cgroups still sched on the CPU, which is wrong. The solution is that we should use cpuctx->cgrp to compare with the next task's perf_cgroup. Since cpuctx->cgrp can only be changed on local CPU, and we have irq disabled, we can read cpuctx->cgrp to compare without holding ctx lock. Fixes: a8d757ef076f ("perf events: Fix slow and broken cgroup context switch code") Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220329154523.86438-4-zhouchengming@bytedance.com
2022-04-05perf/core: Use perf_cgroup_info->active to check if cgroup is activeChengming Zhou
Since we use perf_cgroup_set_timestamp() to start cgroup time and set active to 1, then use update_cgrp_time_from_cpuctx() to stop cgroup time and set active to 0. We can use info->active directly to check if cgroup is active. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220329154523.86438-3-zhouchengming@bytedance.com
2022-04-05perf/core: Don't pass task around when ctx sched inChengming Zhou
The current code pass task around for ctx_sched_in(), only to get perf_cgroup of the task, then update the timestamp of it and its ancestors and set them to active. But we can use cpuctx->cgrp to get active perf_cgroup and its ancestors since cpuctx->cgrp has been set before ctx_sched_in(). This patch remove the task argument in ctx_sched_in() and cleanup related code. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220329154523.86438-2-zhouchengming@bytedance.com
2022-04-05perf/x86/intel: Update the FRONTEND MSR mask on Sapphire RapidsKan Liang
On Sapphire Rapids, the FRONTEND_RETIRED.MS_FLOWS event requires the FRONTEND MSR value 0x8. However, the current FRONTEND MSR mask doesn't support it. Update intel_spr_extra_regs[] to support it. Fixes: 61b985e3e775 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1648482543-14923-2-git-send-email-kan.liang@linux.intel.com
2022-04-05perf/x86/intel: Don't extend the pseudo-encoding to GP countersKan Liang
The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR. perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0 Performance counter stats for 'CPU(s) 0': 607,246 cpu/event=0xc0,umask=0x0/ 0 cpu/event=0x0,umask=0x1/ The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which doesn't work on the generic counters. However, current perf extends its mask to the generic counters. The pseudo event-code for a fixed counter must be 0x00. Check and avoid extending the mask for the fixed counter event which using the pseudo-encoding, e.g., ref-cycles and PREC_DIST event. With the patch, perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0 Performance counter stats for 'CPU(s) 0': 583,184 cpu/event=0xc0,umask=0x0/ 583,048 cpu/event=0x0,umask=0x1/ Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1648482543-14923-1-git-send-email-kan.liang@linux.intel.com
2022-04-05perf/core: Inherit event_capsNamhyung Kim
It was reported that some perf event setup can make fork failed on ARM64. It was the case of a group of mixed hw and sw events and it failed in perf_event_init_task() due to armpmu_event_init(). The ARM PMU code checks if all the events in a group belong to the same PMU except for software events. But it didn't set the event_caps of inherited events and no longer identify them as software events. Therefore the test failed in a child process. A simple reproducer is: $ perf stat -e '{cycles,cs,instructions}' perf bench sched messaging # Running 'sched/messaging' benchmark: perf: fork(): Invalid argument The perf stat was fine but the perf bench failed in fork(). Let's inherit the event caps from the parent. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20220328200112.457740-1-namhyung@kernel.org
2022-04-05perf/x86/uncore: Add Raptor Lake uncore supportKan Liang
The uncore PMU of the Raptor Lake is the same as Alder Lake. Add new PCIIDs of IMC for Raptor Lake. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/1647366360-82824-4-git-send-email-kan.liang@linux.intel.com
2022-04-05perf/x86/msr: Add Raptor Lake CPU supportKan Liang
Raptor Lake is Intel's successor to Alder lake. PPERF and SMI_COUNT MSRs are also supported. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/1647366360-82824-3-git-send-email-kan.liang@linux.intel.com
2022-04-05perf/x86/cstate: Add Raptor Lake supportKan Liang
Raptor Lake is Intel's successor to Alder lake. From the perspective of Intel cstate residency counters, there is nothing changed compared with Alder lake. Share adl_cstates with Alder lake. Update the comments for Raptor Lake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/1647366360-82824-2-git-send-email-kan.liang@linux.intel.com
2022-04-05perf/x86: Add Intel Raptor Lake supportKan Liang
From PMU's perspective, Raptor Lake is the same as the Alder Lake. The only difference is the event list, which will be supported in the perf tool later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/1647366360-82824-1-git-send-email-kan.liang@linux.intel.com
2022-04-05Revert "mm/page_alloc: mark pagesets as __maybe_unused"Sebastian Andrzej Siewior
The local_lock() is now using a proper static inline function which is enough for llvm to accept that the variable is used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220328145810.86783-4-bigeasy@linutronix.de
2022-04-05Revert "locking/local_lock: Make the empty local_lock_*() function a macro."Sebastian Andrzej Siewior
With volatile removed from arch_raw_cpu_ptr() the compiler no longer creates the per-CPU reference. The usage of the macro can be reverted now. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220328145810.86783-3-bigeasy@linutronix.de
2022-04-05x86/percpu: Remove volatile from arch_raw_cpu_ptr().Sebastian Andrzej Siewior
The volatile attribute in the inline assembly of arch_raw_cpu_ptr() forces the compiler to always generate the code, even if the compiler can decide upfront that its result is not needed. For instance invoking __intel_pmu_disable_all(false) (like intel_pmu_snapshot_arch_branch_stack() does) leads to loading the address of &cpu_hw_events into the register while compiler knows that it has no need for it. This ends up with code like: | movq $cpu_hw_events, %rax #, tcp_ptr__ | add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ | xorl %eax, %eax # tmp93 It also creates additional code within local_lock() with !RT && !LOCKDEP which is not desired. By removing the volatile attribute the compiler can place the function freely and avoid it if it is not needed in the end. By using the function twice the compiler properly caches only the variable offset and always loads the CPU-offset. this_cpu_ptr() also remains properly placed within a preempt_disable() sections because - arch_raw_cpu_ptr() assembly has a memory input ("m" (this_cpu_off)) - prempt_{dis,en}able() fundamentally has a 'barrier()' in it Therefore this_cpu_ptr() is already properly serialized and does not rely on the 'volatile' attribute. Remove volatile from arch_raw_cpu_ptr(). [ bigeasy: Added Linus' explanation why this_cpu_ptr() is not moved out of a preempt_disable() section without the 'volatile' attribute. ] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220328145810.86783-2-bigeasy@linutronix.de
2022-04-05static_call: Remove __DEFINE_STATIC_CALL macroChristophe Leroy
Only DEFINE_STATIC_CALL use __DEFINE_STATIC_CALL macro now when CONFIG_HAVE_STATIC_CALL is selected. Only keep __DEFINE_STATIC_CALL() for the generic fallback, and also use it to implement DEFINE_STATIC_CALL_NULL() in that case. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/329074f92d96e3220ebe15da7bbe2779beee31eb.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Properly initialise DEFINE_STATIC_CALL_RET0()Christophe Leroy
When a static call is updated with __static_call_return0() as target, arch_static_call_transform() set it to use an optimised set of instructions which are meant to lay in the same cacheline. But when initialising a static call with DEFINE_STATIC_CALL_RET0(), we get a branch to the real __static_call_return0() function instead of getting the optimised setup: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 4b ff ff f4 b c00d8114 <__static_call_return0> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Add ARCH_DEFINE_STATIC_CALL_RET0_TRAMP() defined by each architecture to setup the optimised configuration, and rework DEFINE_STATIC_CALL_RET0() to call it: c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 48 00 00 14 b c00d8134 <__SCT__perf_snapshot_branch_stack+0x14> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/1e0a61a88f52a460f62a58ffc2a5f847d1f7d9d8.1647253456.git.christophe.leroy@csgroup.eu
2022-04-05static_call: Don't make __static_call_return0 staticChristophe Leroy
System.map shows that vmlinux contains several instances of __static_call_return0(): c0004fc0 t __static_call_return0 c0011518 t __static_call_return0 c00d8160 t __static_call_return0 arch_static_call_transform() uses the middle one to check whether we are setting a call to __static_call_return0 or not: c0011520 <arch_static_call_transform>: c0011520: 3d 20 c0 01 lis r9,-16383 <== r9 = 0xc001 << 16 c0011524: 39 29 15 18 addi r9,r9,5400 <== r9 += 0x1518 c0011528: 7c 05 48 00 cmpw r5,r9 <== r9 has value 0xc0011518 here So if static_call_update() is called with one of the other instances of __static_call_return0(), arch_static_call_transform() won't recognise it. In order to work properly, global single instance of __static_call_return0() is required. Fixes: 3f2a8fc4b15d ("static_call/x86: Add __static_call_return0()") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu
2022-04-05x86,static_call: Fix __static_call_return0 for i386Peter Zijlstra
Paolo reported that the instruction sequence that is used to replace: call __static_call_return0 namely: 66 66 48 31 c0 data16 data16 xor %rax,%rax decodes to something else on i386, namely: 66 66 48 data16 dec %ax 31 c0 xor %eax,%eax Which is a nonsensical sequence that happens to have the same outcome. *However* an important distinction is that it consists of 2 instructions which is a problem when the thing needs to be overwriten with a regular call instruction again. As such, replace the instruction with something that decodes the same on both i386 and x86_64. Fixes: 3f2a8fc4b15d ("static_call/x86: Add __static_call_return0()") Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220318204419.GT8939@worktop.programming.kicks-ass.net
2022-04-05entry: Fix compile error in dynamic_irqentry_exit_cond_resched()Sven Schnelle
kernel/entry/common.c: In function ‘dynamic_irqentry_exit_cond_resched’: kernel/entry/common.c:409:14: error: implicit declaration of function ‘static_key_unlikely’; did you mean ‘static_key_enable’? [-Werror=implicit-function-declaration] 409 | if (!static_key_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) | ^~~~~~~~~~~~~~~~~~~ | static_key_enable static_key_unlikely() should be static_branch_unlikely(). Fixes: 99cf983cc8bca ("sched/preempt: Add PREEMPT_DYNAMIC using static keys") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220330084328.1805665-1-svens@linux.ibm.com
2022-04-05sched: Teach the forced-newidle balancer about CPU affinity limitation.Sebastian Andrzej Siewior
try_steal_cookie() looks at task_struct::cpus_mask to decide if the task could be moved to `this' CPU. It ignores that the task might be in a migration disabled section while not on the CPU. In this case the task must not be moved otherwise per-CPU assumption are broken. Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a task can be moved. Fixes: d2dfa17bc7de6 ("sched: Trivial forced-newidle balancer") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/YjNK9El+3fzGmswf@linutronix.de
2022-04-05sched/core: Fix forceidle balancingPeter Zijlstra
Steve reported that ChromeOS encounters the forceidle balancer being ran from rt_mutex_setprio()'s balance_callback() invocation and explodes. Now, the forceidle balancer gets queued every time the idle task gets selected, set_next_task(), which is strictly too often. rt_mutex_setprio() also uses set_next_task() in the 'change' pattern: queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */ running = task_current(rq, p); /* rq->curr == p */ if (queued) dequeue_task(...); if (running) put_prev_task(...); /* change task properties */ if (queued) enqueue_task(...); if (running) set_next_task(...); However, rt_mutex_setprio() will explicitly not run this pattern on the idle task (since priority boosting the idle task is quite insane). Most other 'change' pattern users are pidhash based and would also not apply to idle. Also, the change pattern doesn't contain a __balance_callback() invocation and hence we could have an out-of-band balance-callback, which *should* trigger the WARN in rq_pin_lock() (which guards against this exact anti-pattern). So while none of that explains how this happens, it does indicate that having it in set_next_task() might not be the most robust option. Instead, explicitly queue the forceidle balancer from pick_next_task() when it does indeed result in forceidle selection. Having it here, ensures it can only be triggered under the __schedule() rq->lock instance, and hence must be ran from that context. This also happens to clean up the code a little, so win-win. Fixes: d2dfa17bc7de ("sched: Trivial forced-newidle balancer") Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: T.J. Alumbaugh <talumbau@chromium.org> Link: https://lkml.kernel.org/r/20220330160535.GN8939@worktop.programming.kicks-ass.net
2022-04-05dt-bindings: display: bridge: Drop requirement on input port for DSI devicesMaxime Ripard
MIPI-DSI devices, if they are controlled through the bus itself, have to be described as a child node of the controller they are attached to. Thus, there's no requirement on the controller having an OF-Graph output port to model the data stream: it's assumed that it would go from the parent to the child. However, some bridges controlled through the DSI bus still require an input OF-Graph port, thus requiring a controller with an OF-Graph output port. This prevents those bridges from being used with the controllers that do not have one without any particular reason to. Let's drop that requirement. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220323154823.839469-1-maxime@cerno.tech
2022-04-05sctp: count singleton chunks in assoc user statsJamie Bainbridge
Singleton chunks (INIT, HEARTBEAT PMTU probes, and SHUTDOWN- COMPLETE) are not counted in SCTP_GET_ASOC_STATS "sas_octrlchunks" counter available to the assoc owner. These are all control chunks so they should be counted as such. Add counting of singleton chunks so they are properly accounted for. Fixes: 196d67593439 ("sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/c9ba8785789880cf07923b8a5051e174442ea9ee.1649029663.git.jamie.bainbridge@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-04cifs: update internal module numberSteve French
To 2.36 Signed-off-by: Steve French <stfrench@microsoft.com>
2022-04-04cifs: force new session setup and tcon for dfsPaulo Alcantara
Do not reuse existing sessions and tcons in DFS failover as it might connect to different servers and shares. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: stable@vger.kernel.org Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-04-04io_uring: move read/write file prep state into actual opcode handlerJens Axboe
In preparation for not necessarily having a file assigned at prep time, defer any initialization associated with the file to when the opcode handler is run. Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-04io_uring: defer splice/tee file validity check until command issueJens Axboe
In preparation for not using the file at prep time, defer checking if this file refers to a valid io_uring instance until issue time. This also means we can get rid of the cleanup flag for splice and tee. Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-05drm/nouveau/pmu: Add missing callbacks for Tegra devicesKarol Herbst
Fixes a crash booting on those platforms with nouveau. Fixes: 4cdd2450bf73 ("drm/nouveau/pmu/gm200-: use alternate falcon reset sequence") Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.17+ Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220322124800.2605463-1-kherbst@redhat.com
2022-04-04firmware: arm_scmi: Fix sparse warnings in OPTEE transport driverSudeep Holla
The sparse checker complains about converting pointers between address spaces. We correctly stored an __iomem pointer in struct scmi_optee_channel, but discarded the __iomem when returning it from get_channel_shm, causing one warning. Then we passed the non-__iomem pointer return from get_channel_shm at two other places, where an __iomem pointer is expected, causing couple of other warnings Add the appropriate __iomem annotations at all places where it is missing. optee.c:414:20: warning: incorrect type in return expression (different address spaces) optee.c:414:20: expected struct scmi_shared_mem * optee.c:414:20: got struct scmi_shared_mem [noderef] __iomem *shmem optee.c:426:26: warning: incorrect type in argument 1 (different address spaces) optee.c:426:26: expected struct scmi_shared_mem [noderef] __iomem *shmem optee.c:426:26: got struct scmi_shared_mem *shmem optee.c:441:30: warning: incorrect type in argument 1 (different address spaces) optee.c:441:30: expected struct scmi_shared_mem [noderef] __iomem *shmem optee.c:441:30: got struct scmi_shared_mem *shmem Link: https://lore.kernel.org/r/20220404102419.1159705-1-sudeep.holla@arm.com Cc: Etienne Carriere <etienne.carriere@linaro.org> Cc: Cristian Marussi <cristian.marussi@arm.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-04-04firmware: arm_scmi: Replace zero-length array with flexible-array memberLv Ruyi
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://lore.kernel.org/r/20220401075537.2407376-1-lv.ruyi@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-04-04firmware: arm_scmi: Fix sorting of retrieved clock ratesCristian Marussi
During SCMI Clock protocol initialization, after having retrieved from the SCMI platform all the available discrete rates for a specific clock, the clock rates array is sorted, unfortunately using a pointer to its end as a base instead of its start, so that sorting does not work. Fix invocation of sort() passing as base a pointer to the start of the retrieved clock rates array. Link: https://lore.kernel.org/r/20220318092813.49283-1-cristian.marussi@arm.com Fixes: dccec73de91d ("firmware: arm_scmi: Keep the discrete clock rates sorted") Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-04-04selftests/harness: Pass variant to teardownWillem de Bruijn
FIXTURE_VARIANT data is passed to FIXTURE_SETUP and TEST_F as "variant". In some cases, the variant will change the setup, such that expectations also change on teardown. Also pass variant to FIXTURE_TEARDOWN. The new FIXTURE_TEARDOWN logic is identical to that in FIXTURE_SETUP, right above. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201210231010.420298-1-willemdebruijn.kernel@gmail.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests/harness: Run TEARDOWN for ASSERT failuresKees Cook
The kselftest test harness has traditionally not run the registered TEARDOWN handler when a test encountered an ASSERT. This creates unexpected situations and tests need to be very careful about using ASSERT, which seems a needless hurdle for test writers. Because of the harness's design for optional failure handlers, the original implementation of ASSERT used an abort() to immediately stop execution, but that meant the context for running teardown was lost. Instead, use setjmp/longjmp so that teardown can be done. Failed SETUP routines continue to not be followed by TEARDOWN, though. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests: fix an unused variable warning in pidfd selftestAxel Rasmussen
I fixed a few warnings like this in commit e2aa5e650b07 ("selftests: fixup build warnings in pidfd / clone3 tests"), but I missed this one by mistake. Since this variable is unused, remove it. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests: fix header dependency for pid_namespace selftestsAxel Rasmussen
The way the test target was defined before, when building with clang we get a command line like this: clang -Wall -Werror -g -I../../../../usr/include/ \ regression_enomem.c ../pidfd/pidfd.h -o regression_enomem This yields an error, because clang thinks we want to produce both a *.o file, as well as a precompiled header: clang: error: cannot specify -o when generating multiple output files gcc, for whatever reason, doesn't exhibit the same behavior which I suspect is why the problem wasn't noticed before. This can be fixed simply by using the LOCAL_HDRS infrastructure the selftests lib.mk provides. It does the right think and marks the target as depending on the header (so if the header changes, we rebuild), but it filters the header out of the compiler command line, so we don't get the error described above. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests: x86: add 32bit build warnings for SUSEGeliang Tang
In order to successfully build all these 32bit tests, these 32bit gcc and glibc packages, named gcc-32bit and glibc-devel-static-32bit on SUSE, need to be installed. This patch added this information in warn_32bit_failure. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests/proc: fix array_size.cocci warningGuo Zhengkui
Fix the following coccicheck warning: tools/testing/selftests/proc/proc-pid-vm.c:371:26-27: WARNING: Use ARRAY_SIZE tools/testing/selftests/proc/proc-pid-vm.c:420:26-27: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04selftests/vDSO: fix array_size.cocci warningGuo Zhengkui
Fix the following coccicheck warning: tools/testing/selftests/vDSO/vdso_test_correctness.c:309:46-47: WARNING: Use ARRAY_SIZE tools/testing/selftests/vDSO/vdso_test_correctness.c:373:46-47: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04dt-bindings: Fix 'enum' lists with duplicate entriesRob Herring
There's no reason to list the same value twice in an 'enum'. Fix all the occurrences in the tree. A meta-schema change will catch future ones. Cc: Krzysztof Kozlowski <krzk+dt@kernel.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Charles Keepax <ckeepax@opensource.cirrus.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Sebastian Reichel <sre@kernel.org> Cc: Tony Lindgren <tony@atomide.com> Cc: Yunfei Dong <yunfei.dong@mediatek.com> Cc: - <patches@opensource.cirrus.com> Cc: linux-media@vger.kernel.org Cc: alsa-devel@alsa-project.org Cc: linux-gpio@vger.kernel.org Cc: linux-pm@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20220401141247.2993925-1-robh@kernel.org
2022-04-04dt-bindings: irqchip: mrvl,intc: refresh maintainersKrzysztof Kozlowski
Jason's email bounces and his address was dropped from maintainers in commit 509920aee72a ("MAINTAINERS: Move Jason Cooper to CREDITS"), so drop him here too. Switch other maintainers from IRQCHIP subsystem maintainers to Marvell Orion platform maintainers because its a bigger chance they know the hardware. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220317142952.479413-1-krzysztof.kozlowski@canonical.com
2022-04-04dt-bindings: Fix incomplete if/then/else schemasRob Herring
A recent review highlighted that the json-schema meta-schema allows any combination of if/then/else schema keywords even though if, then or else by themselves makes little sense. With an added meta-schema to only allow valid combinations, there's a handful of schemas found which need fixing in a variety of ways. Incorrect indentation is the most common issue. Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Michael Hennerich <Michael.Hennerich@analog.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Krzysztof Kozlowski <krzk+dt@kernel.org> Cc: Olivier Moysan <olivier.moysan@foss.st.com> Cc: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Georgi Djakov <djakov@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Dmitry Osipenko <digetx@gmail.com> Cc: linux-iio@vger.kernel.org Cc: alsa-devel@alsa-project.org Cc: linux-mmc@vger.kernel.org Cc: linux-tegra@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-phy@lists.infradead.org Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220330145741.3044896-1-robh@kernel.org
2022-04-04Revert "ACPI: processor: idle: Only flush cache on entering C3"Akihiko Odaki
Revert commit 87ebbb8c612b ("ACPI: processor: idle: Only flush cache on entering C3") that broke the assumptions of the acpi_idle_play_dead() callers. Namely, the CPU cache must always be flushed in acpi_idle_play_dead(), regardless of the target C-state that is going to be requested, because this is likely to be part of a CPU offline procedure or preparation for entering a system-wide sleep state and the lack of synchronization between the CPU cache and RAM may lead to problems going forward, for example when the CPU is brought back online. In particular, it breaks resume from suspend-to-RAM on Lenovo ThinkPad C13 which fails occasionally until the problematic commit is reverted. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-04dt-bindings: power: renesas,apmu: Fix cpus property limitsGeert Uytterhoeven
"make dtbs_check": arch/arm/boot/dts/r8a7791-koelsch.dtb: apmu@e6152000: cpus:0: [6, 7] is too long From schema: Documentation/devicetree/bindings/power/renesas,apmu.yaml Correct the minimum and maximum number of CPUs controlled by a single APMU instance. Fixes: 39bd2b6a3783b899 ("dt-bindings: Improve phandle-array schemas") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/9ece1a07bbcb95abc9d80e6a6ecc95806a294a11.1648645279.git.geert+renesas@glider.be
2022-04-04dt-bindings: extcon: maxim,max77843: fix ports typeKrzysztof Kozlowski
The "ports" property can contain multiple ports as name suggests, so it should be using "ports" type from device graphs. Reported-by: Rob Herring <robh@kernel.org> Fixes: 9729cad0278b ("dt-bindings: extcon: maxim,max77843: Add MAX77843 bindings") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220310073258.24060-1-krzysztof.kozlowski@canonical.com
2022-04-04Documentation: kunit: fix path to .kunitconfig in start.rstDaniel Latypov
Commit ddbd60c779b4 ("kunit: use --build_dir=.kunit as default") changed the default --build_dir, which had the side effect of making `.kunitconfig` move to `.kunit/.kunitconfig`. However, the first few lines of kunit/start.rst never got updated, oops. Fix this by telling people to run kunit.py first, which will automatically generate the .kunit directory and .kunitconfig file, and then edit the file manually as desired. Reported-by: Yifan Yuan <alpc_metic@live.com> Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04IB/rdmavt: add lock to call to rvt_error_qp to prevent a race conditionNiels Dossche
The documentation of the function rvt_error_qp says both r_lock and s_lock need to be held when calling that function. It also asserts using lockdep that both of those locks are held. However, the commit I referenced in Fixes accidentally makes the call to rvt_error_qp in rvt_ruc_loopback no longer covered by r_lock. This results in the lockdep assertion failing and also possibly in a race condition. Fixes: d757c60eca9b ("IB/rdmavt: Fix concurrency panics in QP post_send and modify to error") Link: https://lore.kernel.org/r/20220228165330.41546-1-dossche.niels@gmail.com Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>