summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2022-02-11sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCsMel Gorman
Commit 7d2b5dd0bcc4 ("sched/numa: Allow a floating imbalance between NUMA nodes") allowed an imbalance between NUMA nodes such that communicating tasks would not be pulled apart by the load balancer. This works fine when there is a 1:1 relationship between LLC and node but can be suboptimal for multiple LLCs if independent tasks prematurely use CPUs sharing cache. Zen* has multiple LLCs per node with local memory channels and due to the allowed imbalance, it's far harder to tune some workloads to run optimally than it is on hardware that has 1 LLC per node. This patch allows an imbalance to exist up to the point where LLCs should be balanced between nodes. On a Zen3 machine running STREAM parallelised with OMP to have on instance per LLC the results and without binding, the results are 5.17.0-rc0 5.17.0-rc0 vanilla sched-numaimb-v6 MB/sec copy-16 162596.94 ( 0.00%) 580559.74 ( 257.05%) MB/sec scale-16 136901.28 ( 0.00%) 374450.52 ( 173.52%) MB/sec add-16 157300.70 ( 0.00%) 564113.76 ( 258.62%) MB/sec triad-16 151446.88 ( 0.00%) 564304.24 ( 272.61%) STREAM can use directives to force the spread if the OpenMP is new enough but that doesn't help if an application uses threads and it's not known in advance how many threads will be created. Coremark is a CPU and cache intensive benchmark parallelised with threads. When running with 1 thread per core, the vanilla kernel allows threads to contend on cache. With the patch; 5.17.0-rc0 5.17.0-rc0 vanilla sched-numaimb-v5 Min Score-16 368239.36 ( 0.00%) 389816.06 ( 5.86%) Hmean Score-16 388607.33 ( 0.00%) 427877.08 * 10.11%* Max Score-16 408945.69 ( 0.00%) 481022.17 ( 17.62%) Stddev Score-16 15247.04 ( 0.00%) 24966.82 ( -63.75%) CoeffVar Score-16 3.92 ( 0.00%) 5.82 ( -48.48%) It can also make a big difference for semi-realistic workloads like specjbb which can execute arbitrary numbers of threads without advance knowledge of how they should be placed. Even in cases where the average performance is neutral, the results are more stable. 5.17.0-rc0 5.17.0-rc0 vanilla sched-numaimb-v6 Hmean tput-1 71631.55 ( 0.00%) 73065.57 ( 2.00%) Hmean tput-8 582758.78 ( 0.00%) 556777.23 ( -4.46%) Hmean tput-16 1020372.75 ( 0.00%) 1009995.26 ( -1.02%) Hmean tput-24 1416430.67 ( 0.00%) 1398700.11 ( -1.25%) Hmean tput-32 1687702.72 ( 0.00%) 1671357.04 ( -0.97%) Hmean tput-40 1798094.90 ( 0.00%) 2015616.46 * 12.10%* Hmean tput-48 1972731.77 ( 0.00%) 2333233.72 ( 18.27%) Hmean tput-56 2386872.38 ( 0.00%) 2759483.38 ( 15.61%) Hmean tput-64 2909475.33 ( 0.00%) 2925074.69 ( 0.54%) Hmean tput-72 2585071.36 ( 0.00%) 2962443.97 ( 14.60%) Hmean tput-80 2994387.24 ( 0.00%) 3015980.59 ( 0.72%) Hmean tput-88 3061408.57 ( 0.00%) 3010296.16 ( -1.67%) Hmean tput-96 3052394.82 ( 0.00%) 2784743.41 ( -8.77%) Hmean tput-104 2997814.76 ( 0.00%) 2758184.50 ( -7.99%) Hmean tput-112 2955353.29 ( 0.00%) 2859705.09 ( -3.24%) Hmean tput-120 2889770.71 ( 0.00%) 2764478.46 ( -4.34%) Hmean tput-128 2871713.84 ( 0.00%) 2750136.73 ( -4.23%) Stddev tput-1 5325.93 ( 0.00%) 2002.53 ( 62.40%) Stddev tput-8 6630.54 ( 0.00%) 10905.00 ( -64.47%) Stddev tput-16 25608.58 ( 0.00%) 6851.16 ( 73.25%) Stddev tput-24 12117.69 ( 0.00%) 4227.79 ( 65.11%) Stddev tput-32 27577.16 ( 0.00%) 8761.05 ( 68.23%) Stddev tput-40 59505.86 ( 0.00%) 2048.49 ( 96.56%) Stddev tput-48 168330.30 ( 0.00%) 93058.08 ( 44.72%) Stddev tput-56 219540.39 ( 0.00%) 30687.02 ( 86.02%) Stddev tput-64 121750.35 ( 0.00%) 9617.36 ( 92.10%) Stddev tput-72 223387.05 ( 0.00%) 34081.13 ( 84.74%) Stddev tput-80 128198.46 ( 0.00%) 22565.19 ( 82.40%) Stddev tput-88 136665.36 ( 0.00%) 27905.97 ( 79.58%) Stddev tput-96 111925.81 ( 0.00%) 99615.79 ( 11.00%) Stddev tput-104 146455.96 ( 0.00%) 28861.98 ( 80.29%) Stddev tput-112 88740.49 ( 0.00%) 58288.23 ( 34.32%) Stddev tput-120 186384.86 ( 0.00%) 45812.03 ( 75.42%) Stddev tput-128 78761.09 ( 0.00%) 57418.48 ( 27.10%) Similarly, for embarassingly parallel problems like NPB-ep, there are improvements due to better spreading across LLC when the machine is not fully utilised. vanilla sched-numaimb-v6 Min ep.D 31.79 ( 0.00%) 26.11 ( 17.87%) Amean ep.D 31.86 ( 0.00%) 26.17 * 17.86%* Stddev ep.D 0.07 ( 0.00%) 0.05 ( 24.41%) CoeffVar ep.D 0.22 ( 0.00%) 0.20 ( 7.97%) Max ep.D 31.93 ( 0.00%) 26.21 ( 17.91%) Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20220208094334.16379-3-mgorman@techsingularity.net
2022-02-11bpf: Fix a bpf_timer initialization issueYonghong Song
The patch in [1] intends to fix a bpf_timer related issue, but the fix caused existing 'timer' selftest to fail with hang or some random errors. After some debug, I found an issue with check_and_init_map_value() in the hashtab.c. More specifically, in hashtab.c, we have code l_new = bpf_map_kmalloc_node(&htab->map, ...) check_and_init_map_value(&htab->map, l_new...) Note that bpf_map_kmalloc_node() does not do initialization so l_new contains random value. The function check_and_init_map_value() intends to zero the bpf_spin_lock and bpf_timer if they exist in the map. But I found bpf_spin_lock is zero'ed but bpf_timer is not zero'ed. With [1], later copy_map_value() skips copying of bpf_spin_lock and bpf_timer. The non-zero bpf_timer caused random failures for 'timer' selftest. Without [1], for both bpf_spin_lock and bpf_timer case, bpf_timer will be zero'ed, so 'timer' self test is okay. For check_and_init_map_value(), why bpf_spin_lock is zero'ed properly while bpf_timer not. In bpf uapi header, we have struct bpf_spin_lock { __u32 val; }; struct bpf_timer { __u64 :64; __u64 :64; } __attribute__((aligned(8))); The initialization code: *(struct bpf_spin_lock *)(dst + map->spin_lock_off) = (struct bpf_spin_lock){}; *(struct bpf_timer *)(dst + map->timer_off) = (struct bpf_timer){}; It appears the compiler has no obligation to initialize anonymous fields. For example, let us use clang with bpf target as below: $ cat t.c struct bpf_timer { unsigned long long :64; }; struct bpf_timer2 { unsigned long long a; }; void test(struct bpf_timer *t) { *t = (struct bpf_timer){}; } void test2(struct bpf_timer2 *t) { *t = (struct bpf_timer2){}; } $ clang -target bpf -O2 -c -g t.c $ llvm-objdump -d t.o ... 0000000000000000 <test>: 0: 95 00 00 00 00 00 00 00 exit 0000000000000008 <test2>: 1: b7 02 00 00 00 00 00 00 r2 = 0 2: 7b 21 00 00 00 00 00 00 *(u64 *)(r1 + 0) = r2 3: 95 00 00 00 00 00 00 00 exit gcc11.2 does not have the above issue. But from INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x Programming languages — C http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n1547.pdf page 157: Except where explicitly stated otherwise, for the purposes of this subclause unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization. To fix the problem, let use memset for bpf_timer case in check_and_init_map_value(). For consistency, memset is also used for bpf_spin_lock case. [1] https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com/ Fixes: 68134668c17f3 ("bpf: Add map side support for bpf timers.") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220211194953.3142152-1-yhs@fb.com
2022-02-11bpf: Fix crash due to incorrect copy_map_valueKumar Kartikeya Dwivedi
When both bpf_spin_lock and bpf_timer are present in a BPF map value, copy_map_value needs to skirt both objects when copying a value into and out of the map. However, the current code does not set both s_off and t_off in copy_map_value, which leads to a crash when e.g. bpf_spin_lock is placed in map value with bpf_timer, as bpf_map_update_elem call will be able to overwrite the other timer object. When the issue is not fixed, an overwriting can produce the following splat: [root@(none) bpf]# ./test_progs -t timer_crash [ 15.930339] bpf_testmod: loading out-of-tree module taints kernel. [ 16.037849] ================================================================== [ 16.038458] BUG: KASAN: user-memory-access in __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.038944] Write of size 8 at addr 0000000000043ec0 by task test_progs/325 [ 16.039399] [ 16.039514] CPU: 0 PID: 325 Comm: test_progs Tainted: G OE 5.16.0+ #278 [ 16.039983] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.15.0-1 04/01/2014 [ 16.040485] Call Trace: [ 16.040645] <TASK> [ 16.040805] dump_stack_lvl+0x59/0x73 [ 16.041069] ? __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.041427] kasan_report.cold+0x116/0x11b [ 16.041673] ? __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.042040] __pv_queued_spin_lock_slowpath+0x32b/0x520 [ 16.042328] ? memcpy+0x39/0x60 [ 16.042552] ? pv_hash+0xd0/0xd0 [ 16.042785] ? lockdep_hardirqs_off+0x95/0xd0 [ 16.043079] __bpf_spin_lock_irqsave+0xdf/0xf0 [ 16.043366] ? bpf_get_current_comm+0x50/0x50 [ 16.043608] ? jhash+0x11a/0x270 [ 16.043848] bpf_timer_cancel+0x34/0xe0 [ 16.044119] bpf_prog_c4ea1c0f7449940d_sys_enter+0x7c/0x81 [ 16.044500] bpf_trampoline_6442477838_0+0x36/0x1000 [ 16.044836] __x64_sys_nanosleep+0x5/0x140 [ 16.045119] do_syscall_64+0x59/0x80 [ 16.045377] ? lock_is_held_type+0xe4/0x140 [ 16.045670] ? irqentry_exit_to_user_mode+0xa/0x40 [ 16.046001] ? mark_held_locks+0x24/0x90 [ 16.046287] ? asm_exc_page_fault+0x1e/0x30 [ 16.046569] ? asm_exc_page_fault+0x8/0x30 [ 16.046851] ? lockdep_hardirqs_on+0x7e/0x100 [ 16.047137] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 16.047405] RIP: 0033:0x7f9e4831718d [ 16.047602] Code: b4 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b3 6c 0c 00 f7 d8 64 89 01 48 [ 16.048764] RSP: 002b:00007fff488086b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000023 [ 16.049275] RAX: ffffffffffffffda RBX: 00007f9e48683740 RCX: 00007f9e4831718d [ 16.049747] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007fff488086d0 [ 16.050225] RBP: 00007fff488086f0 R08: 00007fff488085d7 R09: 00007f9e4cb594a0 [ 16.050648] R10: 0000000000000000 R11: 0000000000000206 R12: 00007f9e484cde30 [ 16.051124] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 16.051608] </TASK> [ 16.051762] ================================================================== Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220209070324.1093182-2-memxor@gmail.com
2022-02-11Merge tag 'acpi-5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert two commits that turned out to be problematic and fix two issues related to wakeup from suspend-to-idle on x86. Specifics: - Revert a recent change that attempted to avoid issues with conflicting address ranges during PCI initialization, because it turned out to introduce a regression (Hans de Goede). - Revert a change that limited EC GPE wakeups from suspend-to-idle to systems based on Intel hardware, because it turned out that systems based on hardware from other vendors depended on that functionality too (Mario Limonciello). - Fix two issues related to the handling of wakeup interrupts and wakeup events signaled through the EC GPE during suspend-to-idle on x86 (Rafael Wysocki)" * tag 'acpi-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/PCI: revert "Ignore E820 reservations for bridge windows on newer systems" PM: s2idle: ACPI: Fix wakeup interrupts handling ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems"
2022-02-11block: partition include/linux/blk-cgroup.hMing Lei
Partition include/linux/blk-cgroup.h into two parts: one is public part, the other is block layer private part. Suggested by Christoph Hellwig. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220211101149.2368042-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-11block: remove THROTL_IOPS_MAXMing Lei
No one uses THROTL_IOPS_MAX any more, so remove it. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220211101149.2368042-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-11Merge tag 'wireless-next-2022-02-11' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next wireless-next patches for v5.18 First set of patches for v5.18, with both wireless and stack patches. rtw89 now has AP mode support and wcn36xx has survey support. But otherwise pretty normal. Major changes: ath11k * add LDPC FEC type in 802.11 radiotap header * enable RX PPDU stats in monitor co-exist mode wcn36xx * implement survey reporting brcmfmac * add CYW43570 PCIE device rtw88 * rtw8821c: enable RFE 6 devices rtw89 * AP mode support mt76 * mt7916 support * background radar detection support
2022-02-11net/smc: Dynamic control handshake limitation by socket optionsD. Wythe
This patch aims to add dynamic control for SMC handshake limitation for every smc sockets, in production environment, it is possible for the same applications to handle different service types, and may have different opinion on SMC handshake limitation. This patch try socket options to complete it, since we don't have socket option level for SMC yet, which requires us to implement it at the same time. This patch does the following: - add new socket option level: SOL_SMC. - add new SMC socket option: SMC_LIMIT_HS. - provide getter/setter for SMC socket options. Link: https://lore.kernel.org/all/20f504f961e1a803f85d64229ad84260434203bd.1644323503.git.alibuda@linux.alibaba.com/ Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11net/smc: Limit SMC visits when handshake workqueue congestedD. Wythe
This patch intends to provide a mechanism to put constraint on SMC connections visit according to the pressure of SMC handshake process. At present, frequent visits will cause the incoming connections to be backlogged in SMC handshake queue, raise the connections established time. Which is quite unacceptable for those applications who base on short lived connections. There are two ways to implement this mechanism: 1. Put limitation after TCP established. 2. Put limitation before TCP established. In the first way, we need to wait and receive CLC messages that the client will potentially send, and then actively reply with a decline message, in a sense, which is also a sort of SMC handshake, affect the connections established time on its way. In the second way, the only problem is that we need to inject SMC logic into TCP when it is about to reply the incoming SYN, since we already do that, it's seems not a problem anymore. And advantage is obvious, few additional processes are required to complete the constraint. This patch use the second way. After this patch, connections who beyond constraint will not informed any SMC indication, and SMC will not be involved in any of its subsequent processes. Link: https://lore.kernel.org/all/1641301961-59331-1-git-send-email-alibuda@linux.alibaba.com/ Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11locking/local_lock: Make the empty local_lock_*() function a macro.Sebastian Andrzej Siewior
It has been said that local_lock() does not add any overhead compared to preempt_disable() in a !LOCKDEP configuration. A micro benchmark showed an unexpected result which can be reduced to the fact that local_lock() was not entirely optimized away. In the !LOCKDEP configuration local_lock_acquire() is an empty static inline function. On x86 the this_cpu_ptr() argument of that function is fully evaluated leading to an additional mov+add instructions which are not needed and not used. Replace the static inline function with a macro. The typecheck() macro ensures that the argument is of proper type while the resulting disassembly shows no traces of this_cpu_ptr(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/YgKjciR60fZft2l4@linutronix.de
2022-02-11atomics: Fix atomic64_{read_acquire,set_release} fallbacksMark Rutland
Arnd reports that on 32-bit architectures, the fallbacks for atomic64_read_acquire() and atomic64_set_release() are broken as they use smp_load_acquire() and smp_store_release() respectively, which do not work on types larger than the native word size. Since those contain compiletime_assert_atomic_type(), any attempt to use those fallbacks will result in a build-time error. e.g. with the following added to arch/arm/kernel/setup.c: | void test_atomic64(atomic64_t *v) | { | atomic64_set_release(v, 5); | atomic64_read_acquire(v); | } The compiler will complain as follows: | In file included from <command-line>: | In function 'arch_atomic64_set_release', | inlined from 'test_atomic64' at ./include/linux/atomic/atomic-instrumented.h:669:2: | ././include/linux/compiler_types.h:346:38: error: call to '__compiletime_assert_9' declared with attribute error: Need native word sized stores/loads for atomicity. | 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | | ^ | ././include/linux/compiler_types.h:327:4: note: in definition of macro '__compiletime_assert' | 327 | prefix ## suffix(); \ | | ^~~~~~ | ././include/linux/compiler_types.h:346:2: note: in expansion of macro '_compiletime_assert' | 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | | ^~~~~~~~~~~~~~~~~~~ | ././include/linux/compiler_types.h:349:2: note: in expansion of macro 'compiletime_assert' | 349 | compiletime_assert(__native_word(t), \ | | ^~~~~~~~~~~~~~~~~~ | ./include/asm-generic/barrier.h:133:2: note: in expansion of macro 'compiletime_assert_atomic_type' | 133 | compiletime_assert_atomic_type(*p); \ | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ./include/asm-generic/barrier.h:164:55: note: in expansion of macro '__smp_store_release' | 164 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) | | ^~~~~~~~~~~~~~~~~~~ | ./include/linux/atomic/atomic-arch-fallback.h:1270:2: note: in expansion of macro 'smp_store_release' | 1270 | smp_store_release(&(v)->counter, i); | | ^~~~~~~~~~~~~~~~~ | make[2]: *** [scripts/Makefile.build:288: arch/arm/kernel/setup.o] Error 1 | make[1]: *** [scripts/Makefile.build:550: arch/arm/kernel] Error 2 | make: *** [Makefile:1831: arch/arm] Error 2 Fix this by only using smp_load_acquire() and smp_store_release() for native atomic types, and otherwise falling back to the regular barriers necessary for acquire/release semantics, as we do in the more generic acquire and release fallbacks. Since the fallback templates are used to generate the atomic64_*() and atomic_*() operations, the __native_word() check is added to both. For the atomic_*() operations, which are always 32-bit, the __native_word() check is redundant but not harmful, as it is always true. For the example above this works as expected on 32-bit, e.g. for arm multi_v7_defconfig: | <test_atomic64>: | push {r4, r5} | dmb ish | pldw [r0] | mov r2, #5 | mov r3, #0 | ldrexd r4, [r0] | strexd r4, r2, [r0] | teq r4, #0 | bne 484 <test_atomic64+0x14> | ldrexd r2, [r0] | dmb ish | pop {r4, r5} | bx lr ... and also on 64-bit, e.g. for arm64 defconfig: | <test_atomic64>: | bti c | paciasp | mov x1, #0x5 | stlr x1, [x0] | ldar x0, [x0] | autiasp | ret Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20220207101943.439825-1-mark.rutland@arm.com
2022-02-11lib/xor: make xor prototypes more friendly to compiler vectorizationArd Biesheuvel
Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-02-11Merge tag 'drm-intel-next-2022-02-08' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Cross-subsystem Changes: ------------------------ dma-buf: - dma-buf-map: Rename to iosys-map (Lucas) Core Changes: ------------- drm: - Always include the debugfs_entry in drm_crtc (Ville) - Add orientation quirk for GPD Win Max (Anisse) Driver Changes: --------------- gvt: - Constify some pointers. (Rikard Falkeborn) - Use list_entry to access list members. (Guenter Roeck) - Fix cmd parser error for Passmark9. (Zhenyu Wang) i915: - Various clean-ups including headers and removing unused and unnecessary stuff\ (Jani, Hans, Andy, Ville) - Cleaning up on our registers definitions i915_reg.h (Matt) - More multi-FBC refactoring (Ville) - Baytrail backlight fix (Hans) - DG1 OPROM read through SPI controller (Clint) - ADL-N platform enabling (Tejas) - Fix slab-out-of-bounds access (Jani) - Add opregion mailbox #5 support for possible EDID override (Anisse) - Fix possible NULL dereferences (Harish) - Updates and fixes around display voltage swing values (Clint, Jose) - Fix RPM wekeref on PXP code (Juston) - Many register definitions clean-up, including planes registers (Ville) - More conversion towards display version over the old gen (Madhumitha, Ville) - DP MST ESI handling improvements (Jani) - drm device based logging conversions (Jani) - Prevent divide by zero (Dan) - Introduce ilk_pch_pre_enable for complete modeset abstraction (Ville) - Async flip optimization for DG2 (Stanislav) - Multiple DSC and bigjoiner fixes and improvements (Ville) - Fix ADL-P TypeC Phy ready status readout (Imre) - Fix up DP DFP 4:2:0 handling more display related fixes (Ville) - Display M/N cleanup (Ville) - Switch to use VGA definitions from video/vga.h (Jani) - Fixes and improvements to abstract CPU architecture (Lucas) - Disable unsused power wells left enabled by BIOS (Imre) - Allow !join_mbus cases for adlp+ dbuf configuration (Ville) - Populate pipe dbuf slices more accurately during readout (Ville) - Workaround broken BIOS DBUF configuration on TGL/RKL (Ville) - Fix trailing semicolon (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YgKFLmCgpv4vQEa1@intel.com
2022-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10soc: qcom: llcc: Add configuration data for SM8450 SoCSai Prakash Ranjan
Add LLCC configuration data for SM8450 SoC. Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Tested-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/fec944cb8f2a4a70785903c6bfec629c6f31b6a4.1643355594.git.quic_saipraka@quicinc.com
2022-02-10soc: qcom: llcc: Update the logic for version info extractionSai Prakash Ranjan
LLCC HW version info is made up of major, branch, minor and echo version bits each of which are 8bits. Several features in newer LLCC HW are based on the full version rather than just major or minor versions such as write-subcache enable which is applicable for versions v2.0.0.0 and later, also upcoming write-subcache cacheable for SM8450 SoC which is only present in versions v2.1.0.0 and later, so it makes it easier and cleaner to just directly compare with the full version than adding additional major/branch/ minor/echo version checks. So remove the earlier major version check and add full version check for those features. Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Tested-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/a82d7c32348c51fcc2b63e220d91b318bf706c83.1643355594.git.quic_saipraka@quicinc.com
2022-02-10Merge branch 'i2c/add-request_mem_region_muxed' into i2c/for-mergewindowWolfram Sang
2022-02-10kernel/resource: Introduce request_mem_region_muxed()Terry Bowman
Support for requesting muxed memory region is implemented but not currently callable as a macro. Add the request muxed memory region macro. MMIO memory accesses can be synchronized using request_mem_region() which is already available. This call will return failure if the resource is busy. The 'muxed' version of this macro will handle a busy resource by using a wait queue to retry until the resource is available. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-02-10genirq: Kill irq_chip::parent_deviceMarc Zyngier
Now that noone is using irq_chip::parent_device in the tree, get rid of it. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Bartosz Golaszewski <brgl@bgdev.pl> Link: https://lore.kernel.org/r/20220201120310.878267-13-maz@kernel.org
2022-02-10spi: mxic: Add support for pipelined ECC operationsMiquel Raynal
Some SPI-NAND chips do not have a proper on-die ECC engine providing error correction/detection. This is particularly an issue on embedded devices with limited resources because all the computations must happen in software, unless an external hardware engine is provided. These external engines are new and can be of two categories: external or pipelined. Macronix is providing both, the former being already supported. The second, however, is very SoC implementation dependent and must be instantiated by the SPI host controller directly. An entire subsystem has been contributed to support these engines which makes the insertion into another subsystem such as SPI quite straightforward without the need for a lot of specific functions. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-mtd/20220202144536.393792-1-miquel.raynal@bootlin.com
2022-02-10mtd: spinand: Create direct mapping descriptors for ECC operationsMiquel Raynal
In order for pipelined ECC engines to be able to enable/disable the ECC engine only when needed and avoid races when future parallel-operations will be supported, we need to provide the information about the use of the ECC engine in the direct mapping hooks. As direct mapping configurations are meant to be static, it is best to create two new mappings: one for regular 'raw' accesses and one for accesses involving correction. It is up to the driver to use or not the new ECC enable boolean contained in the spi-mem operation. As dirmaps are not free (they consume a few pages of MMIO address space) and because these extra entries are only meant to be used by pipelined engines, let's limit their use to this specific type of engine and save a bit of memory with all the other setups. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-mtd/20220127091808.1043392-9-miquel.raynal@bootlin.com
2022-02-10spi: spi-mem: Add an ecc parameter to the spi_mem_op structureMiquel Raynal
Soon the SPI-NAND core will need a way to request a SPI controller to enable ECC support for a given operation. This is because of the pipelined integration of certain ECC engines, which are directly managed by the SPI controller itself. Introduce a spi_mem_op additional field for this purpose: ecc. So far this field is left unset and checked to be false by all the SPI controller drivers in their ->supports_op() hook, as they all call spi_mem_default_supports_op(). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Pratyush Yadav <p.yadav@ti.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/linux-mtd/20220127091808.1043392-7-miquel.raynal@bootlin.com
2022-02-10spi: spi-mem: Kill the spi_mem_dtr_supports_op() helperMiquel Raynal
Now that spi_mem_default_supports_op() has access to the static controller capabilities (relating to memory operations), and now that these capabilities have been filled by the relevant controllers, there is no need for a specific helper checking only DTR operations, so let's just kill spi_mem_dtr_supports_op() and simply use spi_mem_default_supports_op() instead. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Pratyush Yadav <p.yadav@ti.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/linux-mtd/20220127091808.1043392-6-miquel.raynal@bootlin.com
2022-02-10spi: spi-mem: Introduce a capability structureMiquel Raynal
Create a spi_controller_mem_caps structure and put it within the spi_controller structure close to the spi_controller_mem_ops strucure. So far the only field in this structure is the support for dtr operations, but soon we will add another parameter. Also create a helper to parse the capabilities and check if the requested capability has been set or not. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Pratyush Yadav <p.yadav@ti.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-mtd/20220127091808.1043392-2-miquel.raynal@bootlin.com
2022-02-10mtd: nand: mxic-ecc: Support SPI pipelined modeMiquel Raynal
Introduce the support for another possible configuration: the ECC engine may work as DMA master (pipelined) and move itself the data to/from the NAND chip into the buffer, applying the necessary corrections/computations on the fly. This driver offers an ECC engine implementation that must be instatiated from a SPI controller driver. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20211216111654.238086-17-miquel.raynal@bootlin.com
2022-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next 1) Conntrack sets on CHECKSUM_UNNECESSARY for UDP packet with no checksum, from Kevin Mitchell. 2) skb->priority support for nfqueue, from Nicolas Dichtel. 3) Remove conntrack extension register API, from Florian Westphal. 4) Move nat destroy hook to nf_nat_hook instead, to remove nf_ct_ext_destroy(), also from Florian. 5) Wrap pptp conntrack NAT hooks into single structure, from Florian Westphal. 6) Support for tcp option set to noop for nf_tables, also from Florian. 7) Do not run x_tables comment match from packet path in nf_tables, from Florian Westphal. 8) Replace spinlock by cmpxchg() loop to update missed ct event, from Florian Westphal. 9) Wrap cttimeout hooks into single structure, from Florian. 10) Add fast nft_cmp expression for up to 16-bytes. 11) Use cb->ctx to store context in ctnetlink dump, instead of using cb->args[], from Florian Westphal. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: ctnetlink: use dump structure instead of raw args nfqueue: enable to set skb->priority netfilter: nft_cmp: optimize comparison for 16-bytes netfilter: cttimeout: use option structure netfilter: ecache: don't use nf_conn spinlock netfilter: nft_compat: suppress comment match netfilter: exthdr: add support for tcp option removal netfilter: conntrack: pptp: use single option structure netfilter: conntrack: remove extension register api netfilter: conntrack: handle ->destroy hook via nat_ops instead netfilter: conntrack: move extension sizes into core netfilter: conntrack: make all extensions 8-byte alignned netfilter: nfqueue: enable to get skb->priority netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY ==================== Link: https://lore.kernel.org/r/20220209133616.165104-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-02-09 We've added 126 non-merge commits during the last 16 day(s) which contain a total of 201 files changed, 4049 insertions(+), 2215 deletions(-). The main changes are: 1) Add custom BPF allocator for JITs that pack multiple programs into a huge page to reduce iTLB pressure, from Song Liu. 2) Add __user tagging support in vmlinux BTF and utilize it from BPF verifier when generating loads, from Yonghong Song. 3) Add per-socket fast path check guarding from cgroup/BPF overhead when used by only some sockets, from Pavel Begunkov. 4) Continued libbpf deprecation work of APIs/features and removal of their usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko and various others. 5) Improve BPF instruction set documentation by adding byte swap instructions and cleaning up load/store section, from Christoph Hellwig. 6) Switch BPF preload infra to light skeleton and remove libbpf dependency from it, from Alexei Starovoitov. 7) Fix architecture-agnostic macros in libbpf for accessing syscall arguments from BPF progs for non-x86 architectures, from Ilya Leoshkevich. 8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be of 16-bit field with anonymous zero padding, from Jakub Sitnicki. 9) Add new bpf_copy_from_user_task() helper to read memory from a different task than current. Add ability to create sleepable BPF iterator progs, from Kenny Yu. 10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski. 11) Generate temporary netns names for BPF selftests to avoid naming collisions, from Hangbin Liu. 12) Implement bpf_core_types_are_compat() with limited recursion for in-kernel usage, from Matteo Croce. 13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5 to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor. 14) Misc minor fixes to libbpf and selftests from various folks. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits) selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide libbpf: Fix compilation warning due to mismatched printf format selftests/bpf: Test BPF_KPROBE_SYSCALL macro libbpf: Add BPF_KPROBE_SYSCALL macro libbpf: Fix accessing the first syscall argument on s390 libbpf: Fix accessing the first syscall argument on arm64 libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390 libbpf: Fix accessing syscall arguments on riscv libbpf: Fix riscv register names libbpf: Fix accessing syscall arguments on powerpc selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro libbpf: Add PT_REGS_SYSCALL_REGS macro selftests/bpf: Fix an endianness issue in bpf_syscall_macro test bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE bpf: Fix leftover header->pages in sparc and powerpc code. libbpf: Fix signedness bug in btf_dump_array_data() selftests/bpf: Do not export subtest as standalone test bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures ... ==================== Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09Merge tag 'nfsd-5.17-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull more nfsd fixes from Chuck Lever: "Ensure that NFS clients cannot send file size or offset values that can cause the NFS server to crash or to return incorrect or surprising results. In particular, fix how the NFS server handles values larger than OFFSET_MAX" * tag 'nfsd-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Deprecate NFS_OFFSET_MAX NFSD: Fix offset type in I/O trace points NFSD: COMMIT operations must not return NFS?ERR_INVAL NFSD: Clamp WRITE offsets NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes NFSD: Fix ia_size underflow NFSD: Fix the behavior of READ near OFFSET_MAX
2022-02-09spi: make remove callback a void functionMark Brown
Merge series from Uwe Kleine-König <u.kleine-koenig@pengutronix.de>: this series goal is to change the spi remove callback's return value to void. After numerous patches nearly all drivers already return 0 unconditionally. The four first patches in this series convert the remaining three drivers to return 0, the final patch changes the remove prototype and converts all implementers. base-commit: 26291c54e111ff6ba87a164d85d4a4e134b7315c
2022-02-09NFSD: Deprecate NFS_OFFSET_MAXChuck Lever
NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there was a kernel-wide OFFSET_MAX value. As a clean up, replace the last few uses of it with its generic equivalent, and get rid of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-02-09genirq: Allow the PM device to originate from irq domainMarc Zyngier
As a preparation to moving the reference to the device used for runtime power management, add a new 'dev' field to the irqdomain structure for that exact purpose. The irq_chip_pm_{get,put}() helpers are made aware of the dual location via a new private helper. No functional change intended. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Tony Lindgren <tony@atomide.com> Acked-by: Bartosz Golaszewski <brgl@bgdev.pl> Link: https://lore.kernel.org/r/20220201120310.878267-2-maz@kernel.org
2022-02-09spi: make remove callback a void functionUwe Kleine-König
The value returned by an spi driver's remove function is mostly ignored. (Only an error message is printed if the value is non-zero that the error is ignored.) So change the prototype of the remove function to return no value. This way driver authors are not tempted to assume that passing an error to the upper layer is a good idea. All drivers are adapted accordingly. There is no intended change of behaviour, all callbacks were prepared to return 0 before. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Claudius Heine <ch@denx.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Acked-by: Marcus Folkesson <marcus.folkesson@gmail.com> Acked-by: Łukasz Stelmach <l.stelmach@samsung.com> Acked-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220123175201.34839-6-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2022-02-09gpiolib: make struct comments into real kernel docsBartosz Golaszewski
We have several comments that start with '/**' but don't conform to the kernel doc standard. Add proper detailed descriptions for the affected definitions and move the docs from the forward declarations to the struct definitions where applicable. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org>
2022-02-09mtd: nand: ecc: Provide a helper to retrieve a pilelined engine deviceMiquel Raynal
In a pipelined engine situation, we might either have the host which internally has support for error correction, or have it using an external hardware block for this purpose. In the former case, the host is also the ECC engine. In the latter case, it is not. In order to get the right pointers on the right devices (for example: in order to devm_* allocate variables), let's introduce this helper which can safely be called by pipelined ECC engines in order to retrieve the right device structure. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20211216111654.238086-16-miquel.raynal@bootlin.com
2022-02-09mtd: rawnand: protect access to rawnand devices while in suspendSean Nyekjaer
Prevent rawnand access while in a suspended state. Commit 013e6292aaf5 ("mtd: rawnand: Simplify the locking") allows the rawnand layer to return errors rather than waiting in a blocking wait. Tested on a iMX6ULL. Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking") Signed-off-by: Sean Nyekjaer <sean@geanix.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220208085213.1838273-1-sean@geanix.com
2022-02-09cpufreq: Reintroduce ready() callbackBjorn Andersson
This effectively revert '4bf8e582119e ("cpufreq: Remove ready() callback")', in order to reintroduce the ready callback. This is needed in order to be able to leave the thermal pressure interrupts in the Qualcomm CPUfreq driver disabled during initialization, so that it doesn't fire while related_cpus are still 0. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> [ Viresh: Added the Chinese translation as well and updated commit msg ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-02-09peci: Add peci-cpu driverIwona Winiarska
PECI is an interface that may be used by different types of devices. Add a peci-cpu driver compatible with Intel processors. The driver is responsible for handling auxiliary devices that can subsequently be used by other drivers (e.g. hwmons). Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com> Link: https://lore.kernel.org/r/20220208153639.255278-10-iwona.winiarska@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-09peci: Add support for PECI device driversIwona Winiarska
Add support for PECI device drivers, which unlike PECI controller drivers are actually able to provide functionalities to userspace. Also, extend peci_request API to allow querying more details about PECI device (e.g. model/family), that's going to be used to find a compatible peci_driver. Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com> Link: https://lore.kernel.org/r/20220208153639.255278-9-iwona.winiarska@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-09peci: Add device detectionIwona Winiarska
Since PECI devices are discoverable, we can dynamically detect devices that are actually available in the system. This change complements the earlier implementation by rescanning PECI bus to detect available devices. For this purpose, it also introduces the minimal API for PECI requests. Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com> Link: https://lore.kernel.org/r/20220208153639.255278-7-iwona.winiarska@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-09peci: Add core infrastructureIwona Winiarska
Intel processors provide access for various services designed to support processor and DRAM thermal management, platform manageability and processor interface tuning and diagnostics. Those services are available via the Platform Environment Control Interface (PECI) that provides a communication channel between the processor and the Baseboard Management Controller (BMC) or other platform management device. This change introduces PECI subsystem by adding the initial core module and API for controller drivers. Co-developed-by: Jason M Bills <jason.m.bills@linux.intel.com> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Jason M Bills <jason.m.bills@linux.intel.com> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com> Link: https://lore.kernel.org/r/20220208153639.255278-5-iwona.winiarska@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08net: add dev->dev_registered_trackerEric Dumazet
Convert one dev_hold()/dev_put() pair in register_netdevice() and unregister_netdevice_many() to dev_hold_track() and dev_put_track(). This would allow to detect a rogue dev_put() a bit earlier. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220207184107.1401096-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-08Merge branch 'iwl-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux Nguyen, Anthony L says: ==================== iwl-next Intel Wired LAN Driver Updates 2022-02-07 Dave adds support for ice driver to provide DSCP QoS mappings to irdma driver. [1] https://lore.kernel.org/netdev/20220202191921.1638-1-shiraz.saleem@intel.com/ * 'iwl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux: ice: add support for DSCP QoS for IDC ==================== Link: https://lore.kernel.org/r/20220207235921.1303522-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-08Merge tag 'nfs-for-5.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client fixes from Anna Schumaker: "Stable Fixes: - Fix initialization of nfs_client cl_flags Other Fixes: - Fix performance issues with uncached readdir calls - Fix potential pointer dereferences in rpcrdma_ep_create - Fix nfs4_proc_get_locations() kernel-doc comment - Fix locking during sunrpc sysfs reads - Update my email address in the MAINTAINERS file to my new kernel.org email" * tag 'nfs-for-5.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC: lock against ->sock changing during sysfs read MAINTAINERS: Update my email address NFS: Fix nfs4_proc_get_locations() kernel-doc comment xprtrdma: fix pointer derefs in error cases of rpcrdma_ep_create NFS: Fix initialisation of nfs_client cl_flags field NFS: Avoid duplicate uncached readdir calls on eof NFS: Don't skip directory entries when doing uncached readdir NFS: Don't overfill uncached readdir pages
2022-02-08fscrypt: add functions for direct I/O supportEric Biggers
Encrypted files traditionally haven't supported DIO, due to the need to encrypt/decrypt the data. However, when the encryption is implemented using inline encryption (blk-crypto) instead of the traditional filesystem-layer encryption, it is straightforward to support DIO. In preparation for supporting this, add the following functions: - fscrypt_dio_supported() checks whether a DIO request is supported as far as encryption is concerned. Encrypted files will only support DIO when inline encryption is used and the I/O request is properly aligned; this function checks these preconditions. - fscrypt_limit_io_blocks() limits the length of a bio to avoid crossing a place in the file that a bio with an encryption context cannot cross due to a DUN discontiguity. This function is needed by filesystems that use the iomap DIO implementation (which operates directly on logical ranges, so it won't use fscrypt_mergeable_bio()) and that support FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32. Co-developed-by: Satya Tangirala <satyat@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220128233940.79464-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2022-02-08perf: Fix wrong name in comment for struct perf_cpu_contextAlexandru Elisei
Commit 0793a61d4df8 ("performance counters: core code") added the perf subsystem (then called Performance Counters) to Linux, creating the struct perf_cpu_context. The comment for the struct referred to it as a "struct perf_counter_cpu_context". Commit cdd6c482c9ff ("perf: Do the big rename: Performance Counters -> Performance Events") changed the comment to refer to a "struct perf_event_cpu_context", which was still the wrong name for the struct. Change the comment to say "struct perf_cpu_context". CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-3-alexandru.elisei@arm.com
2022-02-08sbitmap: Delete old sbitmap_queue_get_shallow()John Garry
Since __sbitmap_queue_get_shallow() was introduced in commit c05e66733788 ("sbitmap: add sbitmap_get_shallow() operation"), it has not been used. Delete __sbitmap_queue_get_shallow() and rename public __sbitmap_queue_get_shallow() -> sbitmap_queue_get_shallow() as it is odd to have public __foo but no foo at all. Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1644322024-105340-1-git-send-email-john.garry@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08lib/sbitmap: kill 'depth' from sbitmap_wordMing Lei
Only the last sbitmap_word can have different depth, and all the others must have same depth of 1U << sb->shift, so not necessary to store it in sbitmap_word, and it can be retrieved easily and efficiently by adding one internal helper of __map_depth(sb, index). Remove 'depth' field from sbitmap_word, then the annotation of ____cacheline_aligned_in_smp for 'word' isn't needed any more. Not see performance effect when running high parallel IOPS test on null_blk. This way saves us one cacheline(usually 64 words) per each sbitmap_word. Cc: Martin Wilck <martin.wilck@suse.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20220110072945.347535-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08VMCI: dma dg: add support for DMA datagrams sendsJorgen Hansen
Use DMA based send operation from the transmit buffer instead of the iowrite8_rep based datagram send when DMA datagrams are supported. The outgoing datagram is sent as inline data in the VMCI transmit buffer. Once the header has been configured, the send is initiated by writing the lower 32 bit of the buffer base address to the VMCI_DATA_OUT_LOW_ADDR register. Only then will the device process the header and the datagram itself. Following that, the driver busy waits (it isn't possible to sleep on the send path) for the header busy flag to change - indicating that the send is complete. Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Link: https://lore.kernel.org/r/20220207102725.2742-8-jhansen@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08VMCI: dma dg: allocate send and receive buffers for DMA datagramsJorgen Hansen
If DMA datagrams are used, allocate send and receive buffers in coherent DMA memory. This is done in preparation for the send and receive datagram operations, where the buffers are used for the exchange of data between driver and device. Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Link: https://lore.kernel.org/r/20220207102725.2742-7-jhansen@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08VMCI: dma dg: register dummy IRQ handlers for DMA datagramsJorgen Hansen
Register dummy interrupt handlers for DMA datagrams in preparation for DMA datagram receive operations. Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Link: https://lore.kernel.org/r/20220207102725.2742-6-jhansen@vmware.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>