git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-10-28	libbpf: Add typeless ksym support to gen_loader	Kumar Kartikeya Dwivedi
	This uses the bpf_kallsyms_lookup_name helper added in previous patches to relocate typeless ksyms. The return value ENOENT can be ignored, and the value written to 'res' can be directly stored to the insn, as it is overwritten to 0 on lookup failure. For repeating symbols, we can simply copy the previously populated bpf_insn. Also, we need to take care to not close fds for typeless ksym_desc, so reuse the 'off' member's space to add a marker for typeless ksym and use that to skip them in cleanup_relos. We add a emit_ksym_relo_log helper that avoids duplicating common logging instructions between typeless and weak ksym (for future commit). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com
2021-10-28	bpf: Add bpf_kallsyms_lookup_name helper	Kumar Kartikeya Dwivedi
	This helper allows us to get the address of a kernel symbol from inside a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can relocate typeless ksym vars. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com
2021-10-28	Merge branch 'Implement bloom filter map'	Alexei Starovoitov
	Joanne Koong says: ==================== This patchset adds a new kind of bpf map: the bloom filter map. Bloom filters are a space-efficient probabilistic data structure used to quickly test whether an element exists in a set. For a brief overview about how bloom filters work, https://en.wikipedia.org/wiki/Bloom_filter may be helpful. One example use-case is an application leveraging a bloom filter map to determine whether a computationally expensive hashmap lookup can be avoided. If the element was not found in the bloom filter map, the hashmap lookup can be skipped. This patchset includes benchmarks for testing the performance of the bloom filter for different entry sizes and different number of hash functions used, as well as comparisons for hashmap lookups with vs. without the bloom filter. A high level overview of this patchset is as follows: 1/5 - kernel changes for adding bloom filter map 2/5 - libbpf changes for adding map_extra flags 3/5 - tests for the bloom filter map 4/5 - benchmarks for bloom filter lookup/update throughput and false positive rate 5/5 - benchmarks for how hashmap lookups perform with vs. without the bloom filter v5 -> v6: * in 1/5: remove "inline" from the hash function, add check in syscall to fail out in cases where map_extra is not 0 for non-bloom-filter maps, fix alignment matching issues, move "map_extra flags" comments to inside the bpf_attr struct, add bpf_map_info map_extra changes here, add map_extra assignment in bpf_map_get_info_by_fd, change hash value_size to u32 instead of a u64 * in 2/5: remove bpf_map_info map_extra changes, remove TODO comment about extending BTF arrays to cover u64s, cast to unsigned long long for %llx when printing out map_extra flags * in 3/5: use __type(value, ...) instead of __uint(value_size, ...) for values and keys * in 4/5: fix wrong bounds for the index when iterating through random values, update commit message to include update+lookup benchmark results for 8 byte and 64-byte value sizes, remove explicit global bool initializaton to false for hashmap_use_bloom and count_false_hits variables v4 -> v5: * Change the "bitset map with bloom filter capabilities" to a bloom filter map with max_entries signifying the number of unique entries expected in the bloom filter, remove bitset tests * Reduce verbiage by changing "bloom_filter" to "bloom", and renaming progs to more concise names. * in 2/5: remove "map_extra" from struct definitions that are frozen, create a "bpf_create_map_params" struct to propagate map_extra to the kernel at map creation time, change map_extra to __u64 * in 4/5: check pthread condition variable in a loop when generating initial map data, remove "err" checks where not pragmatic, generate random values for the hashmap in the setup() instead of in the bpf program, add check_args() for checking that there aren't more requested entries than possible unique entries for the specified value size * in 5/5: Update commit message with updated benchmark data v3 -> v4: * Generalize the bloom filter map to be a bitset map with bloom filter capabilities * Add map_extra flags; pass in nr_hash_funcs through lower 4 bits of map_extra for the bitset map * Add tests for the bitset map (non-bloom filter) functionality * In the benchmarks, stats are computed only as monotonic increases, and place stats in a struct instead of as a percpu_array bpf map v2 -> v3: * Add libbpf changes for supporting nr_hash_funcs, instead of passing the number of hash functions through map_flags. * Separate the hashing logic in kernel/bpf/bloom_filter.c into a helper function v1 -> v2: * Remove libbpf changes, and pass the number of hash functions through map_flags instead. * Default to using 5 hash functions if no number of hash functions is specified. * Use set_bit instead of spinlocks in the bloom filter bitmap. This improved the speed significantly. For example, using 5 hash functions with 100k entries, there was roughly a 35% speed increase. * Use jhash2 (instead of jhash) for u32-aligned value sizes. This increased the speed by roughly 5 to 15%. When using jhash2 on value sizes non-u32 aligned (truncating any remainder bits), there was not a noticeable difference. * Add test for using the bloom filter as an inner map. * Reran the benchmarks, updated the commit messages to correspond to the new results. ==================== Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-10-28	evm: mark evm_fixmode as __ro_after_init	Austin Kim
	The evm_fixmode is only configurable by command-line option and it is never modified outside initcalls, so declaring it with __ro_after_init is better. Signed-off-by: Austin Kim <austin.kim@lge.com> Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-10-28	bpf,x86: Respect X86_FEATURE_RETPOLINE*	Peter Zijlstra
	Current BPF codegen doesn't respect X86_FEATURE_RETPOLINE* flags and unconditionally emits a thunk call, this is sub-optimal and doesn't match the regular, compiler generated, code. Update the i386 JIT to emit code equal to what the compiler emits for the regular kernel text (IOW. a plain THUNK call). Update the x86_64 JIT to emit code similar to the result of compiler and kernel rewrites as according to X86_FEATURE_RETPOLINE* flags. Inlining RETPOLINE_AMD (lfence; jmp %reg) and !RETPOLINE (jmp %reg), while doing a THUNK call for RETPOLINE. This removes the hard-coded retpoline thunks and shrinks the generated code. Leaving a single retpoline thunk definition in the kernel. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.614772675@infradead.org
2021-10-28	bpf,x86: Simplify computing label offsets	Peter Zijlstra
	Take an idea from the 32bit JIT, which uses the multi-pass nature of the JIT to compute the instruction offsets on a prior pass in order to compute the relative jump offsets on a later pass. Application to the x86_64 JIT is slightly more involved because the offsets depend on program variables (such as callee_regs_used and stack_depth) and hence the computed offsets need to be kept in the context of the JIT. This removes, IMO quite fragile, code that hard-codes the offsets and tries to compute the length of variable parts of it. Convert both emit_bpf_tail_call_*() functions which have an out: label at the end. Additionally emit_bpt_tail_call_direct() also has a poke table entry, for which it computes the offset from the end (and thus already relies on the previous pass to have computed addrs[i]), also convert this to be a forward based offset. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.552304864@infradead.org
2021-10-28	x86,bugs: Unconditionally allow spectre_v2=retpoline,amd	Peter Zijlstra
	Currently Linux prevents usage of retpoline,amd on !AMD hardware, this is unfriendly and gets in the way of testing. Remove this restriction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.487348118@infradead.org
2021-10-28	x86/alternative: Add debug prints to apply_retpolines()	Peter Zijlstra
	Make sure we can see the text changes when booting with 'debug-alternative'. Example output: [ ] SMP alternatives: retpoline at: __traceiter_initcall_level+0x1f/0x30 (ffffffff8100066f) len: 5 to: __x86_indirect_thunk_rax+0x0/0x20 [ ] SMP alternatives: ffffffff82603e58: [2:5) optimized NOPs: ff d0 0f 1f 00 [ ] SMP alternatives: ffffffff8100066f: orig: e8 cc 30 00 01 [ ] SMP alternatives: ffffffff8100066f: repl: ff d0 0f 1f 00 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.422273830@infradead.org
2021-10-28	x86/alternative: Try inline spectre_v2=retpoline,amd	Peter Zijlstra
	Try and replace retpoline thunk calls with: LFENCE CALL %\reg for spectre_v2=retpoline,amd. Specifically, the sequence above is 5 bytes for the low 8 registers, but 6 bytes for the high 8 registers. This means that unless the compilers prefix stuff the call with higher registers this replacement will fail. Luckily GCC strongly favours RAX for the indirect calls and most (95%+ for defconfig-x86_64) will be converted. OTOH clang strongly favours R11 and almost nothing gets converted. Note: it will also generate a correct replacement for the Jcc.d32 case, except unless the compilers start to prefix stuff that, it'll never fit. Specifically: Jncc.d8 1f LFENCE JMP %\reg 1: is 7-8 bytes long, where the original instruction in unpadded form is only 6 bytes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.359986601@infradead.org
2021-10-28	x86/alternative: Handle Jcc __x86_indirect_thunk_\reg	Peter Zijlstra
	Handle the rare cases where the compiler (clang) does an indirect conditional tail-call using: Jcc __x86_indirect_thunk_\reg For the !RETPOLINE case this can be rewritten to fit the original (6 byte) instruction like: Jncc.d8 1f JMP *%\reg NOP 1: Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.296470217@infradead.org
2021-10-28	x86/alternative: Implement .retpoline_sites support	Peter Zijlstra
	Rewrite retpoline thunk call sites to be indirect calls for spectre_v2=off. This ensures spectre_v2=off is as near to a RETPOLINE=n build as possible. This is the replacement for objtool writing alternative entries to ensure the same and achieves feature-parity with the previous approach. One noteworthy feature is that it relies on the thunks to be in machine order to compute the register index. Specifically, this does not yet address the Jcc __x86_indirect_thunk_* calls generated by clang, a future patch will add this. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.232495794@infradead.org
2021-10-28	x86/retpoline: Create a retpoline thunk array	Peter Zijlstra
	Stick all the retpolines in a single symbol and have the individual thunks as inner labels, this should guarantee thunk order and layout. Previously there were 16 (or rather 15 without rsp) separate symbols and a toolchain might reasonably expect it could displace them however it liked, with disregard for their relative position. However, now they're part of a larger symbol. Any change to their relative position would disrupt this larger _array symbol and thus not be sound. This is the same reasoning used for data symbols. On their own there is no guarantee about their relative position wrt to one aonther, but we're still able to do arrays because an array as a whole is a single larger symbol. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.169659320@infradead.org
2021-10-28	x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h	Peter Zijlstra
	Because it makes no sense to split the retpoline gunk over multiple headers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.106290934@infradead.org
2021-10-28	x86/asm: Fixup odd GEN-for-each-reg.h usage	Peter Zijlstra
	Currently GEN-for-each-reg.h usage leaves GEN defined, relying on any subsequent usage to start with #undef, which is rude. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120310.041792350@infradead.org
2021-10-28	x86/asm: Fix register order	Peter Zijlstra
	Ensure the register order is correct; this allows for easy translation between register number and trampoline and vice-versa. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.978573921@infradead.org
2021-10-28	x86/retpoline: Remove unused replacement symbols	Peter Zijlstra
	Now that objtool no longer creates alternatives, these replacement symbols are no longer needed, remove them. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.915051744@infradead.org
2021-10-28	objtool,x86: Replace alternatives with .retpoline_sites	Peter Zijlstra
	Instead of writing complete alternatives, simply provide a list of all the retpoline thunk calls. Then the kernel is free to do with them as it pleases. Simpler code all-round. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.850007165@infradead.org
2021-10-28	objtool: Shrink struct instruction	Peter Zijlstra
	Any one instruction can only ever call a single function, therefore insn->mcount_loc_node is superfluous and can use insn->call_node. This shrinks struct instruction, which is by far the most numerous structure objtool creates. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.785456706@infradead.org
2021-10-28	objtool: Explicitly avoid self modifying code in .altinstr_replacement	Peter Zijlstra
	Assume ALTERNATIVE()s know what they're doing and do not change, or cause to change, instructions in .altinstr_replacement sections. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.722511775@infradead.org
2021-10-28	objtool: Classify symbols	Peter Zijlstra
	In order to avoid calling str*cmp() on symbol names, over and over, do them all once upfront and store the result. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211026120309.658539311@infradead.org
2021-10-28	Merge branch 'fixes' into next	Ulf Hansson

2021-10-28	mmc: tmio: reenable card irqs after the reset callback	Wolfram Sang
	The reset callback may clear the internal card detect interrupts, so make sure to reenable them if needed. Fixes: b4d86f37eacb ("mmc: renesas_sdhi: do hard reset if possible") Reported-by: Biju Das <biju.das.jz@bp.renesas.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211028195149.8003-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-28	bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out bloom ↵	Joanne Koong
	filter This patch adds benchmark tests for comparing the performance of hashmap lookups without the bloom filter vs. hashmap lookups with the bloom filter. Checking the bloom filter first for whether the element exists should overall enable a higher throughput for hashmap lookups, since if the element does not exist in the bloom filter, we can avoid a costly lookup in the hashmap. On average, using 5 hash functions in the bloom filter tended to perform the best across the widest range of different entry sizes. The benchmark results using 5 hash functions (running on 8 threads on a machine with one numa node, and taking the average of 3 runs) were roughly as follows: value_size = 4 bytes - 10k entries: 30% faster 50k entries: 40% faster 100k entries: 40% faster 500k entres: 70% faster 1 million entries: 90% faster 5 million entries: 140% faster value_size = 8 bytes - 10k entries: 30% faster 50k entries: 40% faster 100k entries: 50% faster 500k entres: 80% faster 1 million entries: 100% faster 5 million entries: 150% faster value_size = 16 bytes - 10k entries: 20% faster 50k entries: 30% faster 100k entries: 35% faster 500k entres: 65% faster 1 million entries: 85% faster 5 million entries: 110% faster value_size = 40 bytes - 10k entries: 5% faster 50k entries: 15% faster 100k entries: 20% faster 500k entres: 65% faster 1 million entries: 75% faster 5 million entries: 120% faster Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-6-joannekoong@fb.com
2021-10-28	bpf/benchs: Add benchmark tests for bloom filter throughput + false positive	Joanne Koong
	This patch adds benchmark tests for the throughput (for lookups + updates) and the false positive rate of bloom filter lookups, as well as some minor refactoring of the bash script for running the benchmarks. These benchmarks show that as the number of hash functions increases, the throughput and the false positive rate of the bloom filter decreases. >From the benchmark data, the approximate average false-positive rates are roughly as follows: 1 hash function = ~30% 2 hash functions = ~15% 3 hash functions = ~5% 4 hash functions = ~2.5% 5 hash functions = ~1% 6 hash functions = ~0.5% 7 hash functions = ~0.35% 8 hash functions = ~0.15% 9 hash functions = ~0.1% 10 hash functions = ~0% For reference data, the benchmarks run on one thread on a machine with one numa node for 1 to 5 hash functions for 8-byte and 64-byte values are as follows: 1 hash function: 50k entries 8-byte value Lookups - 51.1 M/s operations Updates - 33.6 M/s operations False positive rate: 24.15% 64-byte value Lookups - 15.7 M/s operations Updates - 15.1 M/s operations False positive rate: 24.2% 100k entries 8-byte value Lookups - 51.0 M/s operations Updates - 33.4 M/s operations False positive rate: 24.04% 64-byte value Lookups - 15.6 M/s operations Updates - 14.6 M/s operations False positive rate: 24.06% 500k entries 8-byte value Lookups - 50.5 M/s operations Updates - 33.1 M/s operations False positive rate: 27.45% 64-byte value Lookups - 15.6 M/s operations Updates - 14.2 M/s operations False positive rate: 27.42% 1 mil entries 8-byte value Lookups - 49.7 M/s operations Updates - 32.9 M/s operations False positive rate: 27.45% 64-byte value Lookups - 15.4 M/s operations Updates - 13.7 M/s operations False positive rate: 27.58% 2.5 mil entries 8-byte value Lookups - 47.2 M/s operations Updates - 31.8 M/s operations False positive rate: 30.94% 64-byte value Lookups - 15.3 M/s operations Updates - 13.2 M/s operations False positive rate: 30.95% 5 mil entries 8-byte value Lookups - 41.1 M/s operations Updates - 28.1 M/s operations False positive rate: 31.01% 64-byte value Lookups - 13.3 M/s operations Updates - 11.4 M/s operations False positive rate: 30.98% 2 hash functions: 50k entries 8-byte value Lookups - 34.1 M/s operations Updates - 20.1 M/s operations False positive rate: 9.13% 64-byte value Lookups - 8.4 M/s operations Updates - 7.9 M/s operations False positive rate: 9.21% 100k entries 8-byte value Lookups - 33.7 M/s operations Updates - 18.9 M/s operations False positive rate: 9.13% 64-byte value Lookups - 8.4 M/s operations Updates - 7.7 M/s operations False positive rate: 9.19% 500k entries 8-byte value Lookups - 32.7 M/s operations Updates - 18.1 M/s operations False positive rate: 12.61% 64-byte value Lookups - 8.4 M/s operations Updates - 7.5 M/s operations False positive rate: 12.61% 1 mil entries 8-byte value Lookups - 30.6 M/s operations Updates - 18.9 M/s operations False positive rate: 12.54% 64-byte value Lookups - 8.0 M/s operations Updates - 7.0 M/s operations False positive rate: 12.52% 2.5 mil entries 8-byte value Lookups - 25.3 M/s operations Updates - 16.7 M/s operations False positive rate: 16.77% 64-byte value Lookups - 7.9 M/s operations Updates - 6.5 M/s operations False positive rate: 16.88% 5 mil entries 8-byte value Lookups - 20.8 M/s operations Updates - 14.7 M/s operations False positive rate: 16.78% 64-byte value Lookups - 7.0 M/s operations Updates - 6.0 M/s operations False positive rate: 16.78% 3 hash functions: 50k entries 8-byte value Lookups - 25.1 M/s operations Updates - 14.6 M/s operations False positive rate: 7.65% 64-byte value Lookups - 5.8 M/s operations Updates - 5.5 M/s operations False positive rate: 7.58% 100k entries 8-byte value Lookups - 24.7 M/s operations Updates - 14.1 M/s operations False positive rate: 7.71% 64-byte value Lookups - 5.8 M/s operations Updates - 5.3 M/s operations False positive rate: 7.62% 500k entries 8-byte value Lookups - 22.9 M/s operations Updates - 13.9 M/s operations False positive rate: 2.62% 64-byte value Lookups - 5.6 M/s operations Updates - 4.8 M/s operations False positive rate: 2.7% 1 mil entries 8-byte value Lookups - 19.8 M/s operations Updates - 12.6 M/s operations False positive rate: 2.60% 64-byte value Lookups - 5.3 M/s operations Updates - 4.4 M/s operations False positive rate: 2.69% 2.5 mil entries 8-byte value Lookups - 16.2 M/s operations Updates - 10.7 M/s operations False positive rate: 4.49% 64-byte value Lookups - 4.9 M/s operations Updates - 4.1 M/s operations False positive rate: 4.41% 5 mil entries 8-byte value Lookups - 18.8 M/s operations Updates - 9.2 M/s operations False positive rate: 4.45% 64-byte value Lookups - 5.2 M/s operations Updates - 3.9 M/s operations False positive rate: 4.54% 4 hash functions: 50k entries 8-byte value Lookups - 19.7 M/s operations Updates - 11.1 M/s operations False positive rate: 1.01% 64-byte value Lookups - 4.4 M/s operations Updates - 4.0 M/s operations False positive rate: 1.00% 100k entries 8-byte value Lookups - 19.5 M/s operations Updates - 10.9 M/s operations False positive rate: 1.00% 64-byte value Lookups - 4.3 M/s operations Updates - 3.9 M/s operations False positive rate: 0.97% 500k entries 8-byte value Lookups - 18.2 M/s operations Updates - 10.6 M/s operations False positive rate: 2.05% 64-byte value Lookups - 4.3 M/s operations Updates - 3.7 M/s operations False positive rate: 2.05% 1 mil entries 8-byte value Lookups - 15.5 M/s operations Updates - 9.6 M/s operations False positive rate: 1.99% 64-byte value Lookups - 4.0 M/s operations Updates - 3.4 M/s operations False positive rate: 1.99% 2.5 mil entries 8-byte value Lookups - 13.8 M/s operations Updates - 7.7 M/s operations False positive rate: 3.91% 64-byte value Lookups - 3.7 M/s operations Updates - 3.6 M/s operations False positive rate: 3.78% 5 mil entries 8-byte value Lookups - 13.0 M/s operations Updates - 6.9 M/s operations False positive rate: 3.93% 64-byte value Lookups - 3.5 M/s operations Updates - 3.7 M/s operations False positive rate: 3.39% 5 hash functions: 50k entries 8-byte value Lookups - 16.4 M/s operations Updates - 9.1 M/s operations False positive rate: 0.78% 64-byte value Lookups - 3.5 M/s operations Updates - 3.2 M/s operations False positive rate: 0.77% 100k entries 8-byte value Lookups - 16.3 M/s operations Updates - 9.0 M/s operations False positive rate: 0.79% 64-byte value Lookups - 3.5 M/s operations Updates - 3.2 M/s operations False positive rate: 0.78% 500k entries 8-byte value Lookups - 15.1 M/s operations Updates - 8.8 M/s operations False positive rate: 1.82% 64-byte value Lookups - 3.4 M/s operations Updates - 3.0 M/s operations False positive rate: 1.78% 1 mil entries 8-byte value Lookups - 13.2 M/s operations Updates - 7.8 M/s operations False positive rate: 1.81% 64-byte value Lookups - 3.2 M/s operations Updates - 2.8 M/s operations False positive rate: 1.80% 2.5 mil entries 8-byte value Lookups - 10.5 M/s operations Updates - 5.9 M/s operations False positive rate: 0.29% 64-byte value Lookups - 3.2 M/s operations Updates - 2.4 M/s operations False positive rate: 0.28% 5 mil entries 8-byte value Lookups - 9.6 M/s operations Updates - 5.7 M/s operations False positive rate: 0.30% 64-byte value Lookups - 3.2 M/s operations Updates - 2.7 M/s operations False positive rate: 0.30% Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-5-joannekoong@fb.com
2021-10-28	selftests/bpf: Add bloom filter map test cases	Joanne Koong
	This patch adds test cases for bpf bloom filter maps. They include tests checking against invalid operations by userspace, tests for using the bloom filter map as an inner map, and a bpf program that queries the bloom filter map for values added by a userspace program. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-4-joannekoong@fb.com
2021-10-28	libbpf: Add "map_extra" as a per-map-type extra flag	Joanne Koong
	This patch adds the libbpf infrastructure for supporting a per-map-type "map_extra" field, whose definition will be idiosyncratic depending on map type. For example, for the bloom filter map, the lower 4 bits of map_extra is used to denote the number of hash functions. Please note that until libbpf 1.0 is here, the "bpf_create_map_params" struct is used as a temporary means for propagating the map_extra field to the kernel. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com
2021-10-28	bpf: Add bloom filter map implementation	Joanne Koong
	This patch adds the kernel-side changes for the implementation of a bpf bloom filter map. The bloom filter map supports peek (determining whether an element is present in the map) and push (adding an element to the map) operations.These operations are exposed to userspace applications through the already existing syscalls in the following way: BPF_MAP_LOOKUP_ELEM -> peek BPF_MAP_UPDATE_ELEM -> push The bloom filter map does not have keys, only values. In light of this, the bloom filter map's API matches that of queue stack maps: user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM which correspond internally to bpf_map_peek_elem/bpf_map_push_elem, and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem APIs to query or add an element to the bloom filter map. When the bloom filter map is created, it must be created with a key_size of 0. For updates, the user will pass in the element to add to the map as the value, with a NULL key. For lookups, the user will pass in the element to query in the map as the value, with a NULL key. In the verifier layer, this requires us to modify the argument type of a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in the syscall layer, we need to copy over the user value so that in bpf_map_peek_elem, we know which specific value to query. A few things to please take note of: * If there are any concurrent lookups + updates, the user is responsible for synchronizing this to ensure no false negative lookups occur. * The number of hashes to use for the bloom filter is configurable from userspace. If no number is specified, the default used will be 5 hash functions. The benchmarks later in this patchset can help compare the performance of using different number of hashes on different entry sizes. In general, using more hashes decreases both the false positive rate and the speed of a lookup. * Deleting an element in the bloom filter map is not supported. * The bloom filter map may be used as an inner map. * The "max_entries" size that is specified at map creation time is used to approximate a reasonable bitmap size for the bloom filter, and is not otherwise strictly enforced. If the user wishes to insert more entries into the bloom filter than "max_entries", they may do so but they should be aware that this may lead to a higher false positive rate. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
2021-10-28	Merge branch irq/misc-5.16 into irq/irqchip-next	Marc Zyngier
	* irq/misc-5.16: : . : Misc irqchip fixes for 5.16: : - MAINTAINERS update for the ARM VIC DT binding : - Allow drivers using the IRQCHIP_PLATFORM_DRIVER_BEGIN/END : infrastructure to use COMPILE_TEST without CONFIG_OF : - DT updates : - Detangle h8300 linux/irqchip.h inclusion : . h8300: Fix linux/irqchip.h include mess dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings irqchip: Fix compile-testing without CONFIG_OF MAINTAINERS: update arm,vic.yaml reference Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-28	h8300: Fix linux/irqchip.h include mess	Marc Zyngier
	h8300 drags linux/irqchip.h from asm/irq.h, which is in general a bad idea (asm/.h should avoid dragging linux/.h, as it is usually supposed to work the other way around). Move the inclusion of linux/irqchip.h to the single location where it actually matters in the arch code. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211028172849.GA701812@roeck-us.net
2021-10-28	docs: submitting-patches: make section about the Link: tag more explicit	Thorsten Leemhuis
	Mention the 'Link' tag in the section about adding URLs to the commit msg, to make it clearer they "_primarily_ [...] should be about background", as Linus recently stated (see the link below). That makes the explanation also easier to find with a text search. For the same reason and to improve comprehensibility provide an example, too. Slightly improve the text at the same time to make it more obvious developers are meant to add links to issue reports in mailing list archives, as those allow regression tracking efforts to automatically check which bugs got resolved. Move the section also downwards slightly, to reduce jumping back and forth between aspects relevant for the top and the bottom part of the commit msg. Link: https://lore.kernel.org/lkml/CAHk-=wgBhyLhQLPem1vybKNt7BKP+=qF=veBgc7VirZaXn4FUw@mail.gmail.com/ CC: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info> Reviewed-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org> Link: https://lore.kernel.org/r/27105768dc19b395e7c8e7a80d056d1ff9c570d0.1635152553.git.linux@leemhuis.info [jc: tweaked wording following Konstantin's recommendation] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2021-10-28	Merge tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Quiet but not too quiet, I blame Halloween. The first set of amdgpu fixes missed last week, hence why this has a few more of them, it's mostly display fixes for new GPUs and some debugfs OOB stuff. The i915 patches have one to remove a tracepoint possible issue before it's a real problem, the others around cflush and display are cc'ed to stable as well. Otherwise it's just a few misc fixes. Summary: MAINTAINERS: - Fix the path pattern ttm: - Fix fence leak in ttm_transfered_destroy. core: - Add GPD Win3 rotation quirk i915: - Remove unconditional clflushes - Fix oops on boot due to sync state on disabled DP encoders - Revert backend specific data added to tracepoints - Remove useless and incorrect memory frequence calculation panel: - Add quirk for Aya Neo 2021 seltest: - Reset property count for each drm damage selftest so full run will work correctly. amdgpu: - Fix two potential out of bounds writes in debugfs - Fix revision handling for Yellow Carp - Display fixes for Yellow Carp - Display fixes for DCN 3.1" * tag 'drm-fixes-2021-10-29' of git://anongit.freedesktop.org/drm/drm: (21 commits) MAINTAINERS: dri-devel is for all of drivers/gpu drm/i915: Revert 'guc_id' from i915_request tracepoint drm/amd/display: Fix deadlock when falling back to v2 from v3 drm/amd/display: Fallback to clocks which meet requested voltage on DCN31 drm/amdgpu: Fix even more out of bound writes from debugfs drm: panel-orientation-quirks: Add quirk for GPD Win3 drm/i915/dp: Skip the HW readout of DPCD on disabled encoders drm/i915: Catch yet another unconditioal clflush drm/i915: Convert unconditional clflush to drm_clflush_virt_range() drm/i915/selftests: Properly reset mock object propers for each test drm: panel-orientation-quirks: Add quirk for Aya Neo 2021 drm/ttm: fix memleak in ttm_transfered_destroy drm/amdgpu: support B0&B1 external revision id for yellow carp drm/amd/display: Moved dccg init to after bios golden init drm/amd/display: Increase watermark latencies for DCN3.1 drm/amd/display: increase Z9 latency to workaround underflow in Z9 drm/amd/display: Require immediate flip support for DCN3.1 planes drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1 drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1 drm/amdgpu: fix out of bounds write ...
2021-10-29	MAINTAINERS: dri-devel is for all of drivers/gpu	Daniel Vetter
	Somehow we only have a list of subdirectories, which apparently made it harder for folks to find the gpu maintainers. Fix that. References: https://lore.kernel.org/dri-devel/YXrAAZlxxStNFG%2FK@phenom.ffwll.local/ Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211028170857.4029606-1-daniel.vetter@ffwll.ch
2021-10-29	Merge tag 'drm-intel-fixes-2021-10-28' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.15 final: - Remove unconditional clflushes - Fix oops on boot due to sync state on disabled DP encoders - Revert backend specific data added to tracepoints - Remove useless and incorrect memory frequence calculation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8735olh27y.fsf@intel.com
2021-10-28	drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bits	Alex Deucher
	The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: MST support for DPIA	Meenakshikumar Somasundaram
	[Why] - DPIA MST slot registers are not programmed during payload allocation and hence MST does not work with DPIA. - HPD RX interrupts are not handled for DPIA. [How] - Added inbox command to program the MST slots whenever payload allocation happens for DPIA links. - Added support for handling HPD RX interrupts Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amdgpu: Fix even more out of bound writes from debugfs	Patrik Jakobsson
	CVE-2021-42327 was fixed by: commit f23750b5b3d98653b31d4469592935ef6364ad67 Author: Thelford Williams <tdwilliamsiv@gmail.com> Date: Wed Oct 13 16:04:13 2021 -0400 drm/amdgpu: fix out of bounds write but amdgpu_dm_debugfs.c contains more of the same issue so fix the remaining ones. v2: * Add missing fix in dp_max_bpc_write (Harry Wentland) Fixes: 918698d5c2b5 ("drm/amd/display: Return the number of bytes parsed than allocated") Signed-off-by: Patrik Jakobsson <pjakobsson@suse.de> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amdgpu/discovery: add SDMA IP instance info for soc15 parts	Alex Deucher
	Add secondary instance version info for soc15 parts. Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amdgpu/discovery: add UVD/VCN IP instance info for soc15 parts	Alex Deucher
	Add secondary instance version info for vega20, arcturure, and aldebaran. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amdgpu/UAPI: rearrange header to better align related items	Alex Deucher
	Move the RAS query parameters to align with the INFO query where they are used. No functional change. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: Enable dpia in dmub only for DCN31 B0	Jude Shih
	[Why] DMUB binary is common for both A0 and B0. Hence, driver should notify FW about the support for DPIA in B0. [How] Added dpia_supported bit in dmub_fw_boot_options and will be set only for B0. Assign dpia_supported to true before dm_dmub_hw_init in B0 case. v2: fix build without CONFIG_DRM_AMD_DC_DCN (Alex) Signed-off-by: Jude Shih <shenshih@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: Fix USB4 hot plug crash issue	Jude Shih
	[Why] Notify data from outbox corrupt, the notify type should be 2 (HPD) instead of 0 (No data). We copied the address instead of the value. The memory might be freed in the end of outbox IRQ [How] We should allocate the memory of notify and copy the whole content from outbox to hpd handle function Fixes: 88f52b1fff891e ("drm/amd/display: Support for SET_CONFIG processing with DMUB") Signed-off-by: Jude Shih <shenshih@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: Fix deadlock when falling back to v2 from v3	Nicholas Kazlauskas
	[Why] A deadlock in the kernel occurs when we fallback from the V3 to V2 add_topology_to_display or remove_topology_to_display because they both try to acquire the dtm_mutex but recursive locking isn't supported on mutex_lock(). [How] Make the mutex_lock/unlock more fine grained and move them up such that they're only required for the psp invocation itself. Fixes: bf62221e9d0e ("drm/amd/display: Add DCN3.1 HDCP support") Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: Fallback to clocks which meet requested voltage on DCN31	Michael Strauss
	[WHY] On certain configs, SMU clock table voltages don't match which cause parser to behave incorrectly by leaving dcfclk and socclk table entries unpopulated. [HOW] Currently the function that finds the corresponding clock for a given voltage only checks for exact voltage level matches. In the case that no match gets found, parser now falls back to searching for the max clock which meets the requested voltage (i.e. its corresponding voltage is below requested). Signed-off-by: Michael Strauss <michael.strauss@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: move FPU associated DCN301 code to DML folder	Qingqing Zhuo
	[Why & How] As part of the FPU isolation work documented in https://patchwork.freedesktop.org/series/93042/, isolate code that uses FPU in DCN301 to DML, where all FPU code should locate. Cc: Christian König <christian.koenig@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Zhan Liu <Zhan.Liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: fix link training regression for 1 or 2 lane	Wenjing Liu
	[why] We have a regression that cause maximize lane settings to use uninitialized data from unused lanes. This will cause link training to fail for 1 or 2 lanes because the lane adjust is populated incorrectly sometimes. v2: fix build without CONFIG_DRM_AMD_DC_DCN (Alex) Reviewed-by: Eric Yang <eric.yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: add two lane settings training options	Wenjing Liu
	[why] option 1: disallow different lanes to have different lane settings option 2: dpcd lane settings will always use the same hw lane settings even if it doesn't match requested lane adjust Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: decouple hw_lane_settings from dpcd_lane_settings	Wenjing Liu
	[why] As DP features expands, we have encountered many situations where we must configure a different DPCD lane setting from hw lane settings we output. The change is to decouple hw lane settings from dpcd lane settings to provide flexibility to configure dpcd and hw individually. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: implement decide lane settings	Wenjing Liu
	[why] Decouple lane settings decision logic all to its own function. The function takes in lane adjust array and link training settings and decide what hw lane setting and dpcd lane setting should be used. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: adopt DP2.0 LT SCR revision 8	Wenjing Liu
	[how] revision 8 SCR requires DP Source to write TPS2 and FFE lane adjustment in one 5 byte write aux transaction. It specifies to read aux rd interval value as soon as we turn on TPS1 pattern. Cc: Wayne Lin <wayne.lin@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-28	drm/amd/display: FEC configuration for dpia links in MST mode	Meenakshikumar Somasundaram
	[Why] To fix the check condition for fec enable for dpia links in MST mode. [How] dc_link_should_enable_fec() to be used to check whether fec should be enabled in MST mode. Cc: Wayne Lin <wayne.lin@amd.com> Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>