summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-07Merge branch kvm-arm64/vfio-normal-nc into kvmarm/nextOliver Upton
* kvm-arm64/vfio-normal-nc: : Normal-NC support for vfio-pci @ stage-2, courtesy of Ankit Agrawal : : KVM's policy to date has been that any and all MMIO mapping at stage-2 : is treated as Device-nGnRE. This is primarily done due to concerns of : the guest triggering uncontainable failures in the system if they manage : to tickle the device / memory system the wrong way, though this is : unnecessarily restrictive for devices that can be reasoned as 'safe'. : : Unsurprisingly, the Device-* mapping can really hurt the performance of : assigned devices that can handle Gathering, and can be an outright : correctness issue if the guest driver does unaligned accesses. : : Rather than opening the floodgates to the full ecosystem of devices that : can be exposed to VMs, take the conservative approach and allow PCI : devices to be mapped as Normal-NC since it has been determined to be : 'safe'. vfio: Convey kvm that the vfio-pci device is wc safe KVM: arm64: Set io memory s2 pte as normalnc for vfio pci device mm: Introduce new flag to indicate wc safe KVM: arm64: Introduce new flag for non-cacheable IO memory Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-07Merge branch kvm-arm64/lpi-xarray into kvmarm/nextOliver Upton
* kvm-arm64/lpi-xarray: : xarray-based representation of vgic LPIs : : KVM's linked-list of LPI state has proven to be a bottleneck in LPI : injection paths, due to lock serialization when acquiring / releasing a : reference on an IRQ. : : Start the tedious process of reworking KVM's LPI injection by replacing : the LPI linked-list with an xarray, leveraging this to allow RCU readers : to walk it outside of the spinlock. KVM: arm64: vgic: Don't acquire the lpi_list_lock in vgic_put_irq() KVM: arm64: vgic: Ensure the irq refcount is nonzero when taking a ref KVM: arm64: vgic: Rely on RCU protection in vgic_get_lpi() KVM: arm64: vgic: Free LPI vgic_irq structs in an RCU-safe manner KVM: arm64: vgic: Use atomics to count LPIs KVM: arm64: vgic: Get rid of the LPI linked-list KVM: arm64: vgic-its: Walk the LPI xarray in vgic_copy_lpi_list() KVM: arm64: vgic-v3: Iterate the xarray to find pending LPIs KVM: arm64: vgic: Use xarray to find LPI in vgic_get_lpi() KVM: arm64: vgic: Store LPIs in an xarray Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-07Merge branch kvm-arm64/vm-configuration into kvmarm/nextOliver Upton
* kvm-arm64/vm-configuration: (29 commits) : VM configuration enforcement, courtesy of Marc Zyngier : : Userspace has gained the ability to control the features visible : through the ID registers, yet KVM didn't take this into account as the : effective feature set when determing trap / emulation behavior. This : series adds: : : - Mechanism for testing the presence of a particular CPU feature in the : guest's ID registers : : - Infrastructure for computing the effective value of VNCR-backed : registers, taking into account the RES0 / RES1 bits for a particular : VM configuration : : - Implementation of 'fine-grained UNDEF' controls that shadow the FGT : register definitions. KVM: arm64: Don't initialize idreg debugfs w/ preemption disabled KVM: arm64: Fail the idreg iterator if idregs aren't initialized KVM: arm64: Make build-time check of RES0/RES1 bits optional KVM: arm64: Add debugfs file for guest's ID registers KVM: arm64: Snapshot all non-zero RES0/RES1 sysreg fields for later checking KVM: arm64: Make FEAT_MOPS UNDEF if not advertised to the guest KVM: arm64: Make AMU sysreg UNDEF if FEAT_AMU is not advertised to the guest KVM: arm64: Make PIR{,E0}_EL1 UNDEF if S1PIE is not advertised to the guest KVM: arm64: Make TLBI OS/Range UNDEF if not advertised to the guest KVM: arm64: Streamline save/restore of HFG[RW]TR_EL2 KVM: arm64: Move existing feature disabling over to FGU infrastructure KVM: arm64: Propagate and handle Fine-Grained UNDEF bits KVM: arm64: Add Fine-Grained UNDEF tracking information KVM: arm64: Rename __check_nv_sr_forward() to triage_sysreg_trap() KVM: arm64: Use the xarray as the primary sysreg/sysinsn walker KVM: arm64: Register AArch64 system register entries with the sysreg xarray KVM: arm64: Always populate the trap configuration xarray KVM: arm64: nv: Move system instructions to their own sys_reg_desc array KVM: arm64: Drop the requirement for XARRAY_MULTI KVM: arm64: nv: Turn encoding ranges into discrete XArray stores ... Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-07Merge branch kvm-arm64/misc into kvmarm/nextOliver Upton
* kvm-arm64/misc: : Miscellaneous updates : : - Fix handling of features w/ nonzero safe values in set_id_regs : selftest : : - Cleanup the unused kern_hyp_va() asm macro : : - Differentiate nVHE and hVHE in boot-time message : : - Several selftests cleanups : : - Drop bogus return value from kvm_arch_create_vm_debugfs() : : - Make save/restore of SPE and TRBE control registers affect EL1 state : in hVHE mode : : - Typos KVM: arm64: Fix TRFCR_EL1/PMSCR_EL1 access in hVHE mode KVM: selftests: aarch64: Remove unused functions from vpmu test KVM: arm64: Fix typos KVM: Get rid of return value from kvm_arch_create_vm_debugfs() KVM: selftests: Print timer ctl register in ISTATUS assertion KVM: selftests: Fix GUEST_PRINTF() format warnings in ARM code KVM: arm64: removed unused kern_hyp_va asm macro KVM: arm64: add comments to __kern_hyp_va KVM: arm64: print Hyp mode KVM: arm64: selftests: Handle feature fields with nonzero minimum value correctly Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-07Merge branch kvm-arm64/feat_e2h0 into kvmarm/nextOliver Upton
* kvm-arm64/feat_e2h0: : Support for FEAT_E2H0, courtesy of Marc Zyngier : : As described in the cover letter: : : Since ARMv8.1, the architecture has grown the VHE feature, which makes : EL2 a superset of EL1. With ARMv9.5 (and retroactively allowed from : ARMv8.1), the architecture allows implementations to have VHE as the : *only* implemented behaviour, meaning that HCR_EL2.E2H can be : implemented as RES1. As a follow-up, HCR_EL2.NV1 can also be : implemented as RES0, making the VHE-ness of the architecture : recursive. : : This series adds support for detecting the architectural feature of E2H : being RES1, leveraging the existing infrastructure for handling : out-of-spec CPUs that are VHE-only. Additionally, the (incomplete) NV : infrastructure in KVM is updated to enforce E2H=1 for guest hypervisors : on implementations that do not support NV1. arm64: cpufeatures: Fix FEAT_NV check when checking for FEAT_NV1 arm64: cpufeatures: Only check for NV1 if NV is present arm64: cpufeatures: Add missing ID_AA64MMFR4_EL1 to __read_sysreg_by_encoding() KVM: arm64: Handle Apple M2 as not having HCR_EL2.NV1 implemented KVM: arm64: Force guest's HCR_EL2.E2H RES1 when NV1 is not implemented KVM: arm64: Expose ID_AA64MMFR4_EL1 to guests arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative arm64: cpufeature: Detect HCR_EL2.NV1 being RES0 arm64: cpufeature: Add ID_AA64MMFR4_EL1 handling arm64: sysreg: Add layout for ID_AA64MMFR4_EL1 arm64: cpufeature: Correctly display signed override values arm64: cpufeatures: Correctly handle signed values arm64: Add macro to compose a sysreg field value Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-06dt-bindings: interrupt-controller: fsl,intmux: Include power-domains supportFrank Li
Enable the power-domains property for the fsl,intmux node. This addition accommodates i.MX8QXP, i.MX8QM, and i.MX8DXL, which utilize the power-domains property. Incorporating this eliminates DTB_CHECK errors in relevant device tree source files. Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240229200911.712572-1-Frank.Li@nxp.com Signed-off-by: Rob Herring <robh@kernel.org>
2024-03-06soc: fsl: qbman: Remove RESERVEDMEM_OF_DECLARE usageRob Herring
There is no reason to use RESERVEDMEM_OF_DECLARE() as the initialization hook just saves off the base address and size. Use of RESERVEDMEM_OF_DECLARE() is reserved for non-driver code and initialization which must be done early. For qbman, retrieving the address and size can be done in probe just as easily. Link: https://lore.kernel.org/r/20240201192931.1324130-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2024-03-06bpf, riscv64/cfi: Support kCFI + BPF on riscv64Puranjay Mohan
The riscv BPF JIT doesn't emit proper kCFI prologues for BPF programs and struct_ops trampolines when CONFIG_CFI_CLANG is enabled. This causes CFI failures when calling BPF programs and can even crash the kernel due to invalid memory accesses. Example crash: root@rv-selftester:~/bpf# ./test_progs -a dummy_st_ops Unable to handle kernel paging request at virtual address ffffffff78204ffc Oops [#1] Modules linked in: bpf_testmod(OE) [....] CPU: 3 PID: 356 Comm: test_progs Tainted: P OE 6.8.0-rc1 #1 Hardware name: riscv-virtio,qemu (DT) epc : bpf_struct_ops_test_run+0x28c/0x5fc ra : bpf_struct_ops_test_run+0x26c/0x5fc epc : ffffffff82958010 ra : ffffffff82957ff0 sp : ff200000007abc80 gp : ffffffff868d6218 tp : ff6000008d87b840 t0 : 000000000000000f t1 : 0000000000000000 t2 : 000000002005793e s0 : ff200000007abcf0 s1 : ff6000008a90fee0 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : ffffffff868dba26 a6 : 0000000000000001 a7 : 0000000052464e43 s2 : 00007ffffc0a95f0 s3 : ff6000008a90fe80 s4 : ff60000084c24c00 s5 : ffffffff78205000 s6 : ff60000088750648 s7 : ff20000000035008 s8 : fffffffffffffff4 s9 : ffffffff86200610 s10: 0000000000000000 s11: 0000000000000000 t3 : ffffffff8483dc30 t4 : ffffffff8483dc10 t5 : ffffffff8483dbf0 t6 : ffffffff8483dbd0 status: 0000000200000120 badaddr: ffffffff78204ffc cause: 000000000000000d [<ffffffff82958010>] bpf_struct_ops_test_run+0x28c/0x5fc [<ffffffff805083ee>] bpf_prog_test_run+0x170/0x548 [<ffffffff805029c8>] __sys_bpf+0x2d2/0x378 [<ffffffff804ff570>] __riscv_sys_bpf+0x5c/0x120 [<ffffffff8000e8fe>] syscall_handler+0x62/0xe4 [<ffffffff83362df6>] do_trap_ecall_u+0xc6/0x27c [<ffffffff833822c4>] ret_from_exception+0x0/0x64 Code: b603 0109 b683 0189 b703 0209 8493 0609 157d 8d65 (a303) ffca ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Implement proper kCFI prologues for the BPF programs and callbacks and drop __nocfi for riscv64. Fix the trampoline generation code to emit kCFI prologue when a struct_ops trampoline is being prepared. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240303170207.82201-2-puranjay12@gmail.com
2024-03-06Merge branch 'libbpf-type-suffixes-and-autocreate-flag-for-struct_ops-maps'Andrii Nakryiko
Eduard Zingerman says: ==================== libbpf: type suffixes and autocreate flag for struct_ops maps Tweak struct_ops related APIs to allow the following features: - specify version suffixes for stuct_ops map types; - share same BPF program between several map definitions with different local BTF types, assuming only maps with same kernel BTF type would be selected for load; - toggle autocreate flag for struct_ops maps; - automatically toggle autoload for struct_ops programs referenced from struct_ops maps, depending on autocreate status of the corresponding map; - use SEC("?.struct_ops") and SEC("?.struct_ops.link") to define struct_ops maps with autocreate == false after object open. This would allow loading programs like below: SEC("struct_ops/foo") int BPF_PROG(foo) { ... } SEC("struct_ops/bar") int BPF_PROG(bar) { ... } struct bpf_testmod_ops___v1 { int (*foo)(void); }; struct bpf_testmod_ops___v2 { int (*foo)(void); int (*bar)(void); }; /* Assume kernel type name to be 'test_ops' */ SEC(".struct_ops.link") struct test_ops___v1 map_v1 = { /* Program 'foo' shared by maps with * different local BTF type */ .foo = (void *)foo }; SEC(".struct_ops.link") struct test_ops___v2 map_v2 = { .foo = (void *)foo, .bar = (void *)bar }; Assuming the following tweaks are done before loading: /* to load v1 */ bpf_map__set_autocreate(skel->maps.map_v1, true); bpf_map__set_autocreate(skel->maps.map_v2, false); /* to load v2 */ bpf_map__set_autocreate(skel->maps.map_v1, false); bpf_map__set_autocreate(skel->maps.map_v2, true); Patch #8 ties autocreate and autoload flags for struct_ops maps and programs. Changelog: - v3 [3] -> v4: - changes for multiple styling suggestions from Andrii; - patch #5: libbpf log capture now happens for LIBBPF_INFO and LIBBPF_WARN messages and does not depend on verbosity flags (Andrii); - patch #6: fixed runtime crash caused by conflict with newly added test case struct_ops_multi_pages; - patch #7: fixed free of possibly uninitialized pointer (Daniel) - patch #8: simpler algorithm to detect which programs to autoload (Andrii); - patch #9: added assertions for autoload flag after object load (Andrii); - patch #12: DATASEC name rewrite in libbpf is now done inplace, no new strings added to BTF (Andrii); - patch #14: allow any printable characters in DATASEC names when kernel validates BTF (Andrii) - v2 [2] -> v3: - moved patch #8 logic to be fully done on load (requested by Andrii in offlist discussion); - in patch #9 added test case for shadow vars and autocreate/autoload interaction. - v1 [1] -> v2: - fixed memory leak in patch #1 (Kui-Feng); - improved error messages in patch #2 (Martin, Andrii); - in bad_struct_ops selftest from patch #6 added .test_2 map member setup (David); - added utility functions to capture libbpf log from selftests (David) - in selftests replaced usage of ...__open_and_load by separate calls to ..._open() and ..._load() (Andrii); - removed serial_... in selftest definitions (Andrii); - improved comments in selftest struct_ops_autocreate from patch #7 (David); - removed autoload toggling logic incompatible with shadow variables from bpf_map__set_autocreate(), instead struct_ops programs autoload property is computed at struct_ops maps load phase, see patch #8 (Kui-Feng, Martin, Andrii); - added support for SEC("?.struct_ops") and SEC("?.struct_ops.link") (Andrii). [1] https://lore.kernel.org/bpf/20240227204556.17524-1-eddyz87@gmail.com/ [2] https://lore.kernel.org/bpf/20240302011920.15302-1-eddyz87@gmail.com/ [3] https://lore.kernel.org/bpf/20240304225156.24765-1-eddyz87@gmail.com/ ==================== Link: https://lore.kernel.org/r/20240306104529.6453-1-eddyz87@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-06selftests/bpf: Test cases for '?' in BTF namesEduard Zingerman
Two test cases to verify that '?' and other printable characters are allowed in BTF DATASEC names: - DATASEC with name "?.foo bar:buz" should be accepted; - type with name "?foo" should be rejected. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-16-eddyz87@gmail.com
2024-03-06bpf: Allow all printable characters in BTF DATASEC namesEduard Zingerman
The intent is to allow libbpf to use SEC("?.struct_ops") to identify struct_ops maps that are optional, e.g. like in the following BPF code: SEC("?.struct_ops") struct test_ops optional_map = { ... }; Which yields the following BTF: ... [13] DATASEC '?.struct_ops' size=0 vlen=... ... To load such BTF libbpf rewrites DATASEC name before load. After this patch the rewrite won't be necessary. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-15-eddyz87@gmail.com
2024-03-06selftests/bpf: Test case for SEC("?.struct_ops")Eduard Zingerman
Check that "?.struct_ops" and "?.struct_ops.link" section names define struct_ops maps with autocreate == false after open. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-14-eddyz87@gmail.com
2024-03-06libbpf: Rewrite btf datasec names starting from '?'Eduard Zingerman
Optional struct_ops maps are defined using question mark at the start of the section name, e.g.: SEC("?.struct_ops") struct test_ops optional_map = { ... }; This commit teaches libbpf to detect if kernel allows '?' prefix in datasec names, and if it doesn't then to rewrite such names by replacing '?' with '_', e.g.: DATASEC ?.struct_ops -> DATASEC _.struct_ops Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-13-eddyz87@gmail.com
2024-03-06libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link")Eduard Zingerman
Allow using two new section names for struct_ops maps: - SEC("?.struct_ops") - SEC("?.struct_ops.link") To specify maps that have bpf_map->autocreate == false after open. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-12-eddyz87@gmail.com
2024-03-06libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_typeEduard Zingerman
The next patch would add two new section names for struct_ops maps. To make working with multiple struct_ops sections more convenient: - remove fields like elf_state->st_ops_{shndx,link_shndx}; - mark section descriptions hosting struct_ops as elf_sec_desc->sec_type == SEC_ST_OPS; After these changes struct_ops sections could be processed uniformly by iterating bpf_object->efile.secs entries. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-11-eddyz87@gmail.com
2024-03-06selftests/bpf: Verify struct_ops autoload/autocreate syncEduard Zingerman
Check that autocreate flags of struct_ops map cause autoload of struct_ops corresponding programs: - when struct_ops program is referenced only from a map for which autocreate is set to false, that program should not be loaded; - when struct_ops program with autoload == false is set to be used from a map with autocreate == true using shadow var, that program should be loaded; - when struct_ops program is not referenced from any map object load should fail. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-10-eddyz87@gmail.com
2024-03-06libbpf: Sync progs autoload with maps autocreate for struct_ops mapsEduard Zingerman
Automatically select which struct_ops programs to load depending on which struct_ops maps are selected for automatic creation. E.g. for the BPF code below: SEC("struct_ops/test_1") int BPF_PROG(foo) { ... } SEC("struct_ops/test_2") int BPF_PROG(bar) { ... } SEC(".struct_ops.link") struct test_ops___v1 A = { .foo = (void *)foo }; SEC(".struct_ops.link") struct test_ops___v2 B = { .foo = (void *)foo, .bar = (void *)bar, }; And the following libbpf API calls: bpf_map__set_autocreate(skel->maps.A, true); bpf_map__set_autocreate(skel->maps.B, false); The autoload would be enabled for program 'foo' and disabled for program 'bar'. During load, for each struct_ops program P, referenced from some struct_ops map M: - set P.autoload = true if M.autocreate is true for some M; - set P.autoload = false if M.autocreate is false for all M; - don't change P.autoload, if P is not referenced from any map. Do this after bpf_object__init_kern_struct_ops_maps() to make sure that shadow vars assignment is done. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-9-eddyz87@gmail.com
2024-03-06selftests/bpf: Test autocreate behavior for struct_ops mapsEduard Zingerman
Check that bpf_map__set_autocreate() can be used to disable automatic creation for struct_ops maps. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-8-eddyz87@gmail.com
2024-03-06selftests/bpf: Bad_struct_ops testEduard Zingerman
When loading struct_ops programs kernel requires BTF id of the struct_ops type and member index for attachment point inside that type. This makes impossible to use same BPF program in several struct_ops maps that have different struct_ops type. Check if libbpf rejects such BPF objects files. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-7-eddyz87@gmail.com
2024-03-06selftests/bpf: Utility functions to capture libbpf log in test_progsEduard Zingerman
Several test_progs tests already capture libbpf log in order to check for some expected output, e.g bpf_tcp_ca.c, kfunc_dynptr_param.c, log_buf.c and a few others. This commit provides a, hopefully, simple API to capture libbpf log w/o necessity to define new print callback in each test: /* Creates a global memstream capturing INFO and WARN level output * passed to libbpf_print_fn. * Returns 0 on success, negative value on failure. * On failure the description is printed using PRINT_FAIL and * current test case is marked as fail. */ int start_libbpf_log_capture(void) /* Destroys global memstream created by start_libbpf_log_capture(). * Returns a pointer to captured data which has to be freed. * Returned buffer is null terminated. */ char *stop_libbpf_log_capture(void) The intended usage is as follows: if (start_libbpf_log_capture()) return; use_libbpf(); char *log = stop_libbpf_log_capture(); ASSERT_HAS_SUBSTR(log, "... expected ...", "expected some message"); free(log); As a safety measure, free(start_libbpf_log_capture()) is invoked in the epilogue of the test_progs.c:run_one_test(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-6-eddyz87@gmail.com
2024-03-06selftests/bpf: Test struct_ops map definition with type suffixEduard Zingerman
Extend struct_ops_module test case to check if it is possible to use '___' suffixes for struct_ops type specification. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240306104529.6453-5-eddyz87@gmail.com
2024-03-06libbpf: Honor autocreate flag for struct_ops mapsEduard Zingerman
Skip load steps for struct_ops maps not marked for automatic creation. This should allow to load bpf object in situations like below: SEC("struct_ops/foo") int BPF_PROG(foo) { ... } SEC("struct_ops/bar") int BPF_PROG(bar) { ... } struct test_ops___v1 { int (*foo)(void); }; struct test_ops___v2 { int (*foo)(void); int (*does_not_exist)(void); }; SEC(".struct_ops.link") struct test_ops___v1 map_for_old = { .test_1 = (void *)foo }; SEC(".struct_ops.link") struct test_ops___v2 map_for_new = { .test_1 = (void *)foo, .does_not_exist = (void *)bar }; Suppose program is loaded on old kernel that does not have definition for 'does_not_exist' struct_ops member. After this commit it would be possible to load such object file after the following tweaks: bpf_program__set_autoload(skel->progs.bar, false); bpf_map__set_autocreate(skel->maps.map_for_new, false); Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240306104529.6453-4-eddyz87@gmail.com
2024-03-06libbpf: Tie struct_ops programs to kernel BTF ids, not to local idsEduard Zingerman
Enforce the following existing limitation on struct_ops programs based on kernel BTF id instead of program-local BTF id: struct_ops BPF prog can be re-used between multiple .struct_ops & .struct_ops.link as long as it's the same struct_ops struct definition and the same function pointer field This allows reusing same BPF program for versioned struct_ops map definitions, e.g.: SEC("struct_ops/test") int BPF_PROG(foo) { ... } struct some_ops___v1 { int (*test)(void); }; struct some_ops___v2 { int (*test)(void); }; SEC(".struct_ops.link") struct some_ops___v1 a = { .test = foo } SEC(".struct_ops.link") struct some_ops___v2 b = { .test = foo } Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-3-eddyz87@gmail.com
2024-03-06libbpf: Allow version suffixes (___smth) for struct_ops typesEduard Zingerman
E.g. allow the following struct_ops definitions: struct bpf_testmod_ops___v1 { int (*test)(void); }; struct bpf_testmod_ops___v2 { int (*test)(void); }; SEC(".struct_ops.link") struct bpf_testmod_ops___v1 a = { .test = ... } SEC(".struct_ops.link") struct bpf_testmod_ops___v2 b = { .test = ... } Where both bpf_testmod_ops__v1 and bpf_testmod_ops__v2 would be resolved as 'struct bpf_testmod_ops' from kernel BTF. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-2-eddyz87@gmail.com
2024-03-06Merge branch 'bpf-introduce-may_goto-and-cond_break'Andrii Nakryiko
Alexei Starovoitov says: ==================== bpf: Introduce may_goto and cond_break From: Alexei Starovoitov <ast@kernel.org> v5 -> v6: - Rename BPF_JMA to BPF_JCOND - Addressed Andrii's review comments v4 -> v5: - rewrote patch 1 to avoid fake may_goto_reg and use 'u32 may_goto_cnt' instead. This way may_goto handling is similar to bpf_loop() processing. - fixed bug in patch 2 that RANGE_WITHIN should not use rold->type == NOT_INIT as a safe signal. - patch 3 fixed negative offset computation in cond_break macro - using bpf_arena and cond_break recompiled lib/glob.c as bpf prog and it works! It will be added as a selftest to arena series. v3 -> v4: - fix drained issue reported by John. may_goto insn could be implemented with sticky state (once reaches 0 it stays 0), but the verifier shouldn't assume that. It has to explore both branches. Arguably drained iterator state shouldn't be there at all. bpf_iter_css_next() is not sticky. Can be fixed, but auditing all iterators for stickiness. That's an orthogonal discussion. - explained JMA name reasons in patch 1 - fixed test_progs-no_alu32 issue and added another test v2 -> v3: Major change - drop bpf_can_loop() kfunc and introduce may_goto instruction instead kfunc is a function call while may_goto doesn't consume any registers and LLVM can produce much better code due to less register pressure. - instead of counting from zero to BPF_MAX_LOOPS start from it instead and break out of the loop when count reaches zero - use may_goto instruction in cond_break macro - recognize that 'exact' state comparison doesn't need to be truly exact. regsafe() should ignore precision and liveness marks, but range_within logic is safe to use while evaluating open coded iterators. ==================== Link: https://lore.kernel.org/r/20240306031929.42666-1-alexei.starovoitov@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-06selftests/bpf: Test may_gotoAlexei Starovoitov
Add tests for may_goto instruction via cond_break macro. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240306031929.42666-5-alexei.starovoitov@gmail.com
2024-03-06bpf: Add cond_break macroAlexei Starovoitov
Use may_goto instruction to implement cond_break macro. Ideally the macro should be written as: asm volatile goto(".byte 0xe5; .byte 0; .short %l[l_break] ... .long 0; but LLVM doesn't recognize fixup of 2 byte PC relative yet. Hence use asm volatile goto(".byte 0xe5; .byte 0; .long %l[l_break] ... .short 0; that produces correct asm on little endian. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240306031929.42666-4-alexei.starovoitov@gmail.com
2024-03-06bpf: Recognize that two registers are safe when their ranges matchAlexei Starovoitov
When open code iterators, bpf_loop or may_goto are used the following two states are equivalent and safe to prune the search: cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) In other words "exact" state match should ignore liveness and precision marks, since open coded iterator logic didn't complete their propagation, reg_old->type == NOT_INIT && reg_cur->type != NOT_INIT is also not safe to prune while looping, but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr is safe to rely on. Avoid doing such comparison when regular infinite loop detection logic is used, otherwise bounded loop logic will declare such "infinite loop" as false positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop(). Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240306031929.42666-3-alexei.starovoitov@gmail.com
2024-03-06Merge branch 'mm-enforce-ioremap-address-space-and-introduce-sparse-vm_area'Andrii Nakryiko
Alexei Starovoitov says: ==================== mm: Enforce ioremap address space and introduce sparse vm_area From: Alexei Starovoitov <ast@kernel.org> v3 -> v4 - dropped VM_XEN patch for now. It will be in the follow up. - fixed constant as pointed out by Mike v2 -> v3 - added Christoph's reviewed-by to patch 1 - cap commit log lines to 75 chars - factored out common checks in patch 3 into helper - made vm_area_unmap_pages() return void There are various users of kernel virtual address space: vmalloc, vmap, ioremap, xen. - vmalloc use case dominates the usage. Such vm areas have VM_ALLOC flag and these areas are treated differently by KASAN. - the areas created by vmap() function should be tagged with VM_MAP (as majority of the users do). - ioremap areas are tagged with VM_IOREMAP and vm area start is aligned to size of the area unlike vmalloc/vmap. - there is also xen usage that is marked as VM_IOREMAP, but it doesn't call ioremap_page_range() unlike all other VM_IOREMAP users. To clean this up a bit, enforce that ioremap_page_range() checks the range and VM_IOREMAP flag. In addition BPF would like to reserve regions of kernel virtual address space and populate it lazily, similar to xen use cases. For that reason, introduce VM_SPARSE flag and vm_area_[un]map_pages() helpers to populate this sparse area. In the end the /proc/vmallocinfo will show "vmalloc" "vmap" "ioremap" "sparse" categories for different kinds of address regions. ioremap, sparse will return zero when dumped through /proc/kcore ==================== Link: https://lore.kernel.org/r/20240305030516.41519-1-alexei.starovoitov@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-06bpf: Introduce may_goto instructionAlexei Starovoitov
Introduce may_goto instruction that from the verifier pov is similar to open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it doesn't iterate any objects. In assembly 'may_goto' is a nop most of the time until bpf runtime has to terminate the program for whatever reason. In the current implementation may_goto has a hidden counter, but other mechanisms can be used. For programs written in C the later patch introduces 'cond_break' macro that combines 'may_goto' with 'break' statement and has similar semantics: cond_break is a nop until bpf runtime has to break out of this loop. It can be used in any normal "for" or "while" loop, like for (i = zero; i < cnt; cond_break, i++) { The verifier recognizes that may_goto is used in the program, reserves additional 8 bytes of stack, initializes them in subprog prologue, and replaces may_goto instruction with: aux_reg = *(u64 *)(fp - 40) if aux_reg == 0 goto pc+off aux_reg -= 1 *(u64 *)(fp - 40) = aux_reg may_goto instruction can be used by LLVM to implement __builtin_memcpy, __builtin_strcmp. may_goto is not a full substitute for bpf_for() macro. bpf_for() doesn't have induction variable that verifiers sees, so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded. But when the code is written as: for (i = 0; i < 100; cond_break, i++) the verifier see 'i' as precise constant zero, hence cond_break (aka may_goto) doesn't help to converge the loop. A static or global variable can be used as a workaround: static int zero = 0; for (i = zero; i < 100; cond_break, i++) // works! may_goto works well with arena pointers that don't need to be bounds checked on access. Load/store from arena returns imprecise unbounded scalar and loops with may_goto pass the verifier. Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn. JCOND stands for conditional pseudo jump. Since goto_or_nop insn was proposed, it may use the same opcode. may_goto vs goto_or_nop can be distinguished by src_reg: code = BPF_JMP | BPF_JCOND src_reg = 0 - may_goto src_reg = 1 - goto_or_nop Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Tested-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com
2024-03-06mm: Introduce VM_SPARSE kind and vm_area_[un]map_pages().Alexei Starovoitov
vmap/vmalloc APIs are used to map a set of pages into contiguous kernel virtual space. get_vm_area() with appropriate flag is used to request an area of kernel address range. It's used for vmalloc, vmap, ioremap, xen use cases. - vmalloc use case dominates the usage. Such vm areas have VM_ALLOC flag. - the areas created by vmap() function should be tagged with VM_MAP. - ioremap areas are tagged with VM_IOREMAP. BPF would like to extend the vmap API to implement a lazily-populated sparse, yet contiguous kernel virtual space. Introduce VM_SPARSE flag and vm_area_map_pages(area, start_addr, count, pages) API to map a set of pages within a given area. It has the same sanity checks as vmap() does. It also checks that get_vm_area() was created with VM_SPARSE flag which identifies such areas in /proc/vmallocinfo and returns zero pages on read through /proc/kcore. The next commits will introduce bpf_arena which is a sparsely populated shared memory region between bpf program and user space process. It will map privately-managed pages into a sparse vm area with the following steps: // request virtual memory region during bpf prog verification area = get_vm_area(area_size, VM_SPARSE); // on demand vm_area_map_pages(area, kaddr, kend, pages); vm_area_unmap_pages(area, kaddr, kend); // after bpf program is detached and unloaded free_vm_area(area); Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/bpf/20240305030516.41519-3-alexei.starovoitov@gmail.com
2024-03-07netfilter: nft_ct: fix l3num expectations with inet pseudo familyFlorian Westphal
Following is rejected but should be allowed: table inet t { ct expectation exp1 { [..] l3proto ip Valid combos are: table ip t, l3proto ip table ip6 t, l3proto ip6 table inet t, l3proto ip OR l3proto ip6 Disallow inet pseudeo family, the l3num must be a on-wire protocol known to conntrack. Retain NFPROTO_INET case to make it clear its rejected intentionally rather as oversight. Fixes: 8059918a1377 ("netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-03-07netfilter: nf_tables: reject constant set with timeoutPablo Neira Ayuso
This set combination is weird: it allows for elements to be added/deleted, but once bound to the rule it cannot be updated anymore. Eventually, all elements expire, leading to an empty set which cannot be updated anymore. Reject this flags combination. Cc: stable@vger.kernel.org Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-03-07netfilter: nf_tables: disallow anonymous set with timeout flagPablo Neira Ayuso
Anonymous sets are never used with timeout from userspace, reject this. Exception to this rule is NFT_SET_EVAL to ensure legacy meters still work. Cc: stable@vger.kernel.org Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support") Reported-by: lonial con <kongln9170@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-03-06serial: 8250_dw: Replace ACPI device check by a quirkAndy Shevchenko
Instead of checking for APMC0D08 ACPI device presence, use a quirk based on driver data. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240306143322.3291123-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-06mmc: core: make mmc_host_class constantRicardo B. Marliere
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the mmc_host_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Link: https://lore.kernel.org/r/20240305-class_cleanup-mmc-v1-1-4a66e7122ff3@marliere.net Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-03-06mmc: Merge branch fixes into nextUlf Hansson
Merge the mmc fixes for v6.8-rc[n] into the next branch, to allow them to get tested together with the new mmc changes that are targeted for v6.9. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-03-06Bluetooth: Add new quirk for broken read key length on ATS2851Vinicius Peixoto
The ATS2851 controller erroneously reports support for the "Read Encryption Key Length" HCI command. This makes it unable to connect to any devices, since this command is issued by the kernel during the connection process in response to an "Encryption Change" HCI event. Add a new quirk (HCI_QUIRK_BROKEN_ENC_KEY_SIZE) to hint that the command is unsupported, preventing it from interrupting the connection process. This is the error log from btmon before this patch: > HCI Event: Encryption Change (0x08) plen 4 Status: Success (0x00) Handle: 2048 Address: ... Encryption: Enabled with E0 (0x01) < HCI Command: Read Encryption Key Size (0x05|0x0008) plen 2 Handle: 2048 Address: ... > HCI Event: Command Status (0x0f) plen 4 Read Encryption Key Size (0x05|0x0008) ncmd 1 Status: Unknown HCI Command (0x01) Signed-off-by: Vinicius Peixoto <nukelet64@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: mgmt: remove NULL check in add_ext_adv_params_complete()Roman Smirnov
Remove the cmd pointer NULL check in add_ext_adv_params_complete() because it occurs earlier in add_ext_adv_params(). This check is also unnecessary because the pointer is dereferenced just before it. Found by Linux Verification Center (linuxtesting.org) with Svace. Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: mgmt: remove NULL check in mgmt_set_connectable_complete()Roman Smirnov
Remove the cmd pointer NULL check in mgmt_set_connectable_complete() because it occurs earlier in set_connectable(). This check is also unnecessary because the pointer is dereferenced just before it. Found by Linux Verification Center (linuxtesting.org) with Svace. Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: btusb: Add support Mediatek MT7920Peter Tsao
This patch is added support Mediatek MT7920 The firmware location of MT7920 will set to /lib/firmware/mediatek/ The information in /sys/kernel/debug/usb/devices about MT7920U Bluetooth device is listed as the below T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 12 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0e8d ProdID=7920 Rev= 1.00 S: Manufacturer=MediaTek Inc. S: Product=Wireless_Device S: SerialNumber=000000000 C:* #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA A: FirstIf#= 0 IfCount= 3 Cls=e0(wlcon) Sub=01 Prot=01 I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=125us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I: If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=83(I) Atr=01(Isoc) MxPS= 63 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 63 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 64 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 2 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none) E: Ad=8a(I) Atr=03(Int.) MxPS= 512 Ivl=125us E: Ad=0a(O) Atr=03(Int.) MxPS= 512 Ivl=125us Signed-off-by: Peter Tsao <peter.tsao@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: btmtk: Add MODULE_FIRMWARE() for MT7922Takashi Iwai
Since dracut refers to the module info for defining the required firmware files and btmtk driver doesn't provide the firmware info for MT7922, the generate initrd misses the firmware, resulting in the broken Bluetooth. This patch simply adds the MODULE_FIRMWARE() for the missing entry for covering that. Link: https://bugzilla.suse.com/show_bug.cgi?id=1214133 Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: btnxpuart: Fix btnxpuart_closeMarcel Ziswiler
Fix scheduling while atomic BUG in btnxpuart_close(), properly purge the transmit queue and free the receive skb. [ 10.973809] BUG: scheduling while atomic: kworker/u9:0/80/0x00000002 ... [ 10.980740] CPU: 3 PID: 80 Comm: kworker/u9:0 Not tainted 6.8.0-rc7-0.0.0-devel-00005-g61fdfceacf09 #1 [ 10.980751] Hardware name: Toradex Verdin AM62 WB on Dahlia Board (DT) [ 10.980760] Workqueue: hci0 hci_power_off [bluetooth] [ 10.981169] Call trace: ... [ 10.981363] uart_update_mctrl+0x58/0x78 [ 10.981373] uart_dtr_rts+0x104/0x114 [ 10.981381] tty_port_shutdown+0xd4/0xdc [ 10.981396] tty_port_close+0x40/0xbc [ 10.981407] uart_close+0x34/0x9c [ 10.981414] ttyport_close+0x50/0x94 [ 10.981430] serdev_device_close+0x40/0x50 [ 10.981442] btnxpuart_close+0x24/0x98 [btnxpuart] [ 10.981469] hci_dev_close_sync+0x2d8/0x718 [bluetooth] [ 10.981728] hci_dev_do_close+0x2c/0x70 [bluetooth] [ 10.981862] hci_power_off+0x20/0x64 [bluetooth] Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets") Cc: stable@vger.kernel.org Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Reviewed-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: ISO: Clean up returns values in iso_connect_ind()Dan Carpenter
This function either returns 0 or HCI_LM_ACCEPT. Make it clearer which returns are which and delete the "lm" variable because it is no longer required. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: fix use-after-free in accessing skb after sending itPauli Virtanen
hci_send_cmd_sync first sends skb and then tries to clone it. However, the driver may have already freed the skb at that point. Fix by cloning the sent_cmd cloned just above, instead of the original. Log: ================================================================ BUG: KASAN: slab-use-after-free in __copy_skb_header+0x1a/0x240 ... Call Trace: .. __skb_clone+0x59/0x2c0 hci_cmd_work+0x3b3/0x3d0 [bluetooth] process_one_work+0x459/0x900 ... Allocated by task 129: ... __alloc_skb+0x1ae/0x220 __hci_cmd_sync_sk+0x44c/0x7a0 [bluetooth] __hci_cmd_sync_status+0x24/0xb0 [bluetooth] set_cig_params_sync+0x778/0x7d0 [bluetooth] ... Freed by task 0: ... kmem_cache_free+0x157/0x3c0 __usb_hcd_giveback_urb+0x11e/0x1e0 usb_giveback_urb_bh+0x1ad/0x2a0 tasklet_action_common.isra.0+0x259/0x4a0 __do_softirq+0x15b/0x5a7 ================================================================ Fixes: 2615fd9a7c25 ("Bluetooth: hci_sync: Fix overwriting request callback") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: af_bluetooth: Fix deadlockLuiz Augusto von Dentz
Attemting to do sock_lock on .recvmsg may cause a deadlock as shown bellow, so instead of using sock_sock this uses sk_receive_queue.lock on bt_sock_ioctl to avoid the UAF: INFO: task kworker/u9:1:121 blocked for more than 30 seconds. Not tainted 6.7.6-lemon #183 Workqueue: hci0 hci_rx_work Call Trace: <TASK> __schedule+0x37d/0xa00 schedule+0x32/0xe0 __lock_sock+0x68/0xa0 ? __pfx_autoremove_wake_function+0x10/0x10 lock_sock_nested+0x43/0x50 l2cap_sock_recv_cb+0x21/0xa0 l2cap_recv_frame+0x55b/0x30a0 ? psi_task_switch+0xeb/0x270 ? finish_task_switch.isra.0+0x93/0x2a0 hci_rx_work+0x33a/0x3f0 process_one_work+0x13a/0x2f0 worker_thread+0x2f0/0x410 ? __pfx_worker_thread+0x10/0x10 kthread+0xe0/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> Fixes: 2e07e8348ea4 ("Bluetooth: af_bluetooth: Fix Use-After-Free in bt_sock_recvmsg") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: bnep: Fix out-of-bound accessLuiz Augusto von Dentz
This fixes attempting to access past ethhdr.h_source, although it seems intentional to copy also the contents of h_proto this triggers out-of-bound access problems with the likes of static analyzer, so this instead just copy ETH_ALEN and then proceed to use put_unaligned to copy h_proto separetely. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: btusb: Fix memory leakLuiz Augusto von Dentz
This checks if CONFIG_DEV_COREDUMP is enabled before attempting to clone the skb and also make sure btmtk_process_coredump frees the skb passed following the same logic. Fixes: 0b7015132878 ("Bluetooth: btusb: mediatek: add MediaTek devcoredump support") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: msft: Fix memory leakLuiz Augusto von Dentz
Fix leaking buffer allocated to send MSFT_OP_LE_MONITOR_ADVERTISEMENT. Fixes: 9e14606d8f38 ("Bluetooth: msft: Extended monitor tracking by address filter") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-03-06Bluetooth: hci_core: Fix possible buffer overflowLuiz Augusto von Dentz
struct hci_dev_info has a fixed size name[8] field so in the event that hdev->name is bigger than that strcpy would attempt to write past its size, so this fixes this problem by switching to use strscpy. Fixes: dcda165706b9 ("Bluetooth: hci_core: Fix build warnings") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>