summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-01-25virtchnl: remove unused structure declarationJesse Brandeburg
Nothing uses virtchnl_msg, just remove it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-25net/smc: De-tangle ism and smc device initializationStefan Raspl
The struct device for ISM devices was part of struct smcd_dev. Move to struct ism_dev, provide a new API call in struct smcd_ops, and convert existing SMCD code accordingly. Furthermore, remove struct smcd_dev from struct ism_dev. This is the final part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Consolidate SMC-D-related codeStefan Raspl
The ism module had SMC-D-specific code sprinkled across the entire module. We are now consolidating the SMC-D-specific parts into the latter parts of the module, so it becomes more clear what code is intended for use with ISM, and which parts are glue code for usage in the context of SMC-D. This is the fourth part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Separate SMC-D and ISM APIsStefan Raspl
We separate the code implementing the struct smcd_ops API in the ISM device driver from the functions that may be used by other exploiters of ISM devices. Note: We start out small, and don't offer the whole breadth of the ISM device for public use, as many functions are specific to or likely only ever used in the context of SMC-D. This is the third part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/smc: Register SMC-D as ISM clientStefan Raspl
Register the smc module with the new ism device driver API. This is the second part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25net/ism: Add new API for client registrationStefan Raspl
Add a new API that allows other drivers to concurrently access ISM devices. To do so, we introduce a new API that allows other modules to register for ISM device usage. Furthermore, we move the GID to struct ism, where it belongs conceptually, and rename and relocate struct smcd_event to struct ism_event. This is the first part of a bigger overhaul of the interfaces between SMC and ISM. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-25s390/ism: Introduce struct ism_dmbStefan Raspl
Conceptually, a DMB is a structure that belongs to ISM devices. However, SMC currently 'owns' this structure. So future exploiters of ISM devices would be forced to include SMC headers to work - which is just weird. Therefore, we switch ISM to struct ism_dmb, introduce a new public header with the definition (will be populated with further API calls later on), and, add a thin wrapper to please SMC. Since structs smcd_dmb and ism_dmb are identical, we can simply convert between the two for now. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-24bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listenerJakub Sitnicki
A listening socket linked to a sockmap has its sk_prot overridden. It points to one of the struct proto variants in tcp_bpf_prots. The variant depends on the socket's family and which sockmap programs are attached. A child socket cloned from a TCP listener initially inherits their sk_prot. But before cloning is finished, we restore the child's proto to the listener's original non-tcp_bpf_prots one. This happens in tcp_create_openreq_child -> tcp_bpf_clone. Today, in tcp_bpf_clone we detect if the child's proto should be restored by checking only for the TCP_BPF_BASE proto variant. This is not correct. The sk_prot of listening socket linked to a sockmap can point to to any variant in tcp_bpf_prots. If the listeners sk_prot happens to be not the TCP_BPF_BASE variant, then the child socket unintentionally is left if the inherited sk_prot by tcp_bpf_clone. This leads to issues like infinite recursion on close [1], because the child state is otherwise not set up for use with tcp_bpf_prot operations. Adjust the check in tcp_bpf_clone to detect all of tcp_bpf_prots variants. Note that it wouldn't be sufficient to check the socket state when overriding the sk_prot in tcp_bpf_update_proto in order to always use the TCP_BPF_BASE variant for listening sockets. Since commit b8b8315e39ff ("bpf, sockmap: Remove unhash handler for BPF sockmap usage") it is possible for a socket to transition to TCP_LISTEN state while already linked to a sockmap, e.g. connect() -> insert into map -> connect(AF_UNSPEC) -> listen(). [1]: https://lore.kernel.org/all/00000000000073b14905ef2e7401@google.com/ Fixes: e80251555f0b ("tcp_bpf: Don't let child socket inherit parent protocol ops on copy") Reported-by: syzbot+04c21ed96d861dccc5cd@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230113-sockmap-fix-v2-2-1e0ee7ac2f90@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-24bpf: Allow trusted args to walk struct when checking BTF IDsDavid Vernet
When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier currently enforces that the top-level type must match when calling the kfunc. In other words, the verifier does not allow the BPF program to pass a bitwise equivalent struct, despite it being allowed according to the C standard. For example, if you have the following type: struct nf_conn___init { struct nf_conn ct; }; The C standard stipulates that it would be safe to pass a struct nf_conn___init to a kfunc expecting a struct nf_conn. The verifier currently disallows this, however, as semantically kfuncs may want to enforce that structs that have equivalent types according to the C standard, but have different BTF IDs, are not able to be passed to kfuncs expecting one or the other. For example, struct nf_conn___init may not be queried / looked up, as it is allocated but may not yet be fully initialized. On the other hand, being able to pass types that are equivalent according to the C standard will be useful for other types of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a series of kfuncs will be added which allow programs to do bitwise queries on cpumasks that are either allocated by the program (in which case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as its first element), or a cpumask that was allocated by the main kernel (in which case it will just be a straight cpumask_t, as in task->cpus_ptr). Having the two types of cpumasks allows us to distinguish between the two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t cannot be. On the other hand, a struct bpf_cpumask can of course be queried in the exact same manner as a cpumask_t, with e.g. bpf_cpumask_test_cpu(). If we were to enforce that top level types match, then a user that's passing a struct bpf_cpumask to a read-only cpumask_t argument would have to cast with something like bpf_cast_to_kern_ctx() (which itself would need to be updated to expect the alias, and currently it only accommodates a single alias per prog type). Additionally, not specifying KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a struct bpf_cpumask *, and another as a struct cpumask * (i.e. cpumask_t). In order to enable this, this patch relaxes the constraint that a KF_TRUSTED_ARGS kfunc must have strict type matching, and instead only enforces strict type matching if a type is observed to be a "no-cast alias" (i.e., that the type names are equivalent, but one is suffixed with ___init). Additionally, in order to try and be conservative and match existing behavior / expectations, this patch also enforces strict type checking for acquire kfuncs. We were already enforcing it for release kfuncs, so this should also improve the consistency of the semantics for kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230120192523.3650503-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-24bpf: Enable annotating trusted nested pointersDavid Vernet
In kfuncs, a "trusted" pointer is a pointer that the kfunc can assume is safe, and which the verifier will allow to be passed to a KF_TRUSTED_ARGS kfunc. Currently, a KF_TRUSTED_ARGS kfunc disallows any pointer to be passed at a nonzero offset, but sometimes this is in fact safe if the "nested" pointer's lifetime is inherited from its parent. For example, the const cpumask_t *cpus_ptr field in a struct task_struct will remain valid until the task itself is destroyed, and thus would also be safe to pass to a KF_TRUSTED_ARGS kfunc. While it would be conceptually simple to enable this by using BTF tags, gcc unfortunately does not yet support this. In the interim, this patch enables support for this by using a type-naming convention. A new BTF_TYPE_SAFE_NESTED macro is defined in verifier.c which allows a developer to specify the nested fields of a type which are considered trusted if its parent is also trusted. The verifier is also updated to account for this. A patch with selftests will be added in a follow-on change, along with documentation for this feature. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230120192523.3650503-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-24ipv6: Make ip6_route_output_flags_noref() static.Guillaume Nault
This function is only used in net/ipv6/route.c and has no reason to be visible outside of it. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/50706db7f675e40b3594d62011d9363dce32b92e.1674495822.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-24Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six fixes, all in drivers. The biggest are the UFS devfreq fixes which address a lock inversion and the two iscsi_tcp fixes which try to prevent a use after free from userspace still accessing an area which the kernel has released (seen by KASAN)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: device_handler: alua: Remove a might_sleep() annotation scsi: iscsi_tcp: Fix UAF during login when accessing the shost ipaddress scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress scsi: ufs: core: Fix devfreq deadlocks scsi: hpsa: Fix allocation size for scsi_host_alloc() scsi: target: core: Fix warning on RT kernels
2023-01-24netlink: fix spelling mistake in dump size assertJakub Kicinski
Commit 2c7bc10d0f7b ("netlink: add macro for checking dump ctx size") misspelled the name of the assert as asset, missing an R. Reported-by: Ido Schimmel <idosch@idosch.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20230123222224.732338-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-24thermal: ACPI: Add ACPI trip point routinesRafael J. Wysocki
Add library routines to populate a generic thermal trip point structure with data obtained by evaluating a specific object in the ACPI Namespace. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-01-24Merge back other thermal control material for 6.3.Rafael J. Wysocki
* thermal: (734 commits) thermal: core: call put_device() only after device_register() fails Linux 6.2-rc4 kbuild: Fix CFI hash randomization with KASAN firmware: coreboot: Check size of table entry and use flex-array kallsyms: Fix scheduling with interrupts disabled in self-test ata: pata_cs5535: Don't build on UML lockref: stop doing cpu_relax in the cmpxchg loop x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space efi: tpm: Avoid READ_ONCE() for accessing the event log io_uring: lock overflowing for IOPOLL ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAF iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() iommu/iova: Fix alloc iova overflows issue iommu: Fix refcount leak in iommu_device_claim_dma_owner iommu/arm-smmu-v3: Don't unregister on shutdown iommu/arm-smmu: Don't unregister on shutdown iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even betterer platform/x86: thinkpad_acpi: Fix profile mode display in AMT mode ALSA: usb-audio: Fix possible NULL pointer dereference in snd_usb_pcm_has_fixed_rate() platform/x86: int3472/discrete: Ensure the clk/power enable pins are in output mode ...
2023-01-24platform/x86: apple-gmux: Add apple_gmux_detect() helperHans de Goede
Add a new (static inline) apple_gmux_detect() helper to apple-gmux.h which can be used for gmux detection instead of apple_gmux_present(). The latter is not really reliable since an ACPI device with a HID of APP000B is present on some devices without a gmux at all, as well as on devices with a newer (unsupported) MMIO based gmux model. This causes apple_gmux_present() to return false-positives on a number of different Apple laptop models. This new helper uses the same probing as the actual apple-gmux driver, so that it does not return false positives. To avoid code duplication the gmux_probe() function of the actual driver is also moved over to using the new apple_gmux_detect() helper. This avoids false positives (vs _HID + IO region detection) on: MacBookPro5,4 https://pastebin.com/8Xjq7RhS MacBookPro8,1 https://linux-hardware.org/?probe=e513cfbadb&log=dmesg MacBookPro9,2 https://bugzilla.kernel.org/attachment.cgi?id=278961 MacBookPro10,2 https://lkml.org/lkml/2014/9/22/657 MacBookPro11,2 https://forums.fedora-fr.org/viewtopic.php?id=70142 MacBookPro11,4 https://raw.githubusercontent.com/im-0/investigate-card-reader-suspend-problem-on-mbp11.4/master/test-16/dmesg Fixes: 21245df307cb ("ACPI: video: Add Apple GMUX brightness control detection") Link: https://lore.kernel.org/platform-driver-x86/20230123113750.462144-1-hdegoede@redhat.com/ Reported-by: Emmanouil Kouroupakis <kartebi@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230124105754.62167-3-hdegoede@redhat.com
2023-01-24platform/x86: apple-gmux: Move port defines to apple-gmux.hHans de Goede
This is a preparation patch for adding a new static inline apple_gmux_detect() helper which actually checks a supported gmux is present, rather then only checking an ACPI device with the HID is there as apple_gmux_present() does. Fixes: 21245df307cb ("ACPI: video: Add Apple GMUX brightness control detection") Link: https://lore.kernel.org/platform-driver-x86/20230123113750.462144-1-hdegoede@redhat.com/ Reported-by: Emmanouil Kouroupakis <kartebi@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230124105754.62167-2-hdegoede@redhat.com
2023-01-24Compiler attributes: GCC cold function alignment workaroundsMark Rutland
Contemporary versions of GCC (e.g. GCC 12.2.0) drop the alignment specified by '-falign-functions=N' for functions marked with the __cold__ attribute, and potentially for callees of __cold__ functions as these may be implicitly marked as __cold__ by the compiler. LLVM appears to respect '-falign-functions=N' in such cases. This has been reported to GCC in bug 88345: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 ... which also covers alignment being dropped when '-Os' is used, which will be addressed in a separate patch. Currently, use of '-falign-functions=N' is limited to CONFIG_FUNCTION_ALIGNMENT, which is largely used for performance and/or analysis reasons (e.g. with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B), but isn't necessary for correct functionality. However, this dropped alignment isn't great for the performance and/or analysis cases. Subsequent patches will use CONFIG_FUNCTION_ALIGNMENT as part of arm64's ftrace implementation, which will require all instrumented functions to be aligned to at least 8-bytes. This patch works around the dropped alignment by avoiding the use of the __cold__ attribute when CONFIG_FUNCTION_ALIGNMENT is non-zero, and by specifically aligning abort(), which GCC implicitly marks as __cold__. As the __cold macro is now dependent upon config options (which is against the policy described at the top of compiler_attributes.h), it is moved into compiler_types.h. I've tested this by building and booting a kernel configured with defconfig + CONFIG_EXPERT=y + CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y, and looking for misaligned text symbols in /proc/kallsyms: * arm64: Before: # uname -rm 6.2.0-rc3 aarch64 # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 5009 After: # uname -rm 6.2.0-rc3-00001-g2a2bedf8bfa9 aarch64 # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 919 * x86_64: Before: # uname -rm 6.2.0-rc3 x86_64 # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 11537 After: # uname -rm 6.2.0-rc3-00001-g2a2bedf8bfa9 x86_64 # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l 2805 There's clearly a substantial reduction in the number of misaligned symbols. From manual inspection, the remaining unaligned text labels are a combination of ACPICA functions (due to the use of '-Os'), static call trampolines, and non-function labels in assembly, which will be dealt with in subsequent patches. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Florent Revest <revest@chromium.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Deacon <will@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230123134603.1064407-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPSMark Rutland
Architectures without dynamic ftrace trampolines incur an overhead when multiple ftrace_ops are enabled with distinct filters. in these cases, each call site calls a common trampoline which uses ftrace_ops_list_func() to iterate over all enabled ftrace functions, and so incurs an overhead relative to the size of this list (including RCU protection overhead). Architectures with dynamic ftrace trampolines avoid this overhead for call sites which have a single associated ftrace_ops. In these cases, the dynamic trampoline is customized to branch directly to the relevant ftrace function, avoiding the list overhead. On some architectures it's impractical and/or undesirable to implement dynamic ftrace trampolines. For example, arm64 has limited branch ranges and cannot always directly branch from a call site to an arbitrary address (e.g. from a kernel text address to an arbitrary module address). Calls from modules to core kernel text can be indirected via PLTs (allocated at module load time) to address this, but the same is not possible from calls from core kernel text. Using an indirect branch from a call site to an arbitrary trampoline is possible, but requires several more instructions in the function prologue (or immediately before it), and/or comes with far more complex requirements for patching. Instead, this patch adds a new option, where an architecture can associate each call site with a pointer to an ftrace_ops, placed at a fixed offset from the call site. A shared trampoline can recover this pointer and call ftrace_ops::func() without needing to go via ftrace_ops_list_func(), avoiding the associated overhead. This avoids issues with branch range limitations, and avoids the need to allocate and manipulate dynamic trampolines, making it far simpler to implement and maintain, while having similar performance characteristics. Note that this allows for dynamic ftrace_ops to be invoked directly from an architecture's ftrace_caller trampoline, whereas existing code forces the use of ftrace_ops_get_list_func(), which is in part necessary to permit the ftrace_ops to be freed once unregistered *and* to avoid branch/address-generation range limitation on some architectures (e.g. where ops->func is a module address, and may be outside of the direct branch range for callsites within the main kernel image). The CALL_OPS approach avoids this problems and is safe as: * The existing synchronization in ftrace_shutdown() using ftrace_shutdown() using synchronize_rcu_tasks_rude() (and synchronize_rcu_tasks()) ensures that no tasks hold a stale reference to an ftrace_ops (e.g. in the middle of the ftrace_caller trampoline, or while invoking ftrace_ops::func), when that ftrace_ops is unregistered. Arguably this could also be relied upon for the existing scheme, permitting dynamic ftrace_ops to be invoked directly when ops->func is in range, but this will require additional logic to handle branch range limitations, and is not handled by this patch. * Each callsite's ftrace_ops pointer literal can hold any valid kernel address, and is updated atomically. As an architecture's ftrace_caller trampoline will atomically load the ops pointer then dereference ops->func, there is no risk of invoking ops->func with a mismatches ops pointer, and updates to the ops pointer do not require special care. A subsequent patch will implement architectures support for arm64. There should be no functional change as a result of this patch alone. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Florent Revest <revest@chromium.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230123134603.1064407-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24drm/fb-helper: Use a per-driver FB deferred I/O handlerJavier Martinez Canillas
The DRM fbdev emulation layer sets the struct fb_info .fbdefio field to a struct fb_deferred_io pointer, that is shared across all drivers that use the generic drm_fbdev_generic_setup() helper function. It is a problem because the fbdev core deferred I/O logic assumes that the struct fb_deferred_io data is not shared between devices, and it's stored there state such as the list of pages touched and a mutex that is use to synchronize between the fb_deferred_io_track_page() function that track the dirty pages and fb_deferred_io_work() workqueue handler doing the actual deferred I/O. The latter can lead to the following error, since it may happen that two drivers are probed and then one is removed, which causes the mutex bo be destroyed and not existing anymore by the time the other driver tries to grab it for the fbdev deferred I/O logic: [ 369.756553] ------------[ cut here ]------------ [ 369.756604] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 369.756631] WARNING: CPU: 2 PID: 1023 at kernel/locking/mutex.c:582 __mutex_lock+0x348/0x424 [ 369.756744] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ip v6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr btsdio bluetooth sunrpc brcmfmac snd_soc_hdmi_codec cpufreq_dt cfg80211 vfat fat vc4 rfkill brcmutil raspberrypi_cpufreq i2c_bcm2835 iproc_rng200 bcm2711_thermal snd_soc_core snd_pcm_dmaen gine leds_gpio nvmem_rmem joydev hid_cherry uas usb_storage gpio_raspberrypi_exp v3d snd_pcm raspberrypi_hwmon gpu_sched bcm2835_wdt broadcom bcm_phy_lib snd_timer genet snd mdio_bcm_unimac clk_bcm2711_dvp soundcore drm_display_helper pci e_brcmstb cec ip6_tables ip_tables fuse [ 369.757400] CPU: 2 PID: 1023 Comm: fbtest Not tainted 5.19.0-rc6+ #94 [ 369.757455] Hardware name: raspberrypi,4-model-b Raspberry Pi 4 Model B Rev 1.4/Raspberry Pi 4 Model B Rev 1.4, BIOS 2022.10 10/01/2022 [ 369.757538] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 369.757596] pc : __mutex_lock+0x348/0x424 [ 369.757635] lr : __mutex_lock+0x348/0x424 [ 369.757672] sp : ffff80000953bb00 [ 369.757703] x29: ffff80000953bb00 x28: ffff17fdc087c000 x27: 0000000000000002 [ 369.757771] x26: ffff17fdc349f9b0 x25: fffffc5ff72e0100 x24: 0000000000000000 [ 369.757838] x23: 0000000000000000 x22: 0000000000000002 x21: ffffa618df636f10 [ 369.757903] x20: ffff80000953bb68 x19: ffffa618e0f18138 x18: 0000000000000001 [ 369.757968] x17: 0000000020000000 x16: 0000000000000002 x15: 0000000000000000 [ 369.758032] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47 [ 369.758097] x11: 00000000ffffdfff x10: ffffa618e0c79f88 x9 : ffffa618de472484 [ 369.758162] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8 [ 369.758227] x5 : 0000000000001fff x4 : 0000000000000000 x3 : 0000000000000027 [ 369.758292] x2 : 0000000000000001 x1 : ffff17fdc087c000 x0 : 0000000000000028 [ 369.758357] Call trace: [ 369.758383] __mutex_lock+0x348/0x424 [ 369.758420] mutex_lock_nested+0x4c/0x5c [ 369.758459] fb_deferred_io_mkwrite+0x78/0x1d8 [ 369.758507] do_page_mkwrite+0x5c/0x19c [ 369.758550] wp_page_shared+0x70/0x1a0 [ 369.758590] do_wp_page+0x3d0/0x510 [ 369.758628] handle_pte_fault+0x1c0/0x1e0 [ 369.758670] __handle_mm_fault+0x250/0x380 [ 369.758712] handle_mm_fault+0x17c/0x3a4 [ 369.758753] do_page_fault+0x158/0x530 [ 369.758792] do_mem_abort+0x50/0xa0 [ 369.758831] el0_da+0x78/0x19c [ 369.758864] el0t_64_sync_handler+0xbc/0x150 [ 369.758904] el0t_64_sync+0x190/0x194 [ 369.758942] irq event stamp: 11395 [ 369.758973] hardirqs last enabled at (11395): [<ffffa618de472554>] __up_console_sem+0x74/0x80 [ 369.759042] hardirqs last disabled at (11394): [<ffffa618de47254c>] __up_console_sem+0x6c/0x80 [ 369.760554] softirqs last enabled at (11392): [<ffffa618de330a74>] __do_softirq+0x4c4/0x6b8 [ 369.762060] softirqs last disabled at (11383): [<ffffa618de3c9124>] __irq_exit_rcu+0x104/0x214 [ 369.763564] ---[ end trace 0000000000000000 ]--- Fixes: d536540f304c ("drm/fb-helper: Add generic fbdev emulation .fb_probe function") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230121192418.2814955-4-javierm@redhat.com
2023-01-24net: fou: regenerate the uAPI from the specJakub Kicinski
Regenerate the FOU uAPI header from the YAML spec. The flags now come before attributes which use them, and the comments for type disappear (coders should look at the spec instead). Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-24netfilter: conntrack: unify established states for SCTP pathsSriram Yagnaraman
An SCTP endpoint can start an association through a path and tear it down over another one. That means the initial path will not see the shutdown sequence, and the conntrack entry will remain in ESTABLISHED state for 5 days. By merging the HEARTBEAT_ACKED and ESTABLISHED states into one ESTABLISHED state, there remains no difference between a primary or secondary path. The timeout for the merged ESTABLISHED state is set to 210 seconds (hb_interval * max_path_retrans + rto_max). So, even if a path doesn't see the shutdown sequence, it will expire in a reasonable amount of time. With this change in place, there is now more than one state from which we can transition to ESTABLISHED, COOKIE_ECHOED and HEARTBEAT_SENT, so handle the setting of ASSURED bit whenever a state change has happened and the new state is ESTABLISHED. Removed the check for dir==REPLY since the transition to ESTABLISHED can happen only in the reply direction. Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-01-24Revert "netfilter: conntrack: add sctp DATA_SENT state"Sriram Yagnaraman
This reverts commit (bff3d0534804: "netfilter: conntrack: add sctp DATA_SENT state") Using DATA/SACK to detect a new connection on secondary/alternate paths works only on new connections, while a HEARTBEAT is required on connection re-use. It is probably consistent to wait for HEARTBEAT to create a secondary connection in conntrack. Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-01-23Merge tag 'wireless-next-2023-01-23' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.3 First set of patches for v6.3. The most important change here is that the old Wireless Extension user space interface is not supported on Wi-Fi 7 devices at all. We also added a warning if anyone with modern drivers (ie. cfg80211 and mac80211 drivers) tries to use Wireless Extensions, everyone should switch to using nl80211 interface instead. Static WEP support is removed, there wasn't any driver using that anyway so there's no user impact. Otherwise it's smaller features and fixes as usual. Note: As mt76 had tricky conflicts due to the fixes in wireless tree, we decided to merge wireless into wireless-next to solve them easily. There should not be any merge problems anymore. Major changes: cfg80211 - remove never used static WEP support - warn if Wireless Extention interface is used with cfg80211/mac80211 drivers - stop supporting Wireless Extensions with Wi-Fi 7 devices - support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate reporting rfkill - add GPIO DT support bitfield - add FIELD_PREP_CONST() mt76 - per-PHY LED support rtw89 - support new Bluetooth co-existance version rtl8xxxu - support RTL8188EU * tag 'wireless-next-2023-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (123 commits) wifi: wireless: deny wireless extensions on MLO-capable devices wifi: wireless: warn on most wireless extension usage wifi: mac80211: drop extra 'e' from ieeee80211... name wifi: cfg80211: Deduplicate certificate loading bitfield: add FIELD_PREP_CONST() wifi: mac80211: add kernel-doc for EHT structure mac80211: support minimal EHT rate reporting on RX wifi: mac80211: Add HE MU-MIMO related flags in ieee80211_bss_conf wifi: mac80211: Add VHT MU-MIMO related flags in ieee80211_bss_conf wifi: cfg80211: Use MLD address to indicate MLD STA disconnection wifi: cfg80211: Support 32 bytes KCK key in GTK rekey offload wifi: cfg80211: Fix extended KCK key length check in nl80211_set_rekey_data() wifi: cfg80211: remove support for static WEP wifi: rtl8xxxu: Dump the efuse only for untested devices wifi: rtl8xxxu: Print the ROM version too wifi: rtw88: Use non-atomic sta iterator in rtw_ra_mask_info_update() wifi: rtw88: Use rtw_iterate_vifs() for rtw_vif_watch_dog_iter() wifi: rtw88: Move register access from rtw_bf_assoc() outside the RCU wifi: rtl8xxxu: Use a longer retry limit of 48 wifi: rtl8xxxu: Report the RSSI to the firmware ... ==================== Link: https://lore.kernel.org/r/20230123103338.330CBC433EF@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-23xsk: Add cb area to struct xdp_buff_xskToke Høiland-Jørgensen
Add an area after the xdp_buff in struct xdp_buff_xsk that drivers can use to stash extra information to use in metadata kfuncs. The maximum size of 24 bytes means the full xdp_buff_xsk structure will take up exactly two cache lines (with the cb field spanning both). Also add a macro drivers can use to check their own wrapping structs against the available size. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-15-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23Merge back thermal control material for 6.3.Rafael J. Wysocki
2023-01-23bpf: Support consuming XDP HW metadata from fext programsToke Høiland-Jørgensen
Instead of rejecting the attaching of PROG_TYPE_EXT programs to XDP programs that consume HW metadata, implement support for propagating the offload information. The extension program doesn't need to set a flag or ifindex, these will just be propagated from the target by the verifier. We need to create a separate offload object for the extension program, though, since it can be reattached to a different program later (which means we can't just inherit the offload information from the target). An additional check is added on attach that the new target is compatible with the offload information in the extension prog. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-9-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23bpf: XDP metadata RX kfuncsStanislav Fomichev
Define a new kfunc set (xdp_metadata_kfunc_ids) which implements all possible XDP metatada kfuncs. Not all devices have to implement them. If kfunc is not supported by the target device, the default implementation is called instead. The verifier, at load time, replaces a call to the generic kfunc with a call to the per-device one. Per-device kfunc pointers are stored in separate struct xdp_metadata_ops. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-8-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23bpf: Introduce device-bound XDP programsStanislav Fomichev
New flag BPF_F_XDP_DEV_BOUND_ONLY plus all the infra to have a way to associate a netdev with a BPF program at load time. netdevsim checks are dropped in favor of generic check in dev_xdp_attach. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-6-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloadedStanislav Fomichev
BPF offloading infra will be reused to implement bound-but-not-offloaded bpf programs. Rename existing helpers for clarity. No functional changes. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-3-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23x86/resctrl: Add interface to write mbm_total_bytes_configBabu Moger
The event configuration for mbm_total_bytes can be changed by the user by writing to the file /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config. The event configuration settings are domain specific and affect all the CPUs in the domain. Following are the types of events supported: ==== =========================================================== Bits Description ==== =========================================================== 6 Dirty Victims from the QOS domain to all types of memory 5 Reads to slow memory in the non-local NUMA domain 4 Reads to slow memory in the local NUMA domain 3 Non-temporal writes to non-local NUMA domain 2 Non-temporal writes to local NUMA domain 1 Reads to memory in the non-local NUMA domain 0 Reads to memory in the local NUMA domain ==== =========================================================== For example: To change the mbm_total_bytes to count only reads on domain 0, the bits 0, 1, 4 and 5 needs to be set, which is 110011b (in hex 0x33). Run the command: $echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config To change the mbm_total_bytes to count all the slow memory reads on domain 1, the bits 4 and 5 needs to be set which is 110000b (in hex 0x30). Run the command: $echo 1=0x30 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20230113152039.770054-12-babu.moger@amd.com
2023-01-23net: mscc: ocelot: add MAC Merge layer support for VSC9959Vladimir Oltean
Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers (ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and pMAC statistics are already in the Ocelot switch lib even if just Felix supports them, I'm adding support for the whole MAC Merge layer in the common Ocelot library too. There is an interrupt (shared with the PTP interrupt) which signals changes to the MM verification state. This is done because the preemptible traffic classes should be committed to hardware only once the verification procedure has declared the link partner of being capable of receiving preemptible frames. We implement ethtool getters and setters for the MAC Merge layer state. The "TX enabled" and "verify status" are taken from the IRQ handler, using a mutex to ensure serialized access. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959Vladimir Oltean
The Felix VSC9959 switch supports frame preemption and has a MAC Merge layer. In addition to the structured stats that exist for the eMAC, export the counters associated with its pMAC (pause, RMON, MAC, PHY, control) plus the high-level MAC Merge layer stats. The unstructured ethtool counters, as well as the rtnl_link_stats64 were left to report only the eMAC counters. Because statistics processing is quite self-contained in ocelot_stats.c now, I've opted for introducing an ocelot->mm_supported bool, based on which the common switch lib does everything, rather than pushing the TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: dsa: add plumbing for changing and getting MAC merge layer stateVladimir Oltean
The DSA core is in charge of the ethtool_ops of the net devices associated with switch ports, so in case a hardware driver supports the MAC merge layer, DSA must pass the callbacks through to the driver. Add support for precisely that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: ethtool: add helpers for MM fragment size translationVladimir Oltean
We deliberately make the Linux UAPI pass the minimum fragment size in octets, even though IEEE 802.3 defines it as discrete values, and addFragSize is just the multiplier. This is because there is nothing impossible in operating with an in-between value for the fragment size of non-final preempted fragments, and there may even appear hardware which supports the in-between sizes. For the hardware which just understands the addFragSize multiplier, create two helpers which translate back and forth the values passed in octets. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: ethtool: add helpers for aggregate statisticsVladimir Oltean
When a pMAC exists but the driver is unable to atomically query the aggregate eMAC+pMAC statistics, the user should be given back at least the sum of eMAC and pMAC counters queried separately. This is a generic problem, so add helpers in ethtool to do this operation, if the driver doesn't have a better way to report aggregate stats. Do this in a way that does not require changes to these functions when new stats are added (basically treat the structures as an array of u64 values, except for the first element which is the stats source). In include/linux/ethtool.h, there is already a section where helper function prototypes should be placed. The trouble is, this section is too early, before the definitions of struct ethtool_eth_mac_stats et.al. Move that section at the end and append these new helpers to it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC)Vladimir Oltean
IEEE 802.3-2018 clause 99 defines a MAC Merge sublayer which contains an Express MAC and a Preemptible MAC. Both MACs are hidden to higher and lower layers and visible as a single MAC (packet classification to eMAC or pMAC on TX is done based on priority; classification on RX is done based on SFD). For devices which support a MAC Merge sublayer, it is desirable to retrieve individual packet counters from the eMAC and the pMAC, as well as aggregate statistics (their sum). Introduce a new ETHTOOL_A_STATS_SRC attribute which is part of the policy of ETHTOOL_MSG_STATS_GET and, and an ETHTOOL_A_PAUSE_STATS_SRC which is part of the policy of ETHTOOL_MSG_PAUSE_GET (accepted when ETHTOOL_FLAG_STATS is set in the common ethtool header). Both of these take values from enum ethtool_mac_stats_src, defaulting to "aggregate" in the absence of the attribute. Existing drivers do not need to pay attention to this enum which was added to all driver-facing structures, just the ones which report the MAC merge layer as supported. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net: ethtool: add support for MAC Merge layerVladimir Oltean
The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23net/sock: Introduce trace_sk_data_ready()Peilin Ye
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready() callback implementations. For example: <...> iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable <...> Suggested-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-21Merge tag 'usb-6.2-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are a number of small USB and Thunderbolt driver fixes and new device id changes for 6.2-rc5. Included in here are: - thunderbolt bugfixes for reported problems - new usb-serial driver ids added - onboard_hub usb driver fixes for much-reported problems - xhci bugfixes - typec bugfixes - ehci-fsl driver module alias fix - iowarrior header size fix - usb gadget driver fixes All of these, except for the iowarrior fix, have been in linux-next with no reported issues. The iowarrior fix passed the 0-day testing and is a one digit change based on a reported problem in the driver (which was written to a spec, not the real device that is now available)" * tag 'usb-6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (40 commits) USB: misc: iowarrior: fix up header size for USB_DEVICE_ID_CODEMERCS_IOW100 usb: host: ehci-fsl: Fix module alias usb: dwc3: fix extcon dependency usb: core: hub: disable autosuspend for TI TUSB8041 USB: fix misleading usb_set_intfdata() kernel doc usb: gadget: f_ncm: fix potential NULL ptr deref in ncm_bitrate() USB: gadget: Add ID numbers to configfs-gadget driver names usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail usb: gadget: g_webcam: Send color matching descriptor per frame usb: typec: altmodes/displayport: Use proper macro for pin assignment check usb: typec: altmodes/displayport: Fix pin assignment calculation usb: typec: altmodes/displayport: Add pin assignment helper usb: gadget: f_fs: Ensure ep0req is dequeued before free_request usb: gadget: f_fs: Prevent race during ffs_ep0_queue_wait usb: misc: onboard_hub: Move 'attach' work to the driver usb: misc: onboard_hub: Invert driver registration order usb: ucsi: Ensure connector delayed work items are flushed usb: musb: fix error return code in omap2430_probe() usb: chipidea: core: fix possible constant 0 if use IS_ERR(ci->role_switch) xhci: Detect lpm incapable xHC USB3 roothub ports from ACPI tables ...
2023-01-21batman-adv: tvlv: prepare for tvlv enabled multicast packet typeLinus Lüssing
Prepare TVLV infrastructure for more packet types, in particular the upcoming batman-adv multicast packet type. For that swap the OGM vs. unicast-tvlv packet boolean indicator to an explicit unsigned integer packet type variable. And provide the skb to a call to batadv_tvlv_containers_process(), as later the multicast packet's TVLV handler will need to have access not only to the TVLV but the full skb for forwarding. Forwarding will be invoked from the multicast packet's TVLVs' contents later. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2023-01-20ptp_qoriq: fix latency in ptp_qoriq_adjtime() operationNikhil Gupta
1588 driver loses about 1us in adjtime operation at PTP slave This is because adjtime operation uses a slow non-atomic tmr_cnt_read() followed by tmr_cnt_write() operation. In the above sequence, since the timer counter operation keeps incrementing, it leads to latency. The tmr_offset register (which is added to TMR_CNT_H/L register giving the current time) must be programmed with the delta nanoseconds. Signed-off-by: Nikhil Gupta <nikhil.gupta@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230119204034.7969-1-nikhil.gupta@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-20net: mana: Fix IRQ name - add PCI and queue numberHaiyang Zhang
The PCI and queue number info is missing in IRQ names. Add PCI and queue number to IRQ names, to allow CPU affinity tuning scripts to work. Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/1674161950-19708-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-20net: mdio: Remove support for building C45 muxed addressesAndrew Lunn
The old way of performing a C45 bus transfer created a special register value and passed it to the MDIO bus driver, in the hope it would see the MII_ADDR_C45 bit set, and perform a C45 transfer. Now that there is a clear separation of C22 and C45, this scheme is no longer used. Remove all the #defines and helpers, to prevent any code being added which tries to use it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-20bpf: Invalidate slices on destruction of dynptrs on stackKumar Kartikeya Dwivedi
The previous commit implemented destroy_if_dynptr_stack_slot. It destroys the dynptr which given spi belongs to, but still doesn't invalidate the slices that belong to such a dynptr. While for the case of referenced dynptr, we don't allow their overwrite and return an error early, we still allow it and destroy the dynptr for unreferenced dynptr. To be able to enable precise and scoped invalidation of dynptr slices in this case, we must be able to associate the source dynptr of slices that have been obtained using bpf_dynptr_data. When doing destruction, only slices belonging to the dynptr being destructed should be invalidated, and nothing else. Currently, dynptr slices belonging to different dynptrs are indistinguishible. Hence, allocate a unique id to each dynptr (CONST_PTR_TO_DYNPTR and those on stack). This will be stored as part of reg->id. Whenever using bpf_dynptr_data, transfer this unique dynptr id to the returned PTR_TO_MEM_OR_NULL slice pointer, and store it in a new per-PTR_TO_MEM dynptr_id register state member. Finally, after establishing such a relationship between dynptrs and their slices, implement precise invalidation logic that only invalidates slices belong to the destroyed dynptr in destroy_if_dynptr_stack_slot. Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ipa/ipa_interrupt.c drivers/net/ipa/ipa_interrupt.h 9ec9b2a30853 ("net: ipa: disable ipa interrupt during suspend") 8e461e1f092b ("net: ipa: introduce ipa_interrupt_enable()") d50ed3558719 ("net: ipa: enable IPA interrupt handlers separate from registration") https://lore.kernel.org/all/20230119114125.5182c7ab@canb.auug.org.au/ https://lore.kernel.org/all/79e46152-8043-a512-79d9-c3b905462774@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-20Merge tag 'soc-fixes-6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC DT and driver fixes from Arnd Bergmann: "Lots of dts fixes for Qualcomm Snapdragon and NXP i.MX platforms, including: - A regression fix for SDHCI controllers on Inforce 6540, and another SDHCI fix on SM8350 - Reenable cluster idle on sm8250 after the the code fix is upstream - multiple fixes for the QMP PHY binding, needing an incompatible dt change - The reserved memory map is updated on Xiaomi Mi 4C and Huawei Nexus 6P, to avoid instabilities caused by use of protected memory regions - Fix i.MX8MP DT for missing GPC Interrupt, power-domain typo and USB clock error - A couple of verdin-imx8mm DT fixes for audio playback support - Fix pca9547 i2c-mux node name for i.MX and Vybrid device trees - Fix an imx93-11x11-evk uSDHC pad setting problem that causes Micron eMMC CMD8 CRC error in HS400ES/HS400 mode The remaining ARM and RISC-V platforms only have very few smaller dts bugfixes this time: - A fix for the SiFive unmatched board's PCI memory space - A revert to fix a regression with GPIO on Marvell Armada - A fix for the UART address on Marvell AC5 - Missing chip-select phandles for stm32 boards - Selecting the correct clock for the sam9x60 memory controller - Amlogic based Odroid-HC4 needs a revert to restore USB functionality. And finally, there are some minor code fixes: - Build fixes for OMAP1, pxa, riscpc, raspberry pi firmware, and zynq firmware - memory controller driver fixes for an OMAP regression and older bugs on tegra, atmel and mvebu - reset controller fixes for ti-sci and uniphier platforms - ARM SCMI firmware fixes for a couple of rare corner cases - Qualcomm platform driver fixes for incorrect error handling and a backwards compatibility fix for the apr driver using older dtb - NXP i.MX SoC driver fixes for HDMI output, error handling in the imx8 soc-id and missing reference counting on older cpuid code" * tag 'soc-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (60 commits) firmware: zynqmp: fix declarations for gcc-13 ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp151a-prtt1l ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp157c-emstamp-argon ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp15xx-dhcom-som ARM: dts: stm32: Fix qspi pinctrl phandle for stm32mp15xx-dhcor-som ARM: dts: at91: sam9x60: fix the ddr clock for sam9x60 ARM: omap1: fix building gpio15xx ARM: omap1: fix !ARCH_OMAP1_ANY link failures firmware: raspberrypi: Fix type assignment arm64: dts: qcom: msm8992-libra: Fix the memory map arm64: dts: qcom: msm8992: Don't use sfpb mutex PM: AVS: qcom-cpr: Fix an error handling path in cpr_probe() arm64: dts: msm8994-angler: fix the memory map arm64: dts: marvell: AC5/AC5X: Fix address for UART1 ARM: footbridge: drop unnecessary inclusion Revert "ARM: dts: armada-39x: Fix compatible string for gpios" Revert "ARM: dts: armada-38x: Fix compatible string for gpios" ARM: pxa: enable PXA310/PXA320 for DT-only build riscv: dts: sifive: fu740: fix size of pcie 32bit memory soc: qcom: apr: Make qcom,protection-domain optional again ...
2023-01-20Merge tag 'net-6.2-rc5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless, bluetooth, bpf and netfilter. Current release - regressions: - Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf", fix nsna_ping mode of team - wifi: mt76: fix bugs in Rx queue handling and DMA mapping - eth: mlx5: - add missing mutex_unlock in error reporter - protect global IPsec ASO with a lock Current release - new code bugs: - rxrpc: fix wrong error return in rxrpc_connect_call() Previous releases - regressions: - bluetooth: hci_sync: fix use of HCI_OP_LE_READ_BUFFER_SIZE_V2 - wifi: - mac80211: fix crashes on Rx due to incorrect initialization of rx->link and rx->link_sta - mac80211: fix bugs in iTXQ conversion - Tx stalls, incorrect aggregation handling, crashes - brcmfmac: fix regression for Broadcom PCIe wifi devices - rndis_wlan: prevent buffer overflow in rndis_query_oid - netfilter: conntrack: handle tcp challenge acks during connection reuse - sched: avoid grafting on htb_destroy_class_offload when destroying - virtio-net: correctly enable callback during start_xmit, fix stalls - tcp: avoid the lookup process failing to get sk in ehash table - ipa: disable ipa interrupt during suspend - eth: stmmac: enable all safety features by default Previous releases - always broken: - bpf: - fix pointer-leak due to insufficient speculative store bypass mitigation (Spectre v4) - skip task with pid=1 in send_signal_common() to avoid a splat - fix BPF program ID information in BPF_AUDIT_UNLOAD as well as PERF_BPF_EVENT_PROG_UNLOAD events - fix potential deadlock in htab_lock_bucket from same bucket index but different map_locked index - bluetooth: - fix a buffer overflow in mgmt_mesh_add() - hci_qca: fix driver shutdown on closed serdev - ISO: fix possible circular locking dependency - CIS: hci_event: fix invalid wait context - wifi: brcmfmac: fixes for survey dump handling - mptcp: explicitly specify sock family at subflow creation time - netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits - tcp: fix rate_app_limited to default to 1 - l2tp: close all race conditions in l2tp_tunnel_register() - eth: mlx5: fixes for QoS config and eswitch configuration - eth: enetc: avoid deadlock in enetc_tx_onestep_tstamp() - eth: stmmac: fix invalid call to mdiobus_get_phy() Misc: - ethtool: add netlink attr in rss get reply only if the value is not empty" * tag 'net-6.2-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits) Revert "Merge branch 'octeontx2-af-CPT'" tcp: fix rate_app_limited to default to 1 bnxt: Do not read past the end of test names net: stmmac: enable all safety features by default octeontx2-af: add mbox to return CPT_AF_FLT_INT info octeontx2-af: update cpt lf alloc mailbox octeontx2-af: restore rxc conf after teardown sequence octeontx2-af: optimize cpt pf identification octeontx2-af: modify FLR sequence for CPT octeontx2-af: add mbox for CPT LF reset octeontx2-af: recover CPT engine when it gets fault net: dsa: microchip: ksz9477: port map correction in ALU table entry register selftests/net: toeplitz: fix race on tpacket_v3 block close net/ulp: use consistent error code when blocking ULP octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt tcp: avoid the lookup process failing to get sk in ehash table Revert "net: team: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf" MAINTAINERS: add networking entries for Willem net: sched: gred: prevent races when adding offloads to stats l2tp: prevent lockdep issue in l2tp_tunnel_register() ...
2023-01-20ACPI: utils: Add acpi_evaluate_dsm_typed() and acpi_check_dsm() stubsAndy Shevchenko
When the ACPI part of a driver is optional the methods used in it are expected to be available even if CONFIG_ACPI=n. This is not the case for _DSM related methods. Add stubs for acpi_evaluate_dsm_typed() and acpi_check_dsm() methods. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-20arm64/sme: Implement ZT0 ptrace supportMark Brown
Implement support for a new note type NT_ARM64_ZT providing access to ZT0 when implemented. Since ZT0 is a register with constant size this is much simpler than for other SME state. As ZT0 is only accessible when PSTATE.ZA is set writes to ZT0 cause PSTATE.ZA to be set, the main alternative would be to return -EBUSY in this case but this seemed more constructive. Practical users are also going to be working with ZA anyway and have some understanding of the state. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-12-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>