summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-10-08net: dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged portsVladimir Oltean
Similar to commit 6087175b7991 ("net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges"), software forwarding between an unoffloaded LAG port (a bonding interface with an unsupported policy) and a mv88e6xxx user port directly under a bridge is broken. We adopt the same strategy, which is to make the standalone ports not find any ATU entry learned on a bridge port. Theory: the mv88e6xxx ATU is looked up by FID and MAC address. There are as many FIDs as VIDs (4096). The FID is derived from the VID when possible (the VTU maps a VID to a FID), with a fallback to the port based default FID value when not (802.1Q Mode is disabled on the port, or the classified VID isn't present in the VTU). The mv88e6xxx driver makes the following use of FIDs and VIDs: - the port's DefaultVID (to which untagged & pvid-tagged packets get classified) is 0 and is absent from the VTU, so this kind of packets is processed in FID 0, the default FID assigned by mv88e6xxx_setup_port. - every time a bridge VLAN is created, mv88e6xxx_port_vlan_join() -> mv88e6xxx_atu_new() associates a FID with that VID which increases linearly starting from 1. Like this: bridge vlan add dev lan0 vid 100 # FID 1 bridge vlan add dev lan1 vid 100 # still FID 1 bridge vlan add dev lan2 vid 1024 # FID 2 The FID allocation made by the driver is sub-optimal for the following reasons: (a) A standalone port has a DefaultPVID of 0 and a default FID of 0 too. A VLAN-unaware bridged port has a DefaultPVID of 0 and a default FID of 0 too. The difference is that the bridged ports may learn ATU entries, while the standalone port has the requirement that it must not, and must not find them either. Standalone ports must not use the same FID as ports belonging to a bridge. All standalone ports can use the same FID, since the ATU will never have an entry in that FID. (b) Multiple VLAN-unaware bridges will all use a DefaultPVID of 0 and a default FID of 0 on all their ports. The FDBs will not be isolated between these bridges. Every VLAN-unaware bridge must use the same FID on all its ports, different from the FID of other bridge ports. (c) Each bridge VLAN uses a unique FID which is useful for Independent VLAN Learning, but the same VLAN ID on multiple VLAN-aware bridges will result in the same FID being used by mv88e6xxx_atu_new(). The correct behavior is for VLAN 1 in br0 to have a different FID compared to VLAN 1 in br1. This patch cannot fix all the above. Traditionally the DSA framework did not care about this, and the reality is that DSA core involvement is needed for the aforementioned issues to be solved. The only thing we can solve here is an issue which does not require API changes, and that is issue (a), aka use a different FID for standalone ports vs ports under VLAN-unaware bridges. The first step is deciding what VID and FID to use for standalone ports, and what VID and FID for bridged ports. The 0/0 pair for standalone ports is what they used up till now, let's keep using that. For bridged ports, there are 2 cases: - VLAN-aware ports will never end up using the port default FID, because packets will always be classified to a VID in the VTU or dropped otherwise. The FID is the one associated with the VID in the VTU. - On VLAN-unaware ports, we _could_ leave their DefaultVID (pvid) at zero (just as in the case of standalone ports), and just change the port's default FID from 0 to a different number (say 1). However, Tobias points out that there is one more requirement to cater to: cross-chip bridging. The Marvell DSA header does not carry the FID in it, only the VID. So once a packet crosses a DSA link, if it has a VID of zero it will get classified to the default FID of that cascade port. Relying on a port default FID for upstream cascade ports results in contradictions: a default FID of 0 breaks ATU isolation of bridged ports on the downstream switch, a default FID of 1 breaks standalone ports on the downstream switch. So not only must standalone ports have different FIDs compared to bridged ports, they must also have different DefaultVID values. IEEE 802.1Q defines two reserved VID values: 0 and 4095. So we simply choose 4095 as the DefaultVID of ports belonging to VLAN-unaware bridges, and VID 4095 maps to FID 1. For the xmit operation to look up the same ATU database, we need to put VID 4095 in DSA tags sent to ports belonging to VLAN-unaware bridges too. All shared ports are configured to map this VID to the bridging FID, because they are members of that VLAN in the VTU. Shared ports don't need to have 802.1QMode enabled in any way, they always parse the VID from the DSA header, they don't need to look at the 802.1Q header. We install VID 4095 to the VTU in mv88e6xxx_setup_port(), with the mention that mv88e6xxx_vtu_setup() which was located right below that call was flushing the VTU so those entries wouldn't be preserved. So we need to relocate the VTU flushing prior to the port initialization during ->setup(). Also note that this is why it is safe to assume that VID 4095 will get associated with FID 1: the user ports haven't been created, so there is no avenue for the user to create a bridge VLAN which could otherwise race with the creation of another FID which would otherwise use up the non-reserved FID value of 1. [ Currently mv88e6xxx_port_vlan_join() doesn't have the option of specifying a preferred FID, it always calls mv88e6xxx_atu_new(). ] mv88e6xxx_port_db_load_purge() is the function to access the ATU for FDB/MDB entries, and it used to determine the FID to use for VLAN-unaware FDB entries (VID=0) using mv88e6xxx_port_get_fid(). But the driver only called mv88e6xxx_port_set_fid() once, during probe, so no surprises, the port FID was always 0, the call to get_fid() was redundant. As much as I would have wanted to not touch that code, the logic is broken when we add a new FID which is not the port-based default. Now the port-based default FID only corresponds to standalone ports, and FDB/MDB entries belong to the bridging service. So while in the future, when the DSA API will support FDB isolation, we will have to figure out the FID based on the bridge number, for now there's a single bridging FID, so hardcode that. Lastly, the tagger needs to check, when it is transmitting a VLAN untagged skb, whether it is sending it towards a bridged or a standalone port. When we see it is bridged we assume the bridge is VLAN-unaware. Not because it cannot be VLAN-aware but: - if we are transmitting from a VLAN-aware bridge we are likely doing so using TX forwarding offload. That code path guarantees that skbs have a vlan hwaccel tag in them, so we would not enter the "else" branch of the "if (skb->protocol == htons(ETH_P_8021Q))" condition. - if we are transmitting on behalf of a VLAN-aware bridge but with no TX forwarding offload (no PVT support, out of space in the PVT, whatever), we would indeed be transmitting with VLAN 4095 instead of the bridge device's pvid. However we would be injecting a "From CPU" frame, and the switch won't learn from that - it only learns from "Forward" frames. So it is inconsequential for address learning. And VLAN 4095 is absolutely enough for the frame to exit the switch, since we never remove that VLAN from any port. Fixes: 57e661aae6a8 ("net: dsa: mv88e6xxx: Link aggregation support") Reported-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-08bpf: Support writable context for bare tracepointHou Tao
Commit 9df1c28bb752 ("bpf: add writable context for raw tracepoints") supports writable context for tracepoint, but it misses the support for bare tracepoint which has no associated trace event. Bare tracepoint is defined by DECLARE_TRACE(), so adding a corresponding DECLARE_TRACE_WRITABLE() macro to generate a definition in __bpf_raw_tp_map section for bare tracepoint in a similar way to DEFINE_TRACE_WRITABLE(). Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211004094857.30868-2-hotforest@gmail.com
2021-10-08Merge tag 'for-linus-5.15b-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - fix two minor issues in the Xen privcmd driver plus a cleanup patch for that driver - fix multiple issues related to running as PVH guest and some related earlyprintk fixes for other Xen guest types - fix an issue introduced in 5.15 the Xen balloon driver * tag 'for-linus-5.15b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: fix cancelled balloon action xen/x86: adjust data placement x86/PVH: adjust function/data placement xen/x86: hook up xen_banner() also for PVH xen/x86: generalize preferred console model from PV to PVH Dom0 xen/x86: make "earlyprintk=xen" work for HVM/PVH DomU xen/x86: allow "earlyprintk=xen" to work for PV Dom0 xen/x86: make "earlyprintk=xen" work better for PVH Dom0 xen/x86: allow PVH Dom0 without XEN_PV=y xen/x86: prevent PVH type from getting clobbered xen/privcmd: drop "pages" parameter from xen_remap_pfn() xen/privcmd: fix error handling in mmap-resource processing xen/privcmd: replace kcalloc() by kvcalloc() when allocating empty pages
2021-10-08Merge tag 'asm-generic-fixes-5.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fixes from Arnd Bergmann: "There is one build fix for Arm platforms that ended up impacting most architectures because of the way the drivers/firmware Kconfig file is wired up: The CONFIG_QCOM_SCM dependency have caused a number of randconfig regressions over time, and some still remain in v5.15-rc4. The fix we agreed on in the end is to make this symbol selected by any driver using it, and then building it even for non-Arm platforms with CONFIG_COMPILE_TEST. To make this work on all architectures, the drivers/firmware/Kconfig file needs to be included for all architectures to make the symbol itself visible. In a separate discussion, we found that a sound driver patch that is pending for v5.16 needs the same change to include this Kconfig file, so the easiest solution seems to have my Kconfig rework included in v5.15. Finally, the branch also includes a small unrelated build fix for NOMMU architectures" Link: https://lore.kernel.org/all/20210928153508.101208f8@canb.auug.org.au/ Link: https://lore.kernel.org/all/20210928075216.4193128-1-arnd@kernel.org/ Link: https://lore.kernel.org/all/20211007151010.333516-1-arnd@kernel.org/ * tag 'asm-generic-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic/io.h: give stub iounmap() on !MMU same prototype as elsewhere qcom_scm: hide Kconfig symbol firmware: include drivers/firmware/Kconfig unconditionally
2021-10-08coredump: Limit coredumps to a single thread groupEric W. Biederman
Today when a signal is delivered with a handler of SIG_DFL whose default behavior is to generate a core dump not only that process but every process that shares the mm is killed. In the case of vfork this looks like a real world problem. Consider the following well defined sequence. if (vfork() == 0) { execve(...); _exit(EXIT_FAILURE); } If a signal that generates a core dump is received after vfork but before the execve changes the mm the process that called vfork will also be killed (as the mm is shared). Similarly if the execve fails after the point of no return the kernel delivers SIGSEGV which will kill both the exec'ing process and because the mm is shared the process that called vfork as well. As far as I can tell this behavior is a violation of people's reasonable expectations, POSIX, and is unnecessarily fragile when the system is low on memory. Solve this by making a userspace visible change to only kill a single process/thread group. This is possible because Jann Horn recently modified[1] the coredump code so that the mm can safely be modified while the coredump is happening. With LinuxThreads long gone I don't expect anyone to have a notice this behavior change in practice. To accomplish this move the core_state pointer from mm_struct to signal_struct, which allows different thread groups to coredump simultatenously. In zap_threads remove the work to kill anything except for the current thread group. v2: Remove core_state from the VM_BUG_ON_MM print to fix compile failure when CONFIG_DEBUG_VM is enabled. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> [1] a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: d89f3847def4 ("[PATCH] thread-aware coredumps, 2.5.43-C3") History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Link: https://lkml.kernel.org/r/87y27mvnke.fsf@disp2133 Link: https://lkml.kernel.org/r/20211007144701.67592574@canb.auug.org.au Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-08net: introduce a function to check if a netdev name is in useAntoine Tenart
__dev_get_by_name is currently used to either retrieve a net device reference using its name or to check if a name is already used by a registered net device (per ns). In the later case there is no need to return a reference to a net device. Introduce a new helper, netdev_name_in_use, to check if a name is currently used by a registered net device without leaking a reference the corresponding net device. This helper uses netdev_name_node_lookup instead of __dev_get_by_name as we don't need the extra logic retrieving a reference to the corresponding net device. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08vsock: Enable y2038 safe timeval for timeoutRichard Palethorpe
Reuse the timeval compat code from core/sock to handle 32-bit and 64-bit timeval structures. Also introduce a new socket option define to allow using y2038 safe timeval under 32-bit. The existing behavior of sock_set_timeout and vsock's timeout setter differ when the time value is out of bounds. vsocks current behavior is retained at the expense of not being able to share the full implementation. This allows the LTP test vsock01 to pass under 32-bit compat mode. Fixes: fe0c72f3db11 ("socket: move compat timeout handling into sock.c") Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com> Cc: Richard Palethorpe <rpalethorpe@richiejp.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08eth: platform: add a helper for loading netdev->dev_addrJakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. There is a handful of drivers which pass netdev->dev_addr as the destination buffer to eth_platform_get_mac_address(). Add a helper which takes a dev pointer instead, so it can call an appropriate helper. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08asm-generic/io.h: give stub iounmap() on !MMU same prototype as elsewhereAdam Borowski
It made -Werror sad. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-08arm64: dts: mt8183: Add the mmsys reset bit to reset the dsi0Enric Balletbo i Serra
Reset the DSI hardware is needed to prevent different settings between the bootloader and the kernel. While here, also remove the undocumented and also not used 'mediatek,syscon-dsi' property. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210930103105.v4.5.I933f1532d7a1b2910843a9644c86a7d94a4b44e1@changeid Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2021-10-08arm64: dts: mt8173: Add the mmsys reset bit to reset the dsi0Enric Balletbo i Serra
Reset the DSI hardware is needed to prevent different settings between the bootloader and the kernel. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210930103105.v4.4.I7bd7d9a8da5e2894711b700a1127e6902a2b2f1d@changeid Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2021-10-08arm64: dts: mediatek: Move reset controller constants into common locationEnric Balletbo i Serra
The DT binding includes for reset controllers are located in include/dt-bindings/reset/. Move the Mediatek reset constants in there. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/20210930103105.v4.1.I514d9aafff3a062f751b37d3fea7402f67595b86@changeid Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2021-10-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07Merge series "spi: Various Cleanups" from Uwe Kleine-König ↵Mark Brown
<u.kleine-koenig@pengutronix.de>: Hello, while trying to understand how the spi framework makes use of the core device driver stuff (to fix a deadlock) I found these simplifications and improvements. They are build-tested with allmodconfig on arm64, m68k, powerpc, riscv, s390, sparc64 and x86_64. Best regards Uwe Uwe Kleine-König (4): spi: Move comment about chipselect check to the right place spi: Remove unused function spi_busnum_to_master() spi: Reorder functions to simplify the next commit spi: Make several public functions private to spi.c Documentation/spi/spi-summary.rst | 8 - drivers/spi/spi.c | 237 ++++++++++++------------------ include/linux/spi/spi.h | 55 ------- 3 files changed, 95 insertions(+), 205 deletions(-) base-commit: 9e1ff307c779ce1f0f810c7ecce3d95bbae40896 -- 2.30.2
2021-10-07Merge branch 'spi-5.15' into spi-5.16Mark Brown
2021-10-07Merge tag 'armsoc-fixes-5.15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "This is a larger than normal update for Arm SoC specific code, most of it in device trees, but also drivers and the omap and at91/sama7 platforms: - There are four new entries to the MAINTAINERS file: Sven Peter and Alyssa Rosenzweig for Apple M1, Romain Perier for Mstar/sigmastar, and Vignesh Raghavendra for TI K3 - Build fixes to address randconfig warnings in sharpsl, dove, omap1, and qcom platforms as well as the scmi and op-tee subsystems - Regression fixes for missing CONFIG_FB and other options for several defconfigs - Several bug fixes for the newly added Microchip SAMA7 platform, mostly regarding power management - Missing SMP barriers to protect accesses to SCMI virtio device - Regression fixes for TI OMAP, including a boot-time hang on am335x. - Lots of bug fixes for NXP i.MX, mostly addressing incorrect settings in devicetree files, and one revert for broken suspend. - Fixes for ARM Juno/Vexpress devicetree files, addressing a couple of schema warnings. - Regression fixes for qualcomm SoC specific drivers and devicetree files, reverting an mdt_loader change and at least pastially reverting some of the 5.15 DTS changes, plus some minor bugfixes" * tag 'armsoc-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (64 commits) MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer firmware: arm_scmi: Add proper barriers to scmi virtio device firmware: arm_scmi: Simplify spinlocks in virtio transport ARM: dts: omap3430-sdp: Fix NAND device node bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893 ARM: sharpsl_param: work around -Wstringop-overread warning ARM: defconfig: gemini: Restore framebuffer ARM: dove: mark 'putc' as inline ARM: omap1: move omap15xx local bus handling to usb.c MAINTAINERS: Add Vignesh to TI K3 platform maintainership arm64: dts: imx8m*-venice-gw7902: fix M2_RST# gpio ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence arm64: dts: ls1028a: fix eSDHC2 node arm64: dts: imx8mm-kontron-n801x-som: do not allow to switch off buck2 ARM: dts: at91: sama7g5ek: to not touch slew-rate for SDMMC pins ARM: dts: at91: sama7g5ek: use proper slew-rate settings for GMACs ARM: at91: pm: preload base address of controllers in tlb ARM: at91: pm: group constants and addresses loading ARM: dts: at91: sama7g5ek: add suspend voltage for ddr3l rail ...
2021-10-07Merge tag 'misc-fixes-20211007' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfslib, cachefiles and afs fixes from David Howells: - Fix another couple of oopses in cachefiles tracing stemming from the possibility of passing in a NULL object pointer - Fix netfs_clear_unread() to set READ on the iov_iter so that source it is passed to doesn't do the wrong thing (some drivers look at the flag on iov_iter rather than other available information to determine the direction) - Fix afs_launder_page() to write back at the correct file position on the server so as not to corrupt data * tag 'misc-fixes-20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix afs_launder_page() to set correct start file position netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() cachefiles: Fix oops with cachefiles_cull() due to NULL object
2021-10-07ipvs: add sysctl_run_estimation to support disable estimationDust Li
estimation_timer will iterate the est_list to do estimation for each ipvs stats. When there are lots of services, the list can be very large. We found that estimation_timer() run for more then 200ms on a machine with 104 CPU and 50K services. yunhong-cgl jiang report the same phenomenon before: https://www.spinics.net/lists/lvs-devel/msg05426.html In some cases(for example a large K8S cluster with many ipvs services), ipvs estimation may not be needed. So adding a sysctl blob to allow users to disable this completely. Default is: 1 (enable) Cc: yunhong-cgl jiang <xintian1976@gmail.com> Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-07ACPI: tools: fix compilation errorMiguel Bernal Marin
When ACPI tools are compiled, the following error is showed: $ cd tools/power/acpi $ make DESCEND tools/acpidbg MKDIR include CP include CC tools/acpidbg/acpidbg.o In file included from /home/linux/tools/power/acpi/include/acpi/platform/acenv.h:152, from /home/linux/tools/power/acpi/include/acpi/acpi.h:22, from acpidbg.c:9: /home/linux/tools/power/acpi/include/acpi/platform/acgcc.h:25:10: fatal error: linux/stdarg.h: No such file or directory 29 | #include <linux/stdarg.h> | ^~~~~~~~~~~~~~~~ compilation terminated. Use the ACPICA logic: just identify when it is used inside the kernel or by an ACPI tool. Fixes: c0891ac15f04 ("isystem: ship and use stdarg.h") Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-07Merge branches 'fixes.2021.10.07a', 'scftorture.2021.09.16a', ↵Paul E. McKenney
'tasks.2021.09.15a', 'torture.2021.09.13b' and 'torturescript.2021.09.16a' into HEAD fixes.2021.10.07a: Miscellaneous fixes. scftorture.2021.09.16a: smp_call_function torture-test updates. tasks.2021.09.15a: Tasks-trace RCU updates. torture.2021.09.13b: Other torture-test updates. torturescript.2021.09.16a: Torture-test scripting updates.
2021-10-07Merge tag 'net-5.15-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from xfrm, bpf, netfilter, and wireless. Current release - regressions: - xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting a new value in the middle of an enum - unix: fix an issue in unix_shutdown causing the other end read/write failures - phy: mdio: fix memory leak Current release - new code bugs: - mlx5e: improve MQPRIO resiliency against bad configs Previous releases - regressions: - bpf: fix integer overflow leading to OOB access in map element pre-allocation - stmmac: dwmac-rk: fix ethernet on rk3399 based devices - netfilter: conntrack: fix boot failure with nf_conntrack.enable_hooks=1 - brcmfmac: revert using ISO3166 country code and 0 rev as fallback - i40e: fix freeing of uninitialized misc IRQ vector - iavf: fix double unlock of crit_lock Previous releases - always broken: - bpf, arm: fix register clobbering in div/mod implementation - netfilter: nf_tables: correct issues in netlink rule change event notifications - dsa: tag_dsa: fix mask for trunked packets - usb: r8152: don't resubmit rx immediately to avoid soft lockup on device unplug - i40e: fix endless loop under rtnl if FW fails to correctly respond to capability query - mlx5e: fix rx checksum offload coexistence with ipsec offload - mlx5: force round second at 1PPS out start time and allow it only in supported clock modes - phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable sequence Misc: - xfrm: slightly rejig the new policy uAPI to make it less cryptic" * tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) net: prefer socket bound to interface when not in VRF iavf: fix double unlock of crit_lock i40e: Fix freeing of uninitialized misc IRQ vector i40e: fix endless loop under rtnl dt-bindings: net: dsa: marvell: fix compatible in example ionic: move filter sync_needed bit set gve: report 64bit tx_bytes counter from gve_handle_report_stats() gve: fix gve_get_stats() rtnetlink: fix if_nlmsg_stats_size() under estimation gve: Properly handle errors in gve_assign_qpl gve: Avoid freeing NULL pointer gve: Correct available tx qpl check unix: Fix an issue in unix_shutdown causing the other end read/write failures net: stmmac: trigger PCS EEE to turn off on link down net: pcs: xpcs: fix incorrect steps on disable EEE netlink: annotate data races around nlk->bound net: pcs: xpcs: fix incorrect CL37 AN sequence net: sfp: Fix typo in state machine debug string net/sched: sch_taprio: properly cancel timer from taprio_destroy() net: bridge: fix under estimation in br_get_linkxstats_size() ...
2021-10-07Merge tag 'hyperv-fixes-signed-20211007' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Replace uuid.h with types.h in a header (Andy Shevchenko) - Avoid sleeping in atomic context in PCI driver (Long Li) - Avoid sending IPI to self when it shouldn't (Vitaly Kuznetsov) * tag 'hyperv-fixes-signed-20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Avoid erroneously sending IPI to 'self' hyper-v: Replace uuid.h with types.h PCI: hv: Fix sleep while in non-sleep context when removing child devices from the bus
2021-10-07qcom_scm: hide Kconfig symbolArnd Bergmann
Now that SCM can be a loadable module, we have to add another dependency to avoid link failures when ipa or adreno-gpu are built-in: aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe': ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available' ld.lld: error: undefined symbol: qcom_scm_is_available >>> referenced by adreno_gpu.c >>> gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in archive drivers/built-in.a This can happen when CONFIG_ARCH_QCOM is disabled and we don't select QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd use a similar dependency here to what we have for QCOM_RPROC_COMMON, but that causes dependency loops from other things selecting QCOM_SCM. This appears to be an endless problem, so try something different this time: - CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on' but that is simply selected by all of its users - All the stubs in include/linux/qcom_scm.h can go away - arm-smccc.h needs to provide a stub for __arm_smccc_smc() to allow compile-testing QCOM_SCM on all architectures. - To avoid a circular dependency chain involving RESET_CONTROLLER and PINCTRL_SUNXI, drop the 'select RESET_CONTROLLER' statement. According to my testing this still builds fine, and the QCOM platform selects this symbol already. Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Alex Elder <elder@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-10-07spi: Make several public functions private to spi.cUwe Kleine-König
All these functions have no callers apart from drivers/spi/spi.c. So drop their declarations in include/linux/spi/spi.h and don't export them. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20211007121415.2401638-5-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07spi: Remove unused function spi_busnum_to_master()Uwe Kleine-König
The last user is gone since commit 2962db71c703 ("staging/fbtft: Remove fbtft_device") in 2019. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20211007121415.2401638-3-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2021-10-07PCI: Add PCI_EXP_DEVCTL_PAYLOAD_* macrosPali Rohár
Define a macro PCI_EXP_DEVCTL_PAYLOAD_* for every possible Max Payload Size in linux/pci_regs.h, in the same style as PCI_EXP_DEVCTL_READRQ_*. Link: https://lore.kernel.org/r/20211005180952.6812-2-kabel@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
2021-10-07dma-buf: add dma_resv_for_each_fence v3Christian König
A simpler version of the iterator to be used when the dma_resv object is locked. v2: fix index check here as well v3: minor coding improvement, some documentation cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211006123609.2026-1-christian.koenig@amd.com
2021-10-07Merge tag 'wireless-drivers-next-2021-10-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.16 First set of patches for v5.16. ath11k getting most of new features this time. Other drivers also have few new features, and of course the usual set of fixes and cleanups all over. Major changes: rtw88 * support adaptivity for ETSI/JP DFS region * 8821c: support RFE type4 wifi NIC brcmfmac * DMI nvram filename quirk for Cyberbook T116 tablet ath9k * load calibration data and pci init values via nvmem subsystem ath11k * include channel rx and tx time in survey dump statistics * support for setting fixed Wi-Fi 6 rates from user space * support for 80P80 and 160 MHz bandwidths * spectral scan support for QCN9074 * support for calibration data files per radio * support for calibration data via eeprom * support for rx decapsulation offload (data frames in 802.3 format) * support channel 2 in 6 GHz band ath10k * include frame time stamp in beacon and probe response frames wcn36xx * enable Idle Mode Power Save (IMPS) to reduce power consumption during idle ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07eth: fwnode: add a helper for loading netdev->dev_addrJakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. There is a handful of drivers which pass netdev->dev_addr as the destination buffer to device_get_mac_address(). Add a helper which takes a dev pointer instead, so it can call an appropriate helper. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07eth: fwnode: remove the addr len from mac helpersJakub Kicinski
All callers pass in ETH_ALEN and the function itself will return -EINVAL for any other address length. Just assume it's ETH_ALEN like all other mac address helpers (nvm, of, platform). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07eth: fwnode: change the return type of mac address helpersJakub Kicinski
fwnode_get_mac_address() and device_get_mac_address() return a pointer to the buffer that was passed to them on success or NULL on failure. None of the callers care about the actual value, only if it's NULL or not. These semantics differ from of_get_mac_address() which returns an int so to avoid confusion make the device helpers return an errno. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07device property: move mac addr helpers to eth.cJakub Kicinski
Move the mac address helpers out, eth.c already contains a bunch of similar helpers. Suggested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07of: net: add a helper for loading netdev->dev_addrJakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. There are roughly 40 places where netdev->dev_addr is passed as the destination to a of_get_mac_address() call. Add a helper which takes a dev pointer instead, so it can call an appropriate helper. Note that of_get_mac_address() already assumes the address is 6 bytes long (ETH_ALEN) so use eth_hw_addr_set(). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07of: net: move of_net under net/Jakub Kicinski
Rob suggests to move of_net.c from under drivers/of/ somewhere to the networking code. Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07sched,rcu: Rework try_invoke_on_locked_down_task()Peter Zijlstra
Give try_invoke_on_locked_down_task() a saner name and have it return an int so that the caller might distinguish between different reasons of failure. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Vasily Gorbik <gor@linux.ibm.com> # on s390 Link: https://lkml.kernel.org/r/20210929152428.649944917@infradead.org
2021-10-07futex: Implement sys_futex_waitv()André Almeida
Add support to wait on multiple futexes. This is the interface implemented by this syscall: futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, unsigned int flags, struct timespec *timeout, clockid_t clockid) struct futex_waitv { __u64 val; __u64 uaddr; __u32 flags; __u32 __reserved; }; Given an array of struct futex_waitv, wait on each uaddr. The thread wakes if a futex_wake() is performed at any uaddr. The syscall returns immediately if any waiter has *uaddr != val. *timeout is an optional absolute timeout value for the operation. This syscall supports only 64bit sized timeout structs. The flags argument of the syscall should be empty, but it can be used for future extensions. Flags for shared futexes, sizes, etc. should be used on the individual flags of each waiter. __reserved is used for explicit padding and should be 0, but it might be used for future extensions. If the userspace uses 32-bit pointers, it should make sure to explicitly cast it when assigning to waitv::uaddr. Returns the array index of one of the woken futexes. There’s no given information of how many were woken, or any particular attribute of it (if it’s the first woken, if it is of the smaller index...). Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210923171111.300673-17-andrealmeid@collabora.com
2021-10-07futex: Split out syscallsPeter Zijlstra
Put the syscalls in their own little file. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: André Almeida <andrealmeid@collabora.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: André Almeida <andrealmeid@collabora.com> Link: https://lore.kernel.org/r/20210923171111.300673-3-andrealmeid@collabora.com
2021-10-07Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/David S. Miller
ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2021-10-07 1) Fix a sysbot reported shift-out-of-bounds in xfrm_get_default. From Pavel Skripkin. 2) Fix XFRM_MSG_MAPPING ABI breakage. The new XFRM_MSG_MAPPING messages were accidentally not paced at the end. Fix by Eugene Syromiatnikov. 3) Fix the uapi for the default policy, use explicit field and macros and make it accessible to userland. From Nicolas Dichtel. 4) Fix a missing rcu lock in xfrm_notify_userpolicy(). From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-07ALSA: hda: intel: Allow repeatedly probing on codec configuration errorsTakashi Iwai
It seems that a few recent AMD systems show the codec configuration errors at the early boot, while loading the driver at a later stage works magically. Although the root cause of the error isn't clear, it's certainly not bad to allow retrying the codec probe in such a case if that helps. This patch adds the capability for retrying the probe upon codec probe errors on the certain AMD platforms. The probe_work is changed to a delayed work, and at the secondary call, it'll jump to the codec probing. Note that, not only adding the re-probing, this includes the behavior changes in the codec configuration function. Namely, snd_hda_codec_configure() won't unregister the codec at errors any longer. Instead, its caller, azx_codec_configure() unregisters the codecs with the probe failures *if* any codec has been successfully configured. If all codec probe failed, it doesn't unregister but let it re-probed -- which is the most case we're seeing and this patch tries to improve. Even if the driver doesn't re-probe or give up, it will go to the "free-all" error path, hence the leftover codecs shall be disabled / deleted in anyway. BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1190801 Link: https://lore.kernel.org/r/20211006141940.2897-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-06net: mdio: add mdiobus_modify_changed()Russell King (Oracle)
Add mdiobus_modify_changed() helper to reflect the phylib and similar equivalents. This will avoid this functionality being open-coded, as has already happened in phylink, and it looks like other users will be appearing soon. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06ethtool: Add transceiver module extended stateIdo Schimmel
Add an extended state and sub-state to describe link issues related to transceiver modules. The 'ETHTOOL_LINK_EXT_SUBSTATE_MODULE_CMIS_NOT_READY' extended sub-state tells user space that port is unable to gain a carrier because the CMIS Module State Machine did not reach the ModuleReady (Fully Operational) state. For example, if the module is stuck at ModuleLowPwr or ModuleFault state. In case of the latter, user space can read the fault reason from the module's EEPROM and potentially reset it. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06ethtool: Add ability to control transceiver modules' power modeIdo Schimmel
Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06kunit: fix kernel-doc warnings due to mismatched arg namesDaniel Latypov
Commit 7122debb4367 ("kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers") added new functions but called last arg `flags`, unlike the existing code that used `gfp`. This only is an issue in test.h, test.c still used `gfp`. But the documentation was copy-pasted with the old names, leading to kernel-doc warnings. Do s/flags/gfp to make the names consistent and fix the warnings. Fixes: 7122debb4367 ("kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-06RDMA/efa: CQ notificationsGal Pressman
This patch adds support for CQ notifications through the standard verbs api. In order to achieve that, a new event queue (EQ) object is introduced, which is in charge of reporting completion events to the driver. On driver load, EQs are allocated and their affinity is set to a single cpu. When a user app creates a CQ with a completion channel, the completion vector number is converted to a EQ number, which is in charge of reporting the CQ events. In addition, the CQ creation admin command now returns an offset for the CQ doorbell, which is mapped to the userspace provider and is used to arm the CQ when requested by the user. The EQs use a single doorbell (located on the registers BAR), which encodes the EQ number and arm as part of the doorbell value. The EQs are polled by the driver on each new EQE, and arm it when the poll is completed. Link: https://lore.kernel.org/r/20211003105605.29222-1-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-07Merge branch 'objtool/urgent'Peter Zijlstra
Fixup conflicts. # Conflicts: # tools/objtool/check.c
2021-10-06coredump: Don't perform any cleanups before dumping coreEric W. Biederman
Rename coredump_exit_mm to coredump_task_exit and call it from do_exit before PTRACE_EVENT_EXIT, and before any cleanup work for a task happens. This ensures that an accurate copy of the process can be captured in the coredump as no cleanup for the process happens before the coredump completes. This also ensures that PTRACE_EVENT_EXIT will not be visited by any thread until the coredump is complete. Add a new flag PF_POSTCOREDUMP so that tasks that have passed through coredump_task_exit can be recognized and ignored in zap_process. Now that all of the coredumping happens before exit_mm remove code to test for a coredump in progress from mm_release. Replace "may_ptrace_stop()" with a simple test of "current->ptrace". The other tests in may_ptrace_stop all concern avoiding stopping during a coredump. These tests are no longer necessary as it is now guaranteed that fatal_signal_pending will be set if the code enters ptrace_stop during a coredump. The code in ptrace_stop is guaranteed not to stop if fatal_signal_pending returns true. Until this change "ptrace_event(PTRACE_EVENT_EXIT)" could call ptrace_stop without fatal_signal_pending being true, as signals are dequeued in get_signal before calling do_exit. This is no longer an issue as "ptrace_event(PTRACE_EVENT_EXIT)" is no longer reached until after the coredump completes. Link: https://lkml.kernel.org/r/874kaax26c.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-06ptrace: Remove the unnecessary arguments from arch_ptrace_stopEric W. Biederman
Both arch_ptrace_stop_needed and arch_ptrace_stop are called with an exit_code and a siginfo structure. Neither argument is used by any of the implementations so just remove the unneeded arguments. The two arechitectures that implement arch_ptrace_stop are ia64 and sparc. Both architectures flush their register stacks before a ptrace_stack so that all of the register information can be accessed by debuggers. As the question of if a register stack needs to be flushed is independent of why ptrace is stopping not needing arguments make sense. Cc: David Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Link: https://lkml.kernel.org/r/87lf3mx290.fsf@disp2133 Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-06hyper-v: Replace uuid.h with types.hAndy Shevchenko
There is no user of anything in uuid.h in the hyperv.h. Replace it with more appropriate types.h. Fixes: f081bbb3fd03 ("hyper-v: Remove internal types from UAPI header") Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/20211001135544.1823-1-andriy.shevchenko@linux.intel.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-06dma-buf: add dma_resv_for_each_fence_unlocked v8Christian König
Abstract the complexity of iterating over all the fences in a dma_resv object. The new loop handles the whole RCU and retry dance and returns only fences where we can be sure we grabbed the right one. v2: fix accessing the shared fences while they might be freed, improve kerneldoc, rename _cursor to _iter, add dma_resv_iter_is_exclusive, add dma_resv_iter_begin/end v3: restructor the code, move rcu_read_lock()/unlock() into the iterator, add dma_resv_iter_is_restarted() v4: fix NULL deref when no explicit fence exists, drop superflous rcu_read_lock()/unlock() calls. v5: fix typos in the documentation v6: fix coding error when excl fence is NULL v7: one more logic fix v8: fix index check in dma_resv_iter_is_exclusive() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v7) Link: https://patchwork.freedesktop.org/patch/msgid/20211005113742.1101-2-christian.koenig@amd.com
2021-10-05bpf: selftests: Add selftests for module kfunc supportKumar Kartikeya Dwivedi
This adds selftests that tests the success and failure path for modules kfuncs (in presence of invalid kfunc calls) for both libbpf and gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can add module BTF ID set from bpf_testmod. This also introduces a couple of test cases to verifier selftests for validating whether we get an error or not depending on if invalid kfunc call remains after elimination of unreachable instructions. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com