linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-03-23	pwm: cros-ec: Explicitly set .polarity in .get_state()	Uwe Kleine-König
	The driver only supports normal polarity. Complete the implementation of .get_state() by setting .polarity accordingly. Reviewed-by: Guenter Roeck <groeck@chromium.org> Fixes: 1f0d3bb02785 ("pwm: Add ChromeOS EC PWM driver") Link: https://lore.kernel.org/r/20230228135508.1798428-3-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2023-03-23	pwm: hibvt: Explicitly set .polarity in .get_state()	Uwe Kleine-König
	The driver only both polarities. Complete the implementation of .get_state() by setting .polarity according to the configured hardware state. Fixes: d09f00810850 ("pwm: Add PWM driver for HiSilicon BVT SOCs") Link: https://lore.kernel.org/r/20230228135508.1798428-2-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2023-03-23	drm/amd/display: Set dcn32 caps.seamless_odm	Hersen Wu
	[Why & How] seamless_odm set was not picked up while merging commit 2d017189e2b3 ("drm/amd/display: Blank eDP on enable drv if odm enabled") Fixes: 2d017189e2b3 ("drm/amd/display: Blank eDP on enable drv if odm enabled") Reviewed-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amd/display: fix wrong index used in dccg32_set_dpstreamclk	Hersen Wu
	[Why & How] When merging commit 9af611f29034 ("drm/amd/display: Fix DCN32 DPSTREAMCLK_CNTL programming"), index change was not picked up. Cc: stable@vger.kernel.org Cc: Mario Limonciello <mario.limonciello@amd.com> Fixes: 9af611f29034 ("drm/amd/display: Fix DCN32 DPSTREAMCLK_CNTL programming") Reviewed-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi	Kai-Heng Feng
	S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default"). The root cause is still not clear for now. So extend and apply the ASPM quirk from commit e02fe3bc7aba ("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to workaround the issue on Navi cards too. Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-03-23	drm/amd/display: remove outdated 8bpc comments	Alex Hung
	[Why] The commit c76e483cd916 ("drm/amd/display: Don't restrict bpc to 8 bpc") removes the historical 8bpc dependency and sets max_bpc to 16. [How] The comment that states "8bpc for non-edp" needs to be removed as well. Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amdgpu/gfx: set cg flags to enter/exit safe mode	Jane Jian
	sriov needs to enter/exit safe mode in update umd p state add the cg flag to let it enter or exit while needed Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs	YuBiao Wang
	[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang. [How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amdgpu: add mes resume when do gfx post soft reset	Tong Liu01
	[why] when gfx do soft reset, mes will also do reset, if mes is not resumed when do recover from soft reset, mes is unable to respond in later sequence [how] resume mes when do gfx post soft reset Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23	drm/amdgpu: skip ASIC reset for APUs when go to S4	Tim Huang
	For GC IP v11.0.4/11, PSP TMR need to be reserved for ASIC mode2 reset. But for S4, when psp suspend, it will destroy the TMR that fails the ASIC reset. [ 96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset [ 100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002 [ 100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed! [ 100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62 [ 100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62 [ 100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs [ 100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62 We can skip the reset on APUs, assuming we can resume them properly. Verified on some GFX11, GFX10 and old GFX9 APUs. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-03-23	drm/amdgpu: reposition the gpu reset checking for reuse	Tim Huang
	Move the amdgpu_acpi_should_gpu_reset out of CONFIG_SUSPEND to share it with hibernate case. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-03-23	efi/libstub: Use relocated version of kernel's struct screen_info	Ard Biesheuvel
	In some cases, we expose the kernel's struct screen_info to the EFI stub directly, so it gets populated before even entering the kernel. This means the early console is available as soon as the early param parsing happens, which is nice. It also means we need two different ways to pass this information, as this trick only works if the EFI stub is baked into the core kernel image, which is not always the case. Huacai reports that the preparatory refactoring that was needed to implement this alternative method for zboot resulted in a non-functional efifb earlycon for other cases as well, due to the reordering of the kernel image relocation with the population of the screen_info struct, and the latter now takes place after copying the image to its new location, which means we copy the old, uninitialized state. So let's ensure that the same-image version of alloc_screen_info() produces the correct screen_info pointer, by taking the displacement of the loaded image into account. Reported-by: Huacai Chen <chenhuacai@loongson.cn> Tested-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/linux-efi/20230310021749.921041-1-chenhuacai@loongson.cn/ Fixes: 42c8ea3dca094ab8 ("efi: libstub: Factor out EFI stub entrypoint into separate file") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-03-23	mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25	Bhavya Kapoor
	Timing Information in Datasheet assumes that HIGH_SPEED_ENA=1 should be set for SDR12 and SDR25 modes. But sdhci_am654 driver clears HIGH_SPEED_ENA register. Thus, Modify sdhci_am654 to not clear HIGH_SPEED_ENA (HOST_CONTROL[2]) bit for SDR12 and SDR25 speed modes. Fixes: e374e87538f4 ("mmc: sdhci_am654: Clear HISPD_ENA in some lower speed modes") Signed-off-by: Bhavya Kapoor <b-kapoor@ti.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230317092711.660897-1-b-kapoor@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-03-23	net: mdio: thunder: Add missing fwnode_handle_put()	Liang He
	In device_for_each_child_node(), we should add fwnode_handle_put() when break out of the iteration device_for_each_child_node() as it will automatically increase and decrease the refcounter. Fixes: 379d7ac7ca31 ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	net/sched: act_api: use the correct TCA_ACT attributes in dump	Pedro Tammela
	4 places in the act api code are using 'TCA_' definitions where they should be using 'TCA_ACT_', which is confusing for the reader, although functionally they are equivalent. Cc: Hangbin Liu <haliu@redhat.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Hangbin Liu <haliu@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	Merge branch '1GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-21 (igb, igbvf, igc) This series contains updates to igb, igbvf, and igc drivers. Andrii changes igb driver to utilize diff_by_scaled_ppm() implementation over an open-coded version. Dawid adds pci_error_handlers for reset_prepare and reset_done for igbvf. Sasha removes unnecessary code in igc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	Merge branch 'ipv4-address-protocol'	David S. Miller
	Petr Machata says: ==================== net: Allow changing IPv4 address protocol IPv4 and IPv6 addresses can be assigned a protocol value that indicates the provenance of the IP address. The attribute is modeled after ip route protocols, and essentially allows the administrator or userspace stack to tag addresses in some way that makes sense to the actor in question. When IP address protocol field was added in commit 47f0bd503210 ("net: Add new protocol attribute to IP addresses"), the semantics included the ability to change the protocol for IPv6 addresses, but not for IPv4 addresses. It seems this was not deliberate, but rather by accident. One particular use case is tagging the addresses differently depending on whether the routing stack should advertise them or not. Without support for protocol replacement, this can not be done. In this patchset, extend IPv4 to allow changing the protocol defined at an address (in patch #1). Then in patches #2 and #3 add selftest coverage for ip address protocols. Currently the kernel simply ignores the new value. Thus allowing the replacement changes the observable behavior. However, since IPv6 already behaves like this anyway, and since the feature as such is relatively new, it seems like the change is safe to make. An example session with the feature in action: bash-5.2# ip address add dev d 192.0.2.1/28 proto 0xab bash-5.2# ip address show dev d 4: d: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 06:29:74:fd:1f:eb brd ff:ff:ff:ff:ff:ff inet 192.0.2.1/28 scope global proto 0xab d valid_lft forever preferred_lft forever bash-5.2# ip address replace dev d 192.0.2.1/28 proto 0x11 bash-5.2# ip address show dev d 4: d: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 06:29:74:fd:1f:eb brd ff:ff:ff:ff:ff:ff inet 192.0.2.1/28 scope global proto 0x11 d valid_lft forever preferred_lft forever ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	selftests: rtnetlink: Add an address proto test	Petr Machata
	Add coverage of "ip address {add,replace} ... proto" support. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	selftests: rtnetlink: Make the set of tests to run configurable	Petr Machata
	Extract the list of all tests into a variable, ALL_TESTS. Then assume the environment variable TESTS holds the list of tests to actually run, falling back to ALL_TESTS if TESTS is empty. This is the same interface that forwarding selftests use to make the set of tests to run configurable. In addition to this, allow setting the value explicitly through a command line option "-t" along the lines of what fib_nexthops.sh does. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	net: ipv4: Allow changing IPv4 address protocol	Petr Machata
	When IP address protocol field was added in commit 47f0bd503210 ("net: Add new protocol attribute to IP addresses"), the semantics included the ability to change the protocol for IPv6 addresses, but not for IPv4 addresses. It seems this was not deliberate, but rather by accident. A userspace that wants to change the protocol of an address might drop and recreate the address, but that disrupts routing and is just impractical. So in this patch, when an IPv4 address is replaced (through RTM_NEWADDR request with NLM_F_REPLACE flag), update the proto at the address to the one given in the request, or zero if none is given. This matches the behavior of IPv6. Previously, any new value given was simply ignored. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-23	RDMA/cma: Allow UD qp_type to join multicast only	Mark Zhang
	As for multicast: - The SIDR is the only mode that makes sense; - Besides PS_UDP, other port spaces like PS_IB is also allowed, as it is UD compatible. In this case qkey also needs to be set [1]. This patch allows only UD qp_type to join multicast, and set qkey to default if it's not set, to fix an uninit-value error: the ib->rec.qkey field is accessed without being initialized. ===================================================== BUG: KMSAN: uninit-value in cma_set_qkey drivers/infiniband/core/cma.c:510 [inline] BUG: KMSAN: uninit-value in cma_make_mc_event+0xb73/0xe00 drivers/infiniband/core/cma.c:4570 cma_set_qkey drivers/infiniband/core/cma.c:510 [inline] cma_make_mc_event+0xb73/0xe00 drivers/infiniband/core/cma.c:4570 cma_iboe_join_multicast drivers/infiniband/core/cma.c:4782 [inline] rdma_join_multicast+0x2b83/0x30a0 drivers/infiniband/core/cma.c:4814 ucma_process_join+0xa76/0xf60 drivers/infiniband/core/ucma.c:1479 ucma_join_multicast+0x1e3/0x250 drivers/infiniband/core/ucma.c:1546 ucma_write+0x639/0x6d0 drivers/infiniband/core/ucma.c:1732 vfs_write+0x8ce/0x2030 fs/read_write.c:588 ksys_write+0x28c/0x520 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline] __do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180 do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205 do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c Local variable ib.i created at: cma_iboe_join_multicast drivers/infiniband/core/cma.c:4737 [inline] rdma_join_multicast+0x586/0x30a0 drivers/infiniband/core/cma.c:4814 ucma_process_join+0xa76/0xf60 drivers/infiniband/core/ucma.c:1479 CPU: 0 PID: 29874 Comm: syz-executor.3 Not tainted 5.16.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ===================================================== [1] https://lore.kernel.org/linux-rdma/20220117183832.GD84788@nvidia.com/ Fixes: b5de0c60cc30 ("RDMA/cma: Fix use after free race in roce multicast join") Reported-by: syzbot+8fcbb77276d43cc8b693@syzkaller.appspotmail.com Signed-off-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/58a4a98323b5e6b1282e83f6b76960d06e43b9fa.1679309909.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-23	modpost: Fix processing of CRCs on 32-bit build machines	Ben Hutchings
	modpost now reads CRCs from ..cmd files, parsing them using strtol(). This is inconsistent with its parsing of Module.symvers and with their definition as unsigned* 32-bit values. strtol() clamps values to [LONG_MIN, LONG_MAX], and when building on a 32-bit system this changes all CRCs >= 0x80000000 to be 0x7fffffff. Change extract_crcs_for_object() to use strtoul() instead. Cc: stable@vger.kernel.org Fixes: f292d875d0dc ("modpost: extract symbol versions from *.cmd files") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-03-23	scripts: merge_config: Fix typo in variable name.	Mirsad Goran Todorovac
	${WARNOVERRIDE} was misspelled as ${WARNOVVERIDE}, which caused a shell syntax error in certain paths of the script execution. Fixes: 46dff8d7e381 ("scripts: merge_config: Add option to suppress warning on overrides") Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-03-22	Merge branch 'Transit between BPF TCP congestion controls.'	Martin KaFai Lau
	Kui-Feng Lee says: ==================== Major changes: - Create bpf_links in the kernel for BPF struct_ops to register and unregister it. - Enables switching between implementations of bpf-tcp-cc under a name instantly by replacing the backing struct_ops map of a bpf_link. Previously, BPF struct_ops didn't go off, as even when the user program creating it was terminated, none of these ever were pinned. For instance, the TCP congestion control subsystem indirectly maintains a reference count on the struct_ops of any registered BPF implemented algorithm. Thus, the algorithm won't be deactivated until someone deliberately unregisters it. For compatibility with other BPF programs, bpf_links have been created to work in coordination with struct_ops maps. This ensures that the registration and unregistration of these respective maps is carried out at the start and end of the bpf_link. We also faced complications when attempting to replace an existing TCP congestion control algorithm with a new implementation on the fly. A struct_ops map was used to register a TCP congestion control algorithm with a unique name. We had to either register the alternative implementation with a new name and move over or unregister the current one before being able to reregistration with the same name. To fix this problem, we can an option to migrate the registration of the algorithm from struct_ops maps to bpf_links. By modifying the backing map of a bpf_link, it suddenly becomes possible to replace an existing TCP congestion control algorithm with ease. --- The major differences from v11: - Fix incorrectly setting both old_prog_fd and old_map_fd. The major differences from v10: - Add old_map_fd as an additional field instead of an union in bpf_link_update_opts. The major differences from v9: - Add test case for BPF_F_LINK. Includes adding old_map_fd to struct bpf_link_update_opts in patch 6. - Return -EPERM instead of -EINVAL when the old map fd doesn't match with BPF_F_LINK. - Fix -EBUSY case in bpf_map__attach_struct_ops(). The major differences form v8: - Check bpf_struct_ops::{validate,update} in bpf_struct_ops_map_alloc() The major differences from v7: - Use synchronize_rcu_mult(call_rcu, call_rcu_tasks) to replace synchronize_rcu() and synchronize_rcu_tasks(). - Call synchronize_rcu() in tcp_update_congestion_control(). - Handle -EBUSY in bpf_map__attach_struct_ops() to allow a struct_ops can be used to create links more than once. Include a test case. - Add old_map_fd to bpf_attr and handle BPF_F_REPLACE in bpf_struct_ops_map_link_update(). - Remove changes in bpf_dummy_struct_ops.c and add a check of .update function pointer of bpf_struct_ops. The major differences from v6: - Reword commit logs of the patch 1, 2, and 8. - Call synchronize_rcu_tasks() as well in bpf_struct_ops_map_free(). - Refactor bpf_struct_ops_map_free() so that bpf_struct_ops_map_alloc() can free a struct_ops without waiting for a RCU grace period. The major differences from v5: - Add a new step to bpf_object__load() to prepare vdata. - Accept BPF_F_REPLACE. - Check section IDs in find_struct_ops_map_by_offset() - Add a test case to check mixing w/ and w/o link struct_ops. - Add a test case of using struct_ops w/o link to update a link. - Improve bpf_link__detach_struct_ops() to handle the w/ link case. The major differences from v4: - Rebase. - Reorder patches and merge part 4 to part 2 of the v4. The major differences from v3: - Remove bpf_struct_ops_map_free_rcu(), and use synchronize_rcu(). - Improve the commit log of the part 1. - Before transitioning to the READY state, we conduct a value check to ensure that struct_ops can be successfully utilized and links created later. The major differences from v2: - Simplify states - Remove TOBEUNREG. - Rename UNREG to READY. - Stop using the refcnt of the kvalue of a struct_ops. Explicitly increase and decrease the refcount of struct_ops. - Prepare kernel vdata during the load phase of libbpf. The major differences from v1: - Added bpf_struct_ops_link to replace the previous union-based approach. - Added UNREG and TOBEUNREG to the state of bpf_struct_ops_map. - bpf_struct_ops_transit_state() maintains state transitions. - Fixed synchronization issue. - Prepare kernel vdata of struct_ops during the loading phase of bpf_object. - Merged previous patch 3 to patch 1. v11: https://lore.kernel.org/all/20230323010409.2265383-1-kuifeng@meta.com/ v10: https://lore.kernel.org/all/20230321232813.3376064-1-kuifeng@meta.com/ v9: https://lore.kernel.org/all/20230320195644.1953096-1-kuifeng@meta.com/ v8: https://lore.kernel.org/all/20230318053144.1180301-1-kuifeng@meta.com/ v7: https://lore.kernel.org/all/20230316023641.2092778-1-kuifeng@meta.com/ v6: https://lore.kernel.org/all/20230310043812.3087672-1-kuifeng@meta.com/ v5: https://lore.kernel.org/all/20230308005050.255859-1-kuifeng@meta.com/ v4: https://lore.kernel.org/all/20230307232913.576893-1-andrii@kernel.org/ v3: https://lore.kernel.org/all/20230303012122.852654-1-kuifeng@meta.com/ v2: https://lore.kernel.org/bpf/20230223011238.12313-1-kuifeng@meta.com/ v1: https://lore.kernel.org/bpf/20230214221718.503964-1-kuifeng@meta.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	selftests/bpf: Test switching TCP Congestion Control algorithms.	Kui-Feng Lee
	Create a pair of sockets that utilize the congestion control algorithm under a particular name. Then switch up this congestion control algorithm to another implementation and check whether newly created connections using the same cc name now run the new implementation. Also, try to update a link with a struct_ops that is without BPF_F_LINK or with a wrong or different name. These cases should fail due to the violation of assumptions. To update a bpf_link of a struct_ops, it must be replaced with another struct_ops that is identical in type and name and has the BPF_F_LINK flag. The other test case is to create links from the same struct_ops more than once. It makes sure a struct_ops can be used repeatly. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-9-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	libbpf: Use .struct_ops.link section to indicate a struct_ops with a link.	Kui-Feng Lee
	Flags a struct_ops is to back a bpf_link by putting it to the ".struct_ops.link" section. Once it is flagged, the created struct_ops can be used to create a bpf_link or update a bpf_link that has been backed by another struct_ops. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-8-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	libbpf: Update a bpf_link with another struct_ops.	Kui-Feng Lee
	Introduce bpf_link__update_map(), which allows to atomically update underlying struct_ops implementation for given struct_ops BPF link. Also add old_map_fd to struct bpf_link_update_opts to handle BPF_F_REPLACE feature. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-7-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	bpf: Update the struct_ops of a bpf_link.	Kui-Feng Lee
	By improving the BPF_LINK_UPDATE command of bpf(), it should allow you to conveniently switch between different struct_ops on a single bpf_link. This would enable smoother transitions from one struct_ops to another. The struct_ops maps passing along with BPF_LINK_UPDATE should have the BPF_F_LINK flag. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-6-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	libbpf: Create a bpf_link in bpf_map__attach_struct_ops().	Kui-Feng Lee
	bpf_map__attach_struct_ops() was creating a dummy bpf_link as a placeholder, but now it is constructing an authentic one by calling bpf_link_create() if the map has the BPF_F_LINK flag. You can flag a struct_ops map with BPF_F_LINK by calling bpf_map__set_map_flags(). Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-5-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	bpf: Create links for BPF struct_ops maps.	Kui-Feng Lee
	Make bpf_link support struct_ops. Previously, struct_ops were always used alone without any associated links. Upon updating its value, a struct_ops would be activated automatically. Yet other BPF program types required to make a bpf_link with their instances before they could become active. Now, however, you can create an inactive struct_ops, and create a link to activate it later. With bpf_links, struct_ops has a behavior similar to other BPF program types. You can pin/unpin them from their links and the struct_ops will be deactivated when its link is removed while previously need someone to delete the value for it to be deactivated. bpf_links are responsible for registering their associated struct_ops. You can only use a struct_ops that has the BPF_F_LINK flag set to create a bpf_link, while a structs without this flag behaves in the same manner as before and is registered upon updating its value. The BPF_LINK_TYPE_STRUCT_OPS serves a dual purpose. Not only is it used to craft the links for BPF struct_ops programs, but also to create links for BPF struct_ops them-self. Since the links of BPF struct_ops programs are only used to create trampolines internally, they are never seen in other contexts. Thus, they can be reused for struct_ops themself. To maintain a reference to the map supporting this link, we add bpf_struct_ops_link as an additional type. The pointer of the map is RCU and won't be necessary until later in the patchset. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-4-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	net: Update an existing TCP congestion control algorithm.	Kui-Feng Lee
	This feature lets you immediately transition to another congestion control algorithm or implementation with the same name. Once a name is updated, new connections will apply this new algorithm. The purpose is to update a customized algorithm implemented in BPF struct_ops with a new version on the flight. The following is an example of using the userspace API implemented in later BPF patches. link = bpf_map__attach_struct_ops(skel->maps.ca_update_1); ....... err = bpf_link__update_map(link, skel->maps.ca_update_2); We first load and register an algorithm implemented in BPF struct_ops, then swap it out with a new one using the same name. After that, newly created connections will apply the updated algorithm, while older ones retain the previous version already applied. This patch also takes this chance to refactor the ca validation into the new tcp_validate_congestion_control() function. Cc: netdev@vger.kernel.org, Eric Dumazet <edumazet@google.com> Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-3-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	bpf: Retire the struct_ops map kvalue->refcnt.	Kui-Feng Lee
	We have replaced kvalue-refcnt with synchronize_rcu() to wait for an RCU grace period. Maintenance of kvalue->refcnt was a complicated task, as we had to simultaneously keep track of two reference counts: one for the reference count of bpf_map. When the kvalue->refcnt reaches zero, we also have to reduce the reference count on bpf_map - yet these steps are not performed in an atomic manner and require us to be vigilant when managing them. By eliminating kvalue->refcnt, we can make our maintenance more straightforward as the refcount of bpf_map is now solely managed! To prevent the trampoline image of a struct_ops from being released while it is still in use, we wait for an RCU grace period. The setsockopt(TCP_CONGESTION, "...") command allows you to change your socket's congestion control algorithm and can result in releasing the old struct_ops implementation. It is fine. However, this function is exposed through bpf_setsockopt(), it may be accessed by BPF programs as well. To ensure that the trampoline image belonging to struct_op can be safely called while its method is in use, the trampoline safeguarde the BPF program with rcu_read_lock(). Doing so prevents any destruction of the associated images before returning from a trampoline and requires us to wait for an RCU grace period. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-2-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-03-22	Merge tag 'mlx5-fixes-2023-03-21' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2023-03-21 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2023-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: E-Switch, Fix an Oops in error handling code net/mlx5: Read the TC mapping of all priorities on ETS query net/mlx5e: Overcome slow response for first macsec ASO WQE net/mlx5e: Initialize link speed to zero net/mlx5: Fix steering rules cleanup net/mlx5e: Block entering switchdev mode with ns inconsistency net/mlx5e: Set uplink rep as NETNS_LOCAL ==================== Link: https://lore.kernel.org/r/20230321211135.47711-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-21 (ice) This series contains updates to ice driver only. Piotr sets first_desc field for proper handling of Flow Director packets. Michal moves error checking for VF earlier in function to properly return error before other checks/reporting; he also corrects VSI filter removal to be done during VSI removal and not rebuild. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: remove filters only if VSI is deleted ice: check if VF exists before mode check ice: fix rx buffers handling for flow director packets ==================== Link: https://lore.kernel.org/r/20230321183641.2849726-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch 'net-ipa-fully-support-ipa-v5-0'	Jakub Kicinski
	Alex Elder says: ==================== net: ipa: fully support IPA v5.0 At long last, add the IPA and GSI register definitions, and the configuration data required to support IPA v5.0. This enables IPA support for the Qualcomm SDX65 SoC. The first version of this series had build errors due to a non-existent source file being required. This version addresses that by changing how required files are specified in the Makefile. Note that the second patch has some warnings about lines starting with spaces; those spaces align text with the open parenthesis on the previous line. ==================== Link: https://lore.kernel.org/r/20230321182644.2143990-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: ipa: add IPA v5.0 configuration data	Alex Elder
	Add the configuration data required for IPA v5.0, which is used in the SDX65 SoC. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: ipa: add IPA v5.0 GSI register definitions	Alex Elder
	Add the definitions of GSI register offsets and fields for IPA v5.0. These are used for the SDX65 SoC. Increase the maximum channel and event ring counts supported by the driver, so those implemented by the SDX65 are supported. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: ipa: add IPA v5.0 register definitions	Alex Elder
	Add the definitions of IPA register offsets and fields for IPA v5.0. These are used for the SDX65 SoC. In the Makefile, split IPA_VERSIONS to use IPA_REG_VERSIONS and IPA_DATA_VERSIONS instead, to allow IPA register definitions for a new version to be added separate from the IPA data. Rename GSI_IPA_VERSIONS to be GSI_REG_VERSIONS for consistency. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch 'quirk-for-oem-sfp-2-5g-t-copper-module'	Jakub Kicinski
	Russell King says: ==================== Quirk for OEM SFP-2.5G-T copper module Frank Wunderlich reports that this copper module requires a quirk in order to function - in that the module needs to use 2500base-X. Moreover, negotiation must be disabled. An example of this device would be: https://www.optcore.net/product/sfp-2g-t-gen ==================== Link: https://lore.kernel.org/r/ZBniMlTDZJQ242DP@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: sfp: add quirk for 2.5G copper SFP	Russell King (Oracle)
	Add a quirk for a copper SFP that identifies itself as "OEM" "SFP-2.5G-T". This module's PHY is inaccessible, and can only run at 2500base-X with the host without negotiation. Add a quirk to enable the 2500base-X interface mode with 2500base-T support, and disable autonegotiation. Reported-by: Frank Wunderlich <frank-w@public-files.de> Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: sfp-bus: allow SFP quirks to override Autoneg and pause bits	Russell King (Oracle)
	Allow SFP quirks to override the Autoneg, Pause and Asym_Pause bits in the support mask. Some modules have an inaccessible PHY on which is only accessible via 2500base-X without Autonegotiation. We therefore want to be able to clear the Autoneg bit. Rearrange sfp_parse_support() to allow a SFP modes quirk to override this bit. Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch 'net-remove-some-skb_mac_header-assumptions'	Jakub Kicinski
	Eric Dumazet says: ==================== net: remove some skb_mac_header assumptions Historically, we tried o maintain skb_mac_header available in most of networking paths. When reaching ndo_start_xmit() handlers, skb_mac_header() should always be skb->data. With recent additions of skb_mac_header_was_set() and DEBUG_NET_WARN_ON_ONCE() in skb_mac_header(), we can attempt to remove our reliance on skb_mac_header in TX paths. When this effort completes we will remove skb_reset_mac_header() from __dev_queue_xmit() and replace it by skb_unset_mac_header() on DEBUG_NET builds. ==================== Link: https://lore.kernel.org/r/20230321164519.1286357-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net/sched: remove two skb_mac_header() uses	Eric Dumazet
	tcf_mirred_act() and tcf_mpls_act() can use skb_network_offset() instead of relying on skb_mac_header(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	sch_cake: do not use skb_mac_header() in cake_overhead()	Eric Dumazet
	We want to remove our use of skb_mac_header() in tx paths, eg remove skb_reset_mac_header() from __dev_queue_xmit(). Idea is that ndo_start_xmit() can get the mac header simply looking at skb->data. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: do not use skb_mac_header() in qdisc_pkt_len_init()	Eric Dumazet
	We want to remove our use of skb_mac_header() in tx paths, eg remove skb_reset_mac_header() from __dev_queue_xmit(). Idea is that ndo_start_xmit() can get the mac header simply looking at skb->data. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch 'remove-phylink_state-s-an_enabled-member'	Jakub Kicinski
	Russell King says: ==================== Remove phylink_state's an_enabled member Now that all the fixes and correctness patches have been merged, it is time to switch the two users that make use of .an_enabled to check the Autoneg bit in the advertising mask, and finally remove the .an_enabled member. The first two patches remove the last uses of .an_enabled, which are in DPAA2 and XPCS. The final patch removes the member. ==================== Link: https://lore.kernel.org/r/ZBnT6yW9UY1sAsiy@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: phylink: remove an_enabled	Russell King (Oracle)
	The Autoneg bit in the advertising bitmap and state->an_enabled are always identical. state->an_enabled is now no longer used by any drivers, so lets kill this duplication. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: pcs: xpcs: use Autoneg bit rather than an_enabled	Russell King (Oracle)
	The Autoneg bit in the advertising bitmap and state->an_enabled are always identical. Thus, we will be removing state->an_enabled. Use the Autoneg bit in the advertising bitmap to indicate whether autonegotiation should be used, rather than using the an_enabled member which will be going away. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	net: dpaa2-mac: use Autoneg bit rather than an_enabled	Russell King (Oracle)
	The Autoneg bit in the advertising bitmap and state->an_enabled are always identical. Thus, we will be removing state->an_enabled. Use the Autoneg bit in the advertising bitmap to indicate whether autonegotiation should be used, rather than using the an_enabled member which will be going away. This means we use the same condition as phylink_mii_c22_pcs_config(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22	Merge branch '40GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-03-21 (iavf, i40e) This series contains updates to iavf and i40e drivers. Stefan Assmann adds check, and return, if driver has already gone through remove to prevent hang for iavf. Radoslaw adds zero initialization to ensure Flow Director packets are populated with correct values for i40e. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: i40e: fix flow director packet filter programming iavf: fix hang on reboot with ice ==================== Link: https://lore.kernel.org/r/20230321183548.2849671-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>