summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-03-27ethtool: Add support for configuring tx_push_buf_lenShay Agroskin
This attribute, which is part of ethtool's ring param configuration allows the user to specify the maximum number of the packet's payload that can be written directly to the device. Example usage: # ethtool -G [interface] tx-push-buf-len [number of bytes] Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27netlink: Add a macro to set policy message with format stringShay Agroskin
Similar to NL_SET_ERR_MSG_FMT, add a macro which sets netlink policy error message with a format string. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-27net: introduce a config option to tweak MAX_SKB_FRAGSEric Dumazet
Currently, MAX_SKB_FRAGS value is 17. For standard tcp sendmsg() traffic, no big deal because tcp_sendmsg() attempts order-3 allocations, stuffing 32768 bytes per frag. But with zero copy, we use order-0 pages. For BIG TCP to show its full potential, we add a config option to be able to fit up to 45 segments per skb. This is also needed for BIG TCP rx zerocopy, as zerocopy currently does not support skbs with frag list. We have used MAX_SKB_FRAGS=45 value for years at Google before we deployed 4K MTU, with no adverse effect, other than a recent issue in mlx4, fixed in commit 26782aad00cc ("net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS") Back then, goal was to be able to receive full size (64KB) GRO packets without the frag_list overhead. Note that /proc/sys/net/core/max_skb_frags can also be used to limit the number of fragments TCP can use in tx packets. By default we keep the old/legacy value of 17 until we get more coverage for the updated values. Sizes of struct skb_shared_info on 64bit arches MAX_SKB_FRAGS | sizeof(struct skb_shared_info): ============================================== 17 320 21 320+64 = 384 25 320+128 = 448 29 320+192 = 512 33 320+256 = 576 37 320+320 = 640 41 320+384 = 704 45 320+448 = 768 This inflation might cause problems for drivers assuming they could pack both the incoming packet (for MTU=1500) and skb_shared_info in half a page, using build_skb(). v3: fix build error when CONFIG_NET=n v2: fix two build errors assuming MAX_SKB_FRAGS was "unsigned long" Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20230323162842.1935061-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 6e9d51b1a5cb ("net/mlx5e: Initialize link speed to zero") 1bffcea42926 ("net/mlx5e: Add devlink hairpin queues parameters") https://lore.kernel.org/all/20230324120623.4ebbc66f@canb.auug.org.au/ https://lore.kernel.org/all/20230321211135.47711-1-saeed@kernel.org/ Adjacent changes: drivers/net/phy/phy.c 323fe43cf9ae ("net: phy: Improved PHY error reporting in state machine") 4203d84032e2 ("net: phy: Ensure state transitions are processed from phy_stop()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-24Merge tag 'net-6.3-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, wifi and bluetooth. Current release - regressions: - wifi: mt76: mt7915: add back 160MHz channel width support for MT7915 - libbpf: revert poisoning of strlcpy, it broke uClibc-ng Current release - new code bugs: - bpf: improve the coverage of the "allow reads from uninit stack" feature to fix verification complexity problems - eth: am65-cpts: reset PPS genf adj settings on enable Previous releases - regressions: - wifi: mac80211: serialize ieee80211_handle_wake_tx_queue() - wifi: mt76: do not run mt76_unregister_device() on unregistered hw, fix null-deref - Bluetooth: btqcomsmd: fix command timeout after setting BD address - eth: igb: revert rtnl_lock() that causes a deadlock - dsa: mscc: ocelot: fix device specific statistics Previous releases - always broken: - xsk: add missing overflow check in xdp_umem_reg() - wifi: mac80211: - fix QoS on mesh interfaces - fix mesh path discovery based on unicast packets - Bluetooth: - ISO: fix timestamped HCI ISO data packet parsing - remove "Power-on" check from Mesh feature - usbnet: more fixes to drivers trusting packet length - wifi: iwlwifi: mvm: fix mvmtxq->stopped handling - Bluetooth: btintel: iterate only bluetooth device ACPI entries - eth: iavf: fix inverted Rx hash condition leading to disabled hash - eth: igc: fix the validation logic for taprio's gate list - dsa: tag_brcm: legacy: fix daisy-chained switches Misc: - bpf: adjust insufficient default bpf_jit_limit to account for growth of BPF use over the last 5 years - xdp: bpf_xdp_metadata() use EOPNOTSUPP as unique errno indicating no driver support" * tag 'net-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) Bluetooth: HCI: Fix global-out-of-bounds Bluetooth: mgmt: Fix MGMT add advmon with RSSI command Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work Bluetooth: L2CAP: Fix responding with wrong PDU type Bluetooth: btqcomsmd: Fix command timeout after setting BD address Bluetooth: btinel: Check ACPI handle for NULL before accessing net: mdio: thunder: Add missing fwnode_handle_put() net: dsa: mt7530: move setting ssc_delta to PHY_INTERFACE_MODE_TRGMII case net: dsa: mt7530: move lowering TRGMII driving to mt7530_setup() net: dsa: mt7530: move enabling disabling core clock to mt7530_pll_setup() net: asix: fix modprobe "sysfs: cannot create duplicate filename" gve: Cache link_speed value from device tools: ynl: Fix genlmsg header encoding formats net: enetc: fix aggregate RMON counters not showing the ranges Bluetooth: Remove "Power-on" check from Mesh feature Bluetooth: Fix race condition in hci_cmd_sync_clear Bluetooth: btintel: Iterate only bluetooth device ACPI entries Bluetooth: ISO: fix timestamped HCI ISO data packet parsing Bluetooth: btusb: Remove detection of ISO packets over bulk Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet ...
2023-03-23Merge branch 'main' of ↵Jakub Kicinski
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for net-next This pull request contains changes for the *net-next* tree. 1. Change IPv6 stack to keep conntrack references until ipsec policy checks are done, like ipv4, from Madhu Koriginja. This update was missed when IPv6 NAT support was added 10 years ago. 2. get rid of old 'compat' structure layout in nf_nat_redirect core and move the conversion to the only user that needs the old layout for abi reasons. From Jeremy Sowden. 3. Compact some common code paths in nft_redir, also from Jeremy. 4. Time to remove the 'default y' knob so iptables 32bit compat interface isn't compiled in by default anymore, from myself. 5. Move ip(6)tables builtin icmp matches to the udptcp one. This has the advantage that icmp/icmpv6 match doesn't load the iptables/ip6tables modules anymore when iptables-nft is used. Also from myself. * 'main' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: keep conntrack reference until IPsecv6 policy checks are done xtables: move icmp/icmpv6 logic to xt_tcpudp netfilter: xtables: disable 32bit compat interface by default netfilter: nft_masq: deduplicate eval call-backs netfilter: nft_redir: use `struct nf_nat_range2` throughout and deduplicate eval call-backs ==================== Link: https://lore.kernel.org/r/20230322210802.6743-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-23docs: networking: document NAPIJakub Kicinski
Add basic documentation about NAPI. We can stop linking to the ancient doc on the LF wiki. Link: https://lore.kernel.org/all/20230315223044.471002-1-kuba@kernel.org/ Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Pavel Pisa <pisa@cmp.felk.cvut.cz> # for ctucanfd-driver.rst Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230322053848.198452-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22net: phylink: remove an_enabledRussell King (Oracle)
The Autoneg bit in the advertising bitmap and state->an_enabled are always identical. state->an_enabled is now no longer used by any drivers, so lets kill this duplication. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22netdev: Enforce index cap in netdev_get_tx_queueNick Child
When requesting a TX queue at a given index, warn on out-of-bounds referencing if the index is greater than the allocated number of queues. Specifically, since this function is used heavily in the networking stack use DEBUG_NET_WARN_ON_ONCE to avoid executing a new branch on every packet. Signed-off-by: Nick Child <nnac123@linux.ibm.com> Link: https://lore.kernel.org/r/20230321150725.127229-2-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22Merge tag 'ipsec-libreswan-mlx5' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== Extend packet offload to fully support libreswan The following patches are an outcome of Raed's work to add packet offload support to libreswan [1]. The series includes: * Priority support to IPsec policies * Statistics per-SA (visible through "ip -s xfrm state ..." command) * Support to IKE policy holes * Fine tuning to acquire logic. [1] https://github.com/libreswan/libreswan/pull/986 Link: https://lore.kernel.org/all/cover.1678714336.git.leon@kernel.org * tag 'ipsec-libreswan-mlx5' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5e: Update IPsec per SA packets/bytes count net/mlx5e: Use one rule to count all IPsec Tx offloaded traffic net/mlx5e: Support IPsec acquire default SA net/mlx5e: Allow policies with reqid 0, to support IKE policy holes xfrm: copy_to_user_state fetch offloaded SA packets/bytes statistics xfrm: add new device offload acquire flag net/mlx5e: Use chains for IPsec policy priority offload net/mlx5: fs_core: Allow ignore_flow_level on TX dest net/mlx5: fs_chains: Refactor to detach chains from tc usage ==================== Link: https://lore.kernel.org/r/20230320094722.1009304-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-22Bluetooth: btintel: Iterate only bluetooth device ACPI entriesKiran K
Current flow interates over entire ACPI table entries looking for Bluetooth Per Platform Antenna Gain(PPAG) entry. This patch iterates over ACPI entries relvant to Bluetooth device only. Fixes: c585a92b2f9c ("Bluetooth: btintel: Set Per Platform Antenna Gain(PPAG)") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22netfilter: nft_redir: use `struct nf_nat_range2` throughout and deduplicate ↵Jeremy Sowden
eval call-backs `nf_nat_redirect_ipv4` takes a `struct nf_nat_ipv4_multi_range_compat`, but converts it internally to a `struct nf_nat_range2`. Change the function to take the latter, factor out the code now shared with `nf_nat_redirect_ipv6`, move the conversion to the xt_REDIRECT module, and update the ipv4 range initialization in the nft_redir module. Replace a bare hex constant for 127.0.0.1 with a macro. Remove `WARN_ON`. `nf_nat_setup_info` calls `nf_ct_is_confirmed`: /* Can't setup nat info for confirmed ct. */ if (nf_ct_is_confirmed(ct)) return NF_ACCEPT; This means that `ct` cannot be null or the kernel will crash, and implies that `ctinfo` is `IP_CT_NEW` or `IP_CT_RELATED`. nft_redir has separate ipv4 and ipv6 call-backs which share much of their code, and an inet one switch containing a switch that calls one of the others based on the family of the packet. Merge the ipv4 and ipv6 ones into the inet one in order to get rid of the duplicate code. Const-qualify the `priv` pointer since we don't need to write through it. Assign `priv->flags` to the range instead of OR-ing it in. Set the `NF_NAT_RANGE_PROTO_SPECIFIED` flag once during init, rather than on every eval. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-03-21net: remove rcu_dereference_bh_rtnl()Eric Dumazet
This helper is no longer used in the tree. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21neighbour: switch to standard rcu, instead of rcu_bhEric Dumazet
rcu_bh is no longer a win, especially for objects freed with standard call_rcu(). Switch neighbour code to no longer disable BH when not necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-20net: pcs: add driver for MediaTek SGMII PCSDaniel Golle
The SGMII core found in several MediaTek SoCs is identical to what can also be found in MediaTek's MT7531 Ethernet switch IC. As this has not always been clear, both drivers developed different implementations to deal with the PCS. Recently Alexander Couzens pointed out this fact which lead to the development of this shared driver. Add a dedicated driver, mostly by copying the code now found in the Ethernet driver. The now redundant code will be removed by a follow-up commit. Suggested-by: Alexander Couzens <lynxis@fe80.eu> Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Tested-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-20Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client fixes from Anna Schumaker: - Fix /proc/PID/io read_bytes accounting - Fix setting NLM file_lock start and end during decoding testargs - Fix timing for setting access cache timestamps * tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Correct timing for assigning access cache timestamp lockd: set file_lock start and end when decoding nlm4 testargs NFS: Fix /proc/PID/io read_bytes for buffered reads
2023-03-20net: phy: smsc: export functions for use by meson-gxl PHY driverHeiner Kallweit
The Amlogic Meson internal PHY's have the same register layout as certain SMSC PHY's (also for non-c22-standard registers). This seems to be more than just coincidence. Apparently they also need the same workaround for EDPD mode (energy detect power down). Therefore let's export SMSC PHY driver functionality for use by the meson-gxl PHY driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Chris Healy <healych@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-20xfrm: add new device offload acquire flagRaed Salem
During XFRM acquire flow, a default SA is created to be updated later, once acquire netlink message is handled in user space. When the relevant policy is offloaded this default SA is also offloaded to IPsec offload supporting driver, however this SA does not have context suitable for offloading in HW, nor is interesting to offload to HW, consequently needs a special driver handling apart from other offloaded SA(s). Add a special flag that marks such SA so driver can handle it correctly. Signed-off-by: Raed Salem <raeds@nvidia.com> Link: https://lore.kernel.org/r/f5da0834d8c6b82ab9ba38bd4a0c55e71f0e3dab.1678714336.git.leon@kernel.org Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-20net: mscc: ocelot: expose serdes configuration functionColin Foster
During chip initialization, ports that use SGMII / QSGMII to interface to external phys need to be configured on the VSC7513 and VSC7514. Expose this configuration routine, so it can be used by DSA drivers. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-20net: mscc: ocelot: expose generic phylink_mac_config routineColin Foster
The ocelot-switch driver can utilize the phylink_mac_config routine. Move this to the ocelot library location and export the symbol to make this possible. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-20net: mscc: ocelot: expose ocelot_pll5_init routineColin Foster
Ocelot chips have an internal PLL that must be used when communicating through external phys. Expose the init routine, so it can be used by other drivers. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19Merge tag 'char-misc-6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few small char/misc/other driver subsystem patches to resolve reported problems for 6.3-rc3. Included in here are: - Interconnect driver fixes for reported problems - Memory driver fixes for reported problems - nvmem core fix - firmware driver fix for reported problem All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits) memory: tegra30-emc: fix interconnect registration race memory: tegra20-emc: fix interconnect registration race memory: tegra124-emc: fix interconnect registration race memory: tegra: fix interconnect registration race interconnect: exynos: drop redundant link destroy interconnect: exynos: fix registration race interconnect: exynos: fix node leak in probe PM QoS error path interconnect: qcom: msm8974: fix registration race interconnect: qcom: rpmh: fix registration race interconnect: qcom: rpmh: fix probe child-node error handling interconnect: qcom: rpm: fix registration race nvmem: core: return -ENOENT if nvmem cell is not found firmware: xilinx: don't make a sleepable memory allocation from an atomic context interconnect: qcom: rpm: fix probe child-node error handling interconnect: qcom: osm-l3: fix registration race interconnect: imx: fix registration race interconnect: fix provider registration API interconnect: fix icc_provider_del() error handling interconnect: fix mem leak when freeing nodes interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT ...
2023-03-19net: stmmac: Fix for mismatched host/device DMA address widthJochen Henneberg
Currently DMA address width is either read from a RO device register or force set from the platform data. This breaks DMA when the host DMA address width is <=32it but the device is >32bit. Right now the driver may decide to use a 2nd DMA descriptor for another buffer (happens in case of TSO xmit) assuming that 32bit addressing is used due to platform configuration but the device will still use both descriptor addresses as one address. This can be observed with the Intel EHL platform driver that sets 32bit for addr64 but the MAC reports 40bit. The TX queue gets stuck in case of TCP with iptables NAT configuration on TSO packets. The logic should be like this: Whatever we do on the host side (memory allocation GFP flags) should happen with the host DMA width, whenever we decide how to set addresses on the device registers we must use the device DMA address width. This patch renames the platform address width field from addr64 (term used in device datasheet) to host_addr and uses this value exclusively for host side operations while all chip operations consider the device DMA width as read from the device register. Fixes: 7cfc4486e7ea ("stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing") Signed-off-by: Jochen Henneberg <jh@henneberg-systemdesign.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: mdio: fix owner field for mdio buses registered using ACPIFlorian Fainelli
Bus ownership is wrong when using acpi_mdiobus_register() to register an mdio bus. That function is not inline, so when it calls mdiobus_register() the wrong THIS_MODULE value is captured. CC: Maxime Bizon <mbizon@freebox.fr> Fixes: 803ca24d2f92 ("net: mdio: Add ACPI support code for mdio") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-19net: mdio: fix owner field for mdio buses registered using device-treeMaxime Bizon
Bus ownership is wrong when using of_mdiobus_register() to register an mdio bus. That function is not inline, so when it calls mdiobus_register() the wrong THIS_MODULE value is captured. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs") [florian: fix kdoc, added Fixes tag] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18tcp: preserve const qualifier in tcp_sk()Eric Dumazet
We can change tcp_sk() to propagate its argument const qualifier, thanks to container_of_const(). We have two places where a const sock pointer has to be upgraded to a write one. We have been using const qualifier for lockless listeners to clearly identify points where writes could happen. Add tcp_sk_rw() helper to better document these. tcp_inbound_md5_hash(), __tcp_grow_window(), tcp_reset_check() and tcp_rack_reo_wnd() get an additional const qualififer for their @tp local variables. smc_check_reset_syn_req() also needs a similar change. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18x25: preserve const qualifier in [a]x25_sk()Eric Dumazet
We can change [a]x25_sk() to propagate their argument const qualifier, thanks to container_of_const(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18af_unix: preserve const qualifier in unix_sk()Eric Dumazet
We can change unix_sk() to propagate its argument const qualifier, thanks to container_of_const(). We need to change dump_common_audit_data() 'struct unix_sock *u' local var to get a const attribute. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18dccp: preserve const qualifier in dccp_sk()Eric Dumazet
We can change dccp_sk() to propagate its argument const qualifier, thanks to container_of_const(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18ipv6: raw: preserve const qualifier in raw6_sk()Eric Dumazet
We can change raw6_sk() to propagate its argument const qualifier, thanks to container_of_const(). Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18raw: preserve const qualifier in raw_sk()Eric Dumazet
We can change raw_sk() to propagate const qualifier of its argument, thanks to container_of_const() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-18udp: preserve const qualifier in udp_sk()Eric Dumazet
We can change udp_sk() to propagate const qualifier of its argument, thanks to container_of_const() This should avoid some potential errors caused by accidental (const -> not_const) promotion. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17net/mlx5e: TC, Add support for VxLAN GBP encap/decap flows offloadGavin Li
Add HW offloading support for TC flows with VxLAN GBP encap/decap. Example of encap rule: tc filter add dev eth0 protocol ip ingress flower \ action tunnel_key set id 42 vxlan_opts 512 \ action mirred egress redirect dev vxlan1 Example of decap rule: tc filter add dev vxlan1 protocol ip ingress flower \ enc_key_id 42 enc_dst_port 4789 vxlan_opts 1024 \ action tunnel_key unset action mirred egress redirect dev eth0 Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17ip_tunnel: Preserve pointer const in ip_tunnel_info_optsGavin Li
Change ip_tunnel_info_opts( ) from static function to macro to cast return value and preserve the const-ness of the pointer. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17vxlan: Expose helper vxlan_build_gbp_hdrGavin Li
The function vxlan_build_gbp_hdr will be used by other modules to build gbp option in vxlan header according to gbp flags. Signed-off-by: Gavin Li <gavinl@nvidia.com> Reviewed-by: Gavi Teitz <gavi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17wwan: core: Support slicing in port TX flow of WWAN subsystemhaozhe chang
wwan_port_fops_write inputs the SKB parameter to the TX callback of the WWAN device driver. However, the WWAN device (e.g., t7xx) may have an MTU less than the size of SKB, causing the TX buffer to be sliced and copied once more in the WWAN device driver. This patch implements the slicing in the WWAN subsystem and gives the WWAN devices driver the option to slice(by frag_len) or not. By doing so, the additional memory copy is reduced. Meanwhile, this patch gives WWAN devices driver the option to reserve headroom in fragments for the device-specific metadata. Signed-off-by: haozhe chang <haozhe.chang@mediatek.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/20230316095826.181904-1-haozhe.chang@mediatek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17ptp: kvm: Use decrypted memory in confidential guest on x86Jeremi Piotrowski
KVM_HC_CLOCK_PAIRING currently fails inside SEV-SNP guests because the guest passes an address to static data to the host. In confidential computing the host can't access arbitrary guest memory so handling the hypercall runs into an "rmpfault". To make the hypercall work, the guest needs to explicitly mark the memory as decrypted. Do that in kvm_arch_ptp_init(), but retain the previous behavior for non-confidential guests to save us from having to allocate memory. Add a new arch-specific function (kvm_arch_ptp_exit()) to free the allocation and mark the memory as encrypted again. Signed-off-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Link: https://lore.kernel.org/r/20230308150531.477741-1-jpiotrowski@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/wireless/nl80211.c b27f07c50a73 ("wifi: nl80211: fix puncturing bitmap policy") cbbaf2bb829b ("wifi: nl80211: add a command to enable/disable HW timestamping") https://lore.kernel.org/all/20230314105421.3608efae@canb.auug.org.au tools/testing/selftests/net/Makefile 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 13715acf8ab5 ("selftest: Add test for bind() conflicts.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-17Merge tag 'net-6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wifi and ipsec. A little more changes than usual, but it's pretty normal for us that the rc3/rc4 PRs are oversized as people start testing in earnest. Possibly an extra boost from people deploying the 6.1 LTS but that's more of an unscientific hunch. Current release - regressions: - phy: mscc: fix deadlock in phy_ethtool_{get,set}_wol() - virtio: vsock: don't use skbuff state to account credit - virtio: vsock: don't drop skbuff on copy failure - virtio_net: fix page_to_skb() miscalculating the memory size Current release - new code bugs: - eth: correct xdp_features after device reconfig - wifi: nl80211: fix the puncturing bitmap policy - net/mlx5e: flower: - fix raw counter initialization - fix missing error code - fix cloned flow attribute - ipa: - fix some register validity checks - fix a surprising number of bad offsets - kill FILT_ROUT_CACHE_CFG IPA register Previous releases - regressions: - tcp: fix bind() conflict check for dual-stack wildcard address - veth: fix use after free in XDP_REDIRECT when skb headroom is small - ipv4: fix incorrect table ID in IOCTL path - ipvlan: make skb->skb_iif track skb->dev for l3s mode - mptcp: - fix possible deadlock in subflow_error_report - fix UaFs when destroying unaccepted and listening sockets - dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290 Previous releases - always broken: - tcp: tcp_make_synack() can be called from process context, don't assume preemption is disabled when updating stats - netfilter: correct length for loading protocol registers - virtio_net: add checking sq is full inside xdp xmit - bonding: restore IFF_MASTER/SLAVE flags on bond enslave Ethertype change - phy: nxp-c45-tja11xx: fix MII_BASIC_CONFIG_REV bit number - eth: i40e: fix crash during reboot when adapter is in recovery mode - eth: ice: avoid deadlock on rtnl lock when auxiliary device plug/unplug meets bonding - dsa: mt7530: - remove now incorrect comment regarding port 5 - set PLL frequency and trgmii only when trgmii is used - eth: mtk_eth_soc: reset PCS state when changing interface types Misc: - ynl: another license adjustment - move the TCA_EXT_WARN_MSG attribute for tc action" * tag 'net-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits) selftests: bonding: add tests for ether type changes bonding: restore bond's IFF_SLAVE flag if a non-eth dev enslave fails bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change net: renesas: rswitch: Fix GWTSDIE register handling net: renesas: rswitch: Fix the output value of quote from rswitch_rx() ethernet: sun: add check for the mdesc_grab() net: ipa: fix some register validity checks net: ipa: kill FILT_ROUT_CACHE_CFG IPA register net: ipa: add two missing declarations net: ipa: reg: include <linux/bug.h> net: xdp: don't call notifiers during driver init net/sched: act_api: add specific EXT_WARN_MSG for tc action Revert "net/sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy" net: dsa: microchip: fix RGMII delay configuration on KSZ8765/KSZ8794/KSZ8795 ynl: make the tooling check the license ynl: broaden the license even more tools: ynl: make definitions optional again hsr: ratelimit only when errors are printed qed/qed_mng_tlv: correctly zero out ->min instead of ->hour selftests: net: devlink_port_split.py: skip test if no suitable device available ...
2023-03-17Merge tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "A bit bigger than usual, as the NVMe pull request missed last weeks submission. In detail: - NVMe pull request via Christoph: - Avoid potential UAF in nvmet_req_complete (Damien Le Moal) - More quirks (Elmer Miroslav Mosher Golovin, Philipp Geulen) - Fix a memory leak in the nvme-pci probe teardown path (Irvin Cote) - Repair the MAINTAINERS entry (Lukas Bulwahn) - Fix handling single range discard request (Ming Lei) - Show more opcode names in trace events (Minwoo Im) - Fix nvme-tcp timeout reporting (Sagi Grimberg) - MD pull request via Song: - Two fixes for old issues (Neil) - Resource leak in device stopping (Xiao) - Bio based device stats fix (Yu) - Kill unused CONFIG_BLOCK_COMPAT (Lukas) - sunvdc missing mdesc_grab() failure check (Liang) - Fix for reversal of request ordering upon issue for certain cases (Jan) - null_blk timeout fixes (Damien) - Loop use-after-free fix (Bart) - blk-mq SRCU fix for BLK_MQ_F_BLOCKING devices (Chris)" * tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linux: block: remove obsolete config BLOCK_COMPAT md: select BLOCK_LEGACY_AUTOLOAD block: count 'ios' and 'sectors' when io is done for bio-based device block: sunvdc: add check for mdesc_grab() returning NULL nvmet: avoid potential UAF in nvmet_req_complete() nvme-trace: show more opcode names nvme-tcp: add nvme-tcp pdu size build protection nvme-tcp: fix opcode reporting in the timeout handler nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620 nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000 nvme-pci: fixing memory leak in probe teardown path nvme: fix handling single range discard request MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS block: null_blk: cleanup null_queue_rq() block: null_blk: Fix handling of fake timeout request blk-mq: fix "bad unlock balance detected" on q->srcu in __blk_mq_run_dispatch_ops loop: Fix use-after-free issues block: do not reverse request order when flushing plug list md: avoid signed overflow in slot_store() md: Free resources in __md_stop
2023-03-17Merge tag 'acpi-6.3-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These add some new quirks, fix PPTT handling, fix an ACPI utility and correct a mistake in the ACPI documentation. Specifics: - Fix ACPI PPTT handling to avoid sleep in the atomic context when it is not present (Sudeep Holla) - Add 'backlight=native' DMI quirk for Dell Vostro 15 3535 to the ACPI video driver (Chia-Lin Kao) - Add ACPI quirks for I2C device enumeration on Lenovo Yoga Book X90 and Acer Iconia One 7 B1-750 (Hans de Goede) - Fix handling of invalid command line option values in the ACPI pfrut utility (Chen Yu) - Fix references to I2C device data type in the ACPI documentation for device enumeration (Andy Shevchenko)" * tag 'acpi-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range ACPI: PPTT: Fix to avoid sleep in the atomic context when PPTT is absent ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90 ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750 ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535 ACPI: docs: enumeration: Correct reference to the I²C device data type
2023-03-17Merge tag 'for-linus-6.3-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - cleanup for xen time handling - enable the VGA console in a Xen PVH dom0 - cleanup in the xenfs driver * tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: remove unnecessary (void*) conversions x86/PVH: obtain VGA console info in Dom0 x86/xen/time: cleanup xen_tsc_safe_clocksource xen: update arch/x86/include/asm/xen/cpuid.h
2023-03-17Merge tag 's390-6.3-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Update defconfigs - Fix early boot code by adding missing intersection check to prevent potential overwriting of the ipl report - Fix a use-after-free issue in s390-specific code related to PCI resources being retained after hot-unplugging individual functions, by removing the resources from the PCI bus's resource list and using the zpci_bar_struct's resource pointer directly * tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: update defconfigs PCI: s390: Fix use-after-free of PCI resources with per-function hotplug s390/ipl: add missing intersection check to ipl_report handling
2023-03-17Merge tag 'drm-fixes-2023-03-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Seems like a pretty regular rc3, i915 and amdgpu with the usual selection of fixes, then a scattering of fixes across misc drivers and other areas: accel: - build fix for accel edid: - fix info leak in edid ttm: - fix NULL ptr deref - reference counting fix i915: - Fix hwmon PL1 power limit enabling - Fix audio ELD handling for DP MST - Fix PSR io and wake line calculations - Fix DG2 HDMI modes with 267.30 and 319.89 MHz pixel clocks - Fix SSEU subslice out-of-bounds access - Fix misuse of non-idle barriers as fence trackers amdgpu: - SMU 13 update - RDNA2 suspend/resume fix when overclocking is enabled - SRIOV VCN fixes - HDCP suspend/resume fix - Fix drm polling splat regression - Fix dirty rectangle tracking for PSR - Fix vangogh regression on certain BIOSes - Misc display fixes - Suspend/resume IOMMU regression fix amdkfd: - Fix BO offset for multi-VMA page migration - Fix a possible double free - Fix potential use after free - Fix process cleanup on module exit bridge: - fix returned array size name documentation fbdev: - ref-counting fix for fbdev deferred I/O virtio: - dma sync fix shmem-helper: - error path fix msm: - shrinker blocking fix panfrost: - shrinker rpm fix chipsfb: - fix error code meson: - fix 1px pink line - fix regulator interaction sun4i: - fix missing component unbind" * tag 'drm-fixes-2023-03-17' of git://anongit.freedesktop.org/drm/drm: (38 commits) drm/ttm: drop extra ttm_bo_put in ttm_bo_cleanup_refs drm/amdgpu: Don't resume IOMMU after incomplete init drm/amdkfd: Fixed kfd_process cleanup on module exit. drm/amd/display: disconnect MPCC only on OTG change drm/amd/display: Fix DP MST sinks removal issue drm/amd/display: Do not set DRR on pipe Commit drm/amd/display: Remove OTG DIV register write for Virtual signals. drm/meson: dw-hdmi: Fix devm_regulator_*get_enable*() conversion again drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes drm/amdgpu/nv: fix codec array for SR_IOV drm/amd/display: Write to correct dirty_rect drm/amdgpu: move poll enabled/disable into non DC path drm/amd/display: Fix HDCP failing to enable after suspend drm/amdkfd: fix potential kgd_mem UAFs drm/amdgpu/vcn: custom video info caps for sriov drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume drm/amd/pm: bump SMU 13.0.4 driver_if header version drm/amdkfd: fix a potential double free in pqm_create_queue drm/amdkfd: Get prange->offset after svm_range_vram_node_new ...
2023-03-17Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Ten patches, eight in drivers and two in the core, which correct a regression from directory removal and add a no VPD size quirk also to fix a regression. All pretty small" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: mcq: Use active_reqs to check busy in clock scaling scsi: core: Fix a procfs host directory removal regression scsi: core: Add BLIST_NO_VPD_SIZE for some VDASD scsi: mpi3mr: Fix expander node leak in mpi3mr_remove() scsi: mpi3mr: Fix memory leaks in mpi3mr_init_ioc() scsi: mpi3mr: Fix sas_hba.phy memory leak in mpi3mr_remove() scsi: mpi3mr: Fix mpi3mr_hba_port memory leak in mpi3mr_remove() scsi: mpi3mr: Fix config page DMA memory leak scsi: mpi3mr: Fix throttle_groups memory leak scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
2023-03-17Merge branches 'acpi-video', 'acpi-x86', 'acpi-tools' and 'acpi-docs'Rafael J. Wysocki
Merge a new ACPI backlight quirk, new ACPI quirks for I2C device enumeration on some platforms, a pfrut utility fix and an ACPI documentation fix for 6.3-rc3: - Add backlight=native DMI quirk for Dell Vostro 15 3535 to the ACPI video driver (Chia-Lin Kao). - Add ACPI quirks for I2C devices enumeration on Lenovo Yoga Book X90 and Acer Iconia One 7 B1-750 (Hans de Goede). - Fix handling of invalid command line option values in the ACPI pfrut utility (Chen Yu). - Fix references to I2C device data type in the ACPI documentation for device enumeration (Andy Shevchenko). * acpi-video: ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535 * acpi-x86: ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90 ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750 ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper * acpi-tools: ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range * acpi-docs: ACPI: docs: enumeration: Correct reference to the I²C device data type
2023-03-17ipv4: raw: constify raw_v4_match() socket argumentEric Dumazet
This clarifies raw_v4_match() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv6: raw: constify raw_v6_match() socket argumentEric Dumazet
This clarifies raw_v6_match() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv6: constify inet6_mc_check()Eric Dumazet
inet6_mc_check() is essentially a read-only function. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17ipv4: constify ip_mc_sf_allow() socket argumentEric Dumazet
This clarifies ip_mc_sf_allow() intent. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>