summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-05-28mptcp: fix sk_forward_memory corruption on retransmissionPaolo Abeni
MPTCP sk_forward_memory handling is a bit special, as such field is protected by the msk socket spin_lock, instead of the plain socket lock. Currently we have a code path updating such field without handling the relevant lock: __mptcp_retrans() -> __mptcp_clean_una_wakeup() Several helpers in __mptcp_clean_una_wakeup() will update sk_forward_alloc, possibly causing such field corruption, as reported by Matthieu. Address the issue providing and using a new variant of blamed function which explicitly acquires the msk spin lock. Fixes: 64b9cea7a0af ("mptcp: fix spurious retransmissions") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/172 Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28netfilter: add and use nft_set_do_lookup helperFlorian Westphal
Followup patch will add a CONFIG_RETPOLINE wrapper to avoid the ops->lookup() indirection cost for retpoline builds. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR stateStefano Brivio
We don't need a valid MXCSR state for the lookup routines, none of the instructions we use rely on or affect any bit in the MXCSR register. Instead of calling kernel_fpu_begin(), we can pass 0 as mask to kernel_fpu_begin_mask() and spare one LDMXCSR instruction. Commit 49200d17d27d ("x86/fpu/64: Don't FNINIT in kernel_fpu_begin()") already speeds up lookups considerably, and by dropping the MCXSR initialisation we can now get a much smaller, but measurable, increase in matching rates. The table below reports matching rates and a wild approximation of clock cycles needed for a match in a "port,net" test with 10 entries from selftests/netfilter/nft_concat_range.sh, limited to the first field, i.e. the port (with nft_set_rbtree initialisation skipped), run on a single AMD Epyc 7351 thread (2.9GHz, 512 KiB L1D$, 8 MiB L2$). The (very rough) estimation of clock cycles is obtained by simply dividing frequency by matching rate. The "cycles spared" column refers to the difference in cycles compared to the previous row, and the rate increase also refers to the previous row. Results are averages of six runs. Merely for context, I'm also reporting packet rates obtained by skipping kernel_fpu_begin() and kernel_fpu_end() altogether (which shows a very limited impact now), as well as skipping the whole lookup function, compared to simply counting and dropping all packets using the netdev hook drop (see nft_concat_range.sh for details). This workload also includes packet generation with pktgen and the receive path of veth. |matching| est. | cycles | rate | | rate | cycles | spared |increase| | (Mpps) | | | | --------------------------------------|--------|--------|--------|--------| FNINIT, LDMXCSR (before 49200d17d27d) | 5.245 | 553 | - | - | LDMXCSR only (with 49200d17d27d) | 6.347 | 457 | 96 | 21.0% | Without LDMXCSR (this patch) | 6.461 | 449 | 8 | 1.8% | -------- for reference only: ---------|--------|--------|--------|--------| Without kernel_fpu_begin() | 6.513 | 445 | 4 | 0.8% | Without actual matching (return true) | 7.649 | 379 | 66 | 17.4% | Without lookup operation (netdev drop)| 10.320 | 281 | 98 | 34.9% | The clock cycles spared by avoiding LDMXCSR appear to be in line with CPI and latency indicated in the manuals of comparable architectures: Intel Skylake (CPI: 1, latency: 7) and AMD 12h (latency: 12) -- I couldn't find this information for AMD 17h. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28netfilter: nft_exthdr: Support SCTP chunksPhil Sutter
Chunks are SCTP header extensions similar in implementation to IPv6 extension headers or TCP options. Reusing exthdr expression to find and extract field values from them is therefore pretty straightforward. For now, this supports extracting data from chunks at a fixed offset (and length) only - chunks themselves are an extensible data structure; in order to make all fields available, a nested extension search is needed. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-27Merge tag 'mlx5-updates-2021-05-26' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-05-26 Misc update for mlx5 driver, 1) Clean up patches for lag and SF 2) Reserve bit 31 in steering register C1 for IPSec offload usage 3) Move steering tables pool logic into the steering core and increase the maximum table size to 2G entries when software steering is enabled. * tag 'mlx5-updates-2021-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Fix lag port remapping logic net/mlx5: Use boolean arithmetic to evaluate roce_lag net/mlx5: Remove unnecessary spin lock protection net/mlx5: Cap the maximum flow group size to 16M entries net/mlx5: DR, Set max table size to 2G entries net/mlx5: Move chains ft pool to be used by all firmware steering net/mlx5: Move table size calculation to steering cmd layer net/mlx5: Add case for FS_FT_NIC_TX FT in MLX5_CAP_FLOWTABLE_TYPE net/mlx5: DR, Remove unused field of send_ring struct net/mlx5e: RX, Remove unnecessary check in RX CQE compression handling net/mlx5e: IPsec/rep_tc: Fix rep_tc_update_skb drops IPsec packet net/mlx5e: TC: Reserved bit 31 of REG_C1 for IPsec offload net/mlx5e: TC: Use bit counts for register mapping net/mlx5: CT: Avoid reusing modify header context for natted entries net/mlx5e: CT, Remove newline from ct_dbg call ==================== Link: https://lore.kernel.org/r/20210527185624.694304-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27ixgbe: Fix out-bounds warning in ixgbe_host_interface_command()Gustavo A. R. Silva
Replace union with a couple of pointers in order to fix the following out-of-bounds warning: CC [M] drivers/net/ethernet/intel/ixgbe/ixgbe_common.o drivers/net/ethernet/intel/ixgbe/ixgbe_common.c: In function ‘ixgbe_host_interface_command’: drivers/net/ethernet/intel/ixgbe/ixgbe_common.c:3729:13: warning: array subscript 1 is above array bounds of ‘u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 3729 | bp->u32arr[bi] = IXGBE_READ_REG_ARRAY(hw, IXGBE_FLEX_MNG, bi); | ~~~~~~~~~~^~~~ drivers/net/ethernet/intel/ixgbe/ixgbe_common.c:3682:7: note: while referencing ‘u32arr’ 3682 | u32 u32arr[1]; | ^~~~~~ This helps with the ongoing efforts to globally enable -Warray-bounds. Link: https://github.com/KSPP/linux/issues/109 Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20210527173424.362456-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27Merge branch 'add-4-rx-tx-queue-support-for-mikrotik-10-25g-nic'Jakub Kicinski
Gatis Peisenieks says: ==================== add 4 RX/TX queue support for Mikrotik 10/25G NIC More RX/TX queues on a network card help spread the CPU load among cores and achieve higher overall networking performance. This patch set adds support for 4 RX/TX queues available on Mikrotik 10/25G NIC. v4: - addressed comments from Jakub Kicinski: - split up the change in more manageable chunks - changed member order in structs for tighter packing - fixed style issues - reverted to calling napi_alloc_skb only from within poll as before v3: - fix kernel-doc complaints on comments as pointed out by David Miller v2: - rebase on net-next master as requested by David Miller ==================== Link: https://lore.kernel.org/r/20210527144423.3395719-1-gatis@mikrotik.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27atl1c: add 4 RX/TX queue support for Mikrotik 10/25G NICGatis Peisenieks
More RX/TX queues on a network card help spread the CPU load among cores and achieve higher overall networking performance. The new Mikrotik 10/25G NIC supports 4 RX and 4 TX queues. TX queues are treated with equal priority. RX queue balancing is fixed based on L2/L3/L4 hash. This adds support for 4 RX/TX queues while maintaining backwards compatibility with older hardware. Simultaneous TX + RX performance on AMD Threadripper 3960X with Mikrotik 10/25G NIC improved from 1.6Mpps to 3.2Mpps per port. Backwards compatiblitiy was verified with AR8151 and AR8131 based NICs. Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27atl1c: prepare for multiple rx queuesGatis Peisenieks
Move napi and other per queue members into per rx queue struct. Allocate max rx queues that any hw supported by the driver might have. Patch that actually enables multiple rx queues will follow. Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27atl1c: move tx napi into tpd_ringGatis Peisenieks
To get more performance from using multiple tx queues one needs a per tx queue napi. Move tx napi from per adapter struct into per tx queue struct. Patch that actually enables multiple tx queues will follow. Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27atl1c: detect NIC type earlyGatis Peisenieks
To support NICs that allow for more than one tx queue it is required to detect NIC type early during probe. This is moves NIC type detection before netdev_alloc to prepare for that. Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Fix incorrect sockopts unregistration from error path, from Florian Westphal. 2) A few patches to provide better error reporting when missing kernel netfilter options are missing in .config. 3) Fix dormant table flag updates. 4) Memleak in IPVS when adding service with IP_VS_SVC_F_HASHED flag. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service netfilter: nf_tables: fix table flag updates netfilter: nf_tables: extended netlink error reporting for chain type netfilter: nf_tables: missing error reporting for not selected expressions netfilter: conntrack: unregister ipv4 sockopts on error unwind ==================== Link: https://lore.kernel.org/r/20210527190115.98503-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27net/sched: act_ct: Fix ct template allocation for zone 0Ariel Levkovich
Fix current behavior of skipping template allocation in case the ct action is in zone 0. Skipping the allocation may cause the datapath ct code to ignore the entire ct action with all its attributes (commit, nat) in case the ct action in zone 0 was preceded by a ct clear action. The ct clear action sets the ct_state to untracked and resets the skb->_nfct pointer. Under these conditions and without an allocated ct template, the skb->_nfct pointer will remain NULL which will cause the tc ct action handler to exit without handling commit and nat actions, if such exist. For example, the following rule in OVS dp: recirc_id(0x2),ct_state(+new-est-rel-rpl+trk),ct_label(0/0x1), \ in_port(eth0),actions:ct_clear,ct(commit,nat(src=10.11.0.12)), \ recirc(0x37a) Will result in act_ct skipping the commit and nat actions in zone 0. The change removes the skipping of template allocation for zone 0 and treats it the same as any other zone. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/20210526170110.54864-1-lariel@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27net/sched: act_ct: Offload connections with commit actionPaul Blakey
Currently established connections are not offloaded if the filter has a "ct commit" action. This behavior will not offload connections of the following scenario: $ tc_filter add dev $DEV ingress protocol ip prio 1 flower \ ct_state -trk \ action ct commit action goto chain 1 $ tc_filter add dev $DEV ingress protocol ip chain 1 prio 1 flower \ action mirred egress redirect dev $DEV2 $ tc_filter add dev $DEV2 ingress protocol ip prio 1 flower \ action ct commit action goto chain 1 $ tc_filter add dev $DEV2 ingress protocol ip prio 1 chain 1 flower \ ct_state +trk+est \ action mirred egress redirect dev $DEV Offload established connections, regardless of the commit flag. Fixes: 46475bb20f4b ("net/sched: act_ct: Software offload of established flows") Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Link: https://lore.kernel.org/r/1622029449-27060-1-git-send-email-paulb@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27Merge branch 'mlx-devlink-dev-info-versions-adjustments'Jakub Kicinski
Jiri Pirko says: ==================== mlx*: devlink dev info versions adjustments Couple of adjustments in Mellanox drivers regarding devlink dev versions fill-up. ==================== Link: https://lore.kernel.org/r/20210526104509.761807-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27mlxsw: core: use PSID string define in devlink infoJiri Pirko
Instead of having the string spelled out in the driver, use the global define with the same value. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27mlxsw: core: Expose FW version over defined keywordJiri Pirko
To be aligned with the rest of the drivers, expose FW version under "fw" keyword in devlink dev info, in addition to the existing "fw.version", which is currently Mellanox-specific. devlink output before: running: fw.version 30.2008.2018 after: running: fw.version 30.2008.2018 fw 30.2008.2018 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27net/mlx5: Expose FW version over defined keywordJiri Pirko
To be aligned with the rest of the drivers, expose FW version under "fw" keyword in devlink dev info, in addition to the existing "fw.version", which is currently Mellanox-specific. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27selftests: devlink_lib: add check for devlink device existenceJiri Pirko
If user passes devlink handle over DEVLINK_DEV variable, check if the device exists. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20210527105515.790330-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27devlink: Correct VIRTUAL port to not have phys_port attributesParav Pandit
Physical port name, port number attributes do not belong to virtual port flavour. When VF or SF virtual ports are registered they incorrectly append "np0" string in the netdevice name of the VF/SF. Before this fix, VF netdevice name were ens2f0np0v0, ens2f0np0v1 for VF 0 and 1 respectively. After the fix, they are ens2f0v0, ens2f0v1. With this fix, reading /sys/class/net/ens2f0v0/phys_port_name returns -EOPNOTSUPP. Also devlink port show example for 2 VFs on one PF to ensure that any physical port attributes are not exposed. $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false pci/0000:06:00.3/196608: type eth netdev ens2f0v0 flavour virtual splittable false pci/0000:06:00.4/262144: type eth netdev ens2f0v1 flavour virtual splittable false This change introduces a netdevice name change on systemd/udev version 245 and higher which honors phys_port_name sysfs file for generation of netdevice name. This also aligns to phys_port_name usage which is limited to switchdev ports as described in [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/Documentation/networking/switchdev.rst Fixes: acf1ee44ca5d ("devlink: Introduce devlink port flavour virtual") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20210526200027.14008-1-parav@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27Merge tag 'linux-can-next-for-5.14-20210527' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== can-next 2021-05-27 The first 2 patches are by Geert Uytterhoeven and convert the rcan_can and rcan_canfd device tree bindings to yaml. The next 2 patches are by Oliver Hartkopp and me and update the CAN uapi headers. zuoqilin's patch removes an unnecessary variable from the CAN proc code. Patrick Menschel contributes 3 patches for CAN ISOTP to enhance the error messages. Jiapeng Chong's patch removes two dead stores from the softing driver. The next 4 patches are by me and silence several warnings found by clang compiler. Jimmy Assarsson's patches for the kvaser_usb driver add support for the Kvaser hydra devices. Dario Binacchi provides 2 patches for the c_can driver, first removing an unused variable, then adding basic ethtool support to query driver and ring parameter info. The last 4 patches are by Torin Cooper-Bennun and clean up the m_can driver. * tag 'linux-can-next-for-5.14-20210527' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (21 commits) can: m_can: fix whitespace in a few comments can: m_can: make TXESC, RXESC config more explicit can: m_can: clean up CCCR reg defs, order by revs can: m_can: use bits.h macros for all regmasks can: c_can: add ethtool support can: c_can: remove unused variable struct c_can_priv::rxmasked can: kvaser_usb: Add new Kvaser hydra devices can: kvaser_usb: Rename define USB_HYBRID_{,PRO_}CANLIN_PRODUCT_ID can: at91_can: silence clang warning can: mcp251xfd: silence clang warning can: mcp251x: mcp251x_can_probe(): silence clang warning can: hi311x: hi3110_can_probe(): silence clang warning can: softing: Remove redundant variable ptr can: isotp: Add error message if txqueuelen is too small can: isotp: add symbolic error message to isotp_module_init() can: isotp: change error format from decimal to symbolic error names can: proc: remove unnecessary variables can: uapi: introduce CANFD_FDF flag for mixed content in struct canfd_frame can: uapi: update CAN-FD frame description dt-bindings: can: rcar_canfd: Convert to json-schema ... ==================== Link: https://lore.kernel.org/r/20210527084532.1384031-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27devlink: append split port number to the port nameJiri Pirko
Instead of doing sprintf twice in case the port is split or not, append the split port suffix in case the port is split. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20210527104819.789840-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27net/mlx5: Fix lag port remapping logicEli Cohen
Fix the logic so that if both ports netdevices are enabled or disabled, use the trivial mapping without swapping. If only one of the netdevice's tx is enabled, use it to remap traffic to that port. Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Use boolean arithmetic to evaluate roce_lagEli Cohen
Avoid mixing boolean and bit arithmetic when evaluating validity of roce_lag. Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Remove unnecessary spin lock protectionEli Cohen
Taking lag_lock to access ldev->tracker is meaningless in the context of do_bond() and mlx5_lag_netdev_event(). Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Cap the maximum flow group size to 16M entriesPaul Blakey
The maximum number of large flow groups applies to both small and large tables. For very large tables (such as the 2G SW steering tables) this may create a small number of flow groups each with an unrealistic entries domain (> 16M). Set the maximum number of large flow groups to at least what user requested, but with a maximum per group size of 16M entries. For software steering, if user requested less than 128 large flow groups, it will gives us about 128 16M groups in a 2G entries tables. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: DR, Set max table size to 2G entriesPaul Blakey
SW steering has no table size limitations. However, fs_core API is size aware. Set SW steering tables to the maximum possible table size (INT_MAX). Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Move chains ft pool to be used by all firmware steeringPaul Blakey
Firmware FT pool is per device, but the software tracking of this pool only services fs_chains users, and if another layer takes a flow table, the pool will not be updated, and fs_chains will fail creating a flow table, with no recovery till the flow table is returned. Move FT pool to be global per device, and stored at the cmd level, so all layers can use it. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Move table size calculation to steering cmd layerPaul Blakey
Currently the table size is calculated by the fs_core layer. However, each steering cmd instance has a different allocation logic. FW steering uses a predefined pools of multiple sizes. SW steering doesn't have a pool, and can allocate any size of tables. Move the table size calculation to the steering cmd layer as a pre-step for moving fs_chains pool logic globally to firmware steering, and increasing table sizes for software steering. In addition, change the size parameter to absolute size to allow the special zero value to mean "get next available maximum size". Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: Add case for FS_FT_NIC_TX FT in MLX5_CAP_FLOWTABLE_TYPEPaul Blakey
Commit 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters") added MLX5_CAP_FLOWTABLE_TYPE but forgot to account for FS_FT_NIC_TX case in the expression. Although the expression will return 1 for this case instead of the actual cap, there isn't currently no known side affects of missing this case. Add the FS_FT_NIC_TX case. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: DR, Remove unused field of send_ring structYevgeny Kliteynik
Remove unused field of struct mlx5dr_send_ring Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5e: RX, Remove unnecessary check in RX CQE compression handlingTariq Toukan
There are two reasons for exiting mlx5e_decompress_cqes_cont(): 1. The compression session is completed (cqd.left == 0). 2. The budget is exhausted (work_done == budget). If after calling mlx5e_decompress_cqes_cont() we have cqd.left > 0, it necessarily implies that budget is exhausted. The first part of the complex condition is covered by the second, hence we remove it here. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5e: IPsec/rep_tc: Fix rep_tc_update_skb drops IPsec packetHuy Nguyen
rep_tc copy REG_C1 to REG_B. IPsec crypto utilizes the whole REG_B register with BIT31 as IPsec marker. rep_tc_update_skb drops IPsec because it thought REG_B contains bad value. In previous patch, BIT 31 of REG_C1 is reserved for IPsec. Skip the rep_tc_update_skb if BIT31 of REG_B is set. Signed-off-by: Huy Nguyen <huyn@nvidia.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5e: TC: Reserved bit 31 of REG_C1 for IPsec offloadHuy Nguyen
Currently ASAP features fully utilize all the bits of the CQE's flow tag and ft_metadata field. The flow tag field cannot be used because the flow table tagging in FTE does not allow partial write. We agree to reserve bit 31 of CQE's ft_metadata for IPsec to avoid ASAP CT from dropping IPsec offloaded packet Here is the new bit layout of REG_C1. Tunnel option id is reduced to 11 bits: < IPSEC MARKER (1) | ESW_TUN_ID(12) | ESW_TUN_OPTS(11) | ESW_ZONE_ID(8) > Signed-off-by: Huy Nguyen <huyn@nvidia.com> Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com>
2021-05-27net/mlx5e: TC: Use bit counts for register mappingPaul Blakey
To prepare for next patch where we will use a non-byte aligned mapping, change all byte counts in register mapping to bits. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5: CT: Avoid reusing modify header context for natted entriesPaul Blakey
Currently the driver is designed to reuse header modify context entries. Natted entries will always have a unique modify header, as such the modify header hashtable lookup is introducing an overhead. When the hashtable size exceeded 200k entries the tested insertion rate dropped from ~10k entries/sec to ~300 entries/sec. Don't use the re-use mechanism when creating modify headers for natted tuples. Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27net/mlx5e: CT, Remove newline from ct_dbg callRoi Dayan
ct_dbg() already adds a newline. Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-05-27kbuild: Quote OBJCOPY var to avoid a pahole call break the buildJavier Martinez Canillas
The ccache tool can be used to speed up cross-compilation, by calling the compiler and binutils through ccache. For example, following should work: $ export ARCH=arm64 CROSS_COMPILE="ccache aarch64-linux-gnu-" $ make M=drivers/gpu/drm/rockchip/ but pahole fails to extract the BTF info from DWARF, breaking the build: CC [M] drivers/gpu/drm/rockchip//rockchipdrm.mod.o LD [M] drivers/gpu/drm/rockchip//rockchipdrm.ko BTF [M] drivers/gpu/drm/rockchip//rockchipdrm.ko aarch64-linux-gnu-objcopy: invalid option -- 'J' Usage: aarch64-linux-gnu-objcopy [option(s)] in-file [out-file] Copies a binary file, possibly transforming it in the process ... make[1]: *** [scripts/Makefile.modpost:156: __modpost] Error 2 make: *** [Makefile:1866: modules] Error 2 this fails because OBJCOPY is set to "ccache aarch64-linux-gnu-copy" and later pahole is executed with the following command line: LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@ which gets expanded to: LLVM_OBJCOPY=ccache aarch64-linux-gnu-objcopy pahole -J ... instead of: LLVM_OBJCOPY="ccache aarch64-linux-gnu-objcopy" pahole -J ... Fixes: 5f9ae91f7c0d ("kbuild: Build kernel module BTFs if BTF is enabled and pahole supports it") Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/bpf/20210526215228.3729875-1-javierm@redhat.com
2021-05-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27Bluetooth: fix the erroneous flush_work() orderLin Ma
In the cleanup routine for failed initialization of HCI device, the flush_work(&hdev->rx_work) need to be finished before the flush_work(&hdev->cmd_work). Otherwise, the hci_rx_work() can possibly invoke new cmd_work and cause a bug, like double free, in late processings. This was assigned CVE-2021-3564. This patch reorder the flush_work() to fix this bug. Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: Hao Xiong <mart1n@zju.edu.cn> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-05-27ipvs: ignore IP_VS_SVC_F_HASHED flag when adding serviceJulian Anastasov
syzbot reported memory leak [1] when adding service with HASHED flag. We should ignore this flag both from sockopt and netlink provided data, otherwise the service is not hashed and not visible while releasing resources. [1] BUG: memory leak unreferenced object 0xffff888115227800 (size 512): comm "syz-executor263", pid 8658, jiffies 4294951882 (age 12.560s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff83977188>] kmalloc include/linux/slab.h:556 [inline] [<ffffffff83977188>] kzalloc include/linux/slab.h:686 [inline] [<ffffffff83977188>] ip_vs_add_service+0x598/0x7c0 net/netfilter/ipvs/ip_vs_ctl.c:1343 [<ffffffff8397d770>] do_ip_vs_set_ctl+0x810/0xa40 net/netfilter/ipvs/ip_vs_ctl.c:2570 [<ffffffff838449a8>] nf_setsockopt+0x68/0xa0 net/netfilter/nf_sockopt.c:101 [<ffffffff839ae4e9>] ip_setsockopt+0x259/0x1ff0 net/ipv4/ip_sockglue.c:1435 [<ffffffff839fa03c>] raw_setsockopt+0x18c/0x1b0 net/ipv4/raw.c:857 [<ffffffff83691f20>] __sys_setsockopt+0x1b0/0x360 net/socket.c:2117 [<ffffffff836920f2>] __do_sys_setsockopt net/socket.c:2128 [inline] [<ffffffff836920f2>] __se_sys_setsockopt net/socket.c:2125 [inline] [<ffffffff836920f2>] __x64_sys_setsockopt+0x22/0x30 net/socket.c:2125 [<ffffffff84350efa>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 [<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-and-tested-by: syzbot+e562383183e4b1766930@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-27can: m_can: fix whitespace in a few commentsTorin Cooper-Bennun
Fixes whitespace in comments titling sections of register masks. Link: https://lore.kernel.org/r/20210504125123.500553-5-torin@maxiluxsystems.com Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: m_can: make TXESC, RXESC config more explicitTorin Cooper-Bennun
Introduce masks for the three RXESC fields (RBDS, F1DS, F0DS) and the one TXESC field (TBDS). Update m_can_chip_config() to explicitly set all four fields to the 64-byte option (0x7) (and these defs are renamed to be more concise). This is an improvement in maintainability, and also makes it easier to implement more flexible configuration of the M_CAN buffers in the future. Link: https://lore.kernel.org/r/20210504125123.500553-4-torin@maxiluxsystems.com Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: m_can: clean up CCCR reg defs, order by revsTorin Cooper-Bennun
Ensures that the different CCCR regmasks for m_can revs 3.0.x, 3.1.x, 3.2.x and 3.3.x are clearly distinguishable. Removes incorrect CCCR_CANFD define. Adds bit fields UTSU and WMM for rev 3.3.x, for completeness. Link: https://lore.kernel.org/r/20210504125123.500553-3-torin@maxiluxsystems.com Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: m_can: use bits.h macros for all regmasksTorin Cooper-Bennun
This updates m_can.c to exclusively use GENMASK, FIELD_GET, FIELD_PREP and FIELD_MAX for regmask ops, as is convention in the current kernel (far less error-prone, far more concise). Link: https://lore.kernel.org/r/20210504125123.500553-2-torin@maxiluxsystems.com Signed-off-by: Torin Cooper-Bennun <torin@maxiluxsystems.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: c_can: add ethtool supportDario Binacchi
With commit 132f2d45fb23 ("can: c_can: add support to 64 message objects") the number of message objects used for reception / transmission depends on FIFO size. The ethtools API support allows you to retrieve this info. Driver info has been added too. Link: https://lore.kernel.org/r/20210514165549.14365-2-dariobin@libero.it Signed-off-by: Dario Binacchi <dariobin@libero.it> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: c_can: remove unused variable struct c_can_priv::rxmaskedDario Binacchi
The member rxmasked of struct c_can_priv is initialized by c_can_chip_config(), but's it's never used, so remove it. Link: https://lore.kernel.org/r/20210509124309.30024-2-dariobin@libero.it Signed-off-by: Dario Binacchi <dariobin@libero.it> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: kvaser_usb: Add new Kvaser hydra devicesJimmy Assarsson
Add new Kvaser hydra devices. Link: https://lore.kernel.org/r/20210429093730.499263-2-extja@kvaser.com Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: kvaser_usb: Rename define USB_HYBRID_{,PRO_}CANLIN_PRODUCT_IDJimmy Assarsson
Rename define USB_HYBRID_{,PRO_}CANLIN_PRODUCT_ID to USB_HYBRID_{,PRO_}2CANLIN_PRODUCT_ID, to reflect the channel count. Link: https://lore.kernel.org/r/20210429093730.499263-1-extja@kvaser.com Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27can: at91_can: silence clang warningMarc Kleine-Budde
This patch fixes the following clang warning, by marking the functions as maybe unused. gcc doesn't complain about unused inline functions. | drivers/net/can/at91_can.c:178:1: warning: unused function 'at91_is_sam9X5' [-Wunused-function] | AT91_IS(9X5); | ^ | drivers/net/can/at91_can.c:172:19: note: expanded from macro 'AT91_IS' | static inline int at91_is_sam##_model(const struct at91_priv *priv) \ | ^ | <scratch space>:66:1: note: expanded from here | at91_is_sam9X5 | ^ Link: https://lore.kernel.org/r/20210514153741.1958041-2-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>