summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
AgeCommit message (Collapse)Author
2020-01-06Revert "net/mlx5: Support lockless FTE read lookups"Parav Pandit
This reverts commit 7dee607ed0e04500459db53001d8e02f8831f084. During cleanup path, FTE's parent node group is removed which is referenced by the FTE while freeing the FTE. Hence FTE's lockless read lookup optimization done in cited commit is not possible at the moment. Hence, revert the commit. This avoid below KAZAN call trace. [ 110.390896] BUG: KASAN: use-after-free in find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391048] Read of size 4 at addr ffff888c19e6d220 by task swapper/12/0 [ 110.391219] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.5.0-rc1+ [ 110.391222] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014 [ 110.391225] Call Trace: [ 110.391229] <IRQ> [ 110.391246] dump_stack+0x95/0xd5 [ 110.391307] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391320] print_address_description.constprop.5+0x20/0x320 [ 110.391379] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391435] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391441] __kasan_report+0x149/0x18c [ 110.391499] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391504] kasan_report+0x12/0x20 [ 110.391511] __asan_report_load4_noabort+0x14/0x20 [ 110.391567] find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391625] del_sw_fte_rcu+0x4a/0x100 [mlx5_core] [ 110.391633] rcu_core+0x404/0x1950 [ 110.391640] ? rcu_accelerate_cbs_unlocked+0x100/0x100 [ 110.391649] ? run_rebalance_domains+0x201/0x280 [ 110.391654] rcu_core_si+0xe/0x10 [ 110.391661] __do_softirq+0x181/0x66c [ 110.391670] irq_exit+0x12c/0x150 [ 110.391675] smp_apic_timer_interrupt+0xf0/0x370 [ 110.391681] apic_timer_interrupt+0xf/0x20 [ 110.391684] </IRQ> [ 110.391695] RIP: 0010:cpuidle_enter_state+0xfa/0xba0 [ 110.391703] Code: 3d c3 9b b5 50 e8 56 75 6e fe 48 89 45 c8 0f 1f 44 00 00 31 ff e8 a6 94 6e fe 45 84 ff 0f 85 f6 02 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 db 06 00 00 4d 63 fe 4b 8d 04 7f 49 8d 04 87 49 8d [ 110.391706] RSP: 0018:ffff888c23a6fce8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 110.391712] RAX: dffffc0000000000 RBX: ffffe8ffff7002f8 RCX: 000000000000001f [ 110.391715] RDX: 1ffff11184ee6cb5 RSI: 0000000040277d83 RDI: ffff888c277365a8 [ 110.391718] RBP: ffff888c23a6fd40 R08: 0000000000000002 R09: 0000000000035280 [ 110.391721] R10: ffff888c23a6fc80 R11: ffffed11847485d0 R12: ffffffffb1017740 [ 110.391723] R13: 0000000000000003 R14: 0000000000000003 R15: 0000000000000000 [ 110.391732] ? cpuidle_enter_state+0xea/0xba0 [ 110.391738] cpuidle_enter+0x4f/0xa0 [ 110.391747] call_cpuidle+0x6d/0xc0 [ 110.391752] do_idle+0x360/0x430 [ 110.391758] ? arch_cpu_idle_exit+0x40/0x40 [ 110.391765] ? complete+0x67/0x80 [ 110.391771] cpu_startup_entry+0x1d/0x20 [ 110.391779] start_secondary+0x2f3/0x3c0 [ 110.391784] ? set_cpu_sibling_map+0x2500/0x2500 [ 110.391795] secondary_startup_64+0xa4/0xb0 [ 110.391841] Allocated by task 290: [ 110.391917] save_stack+0x21/0x90 [ 110.391921] __kasan_kmalloc.constprop.8+0xa7/0xd0 [ 110.391925] kasan_kmalloc+0x9/0x10 [ 110.391929] kmem_cache_alloc_trace+0xf6/0x270 [ 110.391987] create_root_ns.isra.36+0x58/0x260 [mlx5_core] [ 110.392044] mlx5_init_fs+0x5fd/0x1ee0 [mlx5_core] [ 110.392092] mlx5_load_one+0xc7a/0x3860 [mlx5_core] [ 110.392139] init_one+0x6ff/0xf90 [mlx5_core] [ 110.392145] local_pci_probe+0xde/0x190 [ 110.392150] work_for_cpu_fn+0x56/0xa0 [ 110.392153] process_one_work+0x678/0x1140 [ 110.392157] worker_thread+0x573/0xba0 [ 110.392162] kthread+0x341/0x400 [ 110.392166] ret_from_fork+0x1f/0x40 [ 110.392218] Freed by task 2742: [ 110.392288] save_stack+0x21/0x90 [ 110.392292] __kasan_slab_free+0x137/0x190 [ 110.392296] kasan_slab_free+0xe/0x10 [ 110.392299] kfree+0x94/0x250 [ 110.392357] tree_put_node+0x257/0x360 [mlx5_core] [ 110.392413] tree_remove_node+0x63/0xb0 [mlx5_core] [ 110.392469] clean_tree+0x199/0x240 [mlx5_core] [ 110.392525] mlx5_cleanup_fs+0x76/0x580 [mlx5_core] [ 110.392572] mlx5_unload+0x22/0xc0 [mlx5_core] [ 110.392619] mlx5_unload_one+0x99/0x260 [mlx5_core] [ 110.392666] remove_one+0x61/0x160 [mlx5_core] [ 110.392671] pci_device_remove+0x10b/0x2c0 [ 110.392677] device_release_driver_internal+0x1e4/0x490 [ 110.392681] device_driver_detach+0x36/0x40 [ 110.392685] unbind_store+0x147/0x200 [ 110.392688] drv_attr_store+0x6f/0xb0 [ 110.392693] sysfs_kf_write+0x127/0x1d0 [ 110.392697] kernfs_fop_write+0x296/0x420 [ 110.392702] __vfs_write+0x66/0x110 [ 110.392707] vfs_write+0x1a0/0x500 [ 110.392711] ksys_write+0x164/0x250 [ 110.392715] __x64_sys_write+0x73/0xb0 [ 110.392720] do_syscall_64+0x9f/0x3a0 [ 110.392725] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 7dee607ed0e0 ("net/mlx5: Support lockless FTE read lookups") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-12-09treewide: Use sizeof_field() macroPankaj Bharadiya
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" | while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
2019-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock from commit c8183f548902 ("s390/qeth: fix potential deadlock on workqueue flush"), removed the code which was removed by commit 9897d583b015 ("s390/qeth: consolidate some duplicated HW cmd code"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-20net/mlx5: Fix auto group size calculationMaor Gottlieb
Once all the large flow groups (defined by the user when the flow table is created - max_num_groups) were created, then all the following new flow groups will have only one flow table entry, even though the flow table has place to larger groups. Fix the condition to prefer large flow group. Fixes: f0d22d187473 ("net/mlx5_core: Introduce flow steering autogrouped flow table") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux 1) New generic devlink param "enable_roce", for downstream devlink reload support 2) Do vport ACL configuration on per vport basis when enabling/disabling a vport. This enables to have vports enabled/disabled outside of eswitch config for future 3) Split the code for legacy vs offloads mode and make it clear 4) Tide up vport locking and workqueue usage 5) Fix metadata enablement for ECPF 6) Make explicit use of VF property to publish IB_DEVICE_VIRTUAL_FUNCTION 7) E-Switch and flow steering core low level support and refactoring for netfilter flowtables offload Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Add new chain for netfilter flow table offloadPaul Blakey
Netfilter tables (nftables) implements a software datapath that comes after tc ingress datapath. The datapath supports offloading such rules via the flow table offload API. This API is currently only used by NFT and it doesn't provide the global priority in regards to tc offload, so we assume offloading such rules must come after tc. It does provide a flow table priority parameter, so we need to provide some supported priority range. For that, split fastpath prio to two, flow table offload and tc offload, with one dedicated priority chain for flow table offload. Next patch will re-use the multi chain API to access this chain by allowing access to this chain by the fdb_sub_namespace. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Refactor creating fast path prio chainsPaul Blakey
Next patch will re-use this to add a new chain but in a different prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Accumulate levels for chains prio namespacesPaul Blakey
Tc chains are implemented by creating a chained prio steering type, and inside it there is a namespace for each chain (FDB_TC_MAX_CHAINS). Each of those has a list of priorities. Currently, all namespaces in a prio start at the parent prio level. But since we can jump from chain (namespace) to another chain in the same prio, we need the levels for higher chains to be higher as well. So we created unused prios to account for levels in previous namespaces. Fix that by accumulating the namespaces levels if we are inside a chained type prio, and removing the unused prios. Fixes: 328edb499f99 ('net/mlx5: Split FDB fast path prio to multiple namespaces') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Define fdb tc levels per prioPaul Blakey
Define FDB_TC_LEVELS_PER_PRIO instead of magic number 2. This is the number of levels used by each tc prio table in the fdb. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-13net/mlx5: Rename FDB_* tc related defines to FDB_TC_* definesPaul Blakey
Rename it to prepare for next patch that will add a different type of offload to the FDB. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-01net/mlx5: Support lockless FTE read lookupsParav Pandit
During connection tracking offloads with high number of connections, (40K connections per second), flow table group lock contention is observed. To improve the performance by reducing lock contention, lockless FTE read lookup is performed as described below. Each flow table entry is refcounted. Flow table entry is removed when refcount drops to zero. rhash table allows rcu protected lookup. Each hash table entry insertion and removal is write lock protected. Hence, it is possible to perform lockless lookup in rhash table using following scheme. (a) Guard FTE entry lookup per group using rcu read lock. (b) Before freeing the FTE entry, wait for all readers to finish accessing the FTE. Below example of one reader and write in parallel racing, shows protection in effect with rcu lock. lookup_fte_locked() rcu_read_lock(); search_hash_table() existing_flow_group_write_lock(); tree_put_node(fte) drop_ref_cnt(fte) del_sw_fte(fte) del_hash_table_entry(); call_rcu(); existing_flow_group_write_unlock(); get_ref_cnt(fte) fails rcu_read_unlock(); rcu grace period(); [..] kmem_cache_free(fte); Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-01net/mlx5: Do not hold group lock while allocating FTE in softwareParav Pandit
FTE memory allocation using alloc_fte() doesn't have any dependency on the flow group. Hence, do not hold flow group lock while performing alloc_fte(). This helps to reduce contention of flow group lock. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03net/mlx5: Add API to set the namespace steering modeMaor Gottlieb
Add API to set the flow steering root namesapce mode. Setting new mode should be called before any steering operation is executed on the namespace. This API is going to be used by steering users such switchdev. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03net/mlx5: Add direct rule fs_cmd implementationMaor Gottlieb
Add support to create flow steering objects via direct rule API (SW steering). New layer is added - fs_dr, this layer translates the command that fs_core sends to the FW into direct rule API. In case that direct rule is not supported in some feature then -EOPNOTSUPP is returned. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-03net/mlx5: Add flow steering actions to fs_cmd shim layerMaor Gottlieb
Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-21net/mlx5: Create bypass and loopback flow steering namespaces for RDMA RXMark Zhang
Use different namespaces for bypass and switchdev loopback because they have different priorities and default table miss action requirement: 1. bypass: with multiple priorities support, and MLX5_FLOW_TABLE_MISS_ACTION_DEF as the default table miss action; 2. switchdev loopback: with single priority support, and MLX5_FLOW_TABLE_MISS_ACTION_SWITCH_DOMAIN as the default table miss action. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-08-21net/mlx5: Add per-namespace flow table default miss action supportMark Zhang
Currently all the namespaces under the same steering domain share the same default table miss action, however in some situations (e.g., RDMA RX) different actions are required. This patch adds a per-namespace default table miss action instead of using the miss action of the steering domain. Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-07-04Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch: 1) Add the required HW definitions and structures for upcoming TLS support. 2) Add support for MCQI and MCQS hardware registers for fw version query. 3) Added hardware bits and structures definitions for sub-functions 4) Small code cleanup and improvement for PF pci driver. 5) Bluefield (ECPF) updates and refactoring for better E-Switch management on ECPF embedded CPU NIC: 5.1) Consolidate querying eswitch number of VFs 5.2) Register event handler at the correct E-Switch init stage 5.3) Setup PF's inline mode and vlan pop when the ECPF is the E-Swtich manager ( the host PF is basically a VF ). 5.4) Handle Vport UC address changes in switchdev mode. 6) Cleanup the rep and netdev reference when unloading IB rep. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> i# All conflicts fixed but you are still merging.
2019-07-03net/mlx5: Introduce and use mlx5_eswitch_get_total_vports()Parav Pandit
Instead MLX5_TOTAL_VPORTS, use mlx5_eswitch_get_total_vports(). mlx5_eswitch_get_total_vports() in subsequent patch accounts for SF vports as well. Expanding MLX5_TOTAL_VPORTS macro would require exposing SF internals to more generic vport.h header file. Such exposure is not desired. Hence a mlx5_eswitch_get_total_vports() is introduced. Given that mlx5_eswitch_get_total_vports() API wants to work on const mlx5_core_dev*, change its helper functions also to accept const *dev. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-28net/mlx5e: reduce stack usage in mlx5_eswitch_termtbl_createArnd Bergmann
Putting an empty 'mlx5_flow_spec' structure on the stack is a bit wasteful and causes a warning on 32-bit architectures when building with clang -fsanitize-coverage: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c: In function 'mlx5_eswitch_termtbl_create': drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c:90:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Since the structure is never written to, we can statically allocate it to avoid the stack usage. To be on the safe side, mark all subsequent function arguments that we pass it into as 'const' as well. Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-28Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch: 1) E-Switch vport metadata support for source vport matching 2) Convert mkey_table to XArray 3) Shared IRQs and to use single IRQ for all async EQs Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-26net/mlx5: Add flow context for flow tagJianbo Liu
Refactor the flow data structures, add new flow_context and move flow_tag into it, as flow_tag doesn't belong to the rule action. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28net/mlx5: Allocate root ns memory using kzalloc to match kfreeParav Pandit
root ns is yet another fs core node which is freed using kfree() by tree_put_node(). Rest of the other fs core objects are also allocated using kmalloc variants. However, root ns memory is allocated using kvzalloc(). Hence allocate root ns memory using kzalloc(). Fixes: 2530236303d9e ("net/mlx5_core: Flow steering tree initialization") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28net/mlx5: Avoid double free in fs init error unwinding pathParav Pandit
In below code flow, for ingress acl table root ns memory leads to double free. mlx5_init_fs init_ingress_acls_root_ns() init_ingress_acl_root_ns kfree(steering->esw_ingress_root_ns); /* steering->esw_ingress_root_ns is not marked NULL */ mlx5_cleanup_fs cleanup_ingress_acls_root_ns steering->esw_ingress_root_ns non NULL check passes. kfree(steering->esw_ingress_root_ns); /* double free */ Similar issue exist for other tables. Hence zero out the pointers to not process the table again. Fixes: 9b93ab981e3bf ("net/mlx5: Separate ingress/egress namespaces for each vport") Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-28net/mlx5: Avoid double free of root ns in the error flow pathParav Pandit
When root ns setup for rdma, sniffer tx and sniffer rx fails, such root ns cleanup is done by the error unwinding path of mlx5_cleanup_fs(). Below call graph shows an example for sniffer_rx_root_ns. mlx5_init_fs() init_sniffer_rx_root_ns() cleanup_root_ns(steering->sniffer_rx_root_ns); mlx5_cleanup_fs() cleanup_root_ns(steering->sniffer_rx_root_ns); /* double free of sniffer_rx_root_ns */ Hence, use the existing cleanup_fs to cleanup. Fixes: d83eb50e29de3 ("net/mlx5: Add support in RDMA RX steering") Fixes: 87d22483ce68e ("net/mlx5: Add sniffer namespaces") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-17net/mlx5e: Additional check for flow destination comparisonDmytro Linkin
Flow destination comparison has an inaccuracy: code see no difference between same vf ports, which belong to different pfs. Example: If start ping from VF0 (PF1) to VF1 (PF1) and mirror all traffic to VF0 (PF2), icmp reply to VF0 (PF1) and mirrored flow to VF0 (PF2) would be determined as same destination. It lead to creating flow handler with rule nodes, which not added to node tree. When later driver try to delete this flow rules we got kernel crash. Add comparison of vhca_id field to avoid this. Fixes: 1228e912c934 ("net/mlx5: Consider encapsulation properties when comparing destinations") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-29net/mlx5: Add new miss flow table actionMaor Gottlieb
Flow table supports three types of miss action: 1. Default miss action - go to default miss table according to table. 2. Go to specific table. 3. Switch domain - go to the root table of an alternative steering table domain. New table miss action was added - switch_domain. The next domain for RDMA_RX namespace is the NIC RX domain. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-29net/mlx5: Add support in RDMA RX steeringMaor Gottlieb
Add new flow steering namespace - MLX5_FLOW_NAMESPACE_RDMA_RX. Flow steering rules in this namespace are used to filter RDMA traffic. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-29net/mlx5: Pass flow steering objects to fs_cmdMaor Gottlieb
Pass the flow steering objects instead of their attributes to fs_cmd in order to decrease number of arguments and in addition it will be used to update object fields. Pass the flow steering root namespace instead of the device so will have context to the namespace in the fs_cmd layer. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-22Merge tag 'v5.1-rc1' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mlx5-next Linux 5.1-rc1 We forgot to reset the branch last merge window thus mlx5-next is outdated and still based on 5.0-rc2. This merge commit is needed to sync mlx5-next branch with 5.1-rc1. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-10net/mlx5: E-Switch, add a new prio to be used by the RDMA sideMark Bloch
Create a new prio in the FDB, it will be used when inserting steering rules into the FDB from the RDMA side. We create a new PRIO so rules from the net side and rules from the RDMA side won't be inserted to the same PRIO, each side has it's own sandbox to play in. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-10net/mlx5: E-Switch, don't use hardcoded values for FDB priosMark Bloch
When creating the FDB prios, use the enum values already defined and not the hardcoded values. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-04-02net/mlx5: Fix false compilation warningTariq Toukan
Fix the following warning: drivers/net/ethernet/mellanox/mlx5/core//fs_core.c:845:5: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized] No real issue here. This is only a false compiler warning. The 'err' variable is guaranteed to be init by time of usage. gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-11net/mlx5: Consolidate update FTE for all removal changesEli Britstein
With commit a18e879d4e45 ("net/mlx5e: Annul encap action ordering requirement") and a use-case of e-switch remote mirroring, the incremental/stepped FTE removal process done by the fs core got us to illegal transient states and FW errors: SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x9c2e40) To avoid that and improve FTE removal performance, aggregate the FTE's updates that should be applied. Remove the FTE if it is empty, or apply one FW update command with the aggregated updates. Fixes: a18e879d4e45 ("net/mlx5e: Annul encap action ordering requirement") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-11net/mlx5: Add a locked flag to node removal functionsEli Britstein
Add a locked flag to the node removal functions to signal if the parent is already locked from the caller function or not as a pre-step towards outside lock. Currently always use false with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-11net/mlx5: Add modify FTE helper functionEli Britstein
Add modify FTE helper function and use it when deleting a rule, as a pre-step towards consolidated FTE modification, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-11net/mlx5: Fix multiple updates of steering rules in parallelEli Britstein
There might be a condition where the fte found is not active yet. In this case we should not use it, but continue to search for another, or allocate a new one. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next shared branched into net-next, From Bodong Wang: 1) Introduction of ECPF (Embedded CPU Physical Function), and low level bits for mlx5 SmartNic capabilities support. 2) Vport enumeration refactoring that affect mlx5_ib and mlx5_core From Aya Levin, 3) Add support for 50Gbps per lane link modes in the Port Type and Speed register (PTYS) 4) Refactor low level query functions for PTYS register 5) Add support for 50Gbps per lane link modes to mlx5_ib Note: due to a change in API in mlx5/core and a later patch from net-next, a fixup was squashed with this merge commit that replaces FDB_UPLINK_VPORT with MLX5_VPORT_UPLINK which exists only in upstream net-next. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Relocate vport macros to the vport header fileBodong Wang
These are two macros in the driver general header which deal with the number of total vports and if a vport is vport manager. Such macros are vport entities, better to place them at the vport header file. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-04net/mlx5: Fix code style issue in mlx driverTonghao Zhang
Add the tab before '}' and keep the code style consistent. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25net/mlx5: Add trace points for flow tables create/destroyOr Gerlitz
We were not tracking flow tables so far, add it up. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19net/mlx5: Typo fix in del_sw_hw_ruleYuval Avnery
Expression terminated with "," instead of ";", resulted in set_fte getting bad value for modify_enable_mask field. Fixes: bd5251dbf156 ("net/mlx5_core: Introduce flow steering destination of type counter") Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11net/mlx5: Consider encapsulation properties when comparing destinationsEli Britstein
Currently the driver identifies identical vport destinations by comparing the vport ID. The FW extended destination feature enables the driver to forward the packet to the same vport with multiple encapsulation properties. Change the vport destination comparison logic to compare the encapsulation properties in addition to the vport ID. Signed-off-by: Eli Britstein <elibr@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-22net/mlx5: Allocate enough space for the FDB sub-namespacesDan Carpenter
FDB_MAX_CHAIN is three. We wanted to allocate enough memory to hold four structs but there are missing parentheses so we only allocate enough memory for three structs and the first byte of the fourth one. Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17net/mlx5: Add a no-append flow insertion modePaul Blakey
If no-append flag is set, we will add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. While here, move the has_flow_tag boolean indicator to be a flag too. This patch doesn't change any functionality. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Split FDB fast path prio to multiple namespacesPaul Blakey
Towards supporting multi-chains and priorities, split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. This patch adds a new flow steering type, FS_TYPE_PRIO_CHAINS, which is like current FS_TYPE_PRIO, but may contain only namespaces, and those will be in parallel to one another in terms of managing of the flow tables connections inside them. Meaning, while searching for the next or previous flow table to connect for a new table inside such namespace we skip the parallel namespaces in the same level under the FS_TYPE_PRIO_CHAINS prio we originated from. We use this new type for splitting the fast path prio into multiple parallel namespaces, each containing normal prios. The prios inside them (and their tables) will be connected to one another, but not from one parallel namespace to another, instead the last prio in each namespace will be connected to the next prio in the containing FDB namespace, which is the slow path prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Use flow counter IDs and not the wrapping cache objectMark Bloch
Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: E-Switch, Get counters for offloaded flows from callersMark Bloch
There's no real reason for the e-switch logic to manage the creation of counters for offloaded flows. The API already has the directive for the caller to denote they want to attach a counter to the created flow. As such, we go and move the management of flow counters to the mlx5e tc offload logic. This also lets us remove an inelegant interface where the FS layer had to provide a way to retrieve a counter from a flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17Merge branch 'mlx5-next' of ↵Saeed Mahameed
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-next mlx5 updates for both net-next and rdma-next * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits) net/mlx5: Expose DC scatter to CQE capability bit net/mlx5: Update mlx5_ifc with DEVX UID bits net/mlx5: Set uid as part of DCT commands net/mlx5: Set uid as part of SRQ commands net/mlx5: Set uid as part of SQ commands net/mlx5: Set uid as part of RQ commands net/mlx5: Set uid as part of QP commands net/mlx5: Set uid as part of CQ commands net/mlx5: Rename incorrect naming in IFC file net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic net/mlx5: Add memic command opcode to command checker ... Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>