summaryrefslogtreecommitdiff
path: root/net/netfilter/nf_tables_api.c
AgeCommit message (Collapse)Author
2020-03-24netfilter: nf_tables: Allow set back-ends to report partial overlaps on ↵Pablo Neira Ayuso
insertion Currently, the -EEXIST return code of ->insert() callbacks is ambiguous: it might indicate that a given element (including intervals) already exists as such, or that the new element would clash with existing ones. If identical elements already exist, the front-end is ignoring this without returning error, in case NLM_F_EXCL is not set. However, if the new element can't be inserted due an overlap, we should report this to the user. To this purpose, allow set back-ends to return -ENOTEMPTY on collision with existing elements, translate that to -EEXIST, and return that to userspace, no matter if NLM_F_EXCL was set. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-05netfilter: nf_tables: fix infinite loop when expr is not availableFlorian Westphal
nft will loop forever if the kernel doesn't support an expression: 1. nft_expr_type_get() appends the family specific name to the module list. 2. -EAGAIN is returned to nfnetlink, nfnetlink calls abort path. 3. abort path sets ->done to true and calls request_module for the expression. 4. nfnetlink replays the batch, we end up in nft_expr_type_get() again. 5. nft_expr_type_get attempts to append family-specific name. This one already exists on the list, so we continue 6. nft_expr_type_get adds the generic expression name to the module list. -EAGAIN is returned, nfnetlink calls abort path. 7. abort path encounters the family-specific expression which has 'done' set, so it gets removed. 8. abort path requests the generic expression name, sets done to true. 9. batch is replayed. If the expression could not be loaded, then we will end up back at 1), because the family-specific name got removed and the cycle starts again. Note that userspace can SIGKILL the nft process to stop the cycle, but the desired behaviour is to return an error after the generic expr name fails to load the expression. Fixes: eb014de4fd418 ("netfilter: nf_tables: autoload modules from the abort path") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-05netfilter: nf_tables: dump NFTA_CHAIN_FLAGS attributePablo Neira Ayuso
Missing NFTA_CHAIN_FLAGS netlink attribute when dumping basechain definitions. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-04netfilter: nf_tables: free flowtable hooks on hook register errorFlorian Westphal
If hook registration fails, the hooks allocated via nft_netdev_hook_alloc need to be freed. We can't change the goto label to 'goto 5' -- while it does fix the memleak it does cause a warning splat from the netfilter core (the hooks were not registered). Fixes: 3f0465a9ef02 ("netfilter: nf_tables: dynamically allocate hooks per net_device in flowtables") Reported-by: syzbot+a2ff6fa45162a5ed4dd3@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-27netfilter: nf_tables: Support for sets with multiple ranged fieldsStefano Brivio
Introduce a new nested netlink attribute, NFTA_SET_DESC_CONCAT, used to specify the length of each field in a set concatenation. This allows set implementations to support concatenation of multiple ranged items, as they can divide the input key into matching data for every single field. Such set implementations would be selected as they specify support for NFT_SET_INTERVAL and allow desc->field_count to be greater than one. Explicitly disallow this for nft_set_rbtree. In order to specify the interval for a set entry, userspace would include in NFTA_SET_DESC_CONCAT attributes field lengths, and pass range endpoints as two separate keys, represented by attributes NFTA_SET_ELEM_KEY and NFTA_SET_ELEM_KEY_END. While at it, export the number of 32-bit registers available for packet matching, as nftables will need this to know the maximum number of field lengths that can be specified. For example, "packets with an IPv4 address between 192.0.2.0 and 192.0.2.42, with destination port between 22 and 25", can be expressed as two concatenated elements: NFTA_SET_ELEM_KEY: 192.0.2.0 . 22 NFTA_SET_ELEM_KEY_END: 192.0.2.42 . 25 and NFTA_SET_DESC_CONCAT attribute would contain: NFTA_LIST_ELEM NFTA_SET_FIELD_LEN: 4 NFTA_LIST_ELEM NFTA_SET_FIELD_LEN: 2 v4: No changes v3: Complete rework, NFTA_SET_DESC_CONCAT instead of NFTA_SET_SUBKEY v2: No changes Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-27netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attributePablo Neira Ayuso
Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the interval between kernel and userspace. This patch also adds the NFT_SET_EXT_KEY_END extension to store the closing element value in this interval. v4: No changes v3: New patch [sbrivio: refactor error paths and labels; add corresponding nft_set_ext_type for new key; rebase] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-27netfilter: nf_tables: add nft_setelem_parse_key()Pablo Neira Ayuso
Add helper function to parse the set element key netlink attribute. v4: No changes v3: New patch [sbrivio: refactor error paths and labels; use NFT_DATA_VALUE_MAXLEN instead of sizeof(*key) in helper, value can be longer than that; rebase] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-24netfilter: nf_tables: autoload modules from the abort pathPablo Neira Ayuso
This patch introduces a list of pending module requests. This new module list is composed of nft_module_request objects that contain the module name and one status field that tells if the module has been already loaded (the 'done' field). In the first pass, from the preparation phase, the netlink command finds that a module is missing on this list. Then, a module request is allocated and added to this list and nft_request_module() returns -EAGAIN. This triggers the abort path with the autoload parameter set on from nfnetlink, request_module() is called and the module request enters the 'done' state. Since the mutex is released when loading modules from the abort phase, the module list is zapped so this is iteration occurs over a local list. Therefore, the request_module() calls happen when object lists are in consistent state (after fulling aborting the transaction) and the commit list is empty. On the second pass, the netlink command will find that it already tried to load the module, so it does not request it again and nft_request_module() returns 0. Then, there is a look up to find the object that the command was missing. If the module was successfully loaded, the command proceeds normally since it finds the missing object in place, otherwise -ENOENT is reported to userspace. This patch also updates nfnetlink to include the reason to enter the abort phase, which is required for this new autoload module rationale. Fixes: ec7470b834fe ("netfilter: nf_tables: store transaction list locally while requesting module") Reported-by: syzbot+29125d208b3dae9a7019@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-24netfilter: nf_tables: add __nft_chain_type_get()Pablo Neira Ayuso
This new helper function validates that unknown family and chain type coming from userspace do not trigger an out-of-bound array access. Bail out in case __nft_chain_type_get() returns NULL from nft_chain_parse_hook(). Fixes: 9370761c56b6 ("netfilter: nf_tables: convert built-in tables/chains to chain types") Reported-by: syzbot+156a04714799b1d480bc@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: nf_tables: fix flowtable list del corruptionFlorian Westphal
syzbot reported following crash: list_del corruption, ffff88808c9bb000->prev is LIST_POISON2 (dead000000000122) [..] Call Trace: __list_del_entry include/linux/list.h:131 [inline] list_del_rcu include/linux/rculist.h:148 [inline] nf_tables_commit+0x1068/0x3b30 net/netfilter/nf_tables_api.c:7183 [..] The commit transaction list has: NFT_MSG_NEWTABLE NFT_MSG_NEWFLOWTABLE NFT_MSG_DELFLOWTABLE NFT_MSG_DELTABLE A missing generation check during DELTABLE processing causes it to queue the DELFLOWTABLE operation a second time, so we corrupt the list here: case NFT_MSG_DELFLOWTABLE: list_del_rcu(&nft_trans_flowtable(trans)->list); nf_tables_flowtable_notify(&trans->ctx, because we have two different DELFLOWTABLE transactions for the same flowtable. We then call list_del_rcu() twice for the same flowtable->list. The object handling seems to suffer from the same bug so add a generation check too and only queue delete transactions for flowtables/objects that are still active in the next generation. Reported-by: syzbot+37a6804945a3a13b1572@syzkaller.appspotmail.com Fixes: 3b49e2e94e6eb ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()Dan Carpenter
Syzbot detected a leak in nf_tables_parse_netdev_hooks(). If the hook already exists, then the error handling doesn't free the newest "hook". Reported-by: syzbot+f9d4095107fc8749c69c@syzkaller.appspotmail.com Fixes: b75a3e8371bc ("netfilter: nf_tables: allow netdevice to be used only once per flowtable") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: nf_tables: remove WARN and add NLA_STRING upper limitsFlorian Westphal
This WARN can trigger because some of the names fed to the module autoload function can be of arbitrary length. Remove the WARN and add limits for all NLA_STRING attributes. Reported-by: syzbot+0e63ae76d117ae1c3a01@syzkaller.appspotmail.com Fixes: 452238e8d5ffd8 ("netfilter: nf_tables: add and use helper for module autoload") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-16netfilter: nf_tables: store transaction list locally while requesting modulePablo Neira Ayuso
This patch fixes a WARN_ON in nft_set_destroy() due to missing set reference count drop from the preparation phase. This is triggered by the module autoload path. Do not exercise the abort path from nft_request_module() while preparation phase cleaning up is still pending. WARNING: CPU: 3 PID: 3456 at net/netfilter/nf_tables_api.c:3740 nft_set_destroy+0x45/0x50 [nf_tables] [...] CPU: 3 PID: 3456 Comm: nft Not tainted 5.4.6-arch3-1 #1 RIP: 0010:nft_set_destroy+0x45/0x50 [nf_tables] Code: e8 30 eb 83 c6 48 8b 85 80 00 00 00 48 8b b8 90 00 00 00 e8 dd 6b d7 c5 48 8b 7d 30 e8 24 dd eb c5 48 89 ef 5d e9 6b c6 e5 c5 <0f> 0b c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 7f 10 e9 52 RSP: 0018:ffffac4f43e53700 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffff99d63a154d80 RCX: 0000000001f88e03 RDX: 0000000001f88c03 RSI: ffff99d6560ef0c0 RDI: ffff99d63a101200 RBP: ffff99d617721de0 R08: 0000000000000000 R09: 0000000000000318 R10: 00000000f0000000 R11: 0000000000000001 R12: ffffffff880fabf0 R13: dead000000000122 R14: dead000000000100 R15: ffff99d63a154d80 FS: 00007ff3dbd5b740(0000) GS:ffff99d6560c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00001cb5de6a9000 CR3: 000000016eb6a004 CR4: 00000000001606e0 Call Trace: __nf_tables_abort+0x3e3/0x6d0 [nf_tables] nft_request_module+0x6f/0x110 [nf_tables] nft_expr_type_request_module+0x28/0x50 [nf_tables] nf_tables_expr_parse+0x198/0x1f0 [nf_tables] nft_expr_init+0x3b/0xf0 [nf_tables] nft_dynset_init+0x1e2/0x410 [nf_tables] nf_tables_newrule+0x30a/0x930 [nf_tables] nfnetlink_rcv_batch+0x2a0/0x640 [nfnetlink] nfnetlink_rcv+0x125/0x171 [nfnetlink] netlink_unicast+0x179/0x210 netlink_sendmsg+0x208/0x3d0 sock_sendmsg+0x5e/0x60 ____sys_sendmsg+0x21b/0x290 Update comment on the code to describe the new behaviour. Reported-by: Marco Oliverio <marco.oliverio@tanaza.com> Fixes: 452238e8d5ff ("netfilter: nf_tables: add and use helper for module autoload") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-05netfilter: nf_tables: unbind callbacks from flowtable destroy pathPablo Neira Ayuso
Callback unbinding needs to be done after nf_flow_table_free(), otherwise entries are not removed from the hardware. Update nft_unregister_flowtable_net_hooks() to call nf_unregister_net_hook() instead since the commit/abort paths do not deal with the callback unbinding anymore. Add a comment to nft_flowtable_event() to clarify that flow_offload_netdev_event() already removes the entries before the callback unbinding. Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Fixes ff4bf2f42a40 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: wenxu <wenxu@ucloud.cn>
2019-12-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso, including adding a missing ipv6 match description. 2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi Bhat. 3) Fix uninit value in bond_neigh_init(), from Eric Dumazet. 4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold. 5) Fix use after free in tipc_disc_rcv(), from Tuong Lien. 6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul Chaignon. 7) Multicast MAC limit test is off by one in qede, from Manish Chopra. 8) Fix established socket lookup race when socket goes from TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening RCU grace period. From Eric Dumazet. 9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet. 10) Fix active backup transition after link failure in bonding, from Mahesh Bandewar. 11) Avoid zero sized hash table in gtp driver, from Taehee Yoo. 12) Fix wrong interface passed to ->mac_link_up(), from Russell King. 13) Fix DSA egress flooding settings in b53, from Florian Fainelli. 14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost. 15) Fix double free in dpaa2-ptp code, from Ioana Ciornei. 16) Reject invalid MTU values in stmmac, from Jose Abreu. 17) Fix refcount leak in error path of u32 classifier, from Davide Caratti. 18) Fix regression causing iwlwifi firmware crashes on boot, from Anders Kaseorg. 19) Fix inverted return value logic in llc2 code, from Chan Shu Tak. 20) Disable hardware GRO when XDP is attached to qede, frm Manish Chopra. 21) Since we encode state in the low pointer bits, dst metrics must be at least 4 byte aligned, which is not necessarily true on m68k. Add annotations to fix this, from Geert Uytterhoeven. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits) sfc: Include XDP packet headroom in buffer step size. sfc: fix channel allocation with brute force net: dst: Force 4-byte alignment of dst_metrics selftests: pmtu: fix init mtu value in description hv_netvsc: Fix unwanted rx_table reset net: phy: ensure that phy IDs are correctly typed mod_devicetable: fix PHY module format qede: Disable hardware gro when xdp prog is installed net: ena: fix issues in setting interrupt moderation params in ethtool net: ena: fix default tx interrupt moderation interval net/smc: unregister ib devices in reboot_event net: stmmac: platform: Fix MDIO init for platforms without PHY llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c) net: hisilicon: Fix a BUG trigered by wrong bytes_compl net: dsa: ksz: use common define for tag len s390/qeth: don't return -ENOTSUPP to userspace s390/qeth: fix promiscuous mode after reset s390/qeth: handle error due to unsupported transport mode cxgb4: fix refcount init for TC-MQPRIO offload tc-testing: initial tdc selftests for cls_u32 ...
2019-12-09treewide: Use sizeof_field() macroPankaj Bharadiya
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except at places where these are defined. Later patches will remove the unused definition of FIELD_SIZEOF(). This patch is generated using following script: EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h" git grep -l -e "\bFIELD_SIZEOF\b" | while read file; do if [[ "$file" =~ $EXCLUDE_FILES ]]; then continue fi sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file; done Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: David Miller <davem@davemloft.net> # for net
2019-12-09netfilter: nf_tables: skip module reference count bump on object updatesPablo Neira Ayuso
Use __nft_obj_type_get() instead, otherwise there is a module reference counter leak. Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-09netfilter: nf_tables: validate NFT_DATA_VALUE after nft_data_init()Pablo Neira Ayuso
Userspace might bogusly sent NFT_DATA_VERDICT in several netlink attributes that assume NFT_DATA_VALUE. Moreover, make sure that error path invokes nft_data_release() to decrement the reference count on the chain object. Fixes: 96518518cc41 ("netfilter: add nftables") Fixes: 0f3cd9b36977 ("netfilter: nf_tables: add range expression") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-09netfilter: nf_tables: validate NFT_SET_ELEM_INTERVAL_ENDPablo Neira Ayuso
Only NFTA_SET_ELEM_KEY and NFTA_SET_ELEM_FLAGS make sense for elements whose NFT_SET_ELEM_INTERVAL_END flag is set on. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-11-26Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle were: - Dynamic tick (nohz) updates, perhaps most notably changes to force the tick on when needed due to lengthy in-kernel execution on CPUs on which RCU is waiting. - Linux-kernel memory consistency model updates. - Replace rcu_swap_protected() with rcu_prepace_pointer(). - Torture-test updates. - Documentation updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer() net/sched: Replace rcu_swap_protected() with rcu_replace_pointer() net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer() net/core: Replace rcu_swap_protected() with rcu_replace_pointer() bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer() fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer() drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer() drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer() x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer() rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer() rcu: Suppress levelspread uninitialized messages rcu: Fix uninitialized variable in nocb_gp_wait() rcu: Update descriptions for rcu_future_grace_period tracepoint rcu: Update descriptions for rcu_nocb_wake tracepoint rcu: Remove obsolete descriptions for rcu_barrier tracepoint rcu: Ensure that ->rcu_urgent_qs is set before resched IPI workqueue: Convert for_each_wq to use built-in list check rcu: Several rcu_segcblist functions can be static rcu: Remove unused function hlist_bl_del_init_rcu() Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch() ...
2019-11-15netfilter: nf_tables: add nft_unregister_flowtable_hook()Pablo Neira Ayuso
Unbind flowtable callback if hook is unregistered. This patch is implicitly fixing the error path of nf_tables_newflowtable() and nft_flowtable_event(). Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Reported-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-11-15netfilter: nf_tables: check if bind callback fails and unbind if hook ↵wenxu
registration fails Undo the callback binding before unregistering the existing hooks. This should also check for error of the bind setup call. Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-11-15netfilter: nf_tables_offload: undo updates if transaction failsPablo Neira Ayuso
The nft_flow_rule_offload_commit() function might fail after several successful commands, thus, leaving the hardware filtering policy in inconsistent state. This patch adds nft_flow_rule_offload_abort() function which undoes the updates that have been already processed if one command in this transaction fails. Hence, the hardware ruleset is left as it was before this aborted transaction. The deletion path needs to create the flow_rule object too, in case that an existing rule needs to be re-added from the abort path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-11-12netfilter: nf_tables: add flowtable offload control planePablo Neira Ayuso
This patch adds the NFTA_FLOWTABLE_FLAGS attribute that allows users to specify the NF_FLOWTABLE_HW_OFFLOAD flag. This patch also adds a new setup interface for the flowtable type to perform the flowtable offload block callback configuration. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
One conflict in the BPF samples Makefile, some fixes in 'net' whilst we were converting over to Makefile.target rules in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-04netfilter: nf_tables: bogus EOPNOTSUPP on basechain updatePablo Neira Ayuso
Userspace never includes the NFT_BASE_CHAIN flag, this flag is inferred from the NFTA_CHAIN_HOOK atribute. The chain update path does not allow to update flags at this stage, the existing sanity check bogusly hits EOPNOTSUPP in the basechain case if the offload flag is set on. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-11-04netfilter: nf_tables: fix unexpected EOPNOTSUPP errorFernando Fernandez Mancera
If the object type doesn't implement an update operation and the user tries to update it will silently ignore the update operation. Fixes: aa4095a156b5 ("netfilter: nf_tables: fix possible null-pointer dereference in object update") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-30net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()Paul E. McKenney
This commit replaces the use of rcu_swap_protected() with the more intuitively appealing rcu_replace_pointer() as a step towards removing rcu_swap_protected(). Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/ Reported-by: Linus Torvalds <torvalds@linux-foundation.org> [ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: <netfilter-devel@vger.kernel.org> Cc: <coreteam@netfilter.org> Cc: <netdev@vger.kernel.org>
2019-10-23netfilter: nf_tables: support for multiple devices per netdev hookPablo Neira Ayuso
This patch allows you to register one netdev basechain to multiple devices. This adds a new NFTA_HOOK_DEVS netlink attribute to specify the list of netdevices. Basechains store a list of hooks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: increase maximum devices number per flowtablePablo Neira Ayuso
Rise the maximum limit of devices per flowtable up to 256. Rename NFT_FLOWTABLE_DEVICE_MAX to NFT_NETDEVICE_MAX in preparation to reuse the netdev hook parser for ingress basechain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: allow netdevice to be used only once per flowtablePablo Neira Ayuso
Allow netdevice only once per flowtable, otherwise hit EEXIST. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_tables: dynamically allocate hooks per net_device in flowtablesPablo Neira Ayuso
Use a list of hooks per device instead an array. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23netfilter: nf_flow_table: move priority to struct nf_flowtablePablo Neira Ayuso
Hardware offload needs access to the priority field, store this field in the nf_flowtable object. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-25netfilter: nf_tables: bogus EBUSY when deleting flowtable after flushLaura Garcia Liebana
The deletion of a flowtable after a flush in the same transaction results in EBUSY. This patch adds an activation and deactivation of flowtables in order to update the _use_ counter. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-20netfilter: nf_tables: allow lookups in dynamic setsFlorian Westphal
This un-breaks lookups in sets that have the 'dynamic' flag set. Given this active example configuration: table filter { set set1 { type ipv4_addr size 64 flags dynamic,timeout timeout 1m } chain input { type filter hook input priority 0; policy accept; } } ... this works: nft add rule ip filter input add @set1 { ip saddr } -> whenever rule is triggered, the source ip address is inserted into the set (if it did not exist). This won't work: nft add rule ip filter input ip saddr @set1 counter Error: Could not process rule: Operation not supported In other words, we can add entries to the set, but then can't make matching decision based on that set. That is just wrong -- all set backends support lookups (else they would not be very useful). The failure comes from an explicit rejection in nft_lookup.c. Looking at the history, it seems like NFT_SET_EVAL used to mean 'set contains expressions' (aka. "is a meter"), for instance something like nft add rule ip filter input meter example { ip saddr limit rate 10/second } or nft add rule ip filter input meter example { ip saddr counter } The actual meaning of NFT_SET_EVAL however, is 'set can be updated from the packet path'. 'meters' and packet-path insertions into sets, such as 'add @set { ip saddr }' use exactly the same kernel code (nft_dynset.c) and thus require a set backend that provides the ->update() function. The only set that provides this also is the only one that has the NFT_SET_EVAL feature flag. Removing the wrong check makes the above example work. While at it, also fix the flag check during set instantiation to allow supported combinations only. Fixes: 8aeff920dcc9b3f ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-20netfilter: nf_tables: add NFT_CHAIN_POLICY_UNSET and use itPablo Neira Ayuso
Default policy is defined as a unsigned 8-bit field, do not use a negative value to leave it unset, use this new NFT_CHAIN_POLICY_UNSET instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13netfilter: nf_tables_offload: remove rules when the device unregisterswenxu
If the net_device unregisters, clean up the offload rules before the chain is destroy. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-10netfilter: nft_{fwd,dup}_netdev: add offload supportPablo Neira Ayuso
This patch adds support for packet mirroring and redirection. The nft_fwd_dup_netdev_offload() function configures the flow_action object for the fwd and the dup actions. Extend nft_flow_rule_destroy() to release the net_device object when the flow_rule object is released, since nft_fwd_dup_netdev_offload() bumps the net_device reference counter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: wenxu <wenxu@ucloud.cn>
2019-09-08netfilter: nf_tables_offload: move indirect flow_block callback logic to corePablo Neira Ayuso
Add nft_offload_init() and nft_offload_exit() function to deal with the init and the exit path of the offload infrastructure. Rename nft_indr_block_get_and_ing_cmd() to nft_indr_block_cb(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08netfilter: nf_tables: Fix an Oops in nf_tables_updobj() error handlingDan Carpenter
The "newobj" is an error pointer so we can't pass it to kfree(). It doesn't need to be freed so we can remove that and I also renamed the error label. Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-05netfilter: nf_tables: fix possible null-pointer dereference in object updateFernando Fernandez Mancera
Not all objects have an update operation. If the object type doesn't implement an update operation and the user tries to update it will hit EOPNOTSUPP. Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-03netfilter: nf_tables: Introduce stateful object update operationFernando Fernandez Mancera
This patch adds the infrastructure needed for the stateful object update support. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Merge conflict of mlx5 resolved using instructions in merge commit 9566e650bf7fdf58384bb06df634f7531ca3a97e. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18netfilter: nf_tables: map basechain priority to hardware priorityPablo Neira Ayuso
This patch adds initial support for offloading basechains using the priority range from 1 to 65535. This is restricting the netfilter priority range to 16-bit integer since this is what most drivers assume so far from tc. It should be possible to extend this range of supported priorities later on once drivers are updated to support for 32-bit integer priorities. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09netfilter: nf_tables: use-after-free in failing rule with bound setPablo Neira Ayuso
If a rule that has already a bound anonymous set fails to be added, the preparation phase releases the rule and the bound set. However, the transaction object from the abort path still has a reference to the set object that is stale, leading to a use-after-free when checking for the set->bound field. Add a new field to the transaction that specifies if the set is bound, so the abort path can skip releasing it since the rule command owns it and it takes care of releasing it. After this update, the set->bound field is removed. [ 24.649883] Unable to handle kernel paging request at virtual address 0000000000040434 [ 24.657858] Mem abort info: [ 24.660686] ESR = 0x96000004 [ 24.663769] Exception class = DABT (current EL), IL = 32 bits [ 24.669725] SET = 0, FnV = 0 [ 24.672804] EA = 0, S1PTW = 0 [ 24.675975] Data abort info: [ 24.678880] ISV = 0, ISS = 0x00000004 [ 24.682743] CM = 0, WnR = 0 [ 24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000 [ 24.692207] [0000000000040434] pgd=0000000000000000 [ 24.697119] Internal error: Oops: 96000004 [#1] SMP [...] [ 24.889414] Call trace: [ 24.891870] __nf_tables_abort+0x3f0/0x7a0 [ 24.895984] nf_tables_abort+0x20/0x40 [ 24.899750] nfnetlink_rcv_batch+0x17c/0x588 [ 24.904037] nfnetlink_rcv+0x13c/0x190 [ 24.907803] netlink_unicast+0x18c/0x208 [ 24.911742] netlink_sendmsg+0x1b0/0x350 [ 24.915682] sock_sendmsg+0x4c/0x68 [ 24.919185] ___sys_sendmsg+0x288/0x2c8 [ 24.923037] __sys_sendmsg+0x7c/0xd0 [ 24.926628] __arm64_sys_sendmsg+0x2c/0x38 [ 24.930744] el0_svc_common.constprop.0+0x94/0x158 [ 24.935556] el0_svc_handler+0x34/0x90 [ 24.939322] el0_svc+0x8/0xc [ 24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863) [ 24.948336] ---[ end trace cebbb9dcbed3b56f ]--- Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-08netfilter: nf_tables_offload: support indr block callwenxu
nftable support indr-block call. It makes nftable an offload vlan and tunnel device. nft add table netdev firewall nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; } nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0 nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; } nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0 Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-19net: flow_offload: add flow_block structure and use itPablo Neira Ayuso
This object stores the flow block callbacks that are attached to this block. Update flow_block_cb_lookup() to take this new object. This patch restores the block sharing feature. Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-16netfilter: nf_tables: don't fail when updating base chain policyFlorian Westphal
The following nftables test case fails on nf-next: tests/shell/run-tests.sh tests/shell/testcases/transactions/0011chain_0 The test case contains: add chain x y { type filter hook input priority 0; } add chain x y { policy drop; }" The new test if (chain->flags ^ flags) return -EOPNOTSUPP; triggers here, because chain->flags has NFT_BASE_CHAIN set, but flags is 0 because no flag attribute was present in the policy update. Just fetch the current flag settings of a pre-existing chain in case userspace did not provide any. Fixes: c9626a2cbdb20 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-09netfilter: nf_tables: add hardware offload supportPablo Neira Ayuso
This patch adds hardware offload support for nftables through the existing netdev_ops->ndo_setup_tc() interface, the TC_SETUP_CLSFLOWER classifier and the flow rule API. This hardware offload support is available for the NFPROTO_NETDEV family and the ingress hook. Each nftables expression has a new ->offload interface, that is used to populate the flow rule object that is attached to the transaction object. There is a new per-table NFT_TABLE_F_HW flag, that is set on to offload an entire table, including all of its chains. This patch supports for basic metadata (layer 3 and 4 protocol numbers), 5-tuple payload matching and the accept/drop actions; this also includes basechain hardware offload only. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-06netfilter: nf_tables: force module load in case select_ops() returns -EAGAINPablo Neira Ayuso
nft_meta needs to pull in the nft_meta_bridge module in case that this is a bridge family rule from the select_ops() path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>