summaryrefslogtreecommitdiff
path: root/net/sched/act_vlan.c
AgeCommit message (Collapse)Author
2018-03-21net/sched: fix idr leak in the error path of tcf_vlan_init()Davide Caratti
tcf_vlan_init() can fail after the idr has been successfully reserved. When this happens, every subsequent attempt to configure vlan rules using the same idr value will systematically fail with -ENOSPC, unless the first attempt was done using the 'replace' keyword. # tc action add action vlan pop index 100 RTNETLINK answers: Cannot allocate memory We have an error talking to the kernel # tc action add action vlan pop index 100 RTNETLINK answers: No space left on device We have an error talking to the kernel # tc action add action vlan pop index 100 RTNETLINK answers: No space left on device We have an error talking to the kernel ... Fix this in tcf_vlan_init(), ensuring that tcf_idr_release() is called on the error path when the idr has been reserved, but not yet inserted. Also, don't test 'ovr' in the error path, to avoid a 'replace' failure implicitly become a 'delete' that leaks refcount in act_vlan module: # rmmod act_vlan; modprobe act_vlan # tc action add action vlan push id 5 index 100 # tc action replace action vlan push id 7 index 100 RTNETLINK answers: Cannot allocate memory We have an error talking to the kernel # tc action list action vlan # # rmmod act_vlan rmmod: ERROR: Module act_vlan is in use Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-17net/sched: fix NULL dereference in the error path of tcf_vlan_init()Davide Caratti
when the following command # tc actions replace action vlan pop index 100 is run for the first time, and tcf_vlan_init() fails allocating struct tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: __call_rcu+0x23/0x2b0 PGD 80000000760a2067 P4D 80000000760a2067 PUD 742c1067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_vlan(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel mbcache snd_hda_codec jbd2 snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev soundcore virtio_balloon pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net ata_piix crc32c_intel libata virtio_pci i2c_core virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan] CPU: 3 PID: 3119 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__call_rcu+0x23/0x2b0 RSP: 0018:ffffaac3005fb798 EFLAGS: 00010246 RAX: ffffffffc0704080 RBX: ffff97f2b4bbe900 RCX: 00000000ffffffff RDX: ffffffffabca5f00 RSI: 0000000000000010 RDI: 0000000000000010 RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000044 R10: 00000000fd003000 R11: ffff97f2faab5b91 R12: 0000000000000000 R13: ffffffffabca5f00 R14: ffff97f2fb80202c R15: 00000000fffffff4 FS: 00007f68f75b4740(0000) GS:ffff97f2ffd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000072b52001 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_vlan_init+0x168/0x270 [act_vlan] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f68f69c5ba0 RSP: 002b:00007fffd79c1118 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fffd79c1240 RCX: 00007f68f69c5ba0 RDX: 0000000000000000 RSI: 00007fffd79c1190 RDI: 0000000000000003 RBP: 000000005aaa708e R08: 0000000000000002 R09: 0000000000000000 R10: 00007fffd79c0ba0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffd79c1254 R14: 0000000000000001 R15: 0000000000669f60 Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 RIP: __call_rcu+0x23/0x2b0 RSP: ffffaac3005fb798 CR2: 0000000000000018 fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called only when p is not NULL. Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Manish Kurup <manish.kurup@verizon.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13net_sched: switch to exit_batch for action pernet opsCong Wang
Since we now hold RTNL lock in tc_action_net_exit(), it is good to batch them to speedup tc action dismantle. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05net_sched: remove unused parameter from act cleanup opsCong Wang
No one actually uses it. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10act_vlan: VLAN action rewrite to use RCU lock/unlock and updateManish Kurup
Using a spinlock in the VLAN action causes performance issues when the VLAN action is used on multiple cores. Rewrote the VLAN action to use RCU read locking for reads and updates instead. All functions now use an RCU dereferenced pointer to access the VLAN action context. Modified helper functions used by other modules, to use the RCU as opposed to directly accessing the structure. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Manish Kurup <manish.kurup@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10act_vlan: Change stats update to use per-core statsManish Kurup
The VLAN action maintains one set of stats across all cores, and uses a spinlock to synchronize updates to it from the same. Changed this to use a per-CPU stats context instead. This change will result in better performance. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Manish Kurup <manish.kurup@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09Revert "net_sched: hold netns refcnt for each action"Cong Wang
This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59. If we hold that refcnt, the netns can never be destroyed until all actions are destroyed by user, this breaks our netns design which we expect all actions are destroyed when we destroy the whole netns. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net_sched: hold netns refcnt for each actionCong Wang
TC actions have been destroyed asynchronously for a long time, previously in a RCU callback and now in a workqueue. If we don't hold a refcnt for its netns, we could use the per netns data structure, struct tcf_idrinfo, after it has been freed by netns workqueue. Hold refcnt to ensure netns destroy happens after all actions are gone. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Tested-by: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30net/sched: Change act_api and act_xxx modules to use IDRChris Mi
Typically, each TC filter has its own action. All the actions of the same type are saved in its hash table. But the hash buckets are too small that it degrades to a list. And the performance is greatly affected. For example, it takes about 0m11.914s to insert 64K rules. If we convert the hash table to IDR, it only takes about 0m1.500s. The improvement is huge. But please note that the test result is based on previous patch that cls_flower uses IDR. Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: pass extended ACK struct to parsing functionsJohannes Berg
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18netns: make struct pernet_operations::id unsigned intAlexey Dobriyan
Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-03net/sched: act_vlan: Push skb->data to mac_header prior calling skb_vlan_*() ↵Shmulik Ladkani
functions Generic skb_vlan_push/skb_vlan_pop functions don't properly handle the case where the input skb data pointer does not point at the mac header: - They're doing push/pop, but fail to properly unwind data back to its original location. For example, in the skb_vlan_push case, any subsequent 'skb_push(skb, skb->mac_len)' calls make the skb->data point 4 bytes BEFORE start of frame, leading to bogus frames that may be transmitted. - They update rcsum per the added/removed 4 bytes tag. Alas if data is originally after the vlan/eth headers, then these bytes were already pulled out of the csum. OTOH calling skb_vlan_push/skb_vlan_pop with skb->data at mac_header present no issues. act_vlan is the only caller to skb_vlan_*() that has skb->data pointing at network header (upon ingress). Other calles (ovs, bpf) already adjust skb->data at mac_header. This patch fixes act_vlan to point to the mac_header prior calling skb_vlan_*() functions, as other callers do. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Pravin Shelar <pshelar@ovn.org> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22net/sched: act_vlan: Introduce TCA_VLAN_ACT_MODIFY vlan actionShmulik Ladkani
TCA_VLAN_ACT_MODIFY allows one to change an existing tag. It accepts same attributes as TCA_VLAN_ACT_PUSH (protocol, id, priority). If packet is vlan tagged, then the tag gets overwritten according to user specified attributes. For example, this allows user to replace a tag's vid while preserving its priority bits (as opposed to "action vlan pop pipe action vlan push"). Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-18net_sched: act_vlan: Add priority optionHadar Hen Zion
The current vlan push action supports only vid and protocol options. Add priority option. Example script that adds vlan push action with vid and priority: tc filter add dev veth0 protocol ip parent ffff: \ flower \ indev veth0 \ action vlan push id 100 priority 5 Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25net_sched: move tc_action into tcf_commonWANG Cong
struct tc_action is confusing, currently we use it for two purposes: 1) Pass in arguments and carry out results from helper functions 2) A generic representation for tc actions The first one is error-prone, since we need to make sure we don't miss anything. This patch aims to get rid of this use, by moving tc_action into tcf_common, so that they are allocated together in hashtable and can be cast'ed easily. And together with the following patch, we could really make tc_action a generic representation for all tc actions and each type of action can inherit from it. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15net_sched: make tcf_hash_check() booleanWANG Cong
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07net sched: indentation and other OCD stylistic fixesJamal Hadi Salim
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
2016-06-07net sched actions: aggregate dumping of actions timeinfoJamal Hadi Salim
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07net sched actions: introduce timestamp for firsttime useJamal Hadi Salim
Useful to know when the action was first used for accounting (and debugging) Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07net sched: actions use tcf_lastuse_update for consistencyJamal Hadi Salim
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next' because we no longer have a per-netns conntrack hash. The ip_gre.c conflict as well as the iwlwifi ones were cases of overlapping changes. Conflicts: drivers/net/wireless/intel/iwlwifi/mvm/tx.c net/ipv4/ip_gre.c net/netfilter/nf_conntrack_core.c Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-10net sched: vlan action fix late bindingJamal Hadi Salim
Late vlan action binding was broken and is fixed with this patch. //add a vlan action to pop and give it an instance id of 1 sudo tc actions add action vlan pop index 1 //create filter which binds to vlan action id 1 sudo tc filter add dev $DEV parent ffff: protocol ip prio 1 u32 \ match ip dst 17.0.0.1/32 flowid 1:1 action vlan index 1 current message(before bug fix) was: RTNETLINK answers: Invalid argument We have an error talking to the kernel Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26sched: align nlattr properly when neededNicolas Dichtel
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25net_sched: add network namespace support for tc actionsWANG Cong
Currently tc actions are stored in a per-module hashtable, therefore are visible to all network namespaces. This is probably the last part of the tc subsystem which is not aware of netns now. This patch makes them per-netns, several tc action API's need to be adjusted for this. The tc action API code is ugly due to historical reasons, we need to refactor that code in the future. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-08net: sched: add percpu stats to actionsEric Dumazet
Reuse existing percpu infrastructure John Fastabend added for qdisc. This patch adds a new cpustats parameter to tcf_hash_create() and all actions pass false, meaning this patch should have no effect yet. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21sched: introduce vlan actionJiri Pirko
This tc action allows to work with vlan tagged skbs. Two supported sub-actions are header pop and header push. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>