summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-05-18vhost_net: try batch dequing from skb arrayJason Wang
We used to dequeue one skb during recvmsg() from skb_array, this could be inefficient because of the bad cache utilization and spinlock touching for each packet. This patch tries to batch them by calling batch dequeuing helpers explicitly on the exported skb array and pass the skb back through msg_control for underlayer socket to finish the userspace copying. Batch dequeuing is also the requirement for more batching improvement on receive path. Tests were done by pktgen on tap with XDP1 in guest. Host is Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz. rx batch | pps 0 2.25Mpps 1 2.33Mpps (+3.56%) 4 2.33Mpps (+3.56%) 16 2.35Mpps (+4.44%) 64 2.42Mpps (+7.56%) <- Default rx batching 128 2.40Mpps (+6.67%) 256 2.38Mpps (+5.78%) Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18tap: support receiving skb from msg_controlJason Wang
This patch makes tap_recvmsg() can receive from skb from its caller through msg_control. Vhost_net will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18tun: support receiving skb through msg_controlJason Wang
This patch makes tun_recvmsg() can receive from skb from its caller through msg_control. Vhost_net will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18tap: export skb_arrayJason Wang
This patch exports skb_array through tap_get_skb_array(). Caller can then manipulate skb array directly. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18tun: export skb_arrayJason Wang
This patch exports skb_array through tun_get_skb_array(). Caller can then manipulate skb array directly. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18skb_array: introduce batch dequeuingJason Wang
Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18ptr_ring: introduce batch dequeuingJason Wang
This patch introduce a batched version of consuming, consumer can dequeue more than one pointers from the ring at a time. We don't care about the reorder of reading here so no need for compiler barrier. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18skb_array: introduce skb_array_unconsumeJason Wang
Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18ptr_ring: add ptr_ring_unconsumeMichael S. Tsirkin
Applications that consume a batch of entries in one go can benefit from ability to return some of them back into the ring. Add an API for that - assuming there's space. If there's no space naturally can't do this and have to drop entries, but this implies ring is full so we'd likely drop some anyway. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18net: x25: fix one potential use-after-free issuelinzhang
The function x25_init is not properly unregister related resources on error handler.It is will result in kernel oops if x25_init init failed, so add properly unregister call on error handler. Also, i adjust the coding style and make x25_register_sysctl properly return failure. Signed-off-by: linzhang <xiaolou4617@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18can: m_can: add deep Suspend/Resume supportQuentin Schulz
This adds Power Management deep Suspend/Resume support for Bosch M_CAN chip. When entering deep sleep, the clocks are gated, the interrupts are disabled. When resuming from deep sleep, the chip needs to be reinitialized, the clocks ungated and the interrupts enabled. Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-05-18can: m_can: factorize clock gating and ungatingQuentin Schulz
This creates a function to ungate M_CAN clocks and another to gate the same clocks, then swaps all gating/ungating code with their respective function. Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-05-18can: m_can: make m_can_start and m_can_stop symmetricQuentin Schulz
This moves clocks gating outside of the m_can_stop function as the m_can_start function does not (and cannot, at least in current implementation) ungate clocks. This way, both functions can now be used symmetrically. Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-05-18can: m_can: move Message RAM initialization to functionQuentin Schulz
To avoid possible ECC/parity checksum errors when reading an uninitialized buffer, the entire Message RAM is initialized when probing the driver. This initialization is done in the same function reading the Device Tree properties. This patch moves the RAM initialization to a separate function so it can be called separately from device initialization from Device Tree. Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-05-17bpf: adjust verifier heuristicsDaniel Borkmann
Current limits with regards to processing program paths do not really reflect today's needs anymore due to programs becoming more complex and verifier smarter, keeping track of more data such as const ALU operations, alignment tracking, spilling of PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for smarter matching of what LLVM generates. This also comes with the side-effect that we result in fewer opportunities to prune search states and thus often need to do more work to prove safety than in the past due to different register states and stack layout where we mismatch. Generally, it's quite hard to determine what caused a sudden increase in complexity, it could be caused by something as trivial as a single branch somewhere at the beginning of the program where LLVM assigned a stack slot that is marked differently throughout other branches and thus causing a mismatch, where verifier then needs to prove safety for the whole rest of the program. Subsequently, programs with even less than half the insn size limit can get rejected. We noticed that while some programs load fine under pre 4.11, they get rejected due to hitting limits on more recent kernels. We saw that in the vast majority of cases (90+%) pruning failed due to register mismatches. In case of stack mismatches, majority of cases failed due to different stack slot types (invalid, spill, misc) rather than differences in spilled registers. This patch makes pruning more aggressive by also adding markers that sit at conditional jumps as well. Currently, we only mark jump targets for pruning. For example in direct packet access, these are usually error paths where we bail out. We found that adding these markers, it can reduce number of processed insns by up to 30%. Another option is to ignore reg->id in probing PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning slightly as well by up to 7% observed complexity reduction as stand-alone. Meaning, if a previous path with register type PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then in the current state a PTR_TO_MAP_VALUE_OR_NULL register for the same map X must be safe as well. Last but not least the patch also adds a scheduling point and bumps the current limit for instructions to be processed to a more adequate value. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17ipv6: Check ip6_find_1stfragopt() return value properly.David S. Miller
Do not use unsigned variables to see if it returns a negative error or not. Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17selftests/bpf: fix broken build due to types.hYonghong Song
Commit 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf selftests.") caused a build failure for tools/testing/selftest/bpf because of some missing types: $ make -C tools/testing/selftests/bpf/ ... In file included from /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8: ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name '__aligned_u64' __aligned_u64 key; ... /usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline' static __always_inline __u16 __swab16p(const __u16 *p) ... The type __aligned_u64 is defined in linux:include/uapi/linux/types.h. The fix is to copy missing type definition into tools/testing/selftests/bpf/include/uapi/linux/types.h. Adding additional include "string.h" resolves __always_inline issue. Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf selftests.") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17Merge tag 'for-4.12/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a couple DM thin provisioning fixes - a few request-based DM and DM multipath fixes for issues that were made when merging Christoph's changes with Bart's changes for 4.12 - a DM bufio unsigned overflow fix - a couple pure fixes for the DM cache target. - various very small tweaks to the DM cache target that enable considerable speed improvements in the face of continuous IO. Given that the cache target was significantly reworked for 4.12 I see no reason to sit on these advances until 4.13 considering the favorable results associated with such minimalist tweaks. * tag 'for-4.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm cache: handle kmalloc failure allocating background_tracker struct dm bufio: make the parameter "retain_bytes" unsigned long dm mpath: multipath_clone_and_map must not return -EIO dm mpath: don't return -EIO from dm_report_EIO dm rq: add a missing break to map_request dm space map disk: fix some book keeping in the disk space map dm thin metadata: call precommit before saving the roots dm cache policy smq: don't do any writebacks unless IDLE dm cache: simplify the IDLE vs BUSY state calculation dm cache: track all IO to the cache rather than just the origin device's IO dm cache policy smq: stop preemptively demoting blocks dm cache policy smq: put newly promoted entries at the top of the multiqueue dm cache policy smq: be more aggressive about triggering a writeback dm cache policy smq: only demote entries in bottom half of the clean multiqueue dm cache: fix incorrect 'idle_time' reset in IO tracker
2017-05-17Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here are some bugfixes from I2C, especially removing a wrongly displayed error message for all i2c muxes" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: xgene: Set ACPI_COMPANION_I2C i2c: mv64xxx: don't override deferred probing when getting irq i2c: mux: only print failure message on error i2c: mux: reg: rename label to indicate what it does i2c: mux: reg: put away the parent i2c adapter on probe failure
2017-05-17Merge branch 'phy-marvell-cleanups'David S. Miller
Andrew Lunn says: ==================== net: phy: marvell: Checkpatch cleanup I will be contributing a few new features to the Marvell PHY driver soon. Start by making the code mostly checkpatch clean. There should not be any functional changes. Just comments set into the correct format, missing blank lines, turn some comparisons around, and refactoring to reduce indentation depth. There is still one camel in the code, but it actually makes sense, so leave it in piece. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: marvell: checkpatch - Fix remaining long linesAndrew Lunn
Fold lines longer than 80 characters Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: marvell: Add helpers to get/set pageAndrew Lunn
Makes the code a bit more readable, and solves quite a few checkpatch warnings of lines longer than 80 characters. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: marvell: Refactor some bigger functionsAndrew Lunn
Break big functions up by using a number of smaller helper function. Solves some of the over 80 lines warnings, by reducing the indentation level. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: marvell: Checkpatch - assignments and comparisonsAndrew Lunn
Avoid multiple assignments Comparisons should place the constant on the right side of the test Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: marvell: Checkpatch - Missing or extra blank linesAndrew Lunn
Remove the extra blank lines, add one in where recommended. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: phy: Marvell: checkpatch - CommentsAndrew Lunn
Use net style comment blocks, and wrap one block with long lines. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17Merge branch 'tcp-TCP-TS-option-use-1-ms-clock'David S. Miller
Eric Dumazet says: ==================== tcp: TCP TS option use 1 ms clock TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffy' variable, because it had been a cheap and good enough generator. Unfortunately some distros use HZ=250 or even HZ=100 leading to not very useful TCP timestamps. For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1]. RCVBUF autotuning is more precise. This series converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. Kathleen Nichols [2] and others advocate for 1ms TS clocks for network analysis. (1ms being the lowest value supported by RFC 7323.) [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf [2] http://netseminar.stanford.edu/seminars/02_02_17.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: switch TCP TS option (RFC 7323) to 1ms clockEric Dumazet
TCP Timestamps option is defined in RFC 7323 Traditionally on linux, it has been tied to the internal 'jiffies' variable, because it had been a cheap and good enough generator. For TCP flows on the Internet, 1 ms resolution would be much better than 4ms or 10ms (HZ=250 or HZ=100 respectively) For TCP flows in the DC, Google has used usec resolution for more than two years with great success [1] Receive size autotuning (DRS) is indeed more precise and converges faster to optimal window size. This patch converts tp->tcp_mstamp to a plain u64 value storing a 1 usec TCP clock. This choice will allow us to upstream the 1 usec TS option as discussed in IETF 97. [1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: replace misc tcp_time_stamp to tcp_jiffies32Eric Dumazet
After this patch, all uses of tcp_time_stamp will require a change when we introduce 1 ms and/or 1 us TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_lp: cache tcp_time_stampEric Dumazet
tcp_time_stamp will become slightly more expensive soon, cache its value. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_westwood: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet
This CC does not need 1 ms tcp_time_stamp and can use the jiffy based 'timestamp'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 in __tcp_oow_rate_limited()Eric Dumazet
This place wants to use tcp_jiffies32, this is good enough. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: uses jiffies_32 to feed tp->chrono_startEric Dumazet
tcp_time_stamp will no longer be tied to jiffies. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed probe_timestampEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 for rcv_tstamp and lrcvtimeEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: bic, cubic: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp_bbr: use tcp_jiffies32 instead of tcp_time_stampEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp, since tcp_time_stamp will soon be only used for TCP TS option. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed tp->snd_cwnd_stampEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->snd_cwnd_stamp. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tcp_jiffies32 to feed tp->lsndtimeEric Dumazet
Use tcp_jiffies32 instead of tcp_time_stamp to feed tp->lsndtime. tcp_time_stamp will soon be a litle bit more expensive than simply reading 'jiffies'. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17dccp: do not use tcp_time_stampEric Dumazet
Use our own macro instead of abusing tcp_time_stamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: introduce tcp_jiffies32Eric Dumazet
We abuse tcp_time_stamp for two different cases : 1) base to generate TCP Timestamp options (RFC 7323) 2) A 32bit version of jiffies since some TCP fields are 32bit wide to save memory. Since we want in the future to have 1ms TCP TS clock, regardless of HZ value, we want to cleanup things. tcp_jiffies32 is the truncated jiffies value, which will be used only in places where we want a 'host' timestamp. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17tcp: use tp->tcp_mstamp in output pathEric Dumazet
Idea is to later convert tp->tcp_mstamp to a full u64 counter using usec resolution, so that we can later have fine grained TCP TS clock (RFC 7323), regardless of HZ value. We try to refresh tp->tcp_mstamp only when necessary. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17sch_dsmark: Fix uninitialized variable warning.David S. Miller
We still need to initialize err to -EINVAL for the case where 'opt' is NULL in dsmark_init(). Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17Merge branch 'net-sched-multichain-filters'David S. Miller
Jiri Pirko says: ==================== net: sched: introduce multichain support for filters Currently, each classful qdisc holds one chain of filters. This chain is traversed and each filter could be matched on, which may lead to execution of list of actions. One of such action could be "reclassify", which would "reset" the processing of the filter chain. So this filter chain could be looked at as a flat table. Sometimes it is convenient for user to configure a hierarchy of tables. Example usecase is encapsulation. Hierarchy of tables is a common way how it is done in HW pipelines. So it is much more convenient to offload this. This patchset contains two major patches: 8/10 - This patch introduces the support for having multiple chains of filters. 10/10 - This patch adds new control action to allow going to specified chain The rest of the patches are smaller or bigger depencies of those 2. Please see individual patch descriptions for details. Corresponding iproute2 patches are appended as a reply to this cover letter. Simple example: $ tc qdisc add dev eth0 ingress $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11 $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop $ tc filter show dev eth0 root filter parent ffff: protocol ip pref 33 flower chain 0 filter parent ffff: protocol ip pref 33 flower chain 0 handle 0x1 dst_mac 52:54:00:3d:c7:6d eth_type ipv4 action order 1: gact action goto chain 11 random type none pass val 0 index 2 ref 1 bind 1 filter parent ffff: protocol ip pref 22 flower chain 11 filter parent ffff: protocol ip pref 22 flower chain 11 handle 0x1 eth_type ipv4 dst_ip 192.168.40.1 action order 1: gact action drop random type none pass val 0 index 3 ref 1 bind 1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: add termination action to allow goto chainJiri Pirko
Introduce new type of termination action called "goto_chain". This allows user to specify a chain to be processed. This action type is then processed as a return value in tcf_classify loop in similar way as "reclassify" is, only it does not reset to the first filter in chain but rather reset to the first filter of the desired chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: push tp down to action initJiri Pirko
Tp pointer will be needed by the next patch in order to get the chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: introduce multichain support for filtersJiri Pirko
Instead of having only one filter per block, introduce a list of chains for every block. Create chain 0 by default. UAPI is extended so the user can specify which chain he wants to change. If the new attribute is not specified, chain 0 is used. That allows to maintain backward compatibility. If chain does not exist and user wants to manipulate with it, new chain is created with specified index. Also, when last filter is removed from the chain, the chain is destroyed. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: push chain dump to a separate functionJiri Pirko
Since there will be multiple chains to dump, push chain dumping code to a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: introduce helpers to work with filter chainsJiri Pirko
Introduce struct tcf_chain object and set of helpers around it. Wraps up insertion, deletion and search in the filter chain. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17net: sched: move TC_H_MAJ macro call into tcf_auto_prioJiri Pirko
Call the helper from the function rather than to always adjust the return value of the function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>