Age | Commit message (Collapse) | Author |
|
Add helper functions increment and decrement the hook static keys.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Just like for MASQ, inspect the reply packets coming from DR/TUN
real servers and alter the connection's state and timeout
according to the protocol.
It's ipvs's duty to do traffic statistic if packets get hit,
no matter what mode it is.
Signed-off-by: longguang.yue <bigclouds@163.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Move the bpf/bpf-next patch processing queue to patchwork.kernel.org.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201011200149.66537-1-alexei.starovoitov@gmail.com
|
|
The bpf_fib_lookup() helper performs a neighbour lookup for the destination
IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
that the BPF program will pass the packet up the stack in this case.
However, with the addition of bpf_redirect_neigh() that can be used instead
to perform the neighbour lookup, at the cost of a bit of duplicated work.
For that we still need the target ifindex, and since bpf_fib_lookup()
already has that at the time it performs the neighbour lookup, there is
really no reason why it can't just return it in any case. So let's just
always return the ifindex if the FIB lookup itself succeeds.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Link: https://lore.kernel.org/bpf/20201009184234.134214-1-toke@redhat.com
|
|
"Daniel T. Lee" says:
====================
To avoid confusion caused by the increasing fragmentation of the BPF
Loader program, this commit would like to convert the previous bpf_load
loader with the libbpf loader.
Thanks to libbpf's bpf_link interface, managing the tracepoint BPF
program is much easier. bpf_program__attach_tracepoint manages the
enable of tracepoint event and attach of BPF programs to it with a
single interface bpf_link, so there is no need to manage event_fd and
prog_fd separately.
And due to addition of generic bpf_program__attach() to libbpf, it is
now possible to attach BPF programs with __attach() instead of
explicitly calling __attach_<type>().
This patchset refactors xdp_monitor with using this libbpf API, and the
bpf_load is removed and migrated to libbpf. Also, attach_tracepoint()
is replaced with the generic __attach() method in xdp_redirect_cpu.
Moreover, maps in kern program have been converted to BTF-defined map.
---
Changes in v2:
- added cleanup logic for bpf_link and bpf_object in xdp_monitor
- program section match with bpf_program__is_<type> instead of strncmp
- revert BTF key/val type to default of BPF_MAP_TYPE_PERF_EVENT_ARRAY
- split increment into seperate satement
- refactor pointer array initialization
- error code cleanup
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Most of the samples were converted to use the new BTF-defined MAP as
they moved to libbpf, but some of the samples were missing.
Instead of using the previous BPF MAP definition, this commit refactors
xdp_monitor and xdp_sample_pkts_kern MAP definition with the new
BTF-defined MAP format.
Also, this commit removes the max_entries attribute at PERF_EVENT_ARRAY
map type. The libbpf's bpf_object__create_map() will automatically
set max_entries to the maximum configured number of CPUs on the host.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-4-danieltimlee@gmail.com
|
|
>From commit d7a18ea7e8b6 ("libbpf: Add generic bpf_program__attach()"),
for some BPF programs, it is now possible to attach BPF programs
with __attach() instead of explicitly calling __attach_<type>().
This commit refactors the __attach_tracepoint() with libbpf's generic
__attach() method. In addition, this refactors the logic of setting
the map FD to simplify the code. Also, the missing removal of
bpf_load.o in Makefile has been fixed.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-3-danieltimlee@gmail.com
|
|
To avoid confusion caused by the increasing fragmentation of the BPF
Loader program, this commit would like to change to the libbpf loader
instead of using the bpf_load.
Thanks to libbpf's bpf_link interface, managing the tracepoint BPF
program is much easier. bpf_program__attach_tracepoint manages the
enable of tracepoint event and attach of BPF programs to it with a
single interface bpf_link, so there is no need to manage event_fd and
prog_fd separately.
This commit refactors xdp_monitor with using this libbpf API, and the
bpf_load is removed and migrated to libbpf.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010181734.1109-2-danieltimlee@gmail.com
|
|
Vladimir Oltean says:
====================
Offload tc-vlan mangle to mscc_ocelot switch
This series offloads one more action to the VCAP IS1 ingress TCAM, which
is to change the classified VLAN for packets, according to the VCAP IS1
keys (VLAN, source MAC, source IP, EtherType, etc).
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create a test that changes a VLAN ID from 200 to 300.
We also need to modify the preferences of the filters installed for the
other rules so that they are unique, because we now install the "tc-vlan
modify" filter in VCAP IS1 only temporarily, and we need to perform the
deletion by filter preference number.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the Extraction Frame Header contains a valid classified VLAN, use
that instead of the VLAN header present in the packet.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VCAP_IS1_ACT_VID_REPLACE_ENA action, from the VCAP IS1 ingress TCAM,
changes the classified VLAN.
We are only exposing this ability for switch ports that are under VLAN
aware bridges. This is because in standalone ports mode and under a
bridge with vlan_filtering=0, the ocelot driver configures the switch to
operate as VLAN-unaware, so the classified VLAN is not derived from the
802.1Q header from the packet, but instead is always equal to the
port-based VLAN ID of the ingress port. We _can_ still change the
classified VLAN for packets when operating in this mode, but the end
result will most likely be a drop, since both the ingress and the egress
port need to be members of the modified VLAN. And even if we install the
new classified VLAN into the VLAN table of the switch, the result would
still not be as expected: we wouldn't see, on the output port, the
modified VLAN tag, but the original one, even though the classified VLAN
was indeed modified. This is because of how the hardware works: on
egress, what is pushed to the frame is a "port tag", which gives us the
following options:
- Tag all frames with port tag (derived from the classified VLAN)
- Tag all frames with port tag, except if the classified VLAN is 0 or
equal to the native VLAN of the egress port
- No port tag
Needless to say, in VLAN-unaware mode we are disabling the port tag.
Otherwise, the existing VLAN tag would be ignored, and a second VLAN
tag (the port tag), holding the classified VLAN, would be pushed
(instead of replacing the existing 802.1Q tag). This is definitely not
what the user wanted when installing a "vlan modify" action.
So it is simply not worth bothering with VLAN modify rules under other
configurations except when the ports are fully VLAN-aware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Claudiu Manoil says:
====================
enetc: Migrate to PHYLINK and PCS_LYNX
Transitioning the enetc driver from phylib to phylink.
Offloading the serdes configuration to the PCS_LYNX
module is a mandatory part of this transition. Aiming
for a cleaner, more maintainable design, and better
code reuse.
The first 2 patches are clean up prerequisites.
Tested on a p1028rdb board.
v2: validate() explicitly rejects now all interface modes not
supported by the driver instead of relying on the device tree
to provide only supported interfaces, and dropped redundant
activation of pcs_poll (addressing Ioana's findings)
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a methodical transition of the driver from phylib
to phylink, following the guidelines from sfp-phylink.rst.
The MAC register configurations based on interface mode
were moved from the probing path to the mac_config() hook.
MAC enable and disable commands (enabling Rx and Tx paths
at MAC level) were also extracted and assigned to their
corresponding phylink hooks.
As part of the migration to phylink, the serdes configuration
from the driver was offloaded to the PCS_LYNX module,
introduced in commit 0da4c3d393e4 ("net: phy: add Lynx PCS module"),
the PCS_LYNX module being a mandatory component required to
make the enetc driver work with phylink.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.cionei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As part of the transition of the enetc ethernet driver from phylib
to phylink, the in-band operation mode of the SGMII interface
from enetc port 0 needs to be specified explicitly for phylink.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Decouple internal mdio bus creation from serdes
configuration, as a prerequisite to offloading
serdes configuration to a different module.
Group together mdio bus creation routines, cleanup.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Decouple level MAC configuration based on phy interface type
from general port configuration.
Group together MAC and link configuration code.
Decouple external mdio bus creation from interface type
parsing. No longer return an (unhandled) error code when
phy_node not found, use phy_node to indicate whether the
port has a phy or not. No longer fall-through when serdes
configuration fails for the link modes that require
internal link configuration.
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Daniel Borkmann says:
====================
This series addresses most of the feedback [0] that was to be followed
up from the last series, that is, UAPI helper comment improvements and
getting rid of the ifindex obj file hacks in the selftest by using a
BPF map instead. The __sk_buff data/data_end pointer work, I'm planning
to do in a later round as well as the mem*() BPF improvements we have
in Cilium for libbpf. Next, the series adds two features, i) a helper
called redirect_peer() to improve latency on netns switch, and ii) to
allow map in map with dynamic inner array map sizes. Selftests for each
are added as well. For details, please check individual patches, thanks!
[0] https://lore.kernel.org/bpf/cover.1601477936.git.daniel@iogearbox.net/
v5 -> v6:
- Going with Andrii's suggestion to make the misconfigured verifier
test more robust, and only probe on -EOPNOTSUPP (Andrii)
v4 -> v5:
- Replace cnt == -EOPNOTSUPP check with cnt < 0; I've used < 0
here as I think it's useful to keep the existing cnt == 0 ||
cnt >= ARRAY_SIZE(insn_buf) for error detection (Andrii)
v3 -> v4:
- Rename new array map flag to BPF_F_INNER_MAP (Alexei)
v2 -> v3:
- Remove tab that slipped into uapi helper desc (Jakub)
- Rework map in map for array to error from map_gen_lookup (Andrii)
v1 -> v2:
- Fixed selftest comment wrt inner1/inner2 value (Yonghong)
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Extend the test_tc_redirect test and add a small test that exercises the new
redirect_peer() helper for the IPv4 and IPv6 case.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-7-daniel@iogearbox.net
|
|
Rename into test_tc_redirect.sh and move setup and test code into separate
functions so they can be reused for newly added tests in here. Also remove
the crude hack to override ifindex inside the object file via xxd and sed
and just use a simple map instead. Map given iproute2 does not support BTF
fully and therefore neither global data at this point.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201010234006.7075-6-daniel@iogearbox.net
|
|
Extend the "diff_size" subtest to also include a non-inlined array map variant
where dynamic inner #elems are possible.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201010234006.7075-5-daniel@iogearbox.net
|
|
Recent work in f4d05259213f ("bpf: Add map_meta_equal map ops") and 134fede4eecf
("bpf: Relax max_entries check for most of the inner map types") added support
for dynamic inner max elements for most map-in-map types. Exceptions were maps
like array or prog array where the map_gen_lookup() callback uses the maps'
max_entries field as a constant when emitting instructions.
We recently implemented Maglev consistent hashing into Cilium's load balancer
which uses map-in-map with an outer map being hash and inner being array holding
the Maglev backend table for each service. This has been designed this way in
order to reduce overall memory consumption given the outer hash map allows to
avoid preallocating a large, flat memory area for all services. Also, the
number of service mappings is not always known a-priori.
The use case for dynamic inner array map entries is to further reduce memory
overhead, for example, some services might just have a small number of back
ends while others could have a large number. Right now the Maglev backend table
for small and large number of backends would need to have the same inner array
map entries which adds a lot of unneeded overhead.
Dynamic inner array map entries can be realized by avoiding the inlined code
generation for their lookup. The lookup will still be efficient since it will
be calling into array_map_lookup_elem() directly and thus avoiding retpoline.
The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips
inline code generation and relaxes array_map_meta_equal() check to ignore both
maps' max_entries. This also still allows to have faster lookups for map-in-map
when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed.
Example code generation where inner map is dynamic sized array:
# bpftool p d x i 125
int handle__sys_enter(void * ctx):
; int handle__sys_enter(void *ctx)
0: (b4) w1 = 0
; int key = 0;
1: (63) *(u32 *)(r10 -4) = r1
2: (bf) r2 = r10
;
3: (07) r2 += -4
; inner_map = bpf_map_lookup_elem(&outer_arr_dyn, &key);
4: (18) r1 = map[id:468]
6: (07) r1 += 272
7: (61) r0 = *(u32 *)(r2 +0)
8: (35) if r0 >= 0x3 goto pc+5
9: (67) r0 <<= 3
10: (0f) r0 += r1
11: (79) r0 = *(u64 *)(r0 +0)
12: (15) if r0 == 0x0 goto pc+1
13: (05) goto pc+1
14: (b7) r0 = 0
15: (b4) w6 = -1
; if (!inner_map)
16: (15) if r0 == 0x0 goto pc+6
17: (bf) r2 = r10
;
18: (07) r2 += -4
; val = bpf_map_lookup_elem(inner_map, &key);
19: (bf) r1 = r0 | No inlining but instead
20: (85) call array_map_lookup_elem#149280 | call to array_map_lookup_elem()
; return val ? *val : -1; | for inner array lookup.
21: (15) if r0 == 0x0 goto pc+1
; return val ? *val : -1;
22: (61) r6 = *(u32 *)(r0 +0)
; }
23: (bc) w0 = w6
24: (95) exit
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net
|
|
Add an efficient ingress to ingress netns switch that can be used out of tc BPF
programs in order to redirect traffic from host ns ingress into a container
veth device ingress without having to go via CPU backlog queue [0]. For local
containers this can also be utilized and path via CPU backlog queue only needs
to be taken once, not twice. On a high level this borrows from ipvlan which does
similar switch in __netif_receive_skb_core() and then iterates via another_round.
This helps to reduce latency for mentioned use cases.
Pod to remote pod with redirect(), TCP_RR [1]:
# percpu_netperf 10.217.1.33
RT_LATENCY: 122.450 (per CPU: 122.666 122.401 122.333 122.401 )
MEAN_LATENCY: 121.210 (per CPU: 121.100 121.260 121.320 121.160 )
STDDEV_LATENCY: 120.040 (per CPU: 119.420 119.910 125.460 115.370 )
MIN_LATENCY: 46.500 (per CPU: 47.000 47.000 47.000 45.000 )
P50_LATENCY: 118.500 (per CPU: 118.000 119.000 118.000 119.000 )
P90_LATENCY: 127.500 (per CPU: 127.000 128.000 127.000 128.000 )
P99_LATENCY: 130.750 (per CPU: 131.000 131.000 129.000 132.000 )
TRANSACTION_RATE: 32666.400 (per CPU: 8152.200 8169.842 8174.439 8169.897 )
Pod to remote pod with redirect_peer(), TCP_RR:
# percpu_netperf 10.217.1.33
RT_LATENCY: 44.449 (per CPU: 43.767 43.127 45.279 45.622 )
MEAN_LATENCY: 45.065 (per CPU: 44.030 45.530 45.190 45.510 )
STDDEV_LATENCY: 84.823 (per CPU: 66.770 97.290 84.380 90.850 )
MIN_LATENCY: 33.500 (per CPU: 33.000 33.000 34.000 34.000 )
P50_LATENCY: 43.250 (per CPU: 43.000 43.000 43.000 44.000 )
P90_LATENCY: 46.750 (per CPU: 46.000 47.000 47.000 47.000 )
P99_LATENCY: 52.750 (per CPU: 51.000 54.000 53.000 53.000 )
TRANSACTION_RATE: 90039.500 (per CPU: 22848.186 23187.089 22085.077 21919.130 )
[0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
[1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net
|
|
Follow-up to address David's feedback that we should better describe internals
of the bpf_redirect_neigh() helper.
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Link: https://lore.kernel.org/bpf/20201010234006.7075-2-daniel@iogearbox.net
|
|
Move the skb_headroom check out of fr_hard_header and into pvc_xmit.
This has two benefits:
1. Originally we only do this check for skbs sent by users on Ethernet-
emulating PVC devices. After the change we do this check for skbs sent on
normal PVC devices, too.
(Also add a comment to make it clear that this is only a protection
against upper layers that don't take dev->needed_headroom into account.
Such upper layers should be rare and I believe they should be fixed.)
2. After the change we can simplify the parameter list of fr_hard_header.
We no longer need to use a pointer to pointers (skb_p) because we no
longer need to replace the skb inside fr_hard_header.
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tobias reported regressions in IPsec tests following the patch
referenced by the Fixes tag below. The root cause is dropping the
reset of the flowi4_oif after the fib_lookup. Apparently it is
needed for xfrm cases, so restore the oif update to ip_route_output_flow
right before the call to xfrm_lookup_route.
Fixes: 2fbc6e89b2f1 ("ipv4: Update exception handling for multipath routes via same device")
Reported-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MTU setting for this DSA switch is global so we need
to keep track of the MTU set for each port, then as soon
as any MTU changes, roof the MTU to the biggest common
denominator and poke that into the switch MTU setting.
To achieve this we need a per-chip-variant state container
for the RTL8366RB to use for the RTL8366RB-specific
stuff. Other SMI switches does seem to have per-port
MTU setting capabilities.
Fixes: 5f4a8ef384db ("net: dsa: rtl8366rb: Support setting MTU")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Paolo Abeni says:
====================
mptcp: some fallback fixes
pktdrill pointed-out we currently don't handle properly some
fallback scenario for MP_JOIN subflows
The first patch addresses such issue.
Patch 2/2 fixes a related pre-existing issue that is more
evident after 1/2: we could keep using for MPTCP signaling
closed subflows.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The msk can close MP_JOIN subflows if the initial handshake
fails. Currently such subflows are kept alive in the
conn_list until the msk itself is closed.
Beyond the wasted memory, we could end-up sending the
DATA_FIN and the DATA_FIN ack on such socket, even after a
reset.
Fixes: 43b54c6ee382 ("mptcp: Use full MPTCP-level disconnect state machine")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Additional/MP_JOIN subflows that do not pass some initial handshake
tests currently causes fallback to TCP. That is an RFC violation:
we should instead reset the subflow and leave the the msk untouched.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/91
Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Better place for of_mdio.c is drivers/net/mdio.
Move of_mdio.c from drivers/of to drivers/net/mdio
Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When packets are received on the error queue, this function under
net_ratelimit():
netif_err(priv, hw, net_dev, "Err FD status = 0x%08x\n");
does not get printed. Instead we only see:
[ 3658.845592] net_ratelimit: 244 callbacks suppressed
[ 3663.969535] net_ratelimit: 230 callbacks suppressed
[ 3669.085478] net_ratelimit: 228 callbacks suppressed
Enabling NETIF_MSG_HW fixes this issue, and we can see some information
about the frame descriptors of packets.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Factor out handling the private packet/byte counters to new
functions rtl_get_priv_stats() and rtl_inc_priv_stats().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Obviously this driver version doesn't make sense. Go with the default
and let ethtool display the kernel version.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For older versions of gcc, the array = {0}; will cause warnings:
net/smc/smc_llc.c: In function 'smc_llc_add_link_local':
net/smc/smc_llc.c:1212:9: warning: missing braces around initializer [-Wmissing-braces]
struct smc_llc_msg_add_link add_llc = {0};
^
net/smc/smc_llc.c:1212:9: warning: (near initialization for 'add_llc.hd') [-Wmissing-braces]
net/smc/smc_llc.c: In function 'smc_llc_srv_delete_link_local':
net/smc/smc_llc.c:1245:9: warning: missing braces around initializer [-Wmissing-braces]
struct smc_llc_msg_del_link del_llc = {0};
^
net/smc/smc_llc.c:1245:9: warning: (near initialization for 'del_llc.hd') [-Wmissing-braces]
2 warnings generated
Fixes: 4dadd151b265 ("net/smc: enqueue local LLC messages")
Signed-off-by: Pujin Shi <shipujin.t@gmail.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For older versions of gcc, the array = {0}; will cause warnings:
net/smc/smc_llc.c: In function 'smc_llc_send_link_delete_all':
net/smc/smc_llc.c:1317:9: warning: missing braces around initializer [-Wmissing-braces]
struct smc_llc_msg_del_link delllc = {0};
^
net/smc/smc_llc.c:1317:9: warning: (near initialization for 'delllc.hd') [-Wmissing-braces]
1 warnings generated
Fixes: f3811fd7bc97 ("net/smc: send DELETE_LINK, ALL message and wait for send to complete")
Signed-off-by: Pujin Shi <shipujin.t@gmail.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make use of the new struct_size() helper instead of the offsetof() idiom.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
====================
linux-can-fixes-for-5.9-20201008
The first patch is by Lucas Stach and fixes m_can driver by removing an
erroneous call to m_can_class_suspend() in runtime suspend. Which causes the
pinctrl state to get stuck on the "sleep" state, which breaks all CAN
functionality on SoCs where this state is defined.
The last two patches target the j1939 protocol: Cong Wang fixes a syzbot
finding of an uninitialized variable in the j1939 transport protocol. I
contribute a patch, that fixes the initialization of a same uninitialized
variable in a different function.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.10
Fourth and last set of patches for v5.10. Most of these are iwlwifi
patches, but few small fixes to other drivers as well.
Major changes:
iwlwifi
* PNVM support (platform-specific phy config data)
* bump the FW API support to 59
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
A handful of changes:
* fixes for the recent S1G work
* a docbook build time improvement
* API to pass beacon rate to lower-level driver
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Johannes Berg says:
====================
netlink: export policy on validation failures
Export the policy used for attribute validation when it fails,
so e.g. for an out-of-range attribute userspace immediately gets
the valid ranges back.
v2 incorporates the suggestion from Jakub to have a function to
estimate the size (netlink_policy_dump_attr_size_estimate()) and
check that it does the right thing on the *normal* policy dumps,
not (just) when calling it from the error scenario.
v3 only addresses a few minor style issues.
v4 fixes up a forgotten 'git add' ... sorry.
v5 is a resend, I messed up v4's cover letter subject (saying v3)
and apparently the second patch didn't go out at all.
Tested using nl80211/iw in a few scenarios, seems to work fine
and return the policy back, e.g.
kernel reports: integer out of range
policy: 04 00 0b 00 0c 00 04 00 01 00 00 00 00 00 00 00
^ padding
^ minimum allowed value
policy: 04 00 0b 00 0c 00 05 00 ff ff ff ff 00 00 00 00
^ padding
^ maximum allowed value
policy: 08 00 01 00 04 00 00 00
^ type 4 == U32
for an out-of-range case.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new attribute NLMSGERR_ATTR_POLICY to the extended ACK
to advertise the policy, e.g. if an attribute was out of range,
you'll know the range that's permissible.
Add new NL_SET_ERR_MSG_ATTR_POL() and NL_SET_ERR_MSG_ATTR_POL()
macros to set this, since realistically it's only useful to do
this when the bad attribute (offset) is also returned.
Use it in lib/nlattr.c which practically does all the policy
validation.
v2:
- add and use netlink_policy_dump_attr_size_estimate()
v3:
- remove redundant break
v4:
- really remove redundant break ... sorry
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor the per-attribute policy writing into a new
helper function, to be used later for dumping out the
policy of a rejected attribute.
v2:
- fix some indentation
v3:
- change variable order in netlink_policy_dump_write()
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the function node_lost_contact(), we call __skb_queue_purge() without
grabbing the list->lock. This can cause to a race-condition why processing
the list 'namedq' in calling path tipc_named_rcv()->tipc_named_dequeue().
[] BUG: kernel NULL pointer dereference, address: 0000000000000000
[] #PF: supervisor read access in kernel mode
[] #PF: error_code(0x0000) - not-present page
[] PGD 7ca63067 P4D 7ca63067 PUD 6c553067 PMD 0
[] Oops: 0000 [#1] SMP NOPTI
[] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G O 5.9.0-rc6+ #2
[] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [...]
[] RIP: 0010:tipc_named_rcv+0x103/0x320 [tipc]
[] Code: 41 89 44 24 10 49 8b 16 49 8b 46 08 49 c7 06 00 00 00 [...]
[] RSP: 0018:ffffc900000a7c58 EFLAGS: 00000282
[] RAX: 00000000000012ec RBX: 0000000000000000 RCX: ffff88807bde1270
[] RDX: 0000000000002c7c RSI: 0000000000002c7c RDI: ffff88807b38f1a8
[] RBP: ffff88807b006288 R08: ffff88806a367800 R09: ffff88806a367900
[] R10: ffff88806a367a00 R11: ffff88806a367b00 R12: ffff88807b006258
[] R13: ffff88807b00628a R14: ffff888069334d00 R15: ffff88806a434600
[] FS: 0000000000000000(0000) GS:ffff888079480000(0000) knlGS:0[...]
[] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[] CR2: 0000000000000000 CR3: 0000000077320000 CR4: 00000000000006e0
[] Call Trace:
[] ? tipc_bcast_rcv+0x9a/0x1a0 [tipc]
[] tipc_rcv+0x40d/0x670 [tipc]
[] ? _raw_spin_unlock+0xa/0x20
[] tipc_l2_rcv_msg+0x55/0x80 [tipc]
[] __netif_receive_skb_one_core+0x8c/0xa0
[] process_backlog+0x98/0x140
[] net_rx_action+0x13a/0x420
[] __do_softirq+0xdb/0x316
[] ? smpboot_thread_fn+0x2f/0x1e0
[] ? smpboot_thread_fn+0x74/0x1e0
[] ? smpboot_thread_fn+0x14e/0x1e0
[] run_ksoftirqd+0x1a/0x40
[] smpboot_thread_fn+0x149/0x1e0
[] ? sort_range+0x20/0x20
[] kthread+0x131/0x150
[] ? kthread_unuse_mm+0xa0/0xa0
[] ret_from_fork+0x22/0x30
[] Modules linked in: veth tipc(O) ip6_udp_tunnel udp_tunnel [...]
[] CR2: 0000000000000000
[] ---[ end trace 65c276a8e2e2f310 ]---
To fix this, we need to grab the lock of the 'namedq' list on both
path calling.
Fixes: cad2929dc432 ("tipc: update a binding service via broadcast")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
skb_unshare() drops a reference count on the old skb unconditionally,
so in the failure case, we end up freeing the skb twice here.
And because the skb is allocated in fclone and cloned by caller
tipc_msg_reassemble(), the consequence is actually freeing the
original skb too, thus triggered the UAF by syzbot.
Fix this by replacing this skb_unshare() with skb_cloned()+skb_copy().
Fixes: ff48b6222e65 ("tipc: use skb_unshare() instead in tipc_buf_append()")
Reported-and-tested-by: syzbot+e96a7ba46281824cc46a@syzkaller.appspotmail.com
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Karsten Graul says:
====================
net/smc: updates 2020-10-07
Patch 1 and 2 address warnings from static code checkers, and patch 3
handles a case when all proposed ISM V2 devices fail to init and no V1
devices are tried afterwards.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Field ini->smcd_version is set to SMC_V2 before calling
smc_listen_ism_init(). This clears the V1 bit that may be set. When all
matching ISM V2 devices fail to initialize then the smcd_version field
needs to get restored to allow any possible V1 devices to initialize.
And be consistent, always go to the not_found label when no device was
found.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
coccinelle informs about
net/smc/af_smc.c:1770:10-11: WARNING: opportunity for kzfree/kvfree_sensitive
Its not that kzfree() would help here, the memset() is done to prepare
the buffer for another socket receive.
Fix that warning message by reordering the calls, while at it eliminate
the unneeded variable cclc2 and use sizeof(*buf) as above in the same
function. No functional changes.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Static code checkers warn of inconsistent returns because the lgr mutex
is locked in one function and unlocked in a function called by the
locking function:
net/smc/af_smc.c:823 smc_connect_rdma() warn: inconsistent returns 'smc_client_lgr_pending'.
net/smc/af_smc.c:897 smc_connect_ism() warn: inconsistent returns 'smc_server_lgr_pending'.
Make the code consistent by doing the unlock in the same function that
fetches the lock. No functional changes.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
linux-can-next-for-5.10-20201007
The first 3 patches are by me and fix several warnings found
when compiling the kernel with W=1.
Lukas Bulwahn's patch adjusts the MAINTAINERS file, to accommodate
the renaming of the mcp251xfd driver.
Vincent Mailhol contributes 3 patches for the CAN networking layer.
First error queue support is added the the CAN RAW protocol.
The second patch converts the get_can_dlc() and get_canfd_dlc()
in-Kernel-only macros from using __u8 to u8.
The third patch adds a helper function to calculate the length of
one bit in in multiple of time quanta.
Oliver Hartkopp's patch add support for the ISO 15765-2:2016
transport protocol to the CAN stack.
Three patches by Lad Prabhakar add documentation for various
new rcar controllers to the device tree bindings of the rcar_can
and rcan_canfd driver.
Michael Walle's patch adds various processors to the flexcan
driver binding documentation.
The next two patches are by me and target the flexcan driver aswell.
The remove the ack_grp and ack_bit from the fsl,stop-mode DT property
and the driver, as they are not used anymore. As these are the last
two arguments this change will not break existing device trees.
The last three patches are by Srinivas Neeli and target
the xilinx_can driver.
The first one increases the lower limit for the bit rate
prescaler to 2, the other two fix sparse and coverity findings.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|