Age | Commit message (Collapse) | Author |
|
If we move all cached dst into the exception table under the main route,
current rt6_clean_tohost() will no longer be able to access them.
This commit makes fib6_clean_tohost() to also go through all cached
routes in exception table and removes cached gateway routes to the
passed in gateway.
This is a preparation in order to move all cached routes into the
exception table.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If we move all cached dst into the exception table under the main route,
current rt6_mtu_change() will no longer be able to access them.
This commit makes rt6_mtu_change_route() function to also go through all
cached routes in the exception table under the main route and do proper
updates on the mtu.
This is a preparation in order to move all cached routes into the
exception table.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After we move cached dst entries into the exception table under its
parent route, current fib6_remove_prefsrc() no longer can access them.
This commit makes fib6_remove_prefsrc() also go through all routes
in the exception table to remove the pref src.
This is a preparation patch in order to move all cached dst into the
exception table.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a hash table into struct rt6_info in order to store dst caches
created by pmtu discovery and ip redirect in ipv6 routing code.
APIs to add dst cache, delete dst cache, find dst cache and update
dst cache in the hash table are implemented and will be used in later
commits.
This is a preparation work to move all cache routes into the exception
table instead of getting inserted into the fib6 tree.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This function takes a route as input and tries to update the sernum in
the fib6_node this route is associated with. It will be used in later
commit when adding a cached route into the exception table under that
route.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using a linear list to store all skbs in write queue has been okay
for quite a while : O(N) is not too bad when N < 500.
Things get messy when N is the order of 100,000 : Modern TCP stacks
want 10Gbit+ of throughput even with 200 ms RTT flows.
40 ns per cache line miss means a full scan can use 4 ms,
blowing away CPU caches.
SACK processing often can use various hints to avoid parsing
whole retransmit queue. But with high packet losses and/or high
reordering, hints no longer work.
Sender has to process thousands of unfriendly SACK, accumulating
a huge socket backlog, burning a cpu and massively dropping packets.
Using an rb-tree for retransmit queue has been avoided for years
because it added complexity and overhead, but now is the time
to be more resistant and say no to quadratic behavior.
1) RTX queue is no longer part of the write queue : already sent skbs
are stored in one rb-tree.
2) Since reaching the head of write queue no longer needs
sk->sk_send_head, we added an union of sk_send_head and tcp_rtx_queue
Tested:
On receiver :
netem on ingress : delay 150ms 200us loss 1
GRO disabled to force stress and SACK storms.
for f in `seq 1 10`
do
./netperf -H lpaa6 -l30 -- -K bbr -o THROUGHPUT|tail -1
done | awk '{print $0} {sum += $0} END {printf "%7u\n",sum}'
Before patch :
323.87
351.48
339.59
338.62
306.72
204.07
304.93
291.88
202.47
176.88
2840
After patch:
1700.83
2207.98
2070.17
1544.26
2114.76
2124.89
1693.14
1080.91
2216.82
1299.94
18053
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No need to recompute previous skb, as it will be a bit more
expensive when rtx queue is converted to RB tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With upcoming rb-tree implementation, the checks will trigger
more often, and this is expected.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It will be a bit more expensive to get the head of rtx queue
once rtx queue is converted to an rb-tree.
We can avoid this extra cost in case tp->lost_skb_hint is set.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp_write_queue_tail() call can be factorized.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since the upcoming rtx rbtree will add some extra code,
it is time to not inline this fat function anymore.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Geeralize private netem_rb_to_skb()
TCP rtx queue will soon be converted to rb-tree,
so we will need skb_rbtree_walk() helpers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
CC net/bluetooth/selftest.o
net/bluetooth/selftest.c: In function ‘bt_selftest_init’:
net/bluetooth/selftest.c:246:3: warning: ‘duration’ may be used uninitialized in this function [-Wmaybe-uninitialized]
snprintf(test_ecdh_buffer, sizeof(test_ecdh_buffer),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"PASS (%llu usecs)\n", duration);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/bluetooth/selftest.c:203:21: note: ‘duration’ was declared here
unsigned long long duration;
^~~~~~~~
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
|
|
That Bluetooth SMP knows about the private key is pointless, since the
detection of debug key usage is actually via the public key portion.
With this patch, the Bluetooth SMP will stop keeping a copy of the
ecdh private key and will let the crypto subsystem to generate and
handle the ecdh private key, potentially benefiting of hardware
ecc private key generation and retention.
The loop that tries to generate a correct private key is now removed and
we trust the crypto subsystem to generate a correct private key. This
backup logic should be done in crypto, if really needed.
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
tmp buffer contains the swapped private key. In case the setkey call
failed, the tmp buffer was freed without clearing the private key.
Zeroize the temporary buffer so we don't leak the private key.
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
ecdh_helper functions were hiding the error codes and chose to return
the return value of an relational operator, "==". Remove the unnecessary
query and reveal the error codes.
While updating the return values, code in a way that compilers will
warn in case of uninitialized err.
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Before this change, a new crypto tfm was allocated, each time,
for both key generation and shared secret computation.
Allocate a single tfm for both cases.
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- Cleanup patches to make checkpatch happy, by Sven Eckelmann (3 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit cc71b7b07119 ("net/ipv6: remove unused err variable on
icmpv6_push_pending_frames") exposed icmpv6_push_pending_frames
return value not being used.
Remove now unnecessary int err declarations and uses.
Miscellanea:
o Remove unnecessary goto and out: labels
o Realign arguments
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
xt_copy_counters_from_user
syzkaller reports an out of bound read in strlcpy(), triggered
by xt_copy_counters_from_user()
Fix this by using memcpy(), then forcing a zero byte at the last position
of the destination, as Florian did for the non COMPAT code.
Fixes: d7591f0c41ce ("netfilter: x_tables: introduce and use xt_copy_counters_from_user")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Chain counters are only enabled on demand since 9f08ea848117, skip them
when dumping them via netlink.
Fixes: 9f08ea848117 ("netfilter: nf_tables: keep chain counters away from hot path")
Reported-by: Johny Mattsson <johny.mattsson+kernel@gmail.com>
Tested-by: Johny Mattsson <johny.mattsson+kernel@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This requires adding a pointer back
to the sta_info since container_of() can't resolve the sta_info.
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Merging this brings in the timer_setup() change, which allows
me to apply Kees's mac80211 changes for it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
int err is unused by icmpv6_push_pending_frames(), this patch returns removes the variable and returns the function with 0.
git bisect shows this variable has been around since linux has been in git in commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2.
This was found by running make coccicheck M=net/ipv6/ on linus' tree on commit 77ede3a014a32746002f7889211f0cecf4803163 (current HEAD as of this patch).
Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Storing the left length of skb into 'len' actually has no effect
so we can remove it.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Refactor the RACK loop to improve readability and speed up the checks.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the new time-ordered list to speed up RACK. The detection
logic is identical. But since the list is chronologically ordered
by skb_mstamp and contains only skbs not yet acked or sacked,
RACK can abort the loop upon hitting skbs that were sent more
recently. On YouTube servers this patch reduces the iterations on
write queue by 40x. The improvement is even bigger with large
BDP networks.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a new queue (list) that tracks the sent but not yet
acked or SACKed skbs for a TCP connection. The list is chronologically
ordered by skb->skb_mstamp (the head is the oldest sent skb).
This list will be used to optimize TCP Rack recovery, which checks
an skb's timestamp to judge if it has been lost and needs to be
retransmitted. Since TCP write queue is ordered by sequence instead
of sent time, RACK has to scan over the write queue to catch all
eligible packets to detect lost retransmission, and iterates through
SACKed skbs repeatedly.
Special cares for rare events:
1. TCP repair fakes skb transmission so the send queue needs adjusted
2. SACK reneging would require re-inserting SACKed skbs into the
send queue. For now I believe it's not worth the complexity to
make RACK work perfectly on SACK reneging, so we do nothing here.
3. Fast Open: currently for non-TFO, send-queue correctly queues
the pure SYN packet. For TFO which queues a pure SYN and
then a data packet, send-queue only queues the data packet but
not the pure SYN due to the structure of TFO code. This is okay
because the SYN receiver would never respond with a SACK on a
missing SYN (i.e. SYN is never fast-retransmitted by SACK/RACK).
In order to not grow sk_buff, we use an union for the new list and
_skb_refdst/destructor fields. This is a bit complicated because
we need to make sure _skb_refdst and destructor are properly zeroed
before skb is cloned/copied at transmit, and before being freed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use max_1m_mrs/max_8k_mrs while setting max_items, as the former
variables are set based on the underlying device attributes.
Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the scope of has_fr and has_fmr variables as they are
needed only in rds_ib_add_one().
Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
int rc is unmodified after initalization in net/ipv4/route.c, this patch simply cleans up that variable and returns 0.
This was found with coccicheck M=net/ipv4/ on linus' tree.
Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit does a cleanup and moves tcp_rearm_rto() call in the TFO
server case into a previous spot in tcp_rcv_state_process() to make
it more compact.
This is only a cosmetic change.
Suggested-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently in the TCP code, the initialization sequence for cached
metrics, congestion control, BPF, etc, after successful connection
is very inconsistent. This introduces inconsistent bevhavior and is
prone to bugs. The current call sequence is as follows:
(1) for active case (tcp_finish_connect() case):
tcp_mtup_init(sk);
icsk->icsk_af_ops->rebuild_header(sk);
tcp_init_metrics(sk);
tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB);
tcp_init_congestion_control(sk);
tcp_init_buffer_space(sk);
(2) for passive case (tcp_rcv_state_process() TCP_SYN_RECV case):
icsk->icsk_af_ops->rebuild_header(sk);
tcp_call_bpf(sk, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
tcp_init_congestion_control(sk);
tcp_mtup_init(sk);
tcp_init_buffer_space(sk);
tcp_init_metrics(sk);
(3) for TFO passive case (tcp_fastopen_create_child()):
inet_csk(child)->icsk_af_ops->rebuild_header(child);
tcp_init_congestion_control(child);
tcp_mtup_init(child);
tcp_init_metrics(child);
tcp_call_bpf(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
tcp_init_buffer_space(child);
This commit uniforms the above functions to have the following sequence:
tcp_mtup_init(sk);
icsk->icsk_af_ops->rebuild_header(sk);
tcp_init_metrics(sk);
tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE/PASSIVE_ESTABLISHED_CB);
tcp_init_congestion_control(sk);
tcp_init_buffer_space(sk);
This sequence is the same as the (1) active case. We pick this sequence
because this order correctly allows BPF to override the settings
including congestion control module and initial cwnd, etc from
the route, and then allows the CC module to see those settings.
Suggested-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds the sock_diag interface for querying sockets from
userspace. Tools like ss(8) and netstat(8) can use this interface to
list open sockets.
The userspace ABI is defined in <linux/vm_sockets_diag.h> and includes
netlink request and response structs. The request can query sockets
based on their sk_state (e.g. listening sockets only) and the response
contains socket information fields including the local/remote addresses,
inode number, etc.
This patch does not dump VMCI pending sockets because I have only tested
the virtio transport, which does not use pending sockets. Support can
be added later by extending vsock_diag_dump() if needed by VMCI users.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are two state fields: socket->state and sock->sk_state. The
socket->state field uses SS_UNCONNECTED, SS_CONNECTED, etc while the
sock->sk_state typically uses values that match TCP state constants
(TCP_CLOSE, TCP_ESTABLISHED). AF_VSOCK does not follow this convention
and instead uses SS_* constants for both fields.
The sk_state field will be exposed to userspace through the vsock_diag
interface for ss(8), netstat(8), and other programs.
This patch switches sk_state to TCP state constants so that the meaning
of this field is consistent with other address families. Not just
AF_INET and AF_INET6 use the TCP constants, AF_UNIX and others do too.
The following mapping was used to convert the code:
SS_FREE -> TCP_CLOSE
SS_UNCONNECTED -> TCP_CLOSE
SS_CONNECTING -> TCP_SYN_SENT
SS_CONNECTED -> TCP_ESTABLISHED
SS_DISCONNECTING -> TCP_CLOSING
VSOCK_SS_LISTEN -> TCP_LISTEN
In __vsock_create() the sk_state initialization was dropped because
sock_init_data() already initializes sk_state to TCP_CLOSE.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The vsock_diag.ko module will need to check socket table membership.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The socket table symbols need to be exported from vsock.ko so that the
vsock_diag.ko module will be able to traverse sockets.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Just simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Check iwlwifi 9000 reorder buffer out-of-space condition properly,
from Sara Sharon.
2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov
Kasiviswanathan.
3) Fix session and tunnel release races in l2tp, from Guillaume Nault
and Sabrina Dubroca.
4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter.
5) Several mlx5 driver fixes from the Mellanox folks (max flow counters
cap check, invalid memory access in IPoIB support, etc.)
6) tun_get_user() should bail if skb->len is zero, from Alexander
Potapenko.
7) Fix RCU lookups in inetpeer, from Eric Dumazet.
8) Fix locking in packet_do_bund().
9) Handle cb->start() error properly in netlink dump code, from Jason
A. Donenfeld.
10) Handle multicast properly in UDP socket early demux code. From Paolo
Abeni.
11) Several erspan bug fixes in ip_gre, from Xin Long.
12) Fix use-after-free in socket filter code, in order to handle the
fact that listener lock is no longer taken during the three-way TCP
handshake. From Eric Dumazet.
13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov.
14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
net: 8021q: skip packets if the vlan is down
bpf: fix bpf_tail_call() x64 JIT
net: stmmac: dwmac-rk: Add RK3128 GMAC support
rndis_host: support Novatel Verizon USB730L
net: rtnetlink: fix info leak in RTM_GETSTATS call
socket, bpf: fix possible use after free
mlxsw: spectrum_router: Track RIF of IPIP next hops
mlxsw: spectrum_router: Move VRF refcounting
net: hns3: Fix an error handling path in 'hclge_rss_init_hw()'
net: mvpp2: Fix clock resource by adding an optional bus clock
r8152: add Linksys USB3GIGV1 id
l2tp: fix l2tp_eth module loading
ip_gre: erspan device should keep dst
ip_gre: set tunnel hlen properly in erspan_tunnel_init
ip_gre: check packet length and mtu correctly in erspan_xmit
ip_gre: get key from session_id correctly in erspan_rcv
tipc: use only positive error codes in messages
ppp: fix __percpu annotation
udp: perform source validation for mcast early demux
IPv4: early demux can return an error code
...
|
|
Pass extack arg to br_add_if. Add messages for a couple of failures
and pass arg to netdev_master_upper_dev_link.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass extack to do_set_master and down to ndo_add_slave
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add netlink_ext_ack to netdev_notifier_info to allow notifier
handlers to return errors to userspace.
Clean up the initialization in dev.c such that extack is easily
added in subsequent patches where relevant. Specifically, remove
the init call in call_netdevice_notifiers_info and have callers
initalize on stack when info is declared.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the vlan is down, free the packet instead of proceeding with other
processing, or counting it as received. If vlan interfaces are used
as slaves for bonding, with arp monitoring for connectivity, if the rx
counter is seen to be incrementing, then the bond device will not
observe that the interface is down.
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Vishakha Narvekar <Vishakha.Narvekar@dell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
x-netns interfaces are bound to two netns: the link netns and the upper
netns. Usually, this kind of interfaces is created in the link netns and
then moved to the upper netns. At the end, the interface is visible only
in the upper netns. The link nsid is advertised via netlink in the upper
netns, thus the user always knows where is the link part.
There is no such mechanism in the link netns. When the interface is moved
to another netns, the user cannot "follow" it.
This patch adds a new netlink attribute which helps to follow an interface
which moves to another netns. When the interface is unregistered, the new
nsid is advertised. If the interface is a x-netns interface (ie
rtnl_link_ops->get_link_net is defined), the nsid is allocated if needed.
CC: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Compiler does not really know that skb_shinfo(to|from) are constants
in skb_try_coalesce(), lets cache their values to shrink code.
We might even take care of skb_zcopy() calls later.
$ size net/core/skbuff.o.before net/core/skbuff.o
text data bss dec hex filename
40727 1298 0 42025 a429 net/core/skbuff.o.before
40631 1298 0 41929 a3c9 net/core/skbuff.o
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
switch the only caller to rtnl_af_unregister.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
no users in the tree.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
net/core/dev.c:1306: warning: No description found for parameter 'name'
net/core/dev.c:1306: warning: Excess function parameter 'alias' description in 'dev_get_alias'
Fixes: 6c5570016b97 ("net: core: decouple ifalias get/set from rtnl lock")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|