Age | Commit message (Collapse) | Author |
|
Now that smp->remote_key_dist is tracking the keys we're still waiting
for we can use it to simplify the logic for checking whether we're done
with key distribution or not. At the same time the reliance on the
"force" parameter of smp_distribute_keys goes away and it can completely
be removed in a subsequent patch.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
To make is easier to track which keys we've received and which ones
we're still waiting for simply clear the corresponding key bits from
smp->remote_key_dist as they get received. This will allow us to
simplify the code for checking for SMP completion in subsequent patches.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
This option has the same semantic as IP_PMTUDISC_OMIT for IPv4 which
got recently introduced. It doesn't honor the path mtu discovered by the
host but in contrary to IPV6_PMTUDISC_INTERFACE allows the generation of
fragments if the packet size exceeds the MTU of the outgoing interface
MTU.
Fixes: 93b36cf3425b9b ("ipv6: support IPV6_PMTU_INTERFACE on sockets")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IP_PMTUDISC_INTERFACE has a design error: because it does not allow the
generation of fragments if the interface mtu is exceeded, it is very
hard to make use of this option in already deployed name server software
for which I introduced this option.
This patch adds yet another new IP_MTU_DISCOVER option to not honor any
path mtu information and not accepting new icmp notifications destined for
the socket this option is enabled on. But we allow outgoing fragmentation
in case the packet size exceeds the outgoing interface mtu.
As such this new option can be used as a drop-in replacement for
IP_PMTUDISC_DONT, which is currently in use by most name server software
making the adoption of this option very smooth and easy.
The original advantage of IP_PMTUDISC_INTERFACE is still maintained:
ignoring incoming path MTU updates and not honoring discovered path MTUs
in the output path.
Fixes: 482fc6094afad5 ("ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE")
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ip_skb_dst_mtu mostly falls back to ip_dst_mtu_maybe_forward if no socket
is attached to the skb (in case of forwarding) or determines the mtu like
we do in ip_finish_output, which actually checks if we should branch to
ip_fragment. Thus use the same function to determine the mtu here, too.
This is important for the introduction of IP_PMTUDISC_OMIT, where we
want the packets getting cut in pieces of the size of the outgoing
interface mtu. IPv6 already does this correctly.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
iproute2 arpd seems to expect this as there's code and comments
to handle netlink probes with NUD_PROBE set. It is used to flush
the arpd cached mappings.
opennhrp instead turns off unicast probes (so it can handle all
neighbour discovery). Without this change it will not see NUD_PROBE
probes and cannot reconfirm the mapping. Thus currently neigh entry
will just fail and can cause few packets dropped until broadcast
discovery is restarted.
Earlier discussion on the subject:
http://marc.info/?t=139305877100001&r=1&w=2
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These info messages are rather pointless without any means to identify
the source of the bogus packets. Logging the src and dst addresses and
ports may help a bit.
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a sysfs file to enable user space to query the device
port number used by a netdevice instance. This is needed for
devices that have multiple ports on the same PCI function.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Three counters are added:
- one to track when we went from non-zero to zero window
- one to track the reverse
- one counter incremented when we want to announce zero window,
but can't because we would shrink current window.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES can only be incremented
in tcp_transmit_skb() from softirq (incoming message or timer
activation), it is better to use NET_INC_STATS() instead of
NET_INC_STATS_BH() as tcp_transmit_skb() can be called from process
context.
This will avoid copy/paste confusion when/if we want to add
other SNMP counters in tcp_transmit_skb()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Set the RxRPC header flag to request an ACK packet for every odd-numbered DATA
packet unless it's the last one (which implicitly requests an ACK anyway).
This is similar to how librx appears to work.
If we don't do this, we'll send out a full window of packets and then just sit
there until the other side gets bored and sends an ACK to indicate that it's
been idle for a while and has received no new packets.
Requesting a lot of ACKs shouldn't be a problem as ACKs should be merged when
possible.
As AF_RXRPC currently works, it will schedule an ACK to be generated upon
receipt of a DATA packet with the ACK-request packet set - and in the time
taken to schedule this in a work queue, several other packets are likely to
arrive and then all get ACK'd together.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Expose RxRPC parameters via sysctls to control the Rx window size, the Rx MTU
maximum size and the number of packets that can be glued into a jumbo packet.
More info added to Documentation/networking/rxrpc.txt.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Improve ACK production by the following means:
(1) Don't send an ACK_REQUESTED ack immediately even if the RXRPC_MORE_PACKETS
flag isn't set on a data packet that has also has RXRPC_REQUEST_ACK set.
MORE_PACKETS just means that the sender just emptied its Tx data buffer.
More data will be forthcoming unless RXRPC_LAST_PACKET is also flagged.
It is possible to see runs of DATA packets with MORE_PACKETS unset that
aren't waiting for an ACK.
It is therefore better to wait a small instant to see if we can combine an
ACK for several packets.
(2) Don't send an ACK_IDLE ack immediately unless we're responding to the
terminal data packet of a call.
Whilst sending an ACK_IDLE mid-call serves to let the other side know
that we won't be asking it to resend certain Tx buffers and that it can
discard them, spamming it with loads of acks just because we've
temporarily run out of data just distracts it.
(3) Put the ACK_IDLE ack generation timeout up to half a second rather than a
single jiffy. Just because we haven't been given more data immediately
doesn't mean that more isn't forthcoming. The other side may be busily
finding the data to send to us.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add sysctls for configuring RxRPC protocol handling, specifically controls on
delays before ack generation, the delay before resending a packet, the maximum
lifetime of a call and the expiration times of calls, connections and
transports that haven't been recently used.
More info added in Documentation/networking/rxrpc.txt.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
AF_RXRPC sends UDP packets with the "Don't Fragment" bit set in an attempt to
determine the maximum packet size between the local socket and the peer by
invoking the generation of ICMP_FRAG_NEEDED packets.
Once a packet is sent with the "Don't Fragment" bit set, it is then
inconvenient to break it up as that requires recalculating all the rxrpc serial
and sequence numbers and reencrypting all the fragments, so we switch off the
"Don't Fragment" service temporarily and send the bounced packet again. Future
packets then use the new MTU.
That's all fine. The problem lies in rxrpc_UDP_error_report() where the code
that deals with ICMP_FRAG_NEEDED packets lives. Packets of this type have a
field (ee_info) to indicate the maximum packet size at the reporting node - but
sometimes ee_info isn't filled in and is just left as 0 and the code must allow
for this.
When ee_info is 0, the code should take the MTU size we're currently using and
reduce it for the next packet we want to send. However, it takes ee_info
(which is known to be 0) and tries to reduce that instead.
This was discovered by Coverity.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
When a policy is unlinked from the lists in thread context,
the xfrm timer can fire before we can mark this policy as dead.
So reinitialize the bydst hlist, then hlist_unhashed() will
notice that this policy is not linked and will avoid a
doulble unlink of that policy.
Reported-by: Xianpeng Zhao <673321875@qq.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
netlink_sendmsg() was changed to prevent non-root processes from sending
messages with dst_pid != 0.
netlink_connect() however still only checks if nladdr->nl_groups is set.
This patch modifies netlink_connect() to check for the same condition.
Signed-off-by: Mike Pecovnik <mike.pecovnik@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the UFO fragmentation process does not correctly handle inner
UDP frames.
(The following tcpdumps are captured on the parent interface with ufo
disabled while tunnel has ufo enabled, 2000 bytes payload, mtu 1280,
both sit device):
IPv6:
16:39:10.031613 IP (tos 0x0, ttl 64, id 3208, offset 0, flags [DF], proto IPv6 (41), length 1300)
192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 1240) 2001::1 > 2001::8: frag (0x00000001:0|1232) 44883 > distinct: UDP, length 2000
16:39:10.031709 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPv6 (41), length 844)
192.168.122.151 > 1.1.1.1: IP6 (hlim 64, next-header Fragment (44) payload length: 784) 2001::1 > 2001::8: frag (0x00000001:0|776) 58979 > 46366: UDP, length 5471
We can see that fragmentation header offset is not correctly updated.
(fragmentation id handling is corrected by 916e4cf46d0204 ("ipv6: reuse
ip6_frag_id from ip6_ufo_append_data")).
IPv4:
16:39:57.737761 IP (tos 0x0, ttl 64, id 3209, offset 0, flags [DF], proto IPIP (4), length 1296)
192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57034, offset 0, flags [none], proto UDP (17), length 1276)
192.168.99.1.35961 > 192.168.99.2.distinct: UDP, length 2000
16:39:57.738028 IP (tos 0x0, ttl 64, id 3210, offset 0, flags [DF], proto IPIP (4), length 792)
192.168.122.151 > 1.1.1.1: IP (tos 0x0, ttl 64, id 57035, offset 0, flags [none], proto UDP (17), length 772)
192.168.99.1.13531 > 192.168.99.2.20653: UDP, length 51109
In this case fragmentation id is incremented and offset is not updated.
First, I aligned inet_gso_segment and ipv6_gso_segment:
* align naming of flags
* ipv6_gso_segment: setting skb->encapsulation is unnecessary, as we
always ensure that the state of this flag is left untouched when
returning from upper gso segmenation function
* ipv6_gso_segment: move skb_reset_inner_headers below updating the
fragmentation header data, we don't care for updating fragmentation
header data
* remove currently unneeded comment indicating skb->encapsulation might
get changed by upper gso_segment callback (gre and udp-tunnel reset
encapsulation after segmentation on each fragment)
If we encounter an IPIP or SIT gso skb we now check for the protocol ==
IPPROTO_UDP and that we at least have already traversed another ip(6)
protocol header.
The reason why we have to special case GSO_IPIP and GSO_SIT is that
we reset skb->encapsulation to 0 while skb_mac_gso_segment the inner
protocol of GSO_UDP_TUNNEL or GSO_GRE packets.
Reported-by: Wolfgang Walter <linux@stwm.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Core Specification (4.1) leaves room for sending an SMP Identity
Address Information PDU with an all-zeros BD_ADDR value. This
essentially means that we would not have an Identity Address for the
device and the only means of identifying it would be the IRK value
itself.
Due to lack of any known implementations behaving like this it's best to
keep our implementation as simple as possible as far as handling such
situations is concerned. This patch updates the Identity Address
Information handler function to simply ignore the IRK received from such
a device.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
When the connectable setting is toggled using mgmt_set_connectable the
HCI_CONNECTABLE flag will only be set once the related HCI commands
succeed. When determining what kind of advertising to do we need to
therefore also check whether there is a pending Set Connectable command
in addition to the current flag value.
The enable_advertising function was already taking care of this for the
advertising type with the help of the get_adv_type function, but was
failing to do the same for the address type selection. This patch
converts the get_adv_type function to be more generic in that it returns
the expected connectable state and updates the enable_advertising
function to use the return value both for the advertising type as well
as the advertising address type.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
When trying to allocate skb for new PDU, l2cap_chan is unlocked so we
can sleep waiting for memory as otherwise there's possible deadlock as
fixed in e454c84464. However, in a6a5568c03 lock was moved from socket
to channel level and it's no longer safe to just unlock and lock again
without checking l2cap_chan state since channel can be disconnected
when lock is not held.
This patch adds missing checks for l2cap_chan state when returning from
call which allocates skb.
Scenario is easily reproducible by running rfcomm-tester in a loop.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
PGD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 7 PID: 4038 Comm: krfcommd Not tainted 3.14.0-rc2+ #15
Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A10 11/24/2011
task: ffff8802bdd731c0 ti: ffff8801ec986000 task.ti: ffff8801ec986000
RIP: 0010:[<ffffffffa0442169>] [<ffffffffa0442169>] l2cap_do_send+0x29/0x120
RSP: 0018:ffff8801ec987ad8 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8800c5796800 RCX: 0000000000000000
RDX: ffff880410e7a800 RSI: ffff8802b6c1da00 RDI: ffff8800c5796800
RBP: ffff8801ec987af8 R08: 00000000000000c0 R09: 0000000000000300
R10: 000000000000573b R11: 000000000000573a R12: ffff8802b6c1da00
R13: 0000000000000000 R14: ffff8802b6c1da00 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000041257c000 CR4: 00000000000407e0
Stack:
ffff8801ec987d78 ffff8800c5796800 ffff8801ec987d78 0000000000000000
ffff8801ec987ba8 ffffffffa0449e37 0000000000000004 ffff8801ec987af0
ffff8801ec987d40 0000000000000282 0000000000000000 ffffffff00000004
Call Trace:
[<ffffffffa0449e37>] l2cap_chan_send+0xaa7/0x1120 [bluetooth]
[<ffffffff81770100>] ? _raw_spin_unlock_bh+0x20/0x40
[<ffffffffa045188b>] l2cap_sock_sendmsg+0xcb/0x110 [bluetooth]
[<ffffffff81652b0f>] sock_sendmsg+0xaf/0xc0
[<ffffffff810a8381>] ? update_curr+0x141/0x200
[<ffffffff810a8961>] ? dequeue_entity+0x181/0x520
[<ffffffff81652b60>] kernel_sendmsg+0x40/0x60
[<ffffffffa04a8505>] rfcomm_send_frame+0x45/0x70 [rfcomm]
[<ffffffff810766f0>] ? internal_add_timer+0x20/0x50
[<ffffffffa04a8564>] rfcomm_send_cmd+0x34/0x60 [rfcomm]
[<ffffffffa04a8605>] rfcomm_send_disc+0x75/0xa0 [rfcomm]
[<ffffffffa04aacec>] rfcomm_run+0x8cc/0x1a30 [rfcomm]
[<ffffffffa04aa420>] ? rfcomm_check_accept+0xc0/0xc0 [rfcomm]
[<ffffffff8108e3a9>] kthread+0xc9/0xe0
[<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
[<ffffffff817795fc>] ret_from_fork+0x7c/0xb0
[<ffffffff8108e2e0>] ? flush_kthread_worker+0xb0/0xb0
Code: 00 00 66 66 66 66 90 55 48 89 e5 48 83 ec 20 f6 05 d6 a3 02 00 04
RIP [<ffffffffa0442169>] l2cap_do_send+0x29/0x120 [bluetooth]
RSP <ffff8801ec987ad8>
CR2: 0000000000000000
Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Commit "nl80211: send event when AP operation is stopped" added an
event to notify user space that an AP interface has been stopped, to
handle cases such as suspend etc. The event is sent regardless
if the stop AP flow was triggered by user space or due to internal state
change.
This might cause issues with wpa_supplicant/hostapd flows that consider
stop AP flow as a synchronous one, e.g., AP/GO channel change in the
absence of CSA support. In such cases, the flow will restart the AP
immediately after the stop AP flow is done, and only handle the stop
AP event after the current flow is done, and as a result stop the AP
again.
Change the current implementation to only send the event in case the
stop AP was triggered due to an internal reason.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Send Channel Availability Check time as a parameter
of start_radar_detection() callback.
Get CAC time from regulatory database.
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Introduce DFS CAC time as a regd param, configured per REG_RULE and
set per channel in cfg80211. DFS CAC time is close connected with
regulatory database configuration. Instead of using hardcoded values,
get DFS CAC time form regulatory database. Pass DFS CAC time to user
mode (mainly for iw reg get, iw list, iw info). Allow setting DFS CAC
time via CRDA. Add support for internal regulatory database.
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
[rewrap commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Allow to set world regulatory domain in case of user
request (iw reg set 00).
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Reset regdomain to world regdomain in case
of errors in set_regdom() function.
This will fix a problem with such scenario:
- iw reg set US
- iw reg set 00
- iw reg set US
The last step always fail and we get deadlock
in kernel regulatory code. Next setting new
regulatory wasn't possible due to:
Pending regulatory request, waiting for it to be processed...
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Acked-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
There's no need for the struct device_type with the uevent function
etc., just fill the country alpha2 when sending the event.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Allow userspace to specify the queue number or the errno code for QUEUE
and DROP verdicts.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a lockdep_nfnl_is_held() function and a nfnl_dereference() macro for
RCU dereferences protected by a NFNL subsystem mutex.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The next patch will introduce a nfnl_dereference() macro that actually
checks that the appropriate mutex is held and therefore needs a
subsystem argument.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
vti4 is now fully namespace aware, so allow namespace changing
for vti devices
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
The tunnel endpoints of the xfrm_state we got from the xfrm_lookup
must match the tunnel endpoints of the vti interface. This patch
ensures this matching.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
With this patch we can tunnel ipv6 traffic via a vti4
interface. A vti4 interface can now have an ipv6 address
and ipv6 traffic can be routed via a vti4 interface.
The resulting traffic is xfrm transformed and tunneled
throuhg ipv4 if matching IPsec policies and states are
present.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
We need to be protocol family indepenent to support
inter addresss family tunneling with vti. So use a
dst_entry instead of the ipv4 rtable in vti_tunnel_xmit.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This was used from vti and is replaced by the IPsec protocol
multiplexer hooks. It is now unused, so remove it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
With this patch, vti uses the IPsec protocol multiplexer to
register it's own receive side hooks for ESP, AH and IPCOMP.
Vti now does the following on receive side:
1. Do an input policy check for the IPsec packet we received.
This is required because this packet could be already
prosecces by IPsec, so an inbuond policy check is needed.
2. Mark the packet with the i_key. The policy and the state
must match this key now. Policy and state belong to the outer
namespace and policy enforcement is done at the further layers.
3. Call the generic xfrm layer to do decryption and decapsulation.
4. Wait for a callback from the xfrm layer to properly clean the
skb to not leak informations on namespace and to update the
device statistics.
On transmit side:
1. Mark the packet with the o_key. The policy and the state
must match this key now.
2. Do a xfrm_lookup on the original packet with the mark applied.
3. Check if we got an IPsec route.
4. Clean the skb to not leak informations on namespace
transitions.
5. Attach the dst_enty we got from the xfrm_lookup to the skb.
6. Call dst_output to do the IPsec processing.
7. Do the device statistics.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Vti uses the o_key to mark packets that were transmitted or received
by a vti interface. Unfortunately we can't apply different marks
to in and outbound packets with only one key availabe. Vti interfaces
typically use wildcard selectors for vti IPsec policies. On forwarding,
the same output policy will match for both directions. This generates
a loop between the IPsec gateways until the ttl of the packet is
exceeded.
The gre i_key/o_key are usually there to find the right gre tunnel
during a lookup. When vti uses the i_key to mark packets, the tunnel
lookup does not work any more because vti does not use the gre keys
as a hash key for the lookup.
This patch workarounds this my not including the i_key when comupting
the hash for the tunnel lookup in case of vti tunnels.
With this we have separate keys available for the transmitting and
receiving side of the vti interface.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
IPsec vti_rcv needs to remind the tunnel pointer to
check it later at the vti_rcv_cb callback. So add
this pointer to the IPsec common buffer, initialize
it and check it to avoid transport state matching of
a tunneled packet.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Switch ipcomp4 to use the new IPsec protocol multiplexer.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Switch ah4 to use the new IPsec protocol multiplexer.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Switch esp4 to use the new IPsec protocol multiplexer.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch add an IPsec protocol multiplexer. With this
it is possible to add alternative protocol handlers as
needed for IPsec virtual tunnel interfaces.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Convert the uses of memcpy to ether_addr_copy because
for some architectures it is smaller and faster.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert the more obvious uses of memcpy to ether_addr_copy.
There are still uses of memcpy that could be converted but
these addresses are __aligned(2).
Convert a couple uses of 6 in gr_private.h to ETH_ALEN.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp_is_cwnd_limited() allows GSO/TSO enabled flows to increase
their cwnd to allow a full size (64KB) TSO packet to be sent.
Non GSO flows only allow an extra room of 3 MSS.
For most flows with a BDP below 10 MSS, this results in a bloat
of cwnd reaching 90, and an inflate of RTT.
Thanks to TSO auto sizing, we can restrict the bloat to the number
of MSS contained in a TSO packet (tp->xmit_size_goal_segs), to keep
original intent without performance impact.
Because we keep cwnd small, it helps to keep TSO packet size to their
optimal value.
Example for a 10Mbit flow, with low TCP Small queue limits (no more than
2 skb in qdisc/device tx ring)
Before patch :
lpk51:~# ./ss -i dst lpk52:44862 | grep cwnd
cubic wscale:6,6 rto:215 rtt:15.875/2.5 mss:1448 cwnd:96
ssthresh:96
send 70.1Mbps unacked:14 rcv_space:29200
After patch :
lpk51:~# ./ss -i dst lpk52:52916 | grep cwnd
cubic wscale:6,6 rto:206 rtt:5.206/0.036 mss:1448 cwnd:15
ssthresh:14
send 33.4Mbps unacked:4 rcv_space:29200
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Van Jacobson <vanj@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The documentation misses a few of the supported flags. Fix this. Also
respect the dependency to CONFIG_XFRM for the IPSEC flag.
Cc: Fan Du <fan.du@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 'out' label is just a relict from previous times as pgctrl_write()
had multiple error paths. Get rid of it and simply return right away
on errors.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a privileged user writes an empty string to /proc/net/pktgen/pgctrl
the code for stripping the (then non-existent) '\n' actually writes the
zero byte at index -1 of data[]. The then still uninitialized array will
very likely fail the command matching tests and the pr_warning() at the
end will therefore leak stack bytes to the kernel log.
Fix those issues by simply ensuring we're passed a non-empty string as
the user API apparently expects a trailing '\n' for all commands.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|