Age | Commit message (Collapse) | Author |
|
syzbot reported a use-after-free in tls_sk_proto_close
Add a boolean value to cleanup a bit this function.
BUG: KASAN: use-after-free in tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
Read of size 1 at addr ffff8801ae40a858 by task syz-executor363/4503
CPU: 0 PID: 4503 Comm: syz-executor363 Not tainted 4.17.0-rc3+ #34
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
sock_release+0x96/0x1b0 net/socket.c:594
sock_close+0x16/0x20 net/socket.c:1149
__fput+0x34d/0x890 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243
task_work_run+0x1e4/0x290 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x1aee/0x2730 kernel/exit.c:865
do_group_exit+0x16f/0x430 kernel/exit.c:968
get_signal+0x886/0x1960 kernel/signal.c:2469
do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4457b9
RSP: 002b:00007fdf4d766da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006dac3c RCX: 00000000004457b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000006dac3c
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac38
R13: 3692738801137283 R14: 6bf92c39443c4c1d R15: 0000000000000006
Allocated by task 4498:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
kmalloc include/linux/slab.h:512 [inline]
kzalloc include/linux/slab.h:701 [inline]
create_ctx net/tls/tls_main.c:521 [inline]
tls_init+0x1f9/0xb00 net/tls/tls_main.c:633
tcp_set_ulp+0x1bc/0x520 net/ipv4/tcp_ulp.c:153
do_tcp_setsockopt.isra.39+0x44a/0x2600 net/ipv4/tcp.c:2588
tcp_setsockopt+0xc1/0xe0 net/ipv4/tcp.c:2893
sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
__sys_setsockopt+0x1bd/0x390 net/socket.c:1903
__do_sys_setsockopt net/socket.c:1914 [inline]
__se_sys_setsockopt net/socket.c:1911 [inline]
__x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 4503:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xd9/0x260 mm/slab.c:3813
tls_sw_free_resources+0x2a3/0x360 net/tls/tls_sw.c:1037
tls_sk_proto_close+0x67c/0x9c0 net/tls/tls_main.c:288
inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
sock_release+0x96/0x1b0 net/socket.c:594
sock_close+0x16/0x20 net/socket.c:1149
__fput+0x34d/0x890 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243
task_work_run+0x1e4/0x290 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x1aee/0x2730 kernel/exit.c:865
do_group_exit+0x16f/0x430 kernel/exit.c:968
get_signal+0x886/0x1960 kernel/signal.c:2469
do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8801ae40a800
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 88 bytes inside of
256-byte region [ffff8801ae40a800, ffff8801ae40a900)
The buggy address belongs to the page:
page:ffffea0006b90280 count:1 mapcount:0 mapping:ffff8801ae40a080 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffff8801ae40a080 0000000000000000 000000010000000c
raw: ffffea0006bea9e0 ffffea0006bc94a0 ffff8801da8007c0 0000000000000000
page dumped because: kasan: bad access detected
Fixes: dd0bed1665d6 ("tls: support for Inline tls record")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Atul Gupta <atul.gupta@chelsio.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Ilya Lesokhin <ilyal@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: Dave Watson <davejwatson@fb.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now sctp only delays the authentication for the normal cookie-echo
chunk by setting chunk->auth_chunk in sctp_endpoint_bh_rcv(). But
for the duplicated one with auth, in sctp_assoc_bh_rcv(), it does
authentication first based on the old asoc, which will definitely
fail due to the different auth info in the old asoc.
The duplicated cookie-echo chunk will create a new asoc with the
auth info from this chunk, and the authentication should also be
done with the new asoc's auth info for all of the collision 'A',
'B' and 'D'. Otherwise, the duplicated cookie-echo chunk with auth
will never pass the authentication and create the new connection.
This issue exists since very beginning, and this fix is to make
sctp_assoc_bh_rcv() follow the way sctp_endpoint_bh_rcv() does
for the normal cookie-echo chunk to delay the authentication.
While at it, remove the unused params from sctp_sf_authenticate()
and define sctp_auth_chunk_verify() used for all the places that
do the delayed authentication.
v1->v2:
fix the typo in changelog as Marcelo noticed.
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The v9fs_get_trans_by_name(char *s) variable name is not "name" but "s".
Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The vlan_flags enum is defined in include/uapi/linux/if_vlan.h file.
not in include/linux/if_vlan.h file.
Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Minor conflict, a CHECK was placed into an if() statement
in net-next, whilst a newline was added to that CHECK
call in 'net'. Thanks to Daniel for the merge resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver will process the RSSI if available and send it to mac80211.
mac80211 will compute the weighted average of ack RSSI for stations.
Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Average ack rssi will be given to userspace via NL80211 interface
if firmware is capable. Userspace tool ‘iw’ can process this
information and give the output as one of the fields in
‘iw dev wlanX station dump’.
Example output :
localhost ~ #iw dev wlan-5000mhz station dump Station
34:f3:9a:aa:3b:29 (on wlan-5000mhz)
inactive time: 5370 ms
rx bytes: 85321
rx packets: 576
tx bytes: 14225
tx packets: 71
tx retries: 0
tx failed: 2
beacon loss: 0
rx drop misc: 0
signal: -54 dBm
signal avg: -53 dBm
tx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
rx bitrate: 866.7 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 2
avg ack signal: -56 dBm
authorized: yes
authenticated: yes
associated: yes
preamble: short
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 2
beacon interval:100
short preamble: yes
short slot time:yes
connected time: 203 seconds
Main use case is to measure the signal strength of a connected station
to AP. Data packet transmit rates and bandwidth used by station can vary
a lot even if the station is at fixed location, especially if the rates
used are multi stream(2stream, 3stream) rates with different bandwidth(20/40/80 Mhz).
These multi stream rates are sensitive and station can use different transmit power
for each of the rate and bandwidth combinations. RSSI measured from these RX packets
on AP will be not stable and can vary a lot with in a short time.
Whereas 802.11 ack frames from station are sent relatively at a constant
rate (6/12/24 Mbps) with constant bandwidth(20 Mhz).
So average rssi of the ack packets is good and more accurate.
Signed-off-by: Balaji Pothunoori <bpothuno@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Currently the regulatory core does not call the regulatory callback
reg_notifier for self managed wiphys, but regulatory_hint_user() call is
independent of wiphy and is meant for all wiphys in the system. Even a
self managed wiphy may be interested in regulatory_hint_user() to know
the country code from a trusted regulatory domain change like a cellular
base station. Therefore, for the regulatory source
NL80211_REGDOM_SET_BY_USER and the user hint type
NL80211_USER_REG_HINT_CELL_BASE, call the regulatory notifier.
No current wlan driver uses the REGULATORY_WIPHY_SELF_MANAGED flag while
also registering the reg_notifier regulatory callback, therefore there
will be no impact on existing drivers without them being explicitly
modified to take advantage of this new possibility.
Signed-off-by: Amar Singhal <asinghal@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This will serve userspace entity to maintain its regulatory limitation.
More specifcally APs can use this data to calculate the WMM IE when
building: beacons, probe responses, assoc responses etc...
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Sometimes the most updated CSA counter values are known only
to the device. Add an API to pass this data to mac80211.
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The data structure is initialized to all zeroes, and
we already rely on that in other places, so remove the
pointless assignment to 0.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Rather than just setting the valid flags to 0 set the
whole struct to 0 since other places might rely on it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
There's no need to do the same thing three times in
the different switch cases, pull that out to a single
place.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since all the HE data won't fit into struct ieee80211_rx_status,
we'll (have to) move that into the SKB proper. This means we'll
need to skip over more things in the future, so rename this to
remove the vendor-only notion from it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
2016 spec, section 10.24.2 specifies that the block ack
timeout in the ADD BA request is advisory.
That means we should check the value in the response and
act upon it (same as buffer size).
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The IEEE P802.11-REVmd D1.0 specification updated the SAE authentication
timeout to be 2000 milliseconds (see dot11RSNASAERetransPeriod). Update
the SAE timeout setting accordingly.
While at it, reduce some code duplication in the timeout configuration.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Make sure to free the rfkill device in case registration fails during
probe.
Fixes: 5e7ca3937fbe ("net: rfkill: gpio: convert to resource managed allocation")
Cc: stable <stable@vger.kernel.org> # 3.13
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
These days the dma mapping routines must be able to handle any address
supported by the device, be that using an iommu, or swiotlb if none is
supported. With that the PCI_DMA_BUS_IS_PHYS check in illegal_highdma
is not needed and can be removed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next
tree, more relevant updates in this batch are:
1) Add Maglev support to IPVS. Moreover, store lastest server weight in
IPVS since this is needed by maglev, patches from from Inju Song.
2) Preparation works to add iptables flowtable support, patches
from Felix Fietkau.
3) Hand over flows back to conntrack slow path in case of TCP RST/FIN
packet is seen via new teardown state, also from Felix.
4) Add support for extended netlink error reporting for nf_tables.
5) Support for larger timeouts that 23 days in nf_tables, patch from
Florian Westphal.
6) Always set an upper limit to dynamic sets, also from Florian.
7) Allow number generator to make map lookups, from Laura Garcia.
8) Use hash_32() instead of opencode hashing in IPVS, from Vicent Bernat.
9) Extend ip6tables SRH match to support previous, next and last SID,
from Ahmed Abdelsalam.
10) Move Passive OS fingerprint nf_osf.c, from Fernando Fernandez.
11) Expose nf_conntrack_max through ctnetlink, from Florent Fourcot.
12) Several housekeeping patches for xt_NFLOG, x_tables and ebtables,
from Taehee Yoo.
13) Unify meta bridge with core nft_meta, then make nft_meta built-in.
Make rt and exthdr built-in too, again from Florian.
14) Missing initialization of tbl->entries in IPVS, from Cong Wang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This must now use a 64bit jiffies value, else we set
a bogus timeout on 32bit.
Fixes: 8e1102d5a1596 ("netfilter: nf_tables: support timeouts larger than 23 days")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
IPCTNL_MSG_CT_GET_STATS netlink command allow to monitor current number
of conntrack entries. However, if one wants to compare it with the
maximum (and detect exhaustion), the only solution is currently to read
sysctl value.
This patch add nf_conntrack_max value in netlink message, and simplify
monitoring for application built on netlink API.
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add nf_osf_ttl() and nf_osf_match() into nf_osf.c to prepare for
nf_tables support.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
IPv6 Segment Routing Header (SRH) contains a list of SIDs to be crossed
by SR encapsulated packet. Each SID is encoded as an IPv6 prefix.
When a Firewall receives an SR encapsulated packet, it should be able
to identify which node previously processed the packet (previous SID),
which node is going to process the packet next (next SID), and which
node is the last to process the packet (last SID) which represent the
final destination of the packet in case of inline SR mode.
An example use-case of using these features could be SID list that
includes two firewalls. When the second firewall receives a packet,
it can check whether the packet has been processed by the first firewall
or not. Based on that check, it decides to apply all rules, apply just
subset of the rules, or totally skip all rules and forward the packet to
the next SID.
This patch extends SRH match to support matching previous SID, next SID,
and last SID.
Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The modulus in the hash function was limited to > 1 as initially
there was no sense to create a hashing of just one element.
Nevertheless, there are certain cases specially for load balancing
where this case needs to be addressed.
This patch fixes the following error.
Error: Could not process rule: Numerical result out of range
add rule ip nftlb lb01 dnat to jhash ip saddr mod 1 map { 0: 192.168.0.10 }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The solution comes to force the hash to 0 when the modulus is 1.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
|
|
This patch includes a new attribute in the numgen structure to allow
the lookup of an element based on the number generator as a key.
For this purpose, different ops have been included to extend the
current numgen inc functions.
Currently, only supported for numgen incremental operations, but
it will be supported for random in a follow-up patch.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This slipped through the cracks in the followup set to the fib6_info flip.
Rename rt6_next to fib6_next.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the newly created LSM-hook for socketpair(). The default hook
return-value is 0, so behavior stays the same unless LSMs start using
this hook.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Morris <james.morris@microsoft.com>
|
|
Making sure the headers line up properly with the actual value output of the command
`cat /proc/net/netlink`
Before the patch:
<sk Eth Pid Groups Rmem Wmem Dump Locks Drops Inode
<ffff8cd2c2f7b000 0 909 00000550 0 0 0 2 0 18946
After the patch:
>sk Eth Pid Groups Rmem Wmem Dump Locks Drops Inode
>0000000033203952 0 897 00000113 0 0 0 2 0 14906
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot caught an infinite recursion in nsh_gso_segment().
Problem here is that we need to make sure the NSH header is of
reasonable length.
BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48 max: 48!
48 locks held by syz-executor0/10189:
#0: (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
#1: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#1: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#2: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#2: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#3: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#3: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#4: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#4: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#5: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#5: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#6: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#6: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#7: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#7: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#8: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#8: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#9: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#9: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#10: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#10: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#11: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#11: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#12: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#12: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#13: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#13: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#14: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#14: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#15: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#15: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#16: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#16: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#17: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#17: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#18: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#18: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#19: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#19: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#20: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#20: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#21: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#21: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#22: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#22: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#23: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#23: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#24: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#24: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#25: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#25: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#26: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#26: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#27: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#27: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#28: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#28: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#29: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#29: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#30: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#30: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#31: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#31: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
dccp_close: ABORT with 65423 bytes unread
#32: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#32: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#33: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#33: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#34: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#34: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#35: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#35: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#36: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#36: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#37: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#37: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#38: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#38: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#39: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#39: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#40: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#40: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#41: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#41: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#42: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#42: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#43: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#43: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#44: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#44: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#45: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#45: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#46: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#46: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
#47: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
#47: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
INFO: lockdep is turned off.
CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
__lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
rcu_read_lock include/linux/rcupdate.h:632 [inline]
skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
__skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
skb_gso_segment include/linux/netdevice.h:4025 [inline]
validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
qdisc_restart net/sched/sch_generic.c:399 [inline]
__qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
__dev_xmit_skb net/core/dev.c:3243 [inline]
__dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
packet_snd net/packet/af_packet.c:2951 [inline]
packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:639
__sys_sendto+0x3d7/0x670 net/socket.c:1789
__do_sys_sendto net/socket.c:1801 [inline]
__se_sys_sendto net/socket.c:1797 [inline]
__x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: c411ed854584 ("nsh: add GSO support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Benc <jbenc@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ioc_data.dev_num can be controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
net/atm/lec.c:702 lec_vcc_attach() warn: potential spectre issue
'dev_lec'
Fix this by sanitizing ioc_data.dev_num before using it to index
dev_lec. Also, notice that there is another instance in which array
dev_lec is being indexed using ioc_data.dev_num at line 705:
lec_vcc_added(netdev_priv(dev_lec[ioc_data.dev_num]),
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If an OVS_ATTR_NESTED attribute type is found while walking
through netlink attributes, we call nlattr_set() recursively
passing the length table for the following nested attributes, if
different from the current one.
However, once we're done with those sub-nested attributes, we
should continue walking through attributes using the current
table, instead of using the one related to the sub-nested
attributes.
For example, given this sequence:
1 OVS_KEY_ATTR_PRIORITY
2 OVS_KEY_ATTR_TUNNEL
3 OVS_TUNNEL_KEY_ATTR_ID
4 OVS_TUNNEL_KEY_ATTR_IPV4_SRC
5 OVS_TUNNEL_KEY_ATTR_IPV4_DST
6 OVS_TUNNEL_KEY_ATTR_TTL
7 OVS_TUNNEL_KEY_ATTR_TP_SRC
8 OVS_TUNNEL_KEY_ATTR_TP_DST
9 OVS_KEY_ATTR_IN_PORT
10 OVS_KEY_ATTR_SKB_MARK
11 OVS_KEY_ATTR_MPLS
we switch to the 'ovs_tunnel_key_lens' table on attribute #3,
and we don't switch back to 'ovs_key_lens' while setting
attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS
evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is
15, we also get this kind of KASan splat while accessing the
wrong table:
[ 7654.586496] ==================================================================
[ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430
[ 7654.610983]
[ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1
[ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
[ 7654.631379] Call Trace:
[ 7654.634108] [<ffffffffb65a7c50>] dump_stack+0x19/0x1b
[ 7654.639843] [<ffffffffb53ff373>] print_address_description+0x33/0x290
[ 7654.647129] [<ffffffffc169b37b>] ? nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.654607] [<ffffffffb53ff812>] kasan_report.part.3+0x242/0x330
[ 7654.661406] [<ffffffffb53ff9b4>] __asan_report_load4_noabort+0x34/0x40
[ 7654.668789] [<ffffffffc169b37b>] nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.676076] [<ffffffffc167ef68>] ovs_nla_get_match+0x10c8/0x1900 [openvswitch]
[ 7654.684234] [<ffffffffb61e9cc8>] ? genl_rcv+0x28/0x40
[ 7654.689968] [<ffffffffb61e7733>] ? netlink_unicast+0x3f3/0x590
[ 7654.696574] [<ffffffffc167dea0>] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch]
[ 7654.705122] [<ffffffffb4f41b50>] ? unwind_get_return_address+0xb0/0xb0
[ 7654.712503] [<ffffffffb65d9355>] ? system_call_fastpath+0x1c/0x21
[ 7654.719401] [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
[ 7654.726298] [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
[ 7654.733195] [<ffffffffb53fe4b5>] ? kasan_unpoison_shadow+0x35/0x50
[ 7654.740187] [<ffffffffb53fe62a>] ? kasan_kmalloc+0xaa/0xe0
[ 7654.746406] [<ffffffffb53fec32>] ? kasan_slab_alloc+0x12/0x20
[ 7654.752914] [<ffffffffb53fe711>] ? memset+0x31/0x40
[ 7654.758456] [<ffffffffc165bf92>] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch]
[snip]
[ 7655.132484] The buggy address belongs to the variable:
[ 7655.138226] ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch]
[ 7655.145507]
[ 7655.147166] Memory state around the buggy address:
[ 7655.152514] ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
[ 7655.160585] ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
[ 7655.176701] ^
[ 7655.184372] ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05
[ 7655.192431] ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[ 7655.200490] ==================================================================
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: 982b52700482 ("openvswitch: Fix mask generation for nested attributes.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide an implementation for splice() when we are using SMC. See
smc_splice_read() for further details.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Preparatory work for splice() support.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Turn smc_rx_wait_data into a generic function that can be used at various
instances to wait on traffic to complete with varying criteria.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some of the conditions to exit recv() are common in two pathes - cleaning up
code by moving the check up so we have it only once.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Overlapping changes in selftests Makefile.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct xfrm_state is rather large (768 bytes here) and therefore wastes
quite a lot of memory as it falls into the kmalloc-1024 slab cache,
leaving 256 bytes of unused memory per XFRM state object -- a net waste
of 25%.
Using a dedicated slab cache for struct xfrm_state reduces the level of
internal fragmentation to a minimum.
On my configuration SLUB chooses to create a slab cache covering 4
pages holding 21 objects, resulting in an average memory waste of ~13
bytes per object -- a net waste of only 1.6%.
In my tests this led to memory savings of roughly 2.3MB for 10k XFRM
states.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Pull networking fixes from David Miller:
1) Various sockmap fixes from John Fastabend (pinned map handling,
blocking in recvmsg, double page put, error handling during redirect
failures, etc.)
2) Fix dead code handling in x86-64 JIT, from Gianluca Borello.
3) Missing device put in RDS IB code, from Dag Moxnes.
4) Don't process fast open during repair mode in TCP< from Yuchung
Cheng.
5) Move address/port comparison fixes in SCTP, from Xin Long.
6) Handle add a bond slave's master into a bridge properly, from
Hangbin Liu.
7) IPv6 multipath code can operate on unitialized memory due to an
assumption that the icmp header is in the linear SKB area. Fix from
Eric Dumazet.
8) Don't invoke do_tcp_sendpages() recursively via TLS, from Dave
Watson.
9) Fix memory leaks in x86-64 JIT, from Daniel Borkmann.
10) RDS leaks kernel memory to userspace, from Eric Dumazet.
11) DCCP can invoke a tasklet on a freed socket, take a refcount. Also
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
dccp: fix tasklet usage
smc: fix sendpage() call
net/smc: handle unregistered buffers
net/smc: call consolidation
qed: fix spelling mistake: "offloded" -> "offloaded"
net/mlx5e: fix spelling mistake: "loobpack" -> "loopback"
tcp: restore autocorking
rds: do not leak kernel memory to user land
qmi_wwan: do not steal interfaces from class drivers
ipv4: fix fnhe usage by non-cached routes
bpf: sockmap, fix error handling in redirect failures
bpf: sockmap, zero sg_size on error when buffer is released
bpf: sockmap, fix scatterlist update on error path in send with apply
net_sched: fq: take care of throttled flows before reuse
ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"
bpf, x64: fix memleak when not converging on calls
bpf, x64: fix memleak when not converging after image
net/smc: restrict non-blocking connect finish
8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
sctp: fix the issue that the cookie-ack with auth can't get processed
...
|
|
This adds a small BPF helper similar to bpf_skb_load_bytes() that
is able to load relative to mac/net header offset from the skb's
linear data. Compared to bpf_skb_load_bytes(), it takes a fifth
argument namely start_header, which is either BPF_HDR_START_MAC
or BPF_HDR_START_NET. This allows for a more flexible alternative
compared to LD_ABS/LD_IND with negative offset. It's enabled for
tc BPF programs as well as sock filter program types where it's
mainly useful in reuseport programs to ease access to lower header
data.
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2017-March/000698.html
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The main part of this work is to finally allow removal of LD_ABS
and LD_IND from the BPF core by reimplementing them through native
eBPF instead. Both LD_ABS/LD_IND were carried over from cBPF and
keeping them around in native eBPF caused way more trouble than
actually worth it. To just list some of the security issues in
the past:
* fdfaf64e7539 ("x86: bpf_jit: support negative offsets")
* 35607b02dbef ("sparc: bpf_jit: fix loads from negative offsets")
* e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
* 07aee9439454 ("bpf, sparc: fix usage of wrong reg for load_skb_regs after call")
* 6d59b7dbf72e ("bpf, s390x: do not reload skb pointers in non-skb context")
* 87338c8e2cbb ("bpf, ppc64: do not reload skb pointers in non-skb context")
For programs in native eBPF, LD_ABS/LD_IND are pretty much legacy
these days due to their limitations and more efficient/flexible
alternatives that have been developed over time such as direct
packet access. LD_ABS/LD_IND only cover 1/2/4 byte loads into a
register, the load happens in host endianness and its exception
handling can yield unexpected behavior. The latter is explained
in depth in f6b1b3bf0d5f ("bpf: fix subprog verifier bypass by
div/mod by 0 exception") with similar cases of exceptions we had.
In native eBPF more recent program types will disable LD_ABS/LD_IND
altogether through may_access_skb() in verifier, and given the
limitations in terms of exception handling, it's also disabled
in programs that use BPF to BPF calls.
In terms of cBPF, the LD_ABS/LD_IND is used in networking programs
to access packet data. It is not used in seccomp-BPF but programs
that use it for socket filtering or reuseport for demuxing with
cBPF. This is mostly relevant for applications that have not yet
migrated to native eBPF.
The main complexity and source of bugs in LD_ABS/LD_IND is coming
from their implementation in the various JITs. Most of them keep
the model around from cBPF times by implementing a fastpath written
in asm. They use typically two from the BPF program hidden CPU
registers for caching the skb's headlen (skb->len - skb->data_len)
and skb->data. Throughout the JIT phase this requires to keep track
whether LD_ABS/LD_IND are used and if so, the two registers need
to be recached each time a BPF helper would change the underlying
packet data in native eBPF case. At least in eBPF case, available
CPU registers are rare and the additional exit path out of the
asm written JIT helper makes it also inflexible since not all
parts of the JITer are in control from plain C. A LD_ABS/LD_IND
implementation in eBPF therefore allows to significantly reduce
the complexity in JITs with comparable performance results for
them, e.g.:
test_bpf tcpdump port 22 tcpdump complex
x64 - before 15 21 10 14 19 18
- after 7 10 10 7 10 15
arm64 - before 40 91 92 40 91 151
- after 51 64 73 51 62 113
For cBPF we now track any usage of LD_ABS/LD_IND in bpf_convert_filter()
and cache the skb's headlen and data in the cBPF prologue. The
BPF_REG_TMP gets remapped from R8 to R2 since it's mainly just
used as a local temporary variable. This allows to shrink the
image on x86_64 also for seccomp programs slightly since mapping
to %rsi is not an ereg. In callee-saved R8 and R9 we now track
skb data and headlen, respectively. For normal prologue emission
in the JITs this does not add any extra instructions since R8, R9
are pushed to stack in any case from eBPF side. cBPF uses the
convert_bpf_ld_abs() emitter which probes the fast path inline
already and falls back to bpf_skb_load_helper_{8,16,32}() helper
relying on the cached skb data and headlen as well. R8 and R9
never need to be reloaded due to bpf_helper_changes_pkt_data()
since all skb access in cBPF is read-only. Then, for the case
of native eBPF, we use the bpf_gen_ld_abs() emitter, which calls
the bpf_skb_load_helper_{8,16,32}_no_cache() helper unconditionally,
does neither cache skb data and headlen nor has an inlined fast
path. The reason for the latter is that native eBPF does not have
any extra registers available anyway, but even if there were, it
avoids any reload of skb data and headlen in the first place.
Additionally, for the negative offsets, we provide an alternative
bpf_skb_load_bytes_relative() helper in eBPF which operates
similarly as bpf_skb_load_bytes() and allows for more flexibility.
Tested myself on x64, arm64, s390x, from Sandipan on ppc64.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Remove all eBPF tests involving LD_ABS/LD_IND from test_bpf.ko. Reason
is that the eBPF tests from test_bpf module do not go via BPF verifier
and therefore any instruction rewrites from verifier cannot take place.
Therefore, move them into test_verifier which runs out of user space,
so that verfier can rewrite LD_ABS/LD_IND internally in upcoming patches.
It will have the same effect since runtime tests are also performed from
there. This also allows to finally unexport bpf_skb_vlan_{push,pop}_proto
and keep it internal to core kernel.
Additionally, also add further cBPF LD_ABS/LD_IND test coverage into
test_bpf.ko suite.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
No change in functionality, just remove the '__' prefix and replace it
with a 'bpf_' prefix instead. We later on add a couple of more helpers
for cBPF and keeping the scheme with '__' is suboptimal there.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In this commit, a new getsockopt is added: XDP_STATISTICS. This is
used to obtain stats from the sockets.
v2: getsockopt now returns size of stats structure.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Here, Tx support is added. The user fills the Tx queue with frames to
be sent by the kernel, and let's the kernel know using the sendmsg
syscall.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The new dev_direct_xmit will be used by AF_XDP in later commits.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Another setsockopt (XDP_TX_QUEUE) is added to let the process allocate
a queue, where the user process can pass frames to be transmitted by
the kernel.
The mmapping of the queue is done using the XDP_PGOFF_TX_QUEUE offset.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Here, we add another setsockopt for registered user memory (umem)
called XDP_UMEM_COMPLETION_QUEUE. Using this socket option, the
process can ask the kernel to allocate a queue (ring buffer) and also
mmap it (XDP_UMEM_PGOFF_COMPLETION_QUEUE) into the process.
The queue is used to explicitly pass ownership of umem frames from the
kernel to user process. This will be used by the TX path to tell user
space that a certain frame has been transmitted and user space can use
it for something else, if it wishes.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit wires up the xskmap to XDP_SKB layer.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit wires up the xskmap to XDP_DRV layer.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|