Age | Commit message (Collapse) | Author |
|
It's possible to crash the kernel in several different ways by sending
messages to the SMC_PNETID generic netlink family that are missing the
expected attributes:
- Missing SMC_PNETID_NAME => null pointer dereference when comparing
names.
- Missing SMC_PNETID_ETHNAME => null pointer dereference accessing
smc_pnetentry::ndev.
- Missing SMC_PNETID_IBNAME => null pointer dereference accessing
smc_pnetentry::smcibdev.
- Missing SMC_PNETID_IBPORT => out of bounds array access to
smc_ib_device::pattr[-1].
Fix it by validating that all expected attributes are present and that
SMC_PNETID_IBPORT is nonzero.
Reported-by: syzbot+5cd61039dc9b8bfa6e47@syzkaller.appspotmail.com
Fixes: 6812baabf24d ("smc: establish pnet table management")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Otherwise we will leak a reference to the network namespace.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
syzbot found a way to trigger an infinitie loop by overflowing
@offset variable that has been forced to use u16 for some very
obscure reason in the past.
We probably want to look at NEXTHDR_FRAGMENT handling which looks
wrong, in a separate patch.
In net-next, we shall try to use skb_header_pointer() instead of
pskb_may_pull().
watchdog: BUG: soft lockup - CPU#1 stuck for 134s! [syz-executor738:4553]
Modules linked in:
irq event stamp: 13885653
hardirqs last enabled at (13885652): [<ffffffff878009d5>] restore_regs_and_return_to_kernel+0x0/0x2b
hardirqs last disabled at (13885653): [<ffffffff87800905>] interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_napi_alloc_frags drivers/net/tun.c:1478 [inline]
softirqs last enabled at (13614028): [<ffffffff84df0809>] tun_get_user+0x1dd9/0x4290 drivers/net/tun.c:1825
softirqs last disabled at (13614032): [<ffffffff84df1b6f>] tun_get_user+0x313f/0x4290 drivers/net/tun.c:1942
CPU: 1 PID: 4553 Comm: syz-executor738 Not tainted 4.17.0-rc3+ #40
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:check_kcov_mode kernel/kcov.c:67 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x20/0x50 kernel/kcov.c:101
RSP: 0018:ffff8801d8cfe250 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: ffff8801d88a8080 RBX: ffff8801d7389e40 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffffffff868da4ad RDI: ffff8801c8a53277
RBP: ffff8801d8cfe250 R08: ffff8801d88a8080 R09: ffff8801d8cfe3e8
R10: ffffed003b19fc87 R11: ffff8801d8cfe43f R12: ffff8801c8a5327f
R13: 0000000000000000 R14: ffff8801c8a4e5fe R15: ffff8801d8cfe3e8
FS: 0000000000d88940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 00000001acab3000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
_decode_session6+0xc1d/0x14f0 net/ipv6/xfrm6_policy.c:150
__xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2368
xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline]
icmpv6_route_lookup+0x395/0x6e0 net/ipv6/icmp.c:372
icmp6_send+0x1982/0x2da0 net/ipv6/icmp.c:551
icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43
ip6_input_finish+0x14e1/0x1a30 net/ipv6/ip6_input.c:305
NF_HOOK include/linux/netfilter.h:288 [inline]
ip6_input+0xe1/0x5e0 net/ipv6/ip6_input.c:327
dst_input include/net/dst.h:450 [inline]
ip6_rcv_finish+0x29c/0xa10 net/ipv6/ip6_input.c:71
NF_HOOK include/linux/netfilter.h:288 [inline]
ipv6_rcv+0xeb8/0x2040 net/ipv6/ip6_input.c:208
__netif_receive_skb_core+0x2468/0x3650 net/core/dev.c:4646
__netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4711
netif_receive_skb_internal+0x126/0x7b0 net/core/dev.c:4785
napi_frags_finish net/core/dev.c:5226 [inline]
napi_gro_frags+0x631/0xc40 net/core/dev.c:5299
tun_get_user+0x3168/0x4290 drivers/net/tun.c:1951
tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1996
call_write_iter include/linux/fs.h:1784 [inline]
do_iter_readv_writev+0x859/0xa50 fs/read_write.c:680
do_iter_write+0x185/0x5f0 fs/read_write.c:959
vfs_writev+0x1c7/0x330 fs/read_write.c:1004
do_writev+0x112/0x2f0 fs/read_write.c:1039
__do_sys_writev fs/read_write.c:1112 [inline]
__se_sys_writev fs/read_write.c:1109 [inline]
__x64_sys_writev+0x75/0xb0 fs/read_write.c:1109
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reported-by: syzbot+0053c8...@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
Virtual interface drivers such as tun / tap interfaces, VLAN, etc tend
to initialize the interface throughput with some value for the sake of
having a throughput number to export via ethtool. This exported
throughput leaves batman-adv to conclude the interface throughput is
genuine (reflecting reality), thus no measurements are necessary.
Based on the observation that those interface types also tend to set
the link auto-negotiation to 'off', batman-adv shall check this
setting to differentiate between genuine link throughput information
and placeholders installed by virtual interfaces.
The "default throughput" setting exported via sysfs still allows to
configure the batman-adv throughput for the interface, thus disabling
the measurements.
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Fix handling of simultaneous open TCP connection in conntrack,
from Jozsef Kadlecsik.
2) Insufficient sanitify check of xtables extension names, from
Florian Westphal.
3) Skip unnecessary synchronize_rcu() call when transaction log
is already empty, from Florian Westphal.
4) Incorrect destination mac validation in ebt_stp, from Stephen
Hemminger.
5) xtables module reference counter leak in nft_compat, from
Florian Westphal.
6) Incorrect connection reference counting logic in IPVS
one-packet scheduler, from Julian Anastasov.
7) Wrong stats for 32-bits CPU in IPVS, also from Julian.
8) Calm down sparse error in netfilter core, also from Florian.
9) Use nla_strlcpy to fix compilation warning in nfnetlink_acct
and nfnetlink_cthelper, again from Florian.
10) Missing module alias in icmp and icmp6 xtables extensions,
from Florian Westphal.
11) Base chain statistics in nf_tables may be unset/null, from Florian.
12) Fix handling of large matchinfo size in nft_compat, this includes
one preparation for before this fix. From Florian.
13) Fix bogus EBUSY error when deleting chains due to incorrect reference
counting from the preparation phase of the two-phase commit protocol.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When CONFIG_PROC_FS isn't set, variable ipconfig_dir isn't used.
net/ipv4/ipconfig.c:167:31: warning: ‘ipconfig_dir’ defined but not used [-Wunused-variable]
static struct proc_dir_entry *ipconfig_dir;
^~~~~~~~~~~~
Move the declaration of ipconfig_dir inside the CONFIG_PROC_FS ifdef to
fix the warning.
Fixes: c04d2cb2009f ("ipconfig: Write NTP server IPs to /proc/net/ipconfig/ntp_servers")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Packet sockets allow construction of packets shorter than
dev->hard_header_len to accommodate protocols with variable length
link layer headers. These packets are padded to dev->hard_header_len,
because some device drivers interpret that as a minimum packet size.
packet_snd reserves dev->hard_header_len bytes on allocation.
SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that
link layer headers are stored in the reserved range. SOCK_RAW sockets
do the same in tpacket_snd, but not in packet_snd.
Syzbot was able to send a zero byte packet to a device with massive
116B link layer header, causing padding to cross over into skb_shinfo.
Fix this by writing from the start of the llheader reserved range also
in the case of packet_snd/SOCK_RAW.
Update skb_set_network_header to the new offset. This also corrects
it for SOCK_DGRAM, where it incorrectly double counted reserve due to
the skb_push in dev_hard_header.
Fixes: 9ed988cd5915 ("packet: validate variable length ll headers")
Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the -EBUSY error return path is not free'ing resources
allocated earlier, leaving a memory leak. Fix this by exiting via the
error exit label err5 that performs the necessary resource clean
up.
Detected by CoverityScan, CID#1432975 ("Resource leak")
Fixes: 9744a6fcefcb ("netfilter: nf_tables: check if same extensions are set when adding elements")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
A translation table TVLV changset sent with an OGM consists
of a number of headers (one per VLAN) plus the changeset
itself (addition and/or deletion of entries).
The per-VLAN headers are used by OGM recipients for consistency
checks. Said consistency check might determine that a full
translation table request is needed to restore consistency. If
the TT sender adds per-VLAN headers of empty VLANs into the OGM,
recipients are led to believe to have reached an inconsistent
state and thus request a full table update. The full table does
not contain empty VLANs (due to missing entries) the cycle
restarts when the next OGM is issued.
Consequently, when the translation table TVLV headers are
composed, empty VLANs are to be excluded.
Fixes: 21a57f6e7a3b ("batman-adv: make the TT CRC logic VLAN specific")
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
The previous TT sync fix so far only fixed TT responses issued by the
target node directly. So far, TT responses issued by intermediate nodes
still lead to the wrong flags being added, leading to CRC mismatches.
This behaviour was observed at Freifunk Hannover in a 800 nodes setup
where a considerable amount of nodes were still infected with 'WI'
TT flags even with (most) nodes having the previous TT sync fix applied.
I was able to reproduce the issue with intermediate TT responses in a
four node test setup and this patch fixes this issue by ensuring to
use the per originator instead of the summarized, OR'd ones.
Fixes: e9c00136a475 ("batman-adv: fix tt_global_entries flags update")
Reported-by: Leonardo Mörlein <me@irrelefant.net>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
|
|
The bpf syscall and selftests conflicts were trivial
overlapping changes.
The r8169 change involved moving the added mdelay from 'net' into a
different function.
A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts. I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Verify lengths of keys provided by the user is AF_KEY, from Kevin
Easton.
2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka.
3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R.
Silva.
4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most
grateful for this fix.
5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we
do appreciate.
6) Use after free in TLS code. Once again we are blessed by the
honorable Eric Dumazet with this fix.
7) Fix regression in TLS code causing stalls on partial TLS records.
This fix is bestowed upon us by Andrew Tomt.
8) Deal with too small MTUs properly in LLC code, another great gift
from Eric Dumazet.
9) Handle cached route flushing properly wrt. MTU locking in ipv4, to
Hangbin Liu we give thanks for this.
10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux.
Paolo Abeni, he gave us this.
11) Range check coalescing parameters in mlx4 driver, thank you Moshe
Shemesh.
12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother
David Howells.
13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens,
you're the best!
14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata
Benerjee saved us!
15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship
is now water tight, thanks to Andrey Ignatov.
16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes
everywhere!
17) Fix error path in tcf_proto_create, Jiri Pirko what would we do
without you!
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
net sched actions: fix refcnt leak in skbmod
net: sched: fix error path in tcf_proto_create() when modules are not configured
net sched actions: fix invalid pointer dereferencing if skbedit flags missing
ixgbe: fix memory leak on ipsec allocation
ixgbevf: fix ixgbevf_xmit_frame()'s return type
ixgbe: return error on unsupported SFP module when resetting
ice: Set rq_last_status when cleaning rq
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
bonding: send learning packets for vlans on slave
bonding: do not allow rlb updates to invalid mac
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net/mlx5: Free IRQs in shutdown path
rxrpc: Trace UDP transmission failure
rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages
rxrpc: Fix the min security level for kernel calls
rxrpc: Fix error reception on AF_INET6 sockets
rxrpc: Fix missing start of call timeout
qed: fix spelling mistake: "taskelt" -> "tasklet"
...
|
|
Pull NFS client fixes from Anna Schumaker:
"These patches fix both a possible corruption during NFSoRDMA MR
recovery, and a sunrpc tracepoint crash.
Additionally, Trond has a new email address to put in the MAINTAINERS
file"
* tag 'nfs-for-4.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
Change Trond's email address in MAINTAINERS
sunrpc: Fix latency trace point crashes
xprtrdma: Fix list corruption / DMAR errors during MR recovery
|
|
When application fails to pass flags in netlink TLV when replacing
existing skbmod action, the kernel will leak refcnt:
$ tc actions get action skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 1 bind 0
For example, at this point a buggy application replaces the action with
index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
however refcnt gets bumped:
$ tc actions get actions skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 2 bind 0
$
Tha patch fixes this by calling tcf_idr_release() on existing actions.
Fixes: 86da71b57383d ("net_sched: Introduce skbmod action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case modules are not configured, error out when tp->ops is null
and prevent later null pointer dereference.
Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the truncated bit is set only when 1) the mirrored packet
is larger than mtu and 2) the ipv4 packet tot_len is larger than
the actual skb->len. This patch adds another case for detecting
whether ipv6 packet is truncated or not, by checking the ipv6 header
payload_len and the skb->len.
Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fixes
Here are three fixes for AF_RXRPC and two tracepoints that were useful for
finding them:
(1) Fix missing start of expect-Rx-by timeout on initial packet
transmission so that calls will time out if the peer doesn't respond.
(2) Fix error reception on AF_INET6 sockets by using the correct family of
sockopts on the UDP transport socket.
(3) Fix setting the minimum security level on kernel calls so that they
can be encrypted.
(4) Add a tracepoint to log ICMP/ICMP6 and other error reports from the
transport socket.
(5) Add a tracepoint to log UDP sendmsg failure so that we can find out if
transmission failure occurred on the UDP socket.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When application fails to pass flags in netlink TLV for a new skbedit action,
the kernel results in the following oops:
[ 8.307732] BUG: unable to handle kernel paging request at 0000000000021130
[ 8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
[ 8.310595] Oops: 0000 [#1] SMP PTI
[ 8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
[ 8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
[ 8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
[ 8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
[ 8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
[ 8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
[ 8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
[ 8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
[ 8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
[ 8.325590] FS: 00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
[ 8.327001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
[ 8.329289] Call Trace:
[ 8.329735] tcf_skbedit_init+0xa7/0xb0
[ 8.330423] tcf_action_init_1+0x362/0x410
[ 8.331139] ? try_to_wake_up+0x44/0x430
[ 8.331817] tcf_action_init+0x103/0x190
[ 8.332511] tc_ctl_action+0x11a/0x220
[ 8.333174] rtnetlink_rcv_msg+0x23d/0x2e0
[ 8.333902] ? _cond_resched+0x16/0x40
[ 8.334569] ? __kmalloc_node_track_caller+0x5b/0x2c0
[ 8.335440] ? rtnl_calcit.isra.31+0xf0/0xf0
[ 8.336178] netlink_rcv_skb+0xdb/0x110
[ 8.336855] netlink_unicast+0x167/0x220
[ 8.337550] netlink_sendmsg+0x2a7/0x390
[ 8.338258] sock_sendmsg+0x30/0x40
[ 8.338865] ___sys_sendmsg+0x2c5/0x2e0
[ 8.339531] ? pagecache_get_page+0x27/0x210
[ 8.340271] ? filemap_fault+0xa2/0x630
[ 8.340943] ? page_add_file_rmap+0x108/0x200
[ 8.341732] ? alloc_set_pte+0x2aa/0x530
[ 8.342573] ? finish_fault+0x4e/0x70
[ 8.343332] ? __handle_mm_fault+0xbc1/0x10d0
[ 8.344337] ? __sys_sendmsg+0x53/0x80
[ 8.345040] __sys_sendmsg+0x53/0x80
[ 8.345678] do_syscall_64+0x4f/0x100
[ 8.346339] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 8.347206] RIP: 0033:0x7f591191da67
[ 8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
[ 8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
[ 8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
[ 8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
[ 8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
[ 8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 <8b> 53 30
74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
[ 8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
[ 8.359770] CR2: 0000000000021130
[ 8.360438] ---[ end trace 60c66be45dfc14f0 ]---
The caller calls action's ->init() and passes pointer to "struct tc_action *a",
which later may be initialized to point at the existing action, otherwise
"struct tc_action *a" is still invalid, and therefore dereferencing it is an
error as happens in tcf_idr_release, where refcnt is decremented.
So in case of missing flags tcf_idr_release must be called only for
existing actions.
v2:
- prepare patch for net tree
Fixes: 5e1567aeb7fe ("net sched: skbedit action fix late binding")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For some reason, Willem thought that the issue we fixed for TCP
in commit 7ec318feeed1 ("tcp: gso: avoid refcount_t warning from
tcp_gso_segment()") was not relevant for UDP GSO.
But syzbot found its way.
refcount_t: saturated; leaking memory.
WARNING: CPU: 0 PID: 10261 at lib/refcount.c:78 refcount_add_not_zero+0x2d4/0x320 lib/refcount.c:78
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 10261 Comm: syz-executor5 Not tainted 4.17.0-rc3+ #38
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
panic+0x22f/0x4de kernel/panic.c:184
__warn.cold.8+0x163/0x1b3 kernel/panic.c:536
report_bug+0x252/0x2d0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:178 [inline]
do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
RIP: 0010:refcount_add_not_zero+0x2d4/0x320 lib/refcount.c:78
RSP: 0018:ffff880196db6b90 EFLAGS: 00010282
RAX: 0000000000000026 RBX: 00000000ffffff01 RCX: ffffc900040d9000
RDX: 0000000000004a29 RSI: ffffffff8160f6f1 RDI: ffff880196db66f0
RBP: ffff880196db6c78 R08: ffff8801b33d6740 R09: 0000000000000002
R10: ffff8801b33d6740 R11: 0000000000000000 R12: 0000000000000000
R13: 00000000ffffffff R14: ffff880196db6c50 R15: 0000000000020101
refcount_add+0x1b/0x70 lib/refcount.c:102
__udp_gso_segment+0xaa5/0xee0 net/ipv4/udp_offload.c:272
udp4_ufo_fragment+0x592/0x7a0 net/ipv4/udp_offload.c:301
inet_gso_segment+0x639/0x12b0 net/ipv4/af_inet.c:1342
skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
__skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
skb_gso_segment include/linux/netdevice.h:4050 [inline]
validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3122
__dev_queue_xmit+0xbf8/0x34c0 net/core/dev.c:3579
dev_queue_xmit+0x17/0x20 net/core/dev.c:3620
neigh_direct_output+0x15/0x20 net/core/neighbour.c:1401
neigh_output include/net/neighbour.h:483 [inline]
ip_finish_output2+0xa5f/0x1840 net/ipv4/ip_output.c:229
ip_finish_output+0x828/0xf80 net/ipv4/ip_output.c:317
NF_HOOK_COND include/linux/netfilter.h:277 [inline]
ip_output+0x21b/0x850 net/ipv4/ip_output.c:405
dst_output include/net/dst.h:444 [inline]
ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124
ip_send_skb+0x40/0xe0 net/ipv4/ip_output.c:1434
udp_send_skb.isra.37+0x5eb/0x1000 net/ipv4/udp.c:825
udp_push_pending_frames+0x5c/0xf0 net/ipv4/udp.c:853
udp_v6_push_pending_frames+0x380/0x3e0 net/ipv6/udp.c:1105
udp_lib_setsockopt+0x59a/0x600 net/ipv4/udp.c:2403
udpv6_setsockopt+0x95/0xa0 net/ipv6/udp.c:1447
sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3046
__sys_setsockopt+0x1bd/0x390 net/socket.c:1903
__do_sys_setsockopt net/socket.c:1914 [inline]
__se_sys_setsockopt net/socket.c:1911 [inline]
__x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: ad405857b174 ("udp: better wmem accounting on gso")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
linux-4.16 got support for softirq based hrtimers.
TCP can switch its pacing hrtimer to this variant, since this
avoids going through a tasklet and some atomic operations.
pacing timer logic looks like other (jiffies based) tcp timers.
v2: use hrtimer_try_to_cancel() in tcp_clear_xmit_timers()
to correctly release reference on socket if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for PHYLINK within the DSA subsystem in order to support more
complex devices such as pluggable (SFP) and non-pluggable (SFF) modules, 10G
PHYs, and traditional PHYs. Using PHYLINK allows us to drop some amount of
complexity we had while probing fixed and non-fixed PHYs using Device Tree.
Because PHYLINK separates the Ethernet MAC/port configuration into different
stages, we let switch drivers implement those, and for now, we maintain
functionality by calling dsa_slave_adjust_link() during
phylink_mac_link_{up,down} which provides semantically equivalent steps.
Drivers willing to take advantage of PHYLINK should implement the phylink_mac_*
operations that DSA wraps.
We cannot quite remove the adjust_link() callback just yet, because a number of
drivers rely on that for configuring their "CPU" and "DSA" ports, this is done
dsa_port_setup_phy_of() and dsa_port_fixed_link_register_of() still.
Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need
to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK
*does not* create a phy_device instance for fixed links.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since we use PHYLIB to manage the per-port link indication, this will
also be reflected correctly in the network device's carrier state, so we
can use ethtool_op_get_link() instead.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for adding support for PHYLINK within DSA, define a number of
operations that we will need and that switch drivers can start implementing.
Proper integration with PHYLINK will follow in subsequent patches.
We start selecting PHYLINK (which implies PHYLIB) in net/dsa/Kconfig
such that drivers can be guaranteed that this dependency is properly
taken care of and can start referencing PHYLINK helper functions without
requiring stubs or anything.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
earlier in 919483096bfe.
* udp_sendmsg one was there since the beginning when linux sources were
first added to git;
* ping_v4_sendmsg one was copy/pasted in c319b4d76b9e.
Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
have to be freed if they were allocated previously.
Add label so that future callers (if any) can use it instead of kfree()
before return that is easy to forget.
Fixes: c319b4d76b9e (net: ipv4: add IPPROTO_ICMP socket kind)
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a tracepoint to log transmission failure from the UDP transport socket
being used by AF_RXRPC.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add a tracepoint to log received ICMP/ICMP6 events and other error
messages.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix the kernel call initiation to set the minimum security level for kernel
initiated calls (such as from kAFS) from the sockopt value.
Fixes: 19ffa01c9c45 ("rxrpc: Use structs to hold connection params and protocol info")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
AF_RXRPC tries to turn on IP_RECVERR and IP_MTU_DISCOVER on the UDP socket
it just opened for communications with the outside world, regardless of the
type of socket. Unfortunately, this doesn't work with an AF_INET6 socket.
Fix this by turning on IPV6_RECVERR and IPV6_MTU_DISCOVER instead if the
socket is of the AF_INET6 family.
Without this, kAFS server and address rotation doesn't work correctly
because the algorithm doesn't detect received network errors.
Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The expect_rx_by call timeout is supposed to be set when a call is started
to indicate that we need to receive a packet by that point. This is
currently put back every time we receive a packet, but it isn't started
when we first send a packet. Without this, the call may wait forever if
the server doesn't deign to reply.
Fix this by setting the timeout upon a successful UDP sendmsg call for the
first DATA packet. The timeout is initiated only for initial transmission
and not for subsequent retries as we don't want the retry mechanism to
extend the timeout indefinitely.
Fixes: a158bdd3247b ("rxrpc: Fix call timeouts")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Provide a helper for doing a FIB and neighbor lookup in the kernel
tables from an XDP program. The helper provides a fastpath for forwarding
packets. If the packet is a local delivery or for any reason is not a
simple lookup and forward, the packet continues up the stack.
If it is to be forwarded, the forwarding can be done directly if the
neighbor is already known. If the neighbor does not exist, the first
few packets go up the stack for neighbor resolution. Once resolved, the
xdp program provides the fast path.
On successful lookup the nexthop dmac, current device smac and egress
device index are returned.
The API supports IPv4, IPv6 and MPLS protocols, but only IPv4 and IPv6
are implemented in this patch. The API includes layer 4 parameters if
the XDP program chooses to do deep packet inspection to allow compare
against ACLs implemented as FIB rules.
Header rewrite is left to the XDP program.
The lookup takes 2 flags:
- BPF_FIB_LOOKUP_DIRECT to do a lookup that bypasses FIB rules and goes
straight to the table associated with the device (expert setting for
those looking to maximize throughput)
- BPF_FIB_LOOKUP_OUTPUT to do a lookup from the egress perspective.
Default is an ingress lookup.
Initial performance numbers collected by Jesper, forwarded packets/sec:
Full stack XDP FIB lookup XDP Direct lookup
IPv4 1,947,969 7,074,156 7,415,333
IPv6 1,728,000 6,165,504 7,262,720
These number are single CPU core forwarding on a Broadwell
E5-1650 v4 @ 3.60GHz.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add stubs to retrieve a handle to an IPv6 FIB table, fib6_get_table,
a stub to do a lookup in a specific table, fib6_table_lookup, and
a stub for a full route lookup.
The stubs are needed for core bpf code to handle the case when the
IPv6 module is not builtin.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Similar to IPv4, IPv6 should use the FIB lookup result in the
tracepoint.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add IPv6 equivalent to fib_lookup. Does a fib lookup, including rules,
but returns a FIB entry, fib6_info, rather than a dst based rt6_info.
fib6_lookup is any where from 140% (MULTIPLE_TABLES config disabled)
to 60% faster than any of the dst based lookup methods (without custom
rules) and 25% faster with custom rules (e.g., l3mdev rule).
Since the lookup function has a completely different signature,
fib6_rule_action is split into 2 paths: the existing one is
renamed __fib6_rule_action and a new one for the fib6_info path
is added. fib6_rule_action decides which to call based on the
lookup_ptr. If it is fib6_table_lookup then the new path is taken.
Caller must hold rcu lock as no reference is taken on the returned
fib entry.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Move source address lookup from fib6_rule_action to a helper. It will be
used in a later patch by a second variant for fib6_rule_action.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
ip6_pol_route is used for ingress and egress FIB lookups. Refactor it
moving the table lookup into a separate fib6_table_lookup that can be
invoked separately and export the new function.
ip6_pol_route now calls fib6_table_lookup and uses the result to generate
a dst based rt6_info.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Rename rt6_multipath_select to fib6_multipath_select and export it.
A later patch wants access to it similar to IPv4's fib_select_path.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Rename fib6_lookup to fib6_node_lookup to better reflect what it
returns. The fib6_lookup name will be used in a later patch for
an IPv6 equivalent to IPv4's fib_lookup.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add sg table initialization to fix a BUG_ON encountered when enabling
CONFIG_DEBUG_SG.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mirroring offload in mlxsw needs to check that a given VLAN is allowed
to ingress the bridge device. br_vlan_get_info() is the function that is
used for this, however currently it only supports bridge port devices.
Extend it to support bridge masters as well.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In Commit 1f45f78f8e51 ("sctp: allow GSO frags to access the chunk too"),
it held the chunk in sctp_ulpevent_make_rcvmsg to access it safely later
in recvmsg. However, it also added sctp_chunk_put in fail_mark err path,
which is only triggered before holding the chunk.
syzbot reported a use-after-free crash happened on this err path, where
it shouldn't call sctp_chunk_put.
This patch simply removes this call.
Fixes: 1f45f78f8e51 ("sctp: allow GSO frags to access the chunk too")
Reported-by: syzbot+141d898c5f24489db4aa@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This version has some suggestions by Eric Dumazet:
- Use a local variable for the mark in IPv6 instead of ctl_sk to avoid SMP
races.
- Use the more elegant "IP4_REPLY_MARK(net, skb->mark) ?: sk->sk_mark"
statement.
- Factorize code as sk_fullsock() check is not necessary.
Aidan McGurn from Openwave Mobility systems reported the following bug:
"Marked routing is broken on customer deployment. Its effects are large
increase in Uplink retransmissions caused by the client never receiving
the final ACK to their FINACK - this ACK misses the mark and routes out
of the incorrect route."
Currently marks are added to sk_buffs for replies when the "fwmark_reflect"
sysctl is enabled. But not for TW sockets that had sk->sk_mark set via
setsockopt(SO_MARK..).
Fix this in IPv4/v6 by adding tw->tw_mark for TIME_WAIT sockets. Copy the the
original sk->sk_mark in __inet_twsk_hashdance() to the new tw->tw_mark location.
Then progate this so that the skb gets sent with the correct mark. Do the same
for resets. Give the "fwmark_reflect" sysctl precedence over sk->sk_mark so that
netfilter rules are still honored.
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
INET_CSK_DEBUG is always set and only is used for 2 pr_debug calls.
EXPORT_SYMBOL(inet_csk_timer_bug_msg) is only used by these 2
pr_debug calls and is also unnecessary as the exported string can
be used directly by these calls.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
We only have a few fixes this time:
* WMM element validation
* SAE timeout
* add-BA timeout
* docbook parsing
* a few memory leaks in error paths
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
WARNING: lock held when returning to user space!
4.17.0-rc3+ #37 Not tainted
syz-executor1/27662 is leaving the kernel with locks still held!
1 lock held by syz-executor1/27662:
#0: 00000000f661aee7 (rcu_read_lock){....}, at: ip6_route_del+0xea/0x13f0 net/ipv6/route.c:3206
BUG: scheduling while atomic: syz-executor1/27662/0x00000002
INFO: lockdep is turned off.
Modules linked in:
Kernel panic - not syncing: scheduling while atomic
CPU: 1 PID: 27662 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #37
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
panic+0x22f/0x4de kernel/panic.c:184
__schedule_bug.cold.85+0xdf/0xdf kernel/sched/core.c:3290
schedule_debug kernel/sched/core.c:3307 [inline]
__schedule+0x139e/0x1e30 kernel/sched/core.c:3412
schedule+0xef/0x430 kernel/sched/core.c:3549
exit_to_usermode_loop+0x220/0x310 arch/x86/entry/common.c:152
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455979
RSP: 002b:00007fbf4051dc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: 0000000000000000 RBX: 00007fbf4051e6d4 RCX: 0000000000455979
RDX: 00000000200001c0 RSI: 000000000000890c RDI: 0000000000000013
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000003c8 R14: 00000000006f9b60 R15: 0000000000000000
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..
Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sysbot/KMSAN reported an uninit-value in recvmsg() that
I tracked down to tipc_sk_set_orig_addr(), missing
srcaddr->member.scope initialization.
This patches moves srcaddr->sock.scope init to follow
fields order and ease future verifications.
BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:226
CPU: 0 PID: 4549 Comm: syz-executor287 Not tainted 4.17.0-rc3+ #88
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:113
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
kmsan_internal_check_memory+0x135/0x1e0 mm/kmsan/kmsan.c:1157
kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
copy_to_user include/linux/uaccess.h:184 [inline]
move_addr_to_user+0x32e/0x530 net/socket.c:226
___sys_recvmsg+0x4e2/0x810 net/socket.c:2285
__sys_recvmsg net/socket.c:2328 [inline]
__do_sys_recvmsg net/socket.c:2338 [inline]
__se_sys_recvmsg net/socket.c:2335 [inline]
__x64_sys_recvmsg+0x325/0x460 net/socket.c:2335
do_syscall_64+0x154/0x220 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4455e9
RSP: 002b:00007fe3bd36ddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 00000000004455e9
RDX: 0000000000002002 RSI: 0000000020000400 RDI: 0000000000000003
RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff98ce4b6f R14: 00007fe3bd36e9c0 R15: 0000000000000003
Local variable description: ----addr@___sys_recvmsg
Variable was created at:
___sys_recvmsg+0xd5/0x810 net/socket.c:2246
__sys_recvmsg net/socket.c:2328 [inline]
__do_sys_recvmsg net/socket.c:2338 [inline]
__se_sys_recvmsg net/socket.c:2335 [inline]
__x64_sys_recvmsg+0x325/0x460 net/socket.c:2335
Byte 19 of 32 is uninitialized
Fixes: 31c82a2d9d51 ("tipc: add second source address to recvmsg()/recvfrom()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
In absence of VRF devices, after commit fb74c27735f0 ("net:
ipv4: add second dif to udp socket lookups") the dif mismatch
isn't fatal anymore for UDP socket lookup with non null
sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.
This changeset addresses the issue making the dif match mandatory
again in the above scenario.
Reported-by: Damir Mansurov <dnman@oktetlabs.ru>
Fixes: fb74c27735f0 ("net: ipv4: add second dif to udp socket lookups")
Fixes: 1801b570dd2a ("net: ipv6: add second dif to udp socket lookups")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After route cache is flushed via ipv4_sysctl_rtcache_flush(), we forget
to reset fnhe_mtu_locked in rt_bind_exception(). When pmtu is updated
in __ip_rt_update_pmtu(), it will return directly since the pmtu is
still locked. e.g.
+ ip netns exec client ping 10.10.1.1 -c 1 -s 1400 -M do
PING 10.10.1.1 (10.10.1.1) 1400(1428) bytes of data.
>From 10.10.0.254 icmp_seq=1 Frag needed and DF set (mtu = 0)
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 161d82de1ff8 ("net: bridge: Notify about !added_by_user FDB
entries") causes the below oops when bringing up a slave interface,
because dsa_port_fdb_add is still scheduled, but with a NULL address.
To fix this, keep the dsa_slave_switchdev_event function agnostic of the
notified info structure and handle the added_by_user flag in the
specific dsa_slave_switchdev_event_work function.
[ 75.512263] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 75.519063] pgd = (ptrval)
[ 75.520545] [00000000] *pgd=00000000
[ 75.522839] Internal error: Oops: 17 [#1] ARM
[ 75.525898] Modules linked in:
[ 75.527673] CPU: 0 PID: 9 Comm: kworker/u2:1 Not tainted 4.17.0-rc2 #78
[ 75.532988] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
[ 75.538153] Workqueue: dsa_ordered dsa_slave_switchdev_event_work
[ 75.542970] PC is at mv88e6xxx_port_db_load_purge+0x60/0x1b0
[ 75.547341] LR is at mdiobus_read_nested+0x6c/0x78
[ 75.550833] pc : [<804cd5c0>] lr : [<804bba84>] psr: 60070013
[ 75.555796] sp : 9f54bd78 ip : 9f54bd87 fp : 9f54bddc
[ 75.559719] r10: 00000000 r9 : 0000000e r8 : 9f6a6010
[ 75.563643] r7 : 00000000 r6 : 81203048 r5 : 9f6a6010 r4 : 9f6a601c
[ 75.568867] r3 : 00000000 r2 : 00000000 r1 : 0000000d r0 : 00000000
[ 75.574094] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 75.579933] Control: 10c53c7d Table: 9de20059 DAC: 00000051
[ 75.584384] Process kworker/u2:1 (pid: 9, stack limit = 0x(ptrval))
[ 75.589349] Stack: (0x9f54bd78 to 0x9f54c000)
[ 75.592406] bd60: 00000000 00000000
[ 75.599295] bd80: 00000391 9f299d10 9f299d68 8014317c 9f7f0000 8120af00 00006dc2 00000000
[ 75.606186] bda0: 8120af00 00000000 9f54bdec 1c9f5d92 8014317c 9f6a601c 9f6a6010 00000000
[ 75.613076] bdc0: 00000000 00000000 9dd1141c 8125a0b4 9f54be0c 9f54bde0 804cd8a8 804cd56c
[ 75.619966] bde0: 0000000e 80143680 00000001 9dce9c1c 81203048 9dce9c10 00000003 00000000
[ 75.626858] be00: 9f54be5c 9f54be10 806abcac 804cd864 9f54be54 80143664 8014317c 80143054
[ 75.633748] be20: ffcaa81d 00000000 812030b0 1c9f5d92 00000000 81203048 9f54beb4 00000003
[ 75.640639] be40: ffffffff 00000000 9dd1141c 8125a0b4 9f54be84 9f54be60 80138e98 806abb18
[ 75.647529] be60: 81203048 9ddc4000 9dce9c54 9f72a300 00000000 00000000 9f54be9c 9f54be88
[ 75.654420] be80: 801390bc 80138e50 00000000 9dce9c54 9f54beac 9f54bea0 806a9524 801390a0
[ 75.661310] bea0: 9f54bedc 9f54beb0 806a9c7c 806a950c 9f54becc 00000000 00000000 00000000
[ 75.668201] bec0: 9f540000 1c9f5d92 805fe604 9ddffc00 9f54befc 9f54bee0 806ab228 806a9c38
[ 75.675092] bee0: 806ab178 9ddffc00 9f4c1900 9f40d200 9f54bf34 9f54bf00 80131e30 806ab184
[ 75.681983] bf00: 9f40d214 9f54a038 9f40d200 9f40d200 9f4c1918 812119a0 9f40d214 9f54a038
[ 75.688873] bf20: 9f40d200 9f4c1900 9f54bf7c 9f54bf38 80132124 80131d1c 9f5f2dd8 00000000
[ 75.695764] bf40: 812119a0 9f54a038 812119a0 81259c5b 9f5f2dd8 9f5f2dc0 9f53dbc0 00000000
[ 75.702655] bf60: 9f4c1900 801320b4 9f5f2dd8 9f4f7e88 9f54bfac 9f54bf80 80137ad0 801320c0
[ 75.709544] bf80: 9f54a000 9f53dbc0 801379a0 00000000 00000000 00000000 00000000 00000000
[ 75.716434] bfa0: 00000000 9f54bfb0 801010e8 801379ac 00000000 00000000 00000000 00000000
[ 75.723324] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 75.730206] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 75.737083] Backtrace:
[ 75.738252] [<804cd560>] (mv88e6xxx_port_db_load_purge) from [<804cd8a8>] (mv88e6xxx_port_fdb_add+0x50/0x68)
[ 75.746795] r10:8125a0b4 r9:9dd1141c r8:00000000 r7:00000000 r6:00000000 r5:9f6a6010
[ 75.753323] r4:9f6a601c
[ 75.754570] [<804cd858>] (mv88e6xxx_port_fdb_add) from [<806abcac>] (dsa_switch_event+0x1a0/0x660)
[ 75.762238] r8:00000000 r7:00000003 r6:9dce9c10 r5:81203048 r4:9dce9c1c
[ 75.767655] [<806abb0c>] (dsa_switch_event) from [<80138e98>] (notifier_call_chain+0x54/0x94)
[ 75.774893] r10:8125a0b4 r9:9dd1141c r8:00000000 r7:ffffffff r6:00000003 r5:9f54beb4
[ 75.781423] r4:81203048
[ 75.782672] [<80138e44>] (notifier_call_chain) from [<801390bc>] (raw_notifier_call_chain+0x28/0x30)
[ 75.790514] r9:00000000 r8:00000000 r7:9f72a300 r6:9dce9c54 r5:9ddc4000 r4:81203048
[ 75.796982] [<80139094>] (raw_notifier_call_chain) from [<806a9524>] (dsa_port_notify+0x24/0x38)
[ 75.804483] [<806a9500>] (dsa_port_notify) from [<806a9c7c>] (dsa_port_fdb_add+0x50/0x6c)
[ 75.811371] [<806a9c2c>] (dsa_port_fdb_add) from [<806ab228>] (dsa_slave_switchdev_event_work+0xb0/0x10c)
[ 75.819635] r4:9ddffc00
[ 75.820885] [<806ab178>] (dsa_slave_switchdev_event_work) from [<80131e30>] (process_one_work+0x120/0x3a4)
[ 75.829241] r6:9f40d200 r5:9f4c1900 r4:9ddffc00 r3:806ab178
[ 75.833612] [<80131d10>] (process_one_work) from [<80132124>] (worker_thread+0x70/0x574)
[ 75.840415] r10:9f4c1900 r9:9f40d200 r8:9f54a038 r7:9f40d214 r6:812119a0 r5:9f4c1918
[ 75.846945] r4:9f40d200
[ 75.848191] [<801320b4>] (worker_thread) from [<80137ad0>] (kthread+0x130/0x160)
[ 75.854300] r10:9f4f7e88 r9:9f5f2dd8 r8:801320b4 r7:9f4c1900 r6:00000000 r5:9f53dbc0
[ 75.860830] r4:9f5f2dc0
[ 75.862076] [<801379a0>] (kthread) from [<801010e8>] (ret_from_fork+0x14/0x2c)
[ 75.867999] Exception stack(0x9f54bfb0 to 0x9f54bff8)
[ 75.871753] bfa0: 00000000 00000000 00000000 00000000
[ 75.878640] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 75.885519] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 75.890844] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:801379a0
[ 75.897377] r4:9f53dbc0 r3:9f54a000
[ 75.899663] Code: e3a02000 e3a03000 e14b26f4 e24bc055 (e5973000)
[ 75.904575] ---[ end trace fbca818a124dbf0d ]---
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit be47e41d77fb ("tipc: fix use-after-free in tipc_nametbl_stop")
we fixed a problem caused by premature release of service range items.
That fix is correct, and solved the problem. However, it doesn't address
the root of the problem, which is that we don't lookup the tipc_service
-> service_range -> publication items in the correct hierarchical
order.
In this commit we try to make this right, and as a side effect obtain
some code simplification.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|