summaryrefslogtreecommitdiff
path: root/net/sctp
AgeCommit message (Collapse)Author
2018-11-19sctp: not increase stream's incnt before sending addstrm_in requestXin Long
Different from processing the addstrm_out request, The receiver handles an addstrm_in request by sending back an addstrm_out request to the sender who will increase its stream's in and incnt later. Now stream->incnt has been increased since it sent out the addstrm_in request in sctp_send_add_streams(), with the wrong stream->incnt will even cause crash when copying stream info from the old stream's in to the new one's in sctp_process_strreset_addstrm_out(). This patch is to fix it by simply removing the stream->incnt change from sctp_send_add_streams(). Fixes: 242bd2d519d7 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19Revert "sctp: remove sctp_transport_pmtu_check"Xin Long
This reverts commit 22d7be267eaa8114dcc28d66c1c347f667d7878a. The dst's mtu in transport can be updated by a non sctp place like in xfrm where the MTU information didn't get synced between asoc, transport and dst, so it is still needed to do the pmtu check in sctp_packet_config. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: not allow to set asoc prsctp_enable by sockoptXin Long
As rfc7496#section4.5 says about SCTP_PR_SUPPORTED: This socket option allows the enabling or disabling of the negotiation of PR-SCTP support for future associations. For existing associations, it allows one to query whether or not PR-SCTP support was negotiated on a particular association. It means only sctp sock's prsctp_enable can be set. Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux sctp in another patchset. v1->v2: - drop the params.assoc_id check as Neil suggested. Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmitXin Long
Now sctp increases sk_wmem_alloc by 1 when doing set_owner_w for the skb allocked in sctp_packet_transmit and decreases by 1 when freeing this skb. But when this skb goes through networking stack, some subcomponents might change skb->truesize and add the same amount on sk_wmem_alloc. However sctp doesn't know the amount to decrease by, it would cause a leak on sk->sk_wmem_alloc and the sock can never be freed. Xiumei found this issue when it hit esp_output_head() by using sctp over ipsec, where skb->truesize is added and so is sk->sk_wmem_alloc. Since sctp has used sk_wmem_queued to count for writable space since Commit cd305c74b0f8 ("sctp: use sk_wmem_queued to check for writable space"), it's ok to fix it by counting sk_wmem_alloc by skb truesize in sctp_packet_transmit. Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: add sockopt SCTP_EVENTXin Long
This patch adds sockopt SCTP_EVENT described in rfc6525#section-6.2. With this sockopt users can subscribe to an event from a specified asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: rename enum sctp_event to sctp_event_typeXin Long
sctp_event is a structure name defined in RFC for sockopt SCTP_EVENT. To avoid the conflict, rename it. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: add subscribe per asocXin Long
The member subscribe should be per asoc, so that sockopt SCTP_EVENT in the next patch can subscribe a event from one asoc only. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19sctp: define subscribe in sctp_sock as __u16Xin Long
The member subscribe in sctp_sock is used to indicate to which of the events it is subscribed, more like a group of flags. So it's better to be defined as __u16 (2 bytpes), instead of struct sctp_event_subscribe (13 bytes). Note that sctp_event_subscribe is an UAPI struct, used on sockopt calls, and thus it will not be removed. This patch only changes the internal storage of the flags. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12sctp: process sk_reuseport in sctp_get_port_localXin Long
When socks' sk_reuseport is set, the same port and address are allowed to be bound into these socks who have the same uid. Note that the difference from sk_reuse is that it allows multiple socks to listen on the same port and address. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12sctp: add sock_reuseport for the sock in __sctp_hash_endpointXin Long
This is a part of sk_reuseport support for sctp. It defines a helper sctp_bind_addrs_check() to check if the bind_addrs in two socks are matched. It will add sock_reuseport if they are completely matched, and return err if they are partly matched, and alloc sock_reuseport if all socks are not matched at all. It will work until sk_reuseport support is added in sctp_get_port_local() in the next patch. v1->v2: - use 'laddr->valid && laddr2->valid' check instead as Marcelo pointed in sctp_bind_addrs_check(). Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-12sctp: do reuseport_select_sock in __sctp_rcv_lookup_endpointXin Long
This is a part of sk_reuseport support for sctp, and it selects a sock by the hashkey of lport, paddr and dport by default. It will work until sk_reuseport support is added in sctp_get_port_local() in the next patch. v1->v2: - define lport as __be16 instead of __be32 as Marcelo pointed in __sctp_rcv_lookup_endpoint(). Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-10sctp: Fix SKB list traversal in sctp_intl_store_ordered().David S. Miller
Same change as made to sctp_intl_store_reasm(). To be fully correct, an iterator has an undefined value when something like skb_queue_walk() naturally terminates. This will actually matter when SKB queues are converted over to list_head. Formalize what this code ends up doing with the current implementation. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-10sctp: Fix SKB list traversal in sctp_intl_store_reasm().David S. Miller
To be fully correct, an iterator has an undefined value when something like skb_queue_walk() naturally terminates. This will actually matter when SKB queues are converted over to list_head. Formalize what this code ends up doing with the current implementation. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-08net: Convert protocol error handlers from void to intStefano Brivio
We'll need this to handle ICMP errors for tunnels without a sending socket (i.e. FoU and GUE). There, we might have to look up different types of IP tunnels, registered as network protocols, before we get a match, so we want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both inet_protos and inet6_protos. These error codes will be used in the next patch. For consistency, return sensible error codes in protocol error handlers whenever handlers can't handle errors because, even if valid, they don't match a protocol or any of its states. This has no effect on existing error handling paths. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03sctp: define SCTP_SS_DEFAULT for Stream schedulersXin Long
According to rfc8260#section-4.3.2, SCTP_SS_DEFAULT is required to defined as SCTP_SS_FCFS or SCTP_SS_RR. SCTP_SS_FCFS is used for SCTP_SS_DEFAULT's value in this patch. Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) BPF verifier fixes from Daniel Borkmann. 2) HNS driver fixes from Huazhong Tan. 3) FDB only works for ethernet devices, reject attempts to install FDB rules for others. From Ido Schimmel. 4) Fix spectre V1 in vhost, from Jason Wang. 5) Don't pass on-stack object to irq_set_affinity_hint() in mvpp2 driver, from Marc Zyngier. 6) Fix mlx5e checksum handling when RXFCS is enabled, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) openvswitch: Fix push/pop ethernet validation net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules bpf: test make sure to run unpriv test cases in test_verifier bpf: add various test cases to test_verifier bpf: don't set id on after map lookup with ptr_to_map_val return bpf: fix partial copy of map_ptr when dst is scalar libbpf: Fix compile error in libbpf_attach_type_by_name kselftests/bpf: use ping6 as the default ipv6 ping binary if it exists selftests: mlxsw: qos_mc_aware: Add a test for UC awareness selftests: mlxsw: qos_mc_aware: Tweak for min shaper mlxsw: spectrum: Set minimum shaper on MC TCs mlxsw: reg: QEEC: Add minimum shaper fields net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset() net: hns3: bugfix for rtnl_lock's range in the hclge_reset() net: hns3: bugfix for handling mailbox while the command queue reinitialized net: hns3: fix incorrect return value/type of some functions net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read net: hns3: bugfix for is_valid_csq_clean_head() net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring() net: hns3: bugfix for the initialization of command queue's spin lock ...
2018-10-31mm: remove include/linux/bootmem.hMike Rapoport
Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-29sctp: check policy more carefully when getting pr statusXin Long
When getting pr_assocstatus and pr_streamstatus by sctp_getsockopt, it doesn't correctly process the case when policy is set with SCTP_PR_SCTP_ALL | SCTP_PR_SCTP_MASK. It even causes a slab-out-of-bounds in sctp_getsockopt_pr_streamstatus(). This patch fixes it by return -EINVAL for this case. Fixes: 0ac1077e3a54 ("sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL") Reported-by: syzbot+5da0d0a72a9e7d791748@syzkaller.appspotmail.com Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-29sctp: clear the transport of some out_chunk_list chunks in sctp_assoc_rm_peerXin Long
If a transport is removed by asconf but there still are some chunks with this transport queuing on out_chunk_list, later an use-after-free issue will be caused when accessing this transport from these chunks in sctp_outq_flush(). This is an old bug, we fix it by clearing the transport of these chunks in out_chunk_list when removing a transport in sctp_assoc_rm_peer(). Reported-by: syzbot+56a40ceee5fb35932f4d@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
net/sched/cls_api.c has overlapping changes to a call to nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL to the 5th argument, and another (from 'net-next') added cb->extack instead of NULL to the 6th argument. net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to code which moved (to mr_table_dump)) in 'net-next'. Thanks to David Ahern for the heads up. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18sctp: use sk_wmem_queued to check for writable spaceXin Long
sk->sk_wmem_queued is used to count the size of chunks in out queue while sk->sk_wmem_alloc is for counting the size of chunks has been sent. sctp is increasing both of them before enqueuing the chunks, and using sk->sk_wmem_alloc to check for writable space. However, sk_wmem_alloc is also increased by 1 for the skb allocked for sending in sctp_packet_transmit() but it will not wake up the waiters when sk_wmem_alloc is decreased in this skb's destructor. If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf, the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf() will keep waiting if there's a skb allocked in sctp_packet_transmit, and later even if this skb got freed, the waiting thread will never get waked up. This issue has been there since very beginning, so we change to use sk->sk_wmem_queued to check for writable space as sk_wmem_queued is not increased for the skb allocked for sending, also as TCP does. SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto tuning which I will add in another patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk sizeXin Long
Now it's confusing that asoc sndbuf_used is doing memory accounting with SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk). It also causes sctp_prsctp_prune to count with a wrong freed memory when sndbuf_policy is not set. To make this right and also keep consistent between asoc sndbuf_used, sk sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk) for them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17sctp: not free the new asoc when sctp_wait_for_connect returns errXin Long
When sctp_wait_for_connect is called to wait for connect ready for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could be triggered if cpu is scheduled out and the new asoc is freed elsewhere, as it will return err and later the asoc gets freed again in sctp_sendmsg. [ 285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100) [ 285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0 [ 285.846193] Kernel panic - not syncing: panic_on_warn set ... [ 285.846193] [ 285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584 [ 285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 285.852164] Call Trace: ... [ 285.872210] ? __list_del_entry_valid+0x50/0xa0 [ 285.872894] sctp_association_free+0x42/0x2d0 [sctp] [ 285.873612] sctp_sendmsg+0x5a4/0x6b0 [sctp] [ 285.874236] sock_sendmsg+0x30/0x40 [ 285.874741] ___sys_sendmsg+0x27a/0x290 [ 285.875304] ? __switch_to_asm+0x34/0x70 [ 285.875872] ? __switch_to_asm+0x40/0x70 [ 285.876438] ? ptep_set_access_flags+0x2a/0x30 [ 285.877083] ? do_wp_page+0x151/0x540 [ 285.877614] __sys_sendmsg+0x58/0xa0 [ 285.878138] do_syscall_64+0x55/0x180 [ 285.878669] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is a similar issue with the one fixed in Commit ca3af4dd28cf ("sctp: do not free asoc when it is already dead in sctp_sendmsg"). But this one can't be fixed by returning -ESRCH for the dead asoc in sctp_wait_for_connect, as it will break sctp_connect's return value to users. This patch is to simply set err to -ESRCH before it returns to sctp_sendmsg when any err is returned by sctp_wait_for_connect for sp->strm_interleave, so that no asoc would be freed due to this. When users see this error, they will know the packet hasn't been sent. And it also makes sense to not free asoc because waiting connect fails, like the second call for sctp_wait_for_connect in sctp_sendmsg_to_asoc. Fixes: 668c9beb9020 ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-17sctp: fix race on sctp_id2asocMarcelo Ricardo Leitner
syzbot reported an use-after-free involving sctp_id2asoc. Dmitry Vyukov helped to root cause it and it is because of reading the asoc after it was freed: CPU 1 CPU 2 (working on socket 1) (working on socket 2) sctp_association_destroy sctp_id2asoc spin lock grab the asoc from idr spin unlock spin lock remove asoc from idr spin unlock free(asoc) if asoc->base.sk != sk ... [*] This can only be hit if trying to fetch asocs from different sockets. As we have a single IDR for all asocs, in all SCTP sockets, their id is unique on the system. An application can try to send stuff on an id that matches on another socket, and the if in [*] will protect from such usage. But it didn't consider that as that asoc may belong to another socket, it may be freed in parallel (read: under another socket lock). We fix it by moving the checks in [*] into the protected region. This fixes it because the asoc cannot be freed while the lock is held. Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-16sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL insteadXin Long
According to rfc7496 section 4.3 or 4.4: sprstat_policy: This parameter indicates for which PR-SCTP policy the user wants the information. It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. If SCTP_PR_SCTP_ALL is used, the counters provided are aggregated over all supported policies. We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is an error to use SCTP_PR_SCTP_NONE in sprstat_policy. " Fixes: 826d253d57b1 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") Fixes: d229d48d183f ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15sctp: use the pmtu from the icmp packet to update transport pathmtuXin Long
Other than asoc pmtu sync from all transports, sctp_assoc_sync_pmtu is also processing transport pmtu_pending by icmp packets. But it's meaningless to use sctp_dst_mtu(t->dst) as new pmtu for a transport. The right pmtu value should come from the icmp packet, and it would be saved into transport->mtu_info in this patch and used later when the pmtu sync happens in sctp_sendmsg_to_asoc or sctp_packet_config. Besides, without this patch, as pmtu can only be updated correctly when receiving a icmp packet and no place is holding sock lock, it will take long time if the sock is busy with sending packets. Note that it doesn't process transport->mtu_info in .release_cb(), as there is no enough information for pmtu update, like for which asoc or transport. It is not worth traversing all asocs to check pmtu_pending. So unlike tcp, sctp does this in tx path, for which mtu_info needs to be atomic_t. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net' overlapped the renaming of a netlink attribute in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03sctp: fix fall-through annotationGustavo A. R. Silva
Replace "fallthru" with a proper "fall through" annotation. This fix is part of the ongoing efforts to enabling -Wimplicit-fallthrough Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Version bump conflict in batman-adv, take what's in net-next. iavf conflict, adjustment of netdev_ops in net-next conflicting with poll controller method removal in net. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20sctp: update dst pmtu with the correct daddrXin Long
When processing pmtu update from an icmp packet, it calls .update_pmtu with sk instead of skb in sctp_transport_update_pmtu. However for sctp, the daddr in the transport might be different from inet_sock->inet_daddr or sk->sk_v6_daddr, which is used to update or create the route cache. The incorrect daddr will cause a different route cache created for the path. So before calling .update_pmtu, inet_sock->inet_daddr/sk->sk_v6_daddr should be updated with the daddr in the transport, and update it back after it's done. The issue has existed since route exceptions introduction. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") Reported-by: ian.periam@dialogic.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10sctp: Use skb_queue_is_first().David S. Miller
Instead of direct skb_queue_head pointer accesses. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-03sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabelXin Long
When users set params.spp_address and get a trans, ipv6_flowlabel flag should be applied into this trans. But even if this one is not an ipv6 trans, it should not go to apply it into all other transes of the asoc but simply ignore it. Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-03sctp: fix invalid reference to the index variable of the iteratorXin Long
Now in sctp_apply_peer_addr_params(), if SPP_IPV6_FLOWLABEL flag is set and trans is NULL, it would use trans as the index variable to traverse transport_addr_list, then trans is set as the last transport of it. Later, if SPP_DSCP flag is set, it would enter into the wrong branch as trans is actually an invalid reference. So fix it by using a new index variable to traverse transport_addr_list for both SPP_DSCP and SPP_IPV6_FLOWLABEL flags process. Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-27sctp: remove useless start_fail from sctp_ht_iter in procXin Long
After changing rhashtable_walk_start to return void, start_fail would never be set other value than 0, and the checking for start_fail is pointless, so remove it. Fixes: 97a6ec4ac021 ("rhashtable: Change rhashtable_walk_start to return void") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-27sctp: hold transport before accessing its asoc in sctp_transport_get_nextXin Long
As Marcelo noticed, in sctp_transport_get_next, it is iterating over transports but then also accessing the association directly, without checking any refcnts before that, which can cause an use-after-free Read. So fix it by holding transport before accessing the association. With that, sctp_transport_hold calls can be removed in the later places. Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Reported-by: syzbot+fe62a0c9aa6a85c6de16@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-11net/sctp: Replace in/out stream arrays with flex_arrayKonstantin Khorenko
This path replaces physically contiguous memory arrays allocated using kmalloc_array() with flexible arrays. This enables to avoid memory allocation failures on the systems under a memory stress. Signed-off-by: Oleg Babin <obabin@virtuozzo.com> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-11net/sctp: Make wrappers for accessing in/out streamsKonstantin Khorenko
This patch introduces wrappers for accessing in/out streams indirectly. This will enable to replace physically contiguous memory arrays of streams with flexible arrays (or maybe any other appropriate mechanism) which do memory allocation on a per-page basis. Signed-off-by: Oleg Babin <obabin@virtuozzo.com> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-24sctp: whitespace fixesStephen Hemminger
Remove blank line at EOF and trailing whitespace. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linuxDavid S. Miller
All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: fix the issue that pathmtu may be set lower than MINSEGMENTXin Long
After commit b6c5734db070 ("sctp: fix the handling of ICMP Frag Needed for too small MTUs"), sctp_transport_update_pmtu would refetch pathmtu from the dst and set it to transport's pathmtu without any check. The new pathmtu may be lower than MINSEGMENT if the dst is obsolete and updated by .get_dst() in sctp_transport_update_pmtu. In this case, it could have a smaller MTU as well, and thus we should validate it against MINSEGMENT instead. Syzbot reported a warning in sctp_mtu_payload caused by this. This patch refetches the pathmtu by calling sctp_dst_mtu where it does the check against MINSEGMENT. v1->v2: - refetch the pathmtu by calling sctp_dst_mtu instead as Marcelo's suggestion. Fixes: b6c5734db070 ("sctp: fix the handling of ICMP Frag Needed for too small MTUs") Reported-by: syzbot+f0d9d7cba052f9344b03@syzkaller.appspotmail.com Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: check for ipv6_pinfo legal sndflow with flowlabel in sctp_v6_get_dstXin Long
The transport with illegal flowlabel should not be allowed to send packets. Other transport protocols already denies this. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: add support for setting flowlabel when adding a transportXin Long
Struct sockaddr_in6 has the member sin6_flowinfo that includes the ipv6 flowlabel, it should also support for setting flowlabel when adding a transport whose ipaddr is from userspace. Note that addrinfo in sctp_sendmsg is using struct in6_addr for the secondary addrs, which doesn't contain sin6_flowinfo, and it needs to copy sin6_flowinfo from the primary addr. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparamsXin Long
spp_ipv6_flowlabel and spp_dscp are added in sctp_paddrparams in this patch so that users could set sctp_sock/asoc/transport dscp and flowlabel with spp_flags SPP_IPV6_FLOWLABEL or SPP_DSCP by SCTP_PEER_ADDR_PARAMS , as described section 8.1.12 in RFC6458. As said in last patch, it uses '| 0x100000' or '|0x1' to mark flowlabel or dscp is set, so that their values could be set to 0. Note that to guarantee that an old app built with old kernel headers could work on the newer kernel, the param's check in sctp_g/setsockopt_peer_addr_params() is also improved, which follows the way that sctp_g/setsockopt_delayed_ack() or some other sockopts' process that accept two types of params does. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: add support for dscp and flowlabel per transportXin Long
Like some other per transport params, flowlabel and dscp are added in transport, asoc and sctp_sock. By default, transport sets its value from asoc's, and asoc does it from sctp_sock. flowlabel only works for ipv6 transport. Other than that they need to be passed down in sctp_xmit, flow4/6 also needs to set them before looking up route in get_dst. Note that it uses '& 0x100000' to check if flowlabel is set and '& 0x1' (tos 1st bit is unused) to check if dscp is set by users, so that they could be set to 0 by sockopt in next patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-03Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-29sctp: add support for SCTP_REUSE_PORT sockoptXin Long
This feature is actually already supported by sk->sk_reuse which can be set by socket level opt SO_REUSEADDR. But it's not working exactly as RFC6458 demands in section 8.1.27, like: - This option only supports one-to-one style SCTP sockets - This socket option must not be used after calling bind() or sctp_bindx(). Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs. Otherwise, the programs with SCTP_REUSE_PORT from other systems will not work in linux. To separate it from the socket level version, this patch adds 'reuse' in sctp_sock and it works pretty much as sk->sk_reuse, but with some extra setup limitations that are needed when it is being enabled. "It should be noted that the behavior of the socket-level socket option to reuse ports and/or addresses for SCTP sockets is unspecified", so it leaves SO_REUSEADDR as is for the compatibility. Note that the name SCTP_REUSE_PORT is somewhat confusing, as its functionality is nearly identical to SO_REUSEADDR, but with some extra restrictions. Here it uses 'reuse' in sctp_sock instead of 'reuseport'. As for sk->sk_reuseport support for SCTP, it will be added in another patch. Thanks to Neil to make this clear. v1->v2: - add sctp_sk->reuse to separate it from the socket level version. v2->v3: - improve changelog according to Marcelo's suggestion. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-28Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLLLinus Torvalds
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-22rhashtable: split rhashtable.hNeilBrown
Due to the use of rhashtables in net namespaces, rhashtable.h is included in lots of the kernel, so a small changes can required a large recompilation. This makes development painful. This patch splits out rhashtable-types.h which just includes the major type declarations, and does not include (non-trivial) inline code. rhashtable.h is no longer included by anything in the include/ directory. Common include files only include rhashtable-types.h so a large recompilation is only triggered when that changes. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-21sctp: fix erroneous inc of snmp SctpFragUsrMsgsMarcelo Ricardo Leitner
Currently it is incrementing SctpFragUsrMsgs when the user message size is of the exactly same size as the maximum fragment size, which is wrong. The fix is to increment it only when user message is bigger than the maximum fragment size. Fixes: bfd2e4b8734d ("sctp: refactor sctp_datamsg_from_user") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Various netfilter fixlets from Pablo and the netfilter team. 2) Fix regression in IPVS caused by lack of PMTU exceptions on local routes in ipv6, from Julian Anastasov. 3) Check pskb_trim_rcsum for failure in DSA, from Zhouyang Jia. 4) Don't crash on poll in TLS, from Daniel Borkmann. 5) Revert SO_REUSE{ADDR,PORT} change, it regresses various things including Avahi mDNS. From Bart Van Assche. 6) Missing of_node_put in qcom/emac driver, from Yue Haibing. 7) We lack checking of the TCP checking in one special case during SYN receive, from Frank van der Linden. 8) Fix module init error paths of mac80211 hwsim, from Johannes Berg. 9) Handle 802.1ad properly in stmmac driver, from Elad Nachman. 10) Must grab HW caps before doing quirk checks in stmmac driver, from Jose Abreu. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits) net: stmmac: Run HWIF Quirks after getting HW caps neighbour: skip NTF_EXT_LEARNED entries during forced gc net: cxgb3: add error handling for sysfs_create_group tls: fix waitall behavior in tls_sw_recvmsg tls: fix use-after-free in tls_push_record l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl() l2tp: reject creation of non-PPP sessions on L2TPv2 tunnels mlxsw: spectrum_switchdev: Fix port_vlan refcounting mlxsw: spectrum_router: Align with new route replace logic mlxsw: spectrum_router: Allow appending to dev-only routes ipv6: Only emit append events for appended routes stmmac: added support for 802.1ad vlan stripping cfg80211: fix rcu in cfg80211_unregister_wdev mac80211: Move up init of TXQs mac80211_hwsim: fix module init error paths cfg80211: initialize sinfo in cfg80211_get_station nl80211: fix some kernel doc tag mistakes hv_netvsc: Fix the variable sizes in ipsecv2 and rsc offload rds: avoid unenecessary cong_update in loop transport l2tp: clean up stale tunnel or session in pppol2tp_connect's error path ...