summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-05-17netfilter: nf_tables: remove old nf_log based tracingFlorian Westphal
nfnetlink tracing is available since nft 0.6 (June 2016). Remove old nf_log based tracing to avoid rule counter in main loop. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-17netfilter: ebtables: handle string from userspace with carePaolo Abeni
strlcpy() can't be safely used on a user-space provided string, as it can try to read beyond the buffer's end, if the latter is not NULL terminated. Leveraging the above, syzbot has been able to trigger the following splat: BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline] BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] BUG: KASAN: stack-out-of-bounds in ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] BUG: KASAN: stack-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504 CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 strlcpy include/linux/string.h:300 [inline] compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline] ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline] size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline] compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194 compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285 compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367 compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline] compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156 compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279 inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041 compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901 compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050 __compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403 __do_compat_sys_setsockopt net/compat.c:416 [inline] __se_compat_sys_setsockopt net/compat.c:413 [inline] __ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413 do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline] do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7fb3cb9 RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000 RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0x2fffc0000000000() raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Fix the issue replacing the unsafe function with strscpy() and taking care of possible errors. Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support") Reported-and-tested-by: syzbot+4e42a04e0bc33cb6c087@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-17netfilter: nf_tables: fix NULL pointer dereference on nft_ct_helper_obj_dump()Taehee Yoo
In the nft_ct_helper_obj_dump(), always priv->helper4 is dereferenced. But if family is ipv6, priv->helper6 should be dereferenced. Steps to reproduces: #test.nft table ip6 filter { ct helper ftp { type "ftp" protocol tcp } chain input { type filter hook input priority 4; ct helper set "ftp" } } %nft -f test.nft %nft list ruleset we can see the below messages: [ 916.286233] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 916.294777] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 916.302613] Modules linked in: nft_objref nf_conntrack_sip nf_conntrack_snmp nf_conntrack_broadcast nf_conntrack_ftp nft_ct nf_conntrack nf_tables nfnetlink [last unloaded: nfnetlink] [ 916.318758] CPU: 1 PID: 2093 Comm: nft Not tainted 4.17.0-rc4+ #181 [ 916.326772] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 916.338773] RIP: 0010:strlen+0x1a/0x90 [ 916.342781] RSP: 0018:ffff88010ff0f2f8 EFLAGS: 00010292 [ 916.346773] RAX: dffffc0000000000 RBX: ffff880119b26ee8 RCX: ffff88010c150038 [ 916.354777] RDX: 0000000000000002 RSI: ffff880119b26ee8 RDI: 0000000000000010 [ 916.362773] RBP: 0000000000000010 R08: 0000000000007e88 R09: ffff88010c15003c [ 916.370773] R10: ffff88010c150037 R11: ffffed002182a007 R12: ffff88010ff04040 [ 916.378779] R13: 0000000000000010 R14: ffff880119b26f30 R15: ffff88010ff04110 [ 916.387265] FS: 00007f57a1997700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 916.394785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 916.402778] CR2: 00007f57a0ac80f0 CR3: 000000010ff02000 CR4: 00000000001006e0 [ 916.410772] Call Trace: [ 916.414787] nft_ct_helper_obj_dump+0x94/0x200 [nft_ct] [ 916.418779] ? nft_ct_set_eval+0x560/0x560 [nft_ct] [ 916.426771] ? memset+0x1f/0x40 [ 916.426771] ? __nla_reserve+0x92/0xb0 [ 916.434774] ? memcpy+0x34/0x50 [ 916.434774] nf_tables_fill_obj_info+0x484/0x860 [nf_tables] [ 916.442773] ? __nft_release_basechain+0x600/0x600 [nf_tables] [ 916.450779] ? lock_acquire+0x193/0x380 [ 916.454771] ? lock_acquire+0x193/0x380 [ 916.458789] ? nf_tables_dump_obj+0x148/0xcb0 [nf_tables] [ 916.462777] nf_tables_dump_obj+0x5f0/0xcb0 [nf_tables] [ 916.470769] ? __alloc_skb+0x30b/0x500 [ 916.474779] netlink_dump+0x752/0xb50 [ 916.478775] __netlink_dump_start+0x4d3/0x750 [ 916.482784] nf_tables_getobj+0x27a/0x930 [nf_tables] [ 916.490774] ? nft_obj_notify+0x100/0x100 [nf_tables] [ 916.494772] ? nf_tables_getobj+0x930/0x930 [nf_tables] [ 916.502579] ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables] [ 916.506774] ? nft_obj_notify+0x100/0x100 [nf_tables] [ 916.514808] nfnetlink_rcv_msg+0x8ab/0xa86 [nfnetlink] [ 916.518771] ? nfnetlink_rcv_msg+0x550/0xa86 [nfnetlink] [ 916.526782] netlink_rcv_skb+0x23e/0x360 [ 916.530773] ? nfnetlink_bind+0x200/0x200 [nfnetlink] [ 916.534778] ? debug_check_no_locks_freed+0x280/0x280 [ 916.542770] ? netlink_ack+0x870/0x870 [ 916.546786] ? ns_capable_common+0xf4/0x130 [ 916.550765] nfnetlink_rcv+0x172/0x16c0 [nfnetlink] [ 916.554771] ? sched_clock_local+0xe2/0x150 [ 916.558774] ? sched_clock_cpu+0x144/0x180 [ 916.566575] ? lock_acquire+0x380/0x380 [ 916.570775] ? sched_clock_local+0xe2/0x150 [ 916.574765] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 916.578763] ? sched_clock_cpu+0x144/0x180 [ 916.582770] ? lock_acquire+0x193/0x380 [ 916.590771] ? lock_acquire+0x193/0x380 [ 916.594766] ? lock_acquire+0x380/0x380 [ 916.598760] ? netlink_deliver_tap+0x262/0xa60 [ 916.602766] ? lock_acquire+0x193/0x380 [ 916.606766] netlink_unicast+0x3ef/0x5a0 [ 916.610771] ? netlink_attachskb+0x630/0x630 [ 916.614763] netlink_sendmsg+0x72a/0xb00 [ 916.618769] ? netlink_unicast+0x5a0/0x5a0 [ 916.626766] ? _copy_from_user+0x92/0xc0 [ 916.630773] __sys_sendto+0x202/0x300 [ 916.634772] ? __ia32_sys_getpeername+0xb0/0xb0 [ 916.638759] ? lock_acquire+0x380/0x380 [ 916.642769] ? lock_acquire+0x193/0x380 [ 916.646761] ? finish_task_switch+0xf4/0x560 [ 916.650763] ? __schedule+0x582/0x19a0 [ 916.655301] ? __sched_text_start+0x8/0x8 [ 916.655301] ? up_read+0x1c/0x110 [ 916.655301] ? __do_page_fault+0x48b/0xaa0 [ 916.655301] ? entry_SYSCALL_64_after_hwframe+0x59/0xbe [ 916.655301] __x64_sys_sendto+0xdd/0x1b0 [ 916.655301] do_syscall_64+0x96/0x3d0 [ 916.655301] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 916.655301] RIP: 0033:0x7f57a0ff5e03 [ 916.655301] RSP: 002b:00007fff6367e0a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 916.655301] RAX: ffffffffffffffda RBX: 00007fff6367f1e0 RCX: 00007f57a0ff5e03 [ 916.655301] RDX: 0000000000000020 RSI: 00007fff6367e110 RDI: 0000000000000003 [ 916.655301] RBP: 00007fff6367e100 R08: 00007f57a0ce9160 R09: 000000000000000c [ 916.655301] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff6367e110 [ 916.655301] R13: 0000000000000020 R14: 00007f57a153c610 R15: 0000562417258de0 [ 916.655301] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 fa 53 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df 48 89 fd 48 83 ec 08 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f [ 916.655301] RIP: strlen+0x1a/0x90 RSP: ffff88010ff0f2f8 [ 916.771929] ---[ end trace 1065e048e72479fe ]--- [ 916.777204] Kernel panic - not syncing: Fatal exception [ 916.778158] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-05-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-05-17 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Provide a new BPF helper for doing a FIB and neighbor lookup in the kernel tables from an XDP or tc BPF program. The helper provides a fast-path for forwarding packets. The API supports IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are implemented in this initial work, from David (Ahern). 2) Just a tiny diff but huge feature enabled for nfp driver by extending the BPF offload beyond a pure host processing offload. Offloaded XDP programs are allowed to set the RX queue index and thus opening the door for defining a fully programmable RSS/n-tuple filter replacement. Once BPF decided on a queue already, the device data-path will skip the conventional RSS processing completely, from Jakub. 3) The original sockmap implementation was array based similar to devmap. However unlike devmap where an ifindex has a 1:1 mapping into the map there are use cases with sockets that need to be referenced using longer keys. Hence, sockhash map is added reusing as much of the sockmap code as possible, from John. 4) Introduce BTF ID. The ID is allocatd through an IDR similar as with BPF maps and progs. It also makes BTF accessible to user space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data through BPF_OBJ_GET_INFO_BY_FD, from Martin. 5) Enable BPF stackmap with build_id also in NMI context. Due to the up_read() of current->mm->mmap_sem build_id cannot be parsed. This work defers the up_read() via a per-cpu irq_work so that at least limited support can be enabled, from Song. 6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND JIT conversion as well as implementation of an optimized 32/64 bit immediate load in the arm64 JIT that allows to reduce the number of emitted instructions; in case of tested real-world programs they were shrinking by three percent, from Daniel. 7) Add ifindex parameter to the libbpf loader in order to enable BPF offload support. Right now only iproute2 can load offloaded BPF and this will also enable libbpf for direct integration into other applications, from David (Beckett). 8) Convert the plain text documentation under Documentation/bpf/ into RST format since this is the appropriate standard the kernel is moving to for all documentation. Also add an overview README.rst, from Jesper. 9) Add __printf verification attribute to the bpf_verifier_vlog() helper. Though it uses va_list we can still allow gcc to check the format string, from Mathieu. 10) Fix a bash reference in the BPF selftest's Makefile. The '|& ...' is a bash 4.0+ feature which is not guaranteed to be available when calling out to shell, therefore use a more portable variant, from Joe. 11) Fix a 64 bit division in xdp_umem_reg() by using div_u64() instead of relying on the gcc built-in, from Björn. 12) Fix a sock hashmap kmalloc warning reported by syzbot when an overly large key size is used in hashmap then causing overflows in htab->elem_size. Reject bogus attr->key_size early in the sock_hash_alloc(), from Yonghong. 13) Ensure in BPF selftests when urandom_read is being linked that --build-id is always enabled so that test_stacktrace_build_id[_nmi] won't be failing, from Alexei. 14) Add bitsperlong.h as well as errno.h uapi headers into the tools header infrastructure which point to one of the arch specific uapi headers. This was needed in order to fix a build error on some systems for the BPF selftests, from Sirio. 15) Allow for short options to be used in the xdp_monitor BPF sample code. And also a bpf.h tools uapi header sync in order to fix a selftest build failure. Both from Prashant. 16) More formally clarify the meaning of ID in the direct packet access section of the BPF documentation, from Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/sched: fix refcnt leak in the error path of tcf_vlan_init()Davide Caratti
Similarly to what was done with commit a52956dfc503 ("net sched actions: fix refcnt leak in skbmod"), fix the error path of tcf_vlan_init() to avoid refcnt leaks when wrong value of TCA_VLAN_PUSH_VLAN_PROTOCOL is given. Fixes: 5026c9b1bafc ("net sched: vlan action fix late binding") CC: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16sched: manipulate __QDISC_STATE_RUNNING in qdisc_run_* helpersPaolo Abeni
Currently NOLOCK qdiscs pay a measurable overhead to atomically manipulate the __QDISC_STATE_RUNNING. Such bit is flipped twice per packet in the uncontended scenario with packet rate below the line rate: on packed dequeue and on the next, failing dequeue attempt. This changeset moves the bit manipulation into the qdisc_run_{begin,end} helpers, so that the bit is now flipped only once per packet, with measurable performance improvement in the uncontended scenario. This also allows simplifying the qdisc teardown code path - since qdisc_is_running() is now effective for each qdisc type - and avoid a possible race between qdisc_run() and dev_deactivate_many(), as now the some_qdisc_is_busy() can properly detect NOLOCK qdiscs being busy dequeuing packets. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16tcp: purge write queue in tcp_connect_init()Eric Dumazet
syzkaller found a reliable way to crash the host, hitting a BUG() in __tcp_retransmit_skb() Malicous MSG_FASTOPEN is the root cause. We need to purge write queue in tcp_connect_init() at the point we init snd_una/write_seq. This patch also replaces the BUG() by a less intrusive WARN_ON_ONCE() kernel BUG at net/ipv4/tcp_output.c:2837! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 5276 Comm: syz-executor0 Not tainted 4.17.0-rc3+ #51 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__tcp_retransmit_skb+0x2992/0x2eb0 net/ipv4/tcp_output.c:2837 RSP: 0000:ffff8801dae06ff8 EFLAGS: 00010206 RAX: ffff8801b9fe61c0 RBX: 00000000ffc18a16 RCX: ffffffff864e1a49 RDX: 0000000000000100 RSI: ffffffff864e2e12 RDI: 0000000000000005 RBP: ffff8801dae073a0 R08: ffff8801b9fe61c0 R09: ffffed0039c40dd2 R10: ffffed0039c40dd2 R11: ffff8801ce206e93 R12: 00000000421eeaad R13: ffff8801ce206d4e R14: ffff8801ce206cc0 R15: ffff8801cd4f4a80 FS: 0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000096bc900 CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 CR2: 0000000020000000 CR3: 00000001c47b6000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> tcp_retransmit_skb+0x2e/0x250 net/ipv4/tcp_output.c:2923 tcp_retransmit_timer+0xc50/0x3060 net/ipv4/tcp_timer.c:488 tcp_write_timer_handler+0x339/0x960 net/ipv4/tcp_timer.c:573 tcp_write_timer+0x111/0x1d0 net/ipv4/tcp_timer.c:593 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:525 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: check for pending terminationKarsten Graul
Avoid to run the processing in smc_lgr_terminate() more than once, remember when the link group termination is triggered. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: drop messages when link state is inactiveKarsten Graul
Drop incoming messages when the link is flagged as inactive. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: set link inactive before calling smc_lgr_free()Karsten Graul
Before smc_lgr_free() is called the link must be set inactive by calling smc_llc_link_inactive(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: handle all error codes from smc_conn_create()Karsten Graul
Always set a reason_code when smc_conn_create() returns an error code. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: use a workqueue to defer llc sendKarsten Graul
SMC handles deferred work in tasklets. As tasklets cannot sleep this can result in rare EBUSY conditions, so defer this work in a work queue. The high level api functions do not defer work because they can sleep until the llc send is actually completed. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: move link llc initialization to llc layerKarsten Graul
Move the llc layer specific initialization and cleanup out of smc_core.c into smc_llc.c (smc_llc_link_init and smc_llc_link_clear). Move all initialization of a link into the new init function. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: simplify test_link function usageKarsten Graul
Make smc_llc_send_test_link() static and remove it from the header file. And to send a test_link response set the response flag and send the message back as-is, without using smc_llc_send_test_link(). And because smc_llc_send_test_link() must no longer send responses, remove the response flag handling from the function. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: remove unnecessary castKarsten Graul
Remove an unneeded (void *) cast from the calls to smc_llc_send_message(). No functional changes. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: register new rmbs with the peerKarsten Graul
Register new rmb buffers with the remote peer by exchanging a confirm_rkey llc message. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16net/smc: no tx work trigger for fallback socketsUrsula Braun
If TCP_NODELAY is set or TCP_CORK is reset, setsockopt triggers the tx worker. This does not make sense, if the SMC socket switched to the TCP fallback when the connection is created. This patch adds the additional check for the fallback case. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-16isdn: replace ->proc_fops with ->proc_showChristoph Hellwig
And switch to proc_create_single_data. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16atm: switch to proc_create_seq_privateChristoph Hellwig
And remove proc boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16atm: simplify procfs codeChristoph Hellwig
Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_seq where applicable. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16bluetooth: switch to proc_create_seq_dataChristoph Hellwig
And use proc private data directly instead of doing a detour through seq->private and private state. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16netfilter/x_tables: switch to proc_create_seq_privateChristoph Hellwig
And remove proc boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16netfilter/xt_hashlimit: switch to proc_create_{seq,single}_dataChristoph Hellwig
And use proc private data directly instead of doing a detour through seq->private. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16neigh: switch to proc_create_seq_dataChristoph Hellwig
And use proc private data directly instead of doing a detour through seq->private. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16proc: introduce proc_create_net_singleChristoph Hellwig
Variant of proc_create_data that directly take a seq_file show callback and deals with network namespaces in ->open and ->release. All callers of proc_create + single_open_net converted over, and single_{open,release}_net are removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16proc: introduce proc_create_net{,_data}Christoph Hellwig
Variants of proc_create{,_data} that directly take a struct seq_operations and deal with network namespaces in ->open and ->release. All callers of proc_create + seq_open_net converted over, and seq_{open,release}_net are removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16netfilter/x_tables: simplify ѕeq_file codeChristoph Hellwig
Just use the address family from the proc private data instead of copying it into per-file data. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16net/kcm: simplify proc registrationChristoph Hellwig
Remove a couple indirections to make the code look like most other protocols. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16ipv6/flowlabel: simplify pid namespace lookupChristoph Hellwig
The code should be using the pid namespace from the procfs mount instead of trying to look it up during open. Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16ipv{4,6}/raw: simplify ѕeq_file codeChristoph Hellwig
Pass the hashtable to the proc private data instead of copying it into the per-file private data. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16ipv{4,6}/ping: simplify proc file creationChristoph Hellwig
Remove the pointless ping_seq_afinfo indirection and make the code look like most other protocols. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16ipv{4,6}/tcp: simplify procfs registrationChristoph Hellwig
Avoid most of the afinfo indirections and just call the proc helpers directly. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16ipv{4,6}/udp{,lite}: simplify proc registrationChristoph Hellwig
Remove a couple indirections to make the code look like most other protocols. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16proc: introduce proc_create_single{,_data}Christoph Hellwig
Variants of proc_create{,_data} that directly take a seq_file show callback and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16proc: introduce proc_create_seq_privateChristoph Hellwig
Variant of proc_create_data that directly take a struct seq_operations argument + a private state size and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16proc: introduce proc_create_seq{,_data}Christoph Hellwig
Variants of proc_create{,_data} that directly take a struct seq_operations argument and drastically reduces the boilerplate code in the callers. All trivial callers converted over. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-15bpf: sockmap, add hash map supportJohn Fastabend
Sockmap is currently backed by an array and enforces keys to be four bytes. This works well for many use cases and was originally modeled after devmap which also uses four bytes keys. However, this has become limiting in larger use cases where a hash would be more appropriate. For example users may want to use the 5-tuple of the socket as the lookup key. To support this add hash support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-15bpf: sockmap, refactor sockmap routines to work with hashmapJohn Fastabend
This patch only refactors the existing sockmap code. This will allow much of the psock initialization code path and bpf helper codes to work for both sockmap bpf map types that are backed by an array, the currently supported type, and the new hash backed bpf map type sockhash. Most the fallout comes from three changes, - Pushing bpf programs into an independent structure so we can use it from the htab struct in the next patch. - Generalizing helpers to use void *key instead of the hardcoded u32. - Instead of passing map/key through the metadata we now do the lookup inline. This avoids storing the key in the metadata which will be useful when keys can be longer than 4 bytes. We rename the sk pointers to sk_redir at this point as well to avoid any confusion between the current sk pointer and the redirect pointer sk_redir. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-14sctp: checkpatch fixupsMarcelo Ricardo Leitner
A collection of fixups from previous patches, left for later to not introduce unnecessary changes while moving code around. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: add asoc and packet to sctp_flush_ctxMarcelo Ricardo Leitner
Pre-compute these so the compiler won't reload them (due to no-strict-aliasing). Changes since v2: - Do not replace a return with a break in sctp_outq_flush_data Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: add sctp_flush_ctx, a context struct on outq_flush routinesMarcelo Ricardo Leitner
With this struct we avoid passing lots of variables around and taking care of updating the current transport/packet. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: rework switch cases in sctp_outq_flush_dataMarcelo Ricardo Leitner
Remove an inner one, which tended to be error prone due to the cascading and it can be replaced by a simple if (). Rework the outer one so that the actual flush code is not inside it. Now we first validate if we can or cannot send data, return if not, and then the flush code. Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: make use of gfp on retransmissionsMarcelo Ricardo Leitner
Retransmissions may be triggered when in user context, so lets make use of gfp. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: move transport flush code out of sctp_outq_flushMarcelo Ricardo Leitner
To the new sctp_outq_flush_transports. Comment on Nagle is outdated and removed. Nagle is performed earlier, while checking if the chunk fits the packet: if the outq length is not enough to fill the packet, it returns SCTP_XMIT_DELAY. So by when it gets to sctp_outq_flush_transports, it has to go through all enlisted transports. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: move flushing of data chunks out of sctp_outq_flushMarcelo Ricardo Leitner
To the new sctp_outq_flush_data. Again, smaller functions and with well defined objectives. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: move outq data rtx code out of sctp_outq_flushMarcelo Ricardo Leitner
This patch renames current sctp_outq_flush_rtx to __sctp_outq_flush_rtx and create a new sctp_outq_flush_rtx, with the code that was on sctp_outq_flush. Again, the idea is to have functions with small and defined objectives. Yes, there is an open-coded path selection in the now sctp_outq_flush_rtx. That is kept as is for now because it may be very different when we implement retransmission path selection algorithms for CMT-SCTP. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: move the flush of ctrl chunks into its own functionMarcelo Ricardo Leitner
Named sctp_outq_flush_ctrl and, with that, keep the contexts contained. One small fix embedded is the reset of one_packet at every iteration. This allows bundling of some control chunks in case they were preceeded by another control chunk that cannot be bundled. Other than this, it has the same behavior. Changes since v2: - Fixed panic reported by kbuild test robot if building with only up to this patch applied, due to bad parameter to sctp_outq_select_transport and by not initializing packet after calling sctp_outq_flush_ctrl. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: factor out sctp_outq_select_transportMarcelo Ricardo Leitner
We had two spots doing such complex operation and they were very close to each other, a bit more tailored to here or there. This patch unifies these under the same function, sctp_outq_select_transport, which knows how to handle control chunks and original transmissions (but not retransmissions). Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14sctp: add sctp_packet_singletonMarcelo Ricardo Leitner
Factor out the code for generating singletons. It's used only once, but helps to keep the context contained. The const variables are to ease the reading of subsequent calls in there. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14audit: use inline function to get audit contextRichard Guy Briggs
Recognizing that the audit context is an internal audit value, use an access function to retrieve the audit context pointer for the task rather than reaching directly into the task struct to get it. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: merge fuzz in auditsc.c and selinuxfs.c, checkpatch.pl fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>