summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-01Merge branch 'uaccess.__copy_to_user' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/__copy_to_user updates from Al Viro: "Getting rid of __copy_to_user() callers - stuff that doesn't fit into other series" * 'uaccess.__copy_to_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dlmfs: convert dlmfs_file_read() to copy_to_user() esas2r: don't bother with __copy_to_user()
2020-06-01Merge branch 'uaccess.__copy_from_user' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/__copy_from_user updates from Al Viro: "Getting rid of __copy_from_user() callers - patches that don't fit into other series" * 'uaccess.__copy_from_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pstore: switch to copy_from_user() firewire: switch ioctl_queue_iso to use of copy_from_user()
2020-06-01Merge branch 'uaccess.__put_user' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/__put-user updates from Al Viro: "Removal of __put_user() calls - misc patches that don't fit into any other series" * 'uaccess.__put_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pcm_native: result of put_user() needs to be checked scsi_ioctl.c: switch SCSI_IOCTL_GET_IDLUN to copy_to_user() compat sysinfo(2): don't bother with field-by-field copyout
2020-06-01Merge branch 'uaccess.readdir' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/readdir updates from Al Viro: "Finishing the conversion of readdir.c to unsafe_... API. This includes the uaccess_{read,write}_begin series by Christophe Leroy" * 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: readdir.c: get rid of the last __put_user(), drop now-useless access_ok() readdir.c: get compat_filldir() more or less in sync with filldir() switch readdir(2) to unsafe_copy_dirent_name() drm/i915/gem: Replace user_access_begin by user_write_access_begin uaccess: Selectively open read or write user access uaccess: Add user_read_access_begin/end and user_write_access_begin/end
2020-06-01Merge branch 'uaccess.access_ok' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/access_ok updates from Al Viro: "Removals of trivially pointless access_ok() calls. Note: the fiemap stuff was removed from the series, since they are duplicates with part of ext4 series carried in Ted's tree" * 'uaccess.access_ok' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vmci_host: get rid of pointless access_ok() hfi1: get rid of pointless access_ok() usb: get rid of pointless access_ok() calls lpfc_debugfs: get rid of pointless access_ok() efi_test: get rid of pointless access_ok() drm_read(): get rid of pointless access_ok() via-pmu: don't bother with access_ok() drivers/crypto/ccp/sev-dev.c: get rid of pointless access_ok() omapfb: get rid of pointless access_ok() calls amifb: get rid of pointless access_ok() calls drivers/fpga/dfl-afu-dma-region.c: get rid of pointless access_ok() drivers/fpga/dfl-fme-pr.c: get rid of pointless access_ok() cm4000_cs.c cmm_ioctl(): get rid of pointless access_ok() nvram: drop useless access_ok() n_hdlc_tty_read(): remove pointless access_ok() tomoyo_write_control(): get rid of pointless access_ok() btrfs_ioctl_send(): don't bother with access_ok() fat_dir_ioctl(): hadn't needed that access_ok() for more than a decade... dlmfs_file_write(): get rid of pointless access_ok()
2020-06-01Merge branch 'uaccess.csum' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess/csum updates from Al Viro: "Regularize the sitation with uaccess checksum primitives: - fold csum_partial_... into csum_and_copy_..._user() - on x86 collapse several access_ok()/stac()/clac() into user_access_begin()/user_access_end()" * 'uaccess.csum' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: default csum_and_copy_to_user(): don't bother with access_ok() take the dummy csum_and_copy_from_user() into net/checksum.h arm: switch to csum_and_copy_from_user() sh32: convert to csum_and_copy_from_user() m68k: convert to csum_and_copy_from_user() xtensa: switch to providing csum_and_copy_from_user() sparc: switch to providing csum_and_copy_from_user() parisc: turn csum_partial_copy_from_user() into csum_and_copy_from_user() alpha: turn csum_partial_copy_from_user() into csum_and_copy_from_user() ia64: turn csum_partial_copy_from_user() into csum_and_copy_from_user() ia64: csum_partial_copy_nocheck(): don't abuse csum_partial_copy_from_user() x86: switch 32bit csum_and_copy_to_user() to user_access_{begin,end}() x86: switch both 32bit and 64bit to providing csum_and_copy_from_user() x86_64: csum_..._copy_..._user(): switch to unsafe_..._user() get rid of csum_partial_copy_to_user()
2020-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-06-01 The following pull-request contains BPF updates for your *net-next* tree. We've added 55 non-merge commits during the last 1 day(s) which contain a total of 91 files changed, 4986 insertions(+), 463 deletions(-). The main changes are: 1) Add rx_queue_mapping to bpf_sock from Amritha. 2) Add BPF ring buffer, from Andrii. 3) Attach and run programs through devmap, from David. 4) Allow SO_BINDTODEVICE opt in bpf_setsockopt, from Ferenc. 5) link based flow_dissector, from Jakub. 6) Use tracing helpers for lsm programs, from Jiri. 7) Several sk_msg fixes and extensions, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()Jules Irenge
Sparse reports a warning at efx_ef10_try_update_nic_stats_vf() warning: context imbalance in efx_ef10_try_update_nic_stats_vf() - unexpected unlock The root cause is the missing annotation at efx_ef10_try_update_nic_stats_vf() Add the missing _must_hold(&efx->stats_lock) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01crypto/chtls: IPv6 support for inline TLSVinay Kumar Yadav
Extends support to IPv6 for Inline TLS server. Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> v1->v2: - cc'd tcp folks. v2->v3: - changed EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL() Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01Merge branch 'chelsio-crypto-fixes'David S. Miller
Ayush Sawal says: ==================== Fixing compilation warnings and errors Patch 1: Fixes the warnings seen when compiling using sparse tool. Patch 2: Fixes a cocci check error introduced after commit 567be3a5d227 ("crypto: chelsio - Use multiple txq/rxq per tfm to process the requests"). V1->V2 patch1: Avoid type casting by using get_unaligned_be32() and put_unaligned_be16/32() functions. patch2: Modified subject of the patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01Crypto/chcr: Fixes a coccinile check errorAyush Sawal
This fixes an error observed after running coccinile check. drivers/crypto/chelsio/chcr_algo.c:1462:5-8: Unneeded variable: "err". Return "0" on line 1480 This line is missed in the commit 567be3a5d227 ("crypto: chelsio - Use multiple txq/rxq per tfm to process the requests"). Fixes: 567be3a5d227 ("crypto: chelsio - Use multiple txq/rxq per tfm to process the requests"). V1->V2 -Modified subject. Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01Crypto/chcr: Fixes compilations warningsAyush Sawal
This patch fixes the compilation warnings displayed by sparse tool for chcr driver. V1->V2 Avoid type casting by using get_unaligned_be32() and put_unaligned_be16/32() functions. The key which comes from stack is an u8 byte stream so we store it in an unsigned char array(ablkctx->key). The function get_aes_decrypt_key() is a used to calculate the reverse round key for decryption, for this operation the key has to be divided into 4 bytes, so to extract 4 bytes from an u8 byte stream and store it in an u32 variable, get_aligned_be32() is used. Similarly for copying back the key from u32 variable to the original u8 key stream, put_aligned_be32() is used. Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01crypto/chcr: IPV6 code needs to be in CONFIG_IPV6Rohit Maheshwari
Error messages seen while building kernel with CONFIG_IPV6 disabled. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01cxgb4/chcr: Enable ktls settings at run timeRohit Maheshwari
Current design enables ktls setting from start, which is not efficient. Now the feature will be enabled when user demands TLS offload on any interface. v1->v2: - taking ULD module refcount till any single connection exists. - taking rtnl_lock() before clearing tls_devops. v2->v3: - cxgb4 is now registering to tlsdev_ops. - module refcount inc/dec in chcr. - refcount is only for connections. - removed new code from cxgb_set_feature(). v3->v4: - fixed warning message. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01ipv6: fix IPV6_ADDRFORM operation logicHangbin Liu
Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present. Previously the checking logic looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol != IPPROTO_TCP) break; After commit b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP was blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; break; else break; Then after commit 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") UDP/UDPLITE were blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; if (sk->sk_protocol == IPPROTO_TCP) do_some_check; if (sk->sk_protocol != IPPROTO_TCP) break; Fix it by using Eric's code and simply remove the break in TCP check, which looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; else break; Fixes: 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01Merge tag 'docs-5.8' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation updates from Jonathan Corbet: "A fair amount of stuff this time around, dominated by yet another massive set from Mauro toward the completion of the RST conversion. I *really* hope we are getting close to the end of this. Meanwhile, those patches reach pretty far afield to update document references around the tree; there should be no actual code changes there. There will be, alas, more of the usual trivial merge conflicts. Beyond that we have more translations, improvements to the sphinx scripting, a number of additions to the sysctl documentation, and lots of fixes" * tag 'docs-5.8' of git://git.lwn.net/linux: (130 commits) Documentation: fixes to the maintainer-entry-profile template zswap: docs/vm: Fix typo accept_threshold_percent in zswap.rst tracing: Fix events.rst section numbering docs: acpi: fix old http link and improve document format docs: filesystems: add info about efivars content Documentation: LSM: Correct the basic LSM description mailmap: change email for Ricardo Ribalda docs: sysctl/kernel: document unaligned controls Documentation: admin-guide: update bug-hunting.rst docs: sysctl/kernel: document ngroups_max nvdimm: fixes to maintainter-entry-profile Documentation/features: Correct RISC-V kprobes support entry Documentation/features: Refresh the arch support status files Revert "docs: sysctl/kernel: document ngroups_max" docs: move locking-specific documents to locking/ docs: move digsig docs to the security book docs: move the kref doc into the core-api book docs: add IRQ documentation at the core-api book docs: debugging-via-ohci1394.txt: add it to the core-api book docs: fix references for ipmi.rst file ...
2020-06-01Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - remove a now unnecessary usage of the KERNEL_DS for sys_oabi_epoll_ctl() - update my email address in a number of drivers - decompressor EFI updates from Ard Biesheuvel - module unwind section handling updates - sparsemem Kconfig cleanups - make act_mm macro respect THREAD_SIZE * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8980/1: Allow either FLATMEM or SPARSEMEM on the multiplatform build ARM: 8979/1: Remove redundant ARCH_SPARSEMEM_DEFAULT setting ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE ARM: decompressor: run decompressor in place if loaded via UEFI ARM: decompressor: move GOT into .data for EFI enabled builds ARM: decompressor: defer loading of the contents of the LC0 structure ARM: decompressor: split off _edata and stack base into separate object ARM: decompressor: move headroom variable out of LC0 ARM: 8976/1: module: allow arch overrides for .init section names ARM: 8975/1: module: fix handling of unwind init sections ARM: 8974/1: use SPARSMEM_STATIC when SPARSEMEM is enabled ARM: 8971/1: replace the sole use of a symbol with its definition ARM: 8969/1: decompressor: simplify libfdt builds Update rmk's email address in various drivers ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()
2020-06-01tipc: Fix NULL pointer dereference in __tipc_sendstream()YueHaibing
tipc_sendstream() may send zero length packet, then tipc_msg_append() do not alloc skb, skb_peek_tail() will get NULL, msg_set_ack_required will trigger NULL pointer dereference. Reported-by: syzbot+8eac6d030e7807c21d32@syzkaller.appspotmail.com Fixes: 0a3e060f340d ("tipc: add test for Nagle algorithm effectiveness") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-01Merge branch 'Link-based-attach-to-netns'Alexei Starovoitov
Jakub Sitnicki says: ==================== One of the pieces of feedback from recent review of BPF hooks for socket lookup [0] was that new program types should use bpf_link-based attachment. This series introduces new bpf_link type for attaching to network namespace. All link operations are supported. Errors returned from ops follow cgroup example. Patch 4 description goes into error semantics. The major change in v2 is a switch away from RCU to mutex-only synchronization. Andrii pointed out that it is not needed, and it makes sense to keep locking straightforward. Also, there were a couple of bugs in update_prog and fill_info initial implementation, one picked up by kbuild. Those are now fixed. Tests have been extended to cover them. Full changelog below. Series is organized as so: Patches 1-3 prepare a space in struct net to keep state for attached BPF programs, and massage the code in flow_dissector to make it attach type agnostic, to finally move it under kernel/bpf/. Patch 4, the most important one, introduces new bpf_link link type for attaching to network namespace. Patch 5 unifies the update error (ENOLINK) between BPF cgroup and netns. Patches 6-8 make libbpf and bpftool aware of the new link type. Patches 9-12 Add and extend tests to check that link low- and high-level API for operating on links to netns works as intended. Thanks to Alexei, Andrii, Lorenz, Marek, and Stanislav for feedback. -jkbs [0] https://lore.kernel.org/bpf/20200511185218.1422406-1-jakub@cloudflare.com/ Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Lorenz Bauer <lmb@cloudflare.com> Cc: Marek Majkowski <marek@cloudflare.com> Cc: Stanislav Fomichev <sdf@google.com> v1 -> v2: - Switch to mutex-only synchronization. Don't rely on RCU grace period guarantee when accessing struct net from link release / update / fill_info, and when accessing bpf_link from pernet pre_exit callback. (Andrii) - Drop patch 1, no longer needed with mutex-only synchronization. - Don't leak uninitialized variable contents from fill_info callback when link is in defunct state. (kbuild) - Make fill_info treat the link as defunct (i.e. no attached netns) when struct net refcount is 0, but link has not been yet auto-detached. - Add missing BPF_LINK_TYPE define in bpf_types.h for new link type. - Fix link update_prog callback to update the prog that will run, and not just the link itself. - Return EEXIST on prog attach when link already exists, and on link create when prog is already attached directly. (Andrii) - Return EINVAL on prog detach when link is attached. (Andrii) - Fold __netns_bpf_link_attach into its only caller. (Stanislav) - Get rid of a wrapper around container_of() (Andrii) - Use rcu_dereference_protected instead of rcu_access_pointer on update-side. (Stanislav) - Make return-on-success from netns_bpf_link_create less confusing. (Andrii) - Adapt bpf_link for cgroup to return ENOLINK when updating a defunct link. (Andrii, Alexei) - Order new exported symbols in libbpf.map alphabetically (Andrii) - Keep libbpf's "failed to attach link" warning message clear as to what we failed to attach to (cgroup vs netns). (Andrii) - Extract helpers for printing link attach type. (bpftool, Andrii) - Switch flow_dissector tests to BPF skeleton and extend them to exercise link-based flow dissector attachment. (Andrii) - Harden flow dissector attachment tests with prog query checks after prog attach/detach, or link create/update/close. - Extend flow dissector tests to cover fill_info for defunct links. - Rebase onto recent bpf-next ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01selftests/bpf: Extend test_flow_dissector to cover link creationJakub Sitnicki
Extend the existing flow_dissector test case to run tests once using direct prog attachments, and then for the second time using indirect attachment via link. The intention is to exercises the newly added high-level API for attaching programs to network namespace with links (bpf_program__attach_netns). Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-13-jakub@cloudflare.com
2020-06-01selftests/bpf: Convert test_flow_dissector to use BPF skeletonJakub Sitnicki
Switch flow dissector test setup from custom BPF object loader to BPF skeleton to save boilerplate and prepare for testing higher-level API for attaching flow dissector with bpf_link. To avoid depending on program order in the BPF object when populating the flow dissector PROG_ARRAY map, change the program section names to contain the program index into the map. This follows the example set by tailcall tests. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-12-jakub@cloudflare.com
2020-06-01selftests/bpf, flow_dissector: Close TAP device FD after the testJakub Sitnicki
test_flow_dissector leaves a TAP device after it's finished, potentially interfering with other tests that will run after it. Fix it by closing the TAP descriptor on cleanup. Fixes: 0905beec9f52 ("selftests/bpf: run flow dissector tests in skb-less mode") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-11-jakub@cloudflare.com
2020-06-01selftests/bpf: Add tests for attaching bpf_link to netnsJakub Sitnicki
Extend the existing test case for flow dissector attaching to cover: - link creation, - link updates, - link info querying, - mixing links with direct prog attachment. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-10-jakub@cloudflare.com
2020-06-01bpftool: Support link show for netns-attached linksJakub Sitnicki
Make `bpf link show` aware of new link type, that is links attached to netns. When listing netns-attached links, display netns inode number as its identifier and link attach type. Sample session: # readlink /proc/self/ns/net net:[4026532251] # bpftool prog show 357: flow_dissector tag a04f5eef06a7f555 gpl loaded_at 2020-05-30T16:53:51+0200 uid 0 xlated 16B jited 37B memlock 4096B 358: flow_dissector tag a04f5eef06a7f555 gpl loaded_at 2020-05-30T16:53:51+0200 uid 0 xlated 16B jited 37B memlock 4096B # bpftool link show 108: netns prog 357 netns_ino 4026532251 attach_type flow_dissector # bpftool link -jp show [{ "id": 108, "type": "netns", "prog_id": 357, "netns_ino": 4026532251, "attach_type": "flow_dissector" } ] (... after netns is gone ...) # bpftool link show 108: netns prog 357 netns_ino 0 attach_type flow_dissector # bpftool link -jp show [{ "id": 108, "type": "netns", "prog_id": 357, "netns_ino": 0, "attach_type": "flow_dissector" } ] Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-9-jakub@cloudflare.com
2020-06-01bpftool: Extract helpers for showing link attach typeJakub Sitnicki
Code for printing link attach_type is duplicated in a couple of places, and likely will be duplicated for future link types as well. Create helpers to prevent duplication. Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-8-jakub@cloudflare.com
2020-06-01libbpf: Add support for bpf_link-based netns attachmentJakub Sitnicki
Add bpf_program__attach_nets(), which uses LINK_CREATE subcommand to create an FD-based kernel bpf_link, for attach types tied to network namespace, that is BPF_FLOW_DISSECTOR for the moment. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-7-jakub@cloudflare.com
2020-06-01bpf, cgroup: Return ENOLINK for auto-detached links on updateJakub Sitnicki
Failure to update a bpf_link because it has been auto-detached by a dying cgroup currently results in EINVAL error, even though the arguments passed to bpf() syscall are not wrong. bpf_links attaching to netns in this case will return ENOLINK, which carries the message that the link is no longer attached to anything. Change cgroup bpf_links to do the same to keep the uAPI errors consistent. Fixes: 0c991ebc8c69 ("bpf: Implement bpf_prog replacement for an active bpf_cgroup_link") Suggested-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-6-jakub@cloudflare.com
2020-06-01bpf: Add link-based BPF program attachment to network namespaceJakub Sitnicki
Extend bpf() syscall subcommands that operate on bpf_link, that is LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO, to accept attach types tied to network namespaces (only flow dissector at the moment). Link-based and prog-based attachment can be used interchangeably, but only one can exist at a time. Attempts to attach a link when a prog is already attached directly, and the other way around, will be met with -EEXIST. Attempts to detach a program when link exists result in -EINVAL. Attachment of multiple links of same attach type to one netns is not supported with the intention to lift the restriction when a use-case presents itself. Because of that link create returns -E2BIG when trying to create another netns link, when one already exists. Link-based attachments to netns don't keep a netns alive by holding a ref to it. Instead links get auto-detached from netns when the latter is being destroyed, using a pernet pre_exit callback. When auto-detached, link lives in defunct state as long there are open FDs for it. -ENOLINK is returned if a user tries to update a defunct link. Because bpf_link to netns doesn't hold a ref to struct net, special care is taken when releasing, updating, or filling link info. The netns might be getting torn down when any of these link operations are in progress. That is why auto-detach and update/release/fill_info are synchronized by the same mutex. Also, link ops have to always check if auto-detach has not happened yet and if netns is still alive (refcnt > 0). Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-5-jakub@cloudflare.com
2020-06-01flow_dissector: Move out netns_bpf prog callbacksJakub Sitnicki
Move functions to manage BPF programs attached to netns that are not specific to flow dissector to a dedicated module named bpf/net_namespace.c. The set of functions will grow with the addition of bpf_link support for netns attached programs. This patch prepares ground by creating a place for it. This is a code move with no functional changes intended. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-4-jakub@cloudflare.com
2020-06-01net: Introduce netns_bpf for BPF programs attached to netnsJakub Sitnicki
In order to: (1) attach more than one BPF program type to netns, or (2) support attaching BPF programs to netns with bpf_link, or (3) support multi-prog attach points for netns we will need to keep more state per netns than a single pointer like we have now for BPF flow dissector program. Prepare for the above by extracting netns_bpf that is part of struct net, for storing all state related to BPF programs attached to netns. Turn flow dissector callbacks for querying/attaching/detaching a program into generic ones that operate on netns_bpf. Next patch will move the generic callbacks into their own module. This is similar to how it is organized for cgroup with cgroup_bpf. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20200531082846.2117903-3-jakub@cloudflare.com
2020-06-01flow_dissector: Pull locking up from prog attach callbackJakub Sitnicki
Split out the part of attach callback that happens with attach/detach lock acquired. This structures the prog attach callback in a way that opens up doors for moving the locking out of flow_dissector and into generic callbacks for attaching/detaching progs to netns in subsequent patches. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20200531082846.2117903-2-jakub@cloudflare.com
2020-06-01Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "A sizeable pile of arm64 updates for 5.8. Summary below, but the big two features are support for Branch Target Identification and Clang's Shadow Call stack. The latter is currently arm64-only, but the high-level parts are all in core code so it could easily be adopted by other architectures pending toolchain support Branch Target Identification (BTI): - Support for ARMv8.5-BTI in both user- and kernel-space. This allows branch targets to limit the types of branch from which they can be called and additionally prevents branching to arbitrary code, although kernel support requires a very recent toolchain. - Function annotation via SYM_FUNC_START() so that assembly functions are wrapped with the relevant "landing pad" instructions. - BPF and vDSO updates to use the new instructions. - Addition of a new HWCAP and exposure of BTI capability to userspace via ID register emulation, along with ELF loader support for the BTI feature in .note.gnu.property. - Non-critical fixes to CFI unwind annotations in the sigreturn trampoline. Shadow Call Stack (SCS): - Support for Clang's Shadow Call Stack feature, which reserves platform register x18 to point at a separate stack for each task that holds only return addresses. This protects function return control flow from buffer overruns on the main stack. - Save/restore of x18 across problematic boundaries (user-mode, hypervisor, EFI, suspend, etc). - Core support for SCS, should other architectures want to use it too. - SCS overflow checking on context-switch as part of the existing stack limit check if CONFIG_SCHED_STACK_END_CHECK=y. CPU feature detection: - Removed numerous "SANITY CHECK" errors when running on a system with mismatched AArch32 support at EL1. This is primarily a concern for KVM, which disabled support for 32-bit guests on such a system. - Addition of new ID registers and fields as the architecture has been extended. Perf and PMU drivers: - Minor fixes and cleanups to system PMU drivers. Hardware errata: - Unify KVM workarounds for VHE and nVHE configurations. - Sort vendor errata entries in Kconfig. Secure Monitor Call Calling Convention (SMCCC): - Update to the latest specification from Arm (v1.2). - Allow PSCI code to query the SMCCC version. Software Delegated Exception Interface (SDEI): - Unexport a bunch of unused symbols. - Minor fixes to handling of firmware data. Pointer authentication: - Add support for dumping the kernel PAC mask in vmcoreinfo so that the stack can be unwound by tools such as kdump. - Simplification of key initialisation during CPU bringup. BPF backend: - Improve immediate generation for logical and add/sub instructions. vDSO: - Minor fixes to the linker flags for consistency with other architectures and support for LLVM's unwinder. - Clean up logic to initialise and map the vDSO into userspace. ACPI: - Work around for an ambiguity in the IORT specification relating to the "num_ids" field. - Support _DMA method for all named components rather than only PCIe root complexes. - Minor other IORT-related fixes. Miscellaneous: - Initialise debug traps early for KGDB and fix KDB cacheflushing deadlock. - Minor tweaks to early boot state (documentation update, set TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections). - Refactoring and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits) KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h KVM: arm64: Check advertised Stage-2 page size capability arm64/cpufeature: Add get_arm64_ftr_reg_nowarn() ACPI/IORT: Remove the unused __get_pci_rid() arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register arm64/cpufeature: Add remaining feature bits in ID_PFR0 register arm64/cpufeature: Introduce ID_MMFR5 CPU register arm64/cpufeature: Introduce ID_DFR1 CPU register arm64/cpufeature: Introduce ID_PFR2 CPU register arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0 arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register arm64: mm: Add asid_gen_match() helper firmware: smccc: Fix missing prototype warning for arm_smccc_version_init arm64: vdso: Fix CFI directives in sigreturn trampoline arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction ...
2020-06-01Merge tag 'm68k-for-v5.8-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - several Mac fixes - defconfig updates - minor cleanups and fixes * tag 'm68k-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: tools: Replace zero-length array with flexible-array member m68k: Add missing __user annotation in get_user() m68k: mac: Avoid stuck ISM IOP interrupt on Quadra 900/950 m68k: mac: Remove misleading comment m68k: mac: Don't call via_flush_cache() on Mac IIfx m68k: defconfig: Update defconfigs for v5.7-rc1 m68k: amiga: config: Replace zero-length array with flexible-array member m68k: amiga: config: Mark expected switch fall-through
2020-06-01libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.cAndrii Nakryiko
On systems with recent enough glibc, reallocarray compat won't kick in, so reallocarray() itself has to come from stdlib.h include. But _GNU_SOURCE is necessary to enable it. So add it. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200601202601.2139477-1-andriin@fb.com
2020-06-01bpf: Use tracing helpers for lsm programsJiri Olsa
Currenty lsm uses bpf_tracing_func_proto helpers which do not include stack trace or perf event output. It's useful to have those for bpftrace lsm support [1]. Using tracing_prog_func_proto helpers for lsm programs. [1] https://github.com/iovisor/bpftrace/pull/1347 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200531154255.896551-1-jolsa@kernel.org
2020-06-01xdp: Rename convert_to_xdp_frame in xdp_convert_buff_to_frameLorenzo Bianconi
In order to use standard 'xdp' prefix, rename convert_to_xdp_frame utility routine in xdp_convert_buff_to_frame and replace all the occurrences Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org
2020-06-01xdp: Introduce xdp_convert_frame_to_buff utility routineLorenzo Bianconi
Introduce xdp_convert_frame_to_buff utility routine to initialize xdp_buff fields from xdp_frames ones. Rely on xdp_convert_frame_to_buff in veth xdp code. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/87acf133073c4b2d4cbb8097e8c2480c0a0fac32.1590698295.git.lorenzo@kernel.org
2020-06-01Merge branch 'bpf_setsockopt-SO_BINDTODEVICE'Alexei Starovoitov
Ferenc Fejes says: ==================== This option makes it possible to programatically bind sockets to netdevices. With the help of this option sockets of VRF unaware applications could be distributed between multiple VRFs with an eBPF program. This lets the applications benefit from multiple possible routes. v2: - splitting up the patch to three parts - lock_sk parameter for optional locking in sock_bindtoindex - Stanislav Fomichev - testing the SO_BINDTODEVICE option - Andrii Nakryiko ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01selftests/bpf: Add test for SO_BINDTODEVICE opt of bpf_setsockoptFerenc Fejes
This test intended to verify if SO_BINDTODEVICE option works in bpf_setsockopt. Because we already in the SOL_SOCKET level in this connect bpf prog its safe to verify the sanity in the beginning of the connect_v4_prog by calling the bind_to_device test helper. The testing environment already created by the test_sock_addr.sh script so this test assume that two netdevices already existing in the system: veth pair with names test_sock_addr1 and test_sock_addr2. The test will try to bind the socket to those devices first. Then the test assume there are no netdevice with "nonexistent_dev" name so the bpf_setsockopt will give use ENODEV error. At the end the test remove the device binding from the socket by binding it to an empty name. Signed-off-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/3f055b8e45c65639c5c73d0b4b6c589e60b86f15.1590871065.git.fejes@inf.elte.hu
2020-06-01bpf: Allow SO_BINDTODEVICE opt in bpf_setsockoptFerenc Fejes
Extending the supported sockopts in bpf_setsockopt with SO_BINDTODEVICE. We call sock_bindtoindex with parameter lock_sk = false in this context because we already owning the socket. Signed-off-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/4149e304867b8d5a606a305bc59e29b063e51f49.1590871065.git.fejes@inf.elte.hu
2020-06-01net: Make locking in sock_bindtoindex optionalFerenc Fejes
The sock_bindtoindex intended for kernel wide usage however it will lock the socket regardless of the context. This modification relax this behavior optionally: locking the socket will be optional by calling the sock_bindtoindex with lock_sk = true. The modification applied to all users of the sock_bindtoindex. Signed-off-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/bee6355da40d9e991b2f2d12b67d55ebb5f5b207.1590871065.git.fejes@inf.elte.hu
2020-06-01Merge tag 'x86-vdso-2020-06-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso updates from Ingo Molnar: "Clean up various aspects of the vDSO code, no change in functionality intended" * tag 'x86-vdso-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso/Makefile: Add vobjs32 x86/vdso/vdso2c: Convert iterators to unsigned x86/vdso/vdso2c: Correct error messages on file open
2020-06-01bpf: Change kvfree to kfree in generic_map_lookup_batch()Denis Efremov
buf_prevkey in generic_map_lookup_batch() is allocated with kmalloc(). It's safe to free it with kfree(). Fixes: cb4d03ab499d ("bpf: Add generic support for lookup batch op") Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200601162814.17426-1-efremov@linux.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01Merge branch 'fix-ktls-with-sk_skb_verdict'Alexei Starovoitov
John Fastabend says: ==================== If a socket is running a BPF_SK_SKB_SREAM_VERDICT program and KTLS is enabled the data stream may be broken if both TLS stream parser and BPF stream parser try to handle data. Fix this here by making KTLS stream parser run first to ensure TLS messages are received correctly and then calling the verdict program. This analogous to how we handle a similar conflict on the TX side. Note, this is a fix but it doesn't make sense to push this late to bpf tree so targeting bpf-next and keeping fixes tags. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01tools/bpf: sync bpf.hAlexei Starovoitov
Sync bpf.h into tool/include/uapi/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01bpf, selftests: Add test for ktls with skb bpf ingress policyJohn Fastabend
This adds a test for bpf ingress policy. To ensure data writes happen as expected with extra TLS headers we run these tests with data verification enabled by default. This will test receive packets have "PASS" stamped into the front of the payload. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159079363965.5745.3390806911628980210.stgit@john-Precision-5820-Tower Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01Merge branch 'xdp_devmap'Alexei Starovoitov
David Ahern says: ==================== Implementation of Daniel's proposal for allowing DEVMAP entries to be a device index, program fd pair. Programs are run after XDP_REDIRECT and have access to both Rx device and Tx device. v4 - moved struct bpf_devmap_val from uapi to devmap.c, named the union and dropped the prefix from the elements - Jesper - fixed 2 bugs in selftests v3 - renamed struct to bpf_devmap_val - used offsetofend to check for expected map size, modification of Toke's comment - check for explicit value sizes - adjusted switch statement in dev_map_run_prog per Andrii's comment - changed SEC shortcut to xdp_devmap - changed selftests to use skeleton and new map declaration v2 - moved dev_map_ext_val definition to uapi to formalize the API for devmap extensions; add bpf_ prefix to the prog_fd and prog_id entries - changed devmap code to handle struct in a way that it can support future extensions - fixed subject in libbpf patch v1 - fixed prog put on invalid program - Toke - changed write value from id to fd per Toke's comments about capabilities - add test cases ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01bpf: Fix running sk_skb program types with ktlsJohn Fastabend
KTLS uses a stream parser to collect TLS messages and send them to the upper layer tls receive handler. This ensures the tls receiver has a full TLS header to parse when it is run. However, when a socket has BPF_SK_SKB_STREAM_VERDICT program attached before KTLS is enabled we end up with two stream parsers running on the same socket. The result is both try to run on the same socket. First the KTLS stream parser runs and calls read_sock() which will tcp_read_sock which in turn calls tcp_rcv_skb(). This dequeues the skb from the sk_receive_queue. When this is done KTLS code then data_ready() callback which because we stacked KTLS on top of the bpf stream verdict program has been replaced with sk_psock_start_strp(). This will in turn kick the stream parser again and eventually do the same thing KTLS did above calling into tcp_rcv_skb() and dequeuing a skb from the sk_receive_queue. At this point the data stream is broke. Part of the stream was handled by the KTLS side some other bytes may have been handled by the BPF side. Generally this results in either missing data or more likely a "Bad Message" complaint from the kTLS receive handler as the BPF program steals some bytes meant to be in a TLS header and/or the TLS header length is no longer correct. We've already broke the idealized model where we can stack ULPs in any order with generic callbacks on the TX side to handle this. So in this patch we do the same thing but for RX side. We add a sk_psock_strp_enabled() helper so TLS can learn a BPF verdict program is running and add a tls_sw_has_ctx_rx() helper so BPF side can learn there is a TLS ULP on the socket. Then on BPF side we omit calling our stream parser to avoid breaking the data stream for the KTLS receiver. Then on the KTLS side we call BPF_SK_SKB_STREAM_VERDICT once the KTLS receiver is done with the packet but before it posts the msg to userspace. This gives us symmetry between the TX and RX halfs and IMO makes it usable again. On the TX side we process packets in this order BPF -> TLS -> TCP and on the receive side in the reverse order TCP -> TLS -> BPF. Discovered while testing OpenSSL 3.0 Alpha2.0 release. Fixes: d829e9c4112b5 ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159079361946.5745.605854335665044485.stgit@john-Precision-5820-Tower Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01selftest: Add tests for XDP programs in devmap entriesDavid Ahern
Add tests to verify ability to add an XDP program to a entry in a DEVMAP. Add negative tests to show DEVMAP programs can not be attached to devices as a normal XDP program, and accesses to egress_ifindex require BPF_XDP_DEVMAP attach type. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200529220716.75383-6-dsahern@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-06-01bpf: Refactor sockmap redirect code so its easy to reuseJohn Fastabend
We will need this block of code called from tls context shortly lets refactor the redirect logic so its easy to use. This also cleans up the switch stmt so we have fewer fallthrough cases. No logic changes are intended. Fixes: d829e9c4112b5 ("tls: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/159079360110.5745.7024009076049029819.stgit@john-Precision-5820-Tower Signed-off-by: Alexei Starovoitov <ast@kernel.org>