summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-03-29include: uapi: linux: add rpl sr header definitionAlexander Aring
This patch adds a uapi header for rpl struct definition. The segments data can be accessed over rpl_segaddr or rpl_segdata macros. In case of compri and compre is zero the segment data is not compressed and can be accessed by rpl_segaddr. In the other case the compressed data can be accessed by rpl_segdata and interpreted as byte array. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: add netlink-based PMPaolo Abeni
Expose a new netlink family to userspace to control the PM, setting: - list of local addresses to be signalled. - list of local addresses used to created subflows. - maximum number of add_addr option to react When the msk is fully established, the PM netlink attempts to announce the 'signal' list via the ADD_ADDR option. Since we currently lack the ADD_ADDR echo (and related event) only the first addr is sent. After exhausting the 'announce' list, the PM tries to create subflow for each addr in 'local' list, waiting for each connection to be completed before attempting the next one. Idea is to add an additional PM hook for ADD_ADDR echo, to allow the PM netlink announcing multiple addresses, in sequence. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: add and use MIB counter infrastructureFlorian Westphal
Exported via same /proc file as the Linux TCP MIB counters, so "netstat -s" or "nstat" will show them automatically. The MPTCP MIB counters are allocated in a distinct pcpu area in order to avoid bloating/wasting TCP pcpu memory. Counters are allocated once the first MPTCP socket is created in a network namespace and free'd on exit. If no sockets have been allocated, all-zero mptcp counters are shown. The MIB counter list is taken from the multipath-tcp.org kernel, but only a few counters have been picked up so far. The counter list can be increased at any time later on. v2 -> v3: - remove 'inline' in foo.c files (David S. Miller) Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: allow dumping subflow context to userspaceDavide Caratti
add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'. v2 -> v3: - uapi: use bit macros appropriate for userspace Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: Add handling of outgoing MP_JOIN requestsPeter Krystad
Subflow creation may be initiated by the path manager when the primary connection is fully established and a remote address has been received via ADD_ADDR. Create an in-kernel sock and use kernel_connect() to initiate connection. Passive sockets can't acquire the mptcp socket lock at subflow creation time, so an additional list protected by a new spinlock is used to track the MPJ subflows. Such list is spliced into conn_list tail every time the msk socket lock is acquired, so that it will not interfere with data flow on the original connection. Data flow and connection failover not addressed by this commit. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: Add handling of incoming MP_JOIN requestsPeter Krystad
Process the MP_JOIN option in a SYN packet with the same flow as MP_CAPABLE but when the third ACK is received add the subflow to the MPTCP socket subflow list instead of adding it to the TCP socket accept queue. The subflow is added at the end of the subflow list so it will not interfere with the existing subflows operation and no data is expected to be transmitted on it. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29mptcp: Add ADD_ADDR handlingPeter Krystad
Add handling for sending and receiving the ADD_ADDR, ADD_ADDR6, and RM_ADDR suboptions. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: Fix typo of SKB_SGO_CB_OFFSETCambda Zhu
The SKB_SGO_CB_OFFSET should be SKB_GSO_CB_OFFSET which means the offset of the GSO in skb cb. This patch fixes the typo. Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation") Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29net: macsec: add support for specifying offload upon link creationMark Starovoytov
This patch adds new netlink attribute to allow a user to (optionally) specify the desired offload mode immediately upon MACSec link creation. Separate iproute patch will be required to support this from user space. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Minor comment conflict in mac80211. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30bus: fsl-mc: add api to retrieve mc versionAndrei Botila
Add a new api that returns Management Complex firmware version and make the required structure public. The api's first user will be the caam driver for setting prediction resistance bits. Signed-off-by: Andrei Botila <andrei.botila@nxp.com> Acked-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Cc: Chris Healy <cphealy@gmail.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Horia Geantă <horia.geanta@nxp.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Iuliana Prodan <iuliana.prodan@nxp.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-imx@nxp.com Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-03-30netfilter: conntrack: add nf_ct_acct_add()wenxu
Add nf_ct_acct_add function to update the conntrack counter with packets and bytes. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-30netfilter: nf_tables: skip set types that do not support for expressionsPablo Neira Ayuso
The bitmap set does not support for expressions, skip it from the estimation step. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-30bpf: lsm: Implement attach, detach and executionKP Singh
JITed BPF programs are dynamically attached to the LSM hooks using BPF trampolines. The trampoline prologue generates code to handle conversion of the signature of the hook to the appropriate BPF context. The allocated trampoline programs are attached to the nop functions initialized as LSM hooks. BPF_PROG_TYPE_LSM programs must have a GPL compatible license and and need CAP_SYS_ADMIN (required for loading eBPF programs). Upon attachment: * A BPF fexit trampoline is used for LSM hooks with a void return type. * A BPF fmod_ret trampoline is used for LSM hooks which return an int. The attached programs can override the return value of the bpf LSM hook to indicate a MAC Policy decision. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-5-kpsingh@chromium.org
2020-03-30bpf: lsm: Provide attachment points for BPF LSM programsKP Singh
When CONFIG_BPF_LSM is enabled, nop functions, bpf_lsm_<hook_name>, are generated for each LSM hook. These functions are initialized as LSM hooks in a subsequent patch. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-4-kpsingh@chromium.org
2020-03-30security: Refactor declaration of LSM hooksKP Singh
The information about the different types of LSM hooks is scattered in two locations i.e. union security_list_options and struct security_hook_heads. Rather than duplicating this information even further for BPF_PROG_TYPE_LSM, define all the hooks with the LSM_HOOK macro in lsm_hook_defs.h which is then used to generate all the data structures required by the LSM framework. The LSM hooks are defined as: LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...) with <default_value> acccessible in security.c as: LSM_RET_DEFAULT(<hook_name>) Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-3-kpsingh@chromium.org
2020-03-30bpf: Introduce BPF_PROG_TYPE_LSMKP Singh
Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Thomas Garnier <thgarnie@google.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org
2020-03-30bpf, net: Fix build issue when net ns not configuredDaniel Borkmann
Fix a redefinition of 'net_gen_cookie' error that was overlooked when net ns is not configured. Fixes: f318903c0bf4 ("bpf: Add netns cookie and enable it for bpf cgroup hooks") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-03-29um: Implement time-travel=extJohannes Berg
This implements synchronized time-travel mode which - using a special application on a unix socket - lets multiple machines take part in a time-travelling simulation together. The protocol for the unix domain socket is defined in the new file include/uapi/linux/um_timetravel.h. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-03-29Merge tag 'irqchip-5.7' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Second batch of the GICv4.1 support saga - Level triggered interrupt support for the stm32 controller - Versatile-fpga chained interrupt fixes - DT support for cascaded VIC interrupt controller - RPi irqchip initialization fixes - Multi-instance support for the Xilinx interrupt controller - Multi-instance support for the PLIC interrupt controller - CPU hotplug support for the PLIC interrupt controller - Ingenic X1000 TCU support - Small fixes all over the shop (GICv3, GICv4, Xilinx, Atmel, sa1111) - Cleanups (setup_irq removal, zero-length array removal)
2020-03-29Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge vm fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/sparse: fix kernel crash with pfn_section_valid check mm: fork: fix kernel_stack memcg stats for various stack implementations hugetlb_cgroup: fix illegal access to memory drivers/base/memory.c: indicate all memory blocks as removable mm/swapfile.c: move inode_lock out of claim_swapfile
2020-03-29mm: fork: fix kernel_stack memcg stats for various stack implementationsRoman Gushchin
Depending on CONFIG_VMAP_STACK and the THREAD_SIZE / PAGE_SIZE ratio the space for task stacks can be allocated using __vmalloc_node_range(), alloc_pages_node() and kmem_cache_alloc_node(). In the first and the second cases page->mem_cgroup pointer is set, but in the third it's not: memcg membership of a slab page should be determined using the memcg_from_slab_page() function, which looks at page->slab_cache->memcg_params.memcg . In this case, using mod_memcg_page_state() (as in account_kernel_stack()) is incorrect: page->mem_cgroup pointer is NULL even for pages charged to a non-root memory cgroup. It can lead to kernel_stack per-memcg counters permanently showing 0 on some architectures (depending on the configuration). In order to fix it, let's introduce a mod_memcg_obj_state() helper, which takes a pointer to a kernel object as a first argument, uses mem_cgroup_from_obj() to get a RCU-protected memcg pointer and calls mod_memcg_state(). It allows to handle all possible configurations (CONFIG_VMAP_STACK and various THREAD_SIZE/PAGE_SIZE values) without spilling any memcg/kmem specifics into fork.c . Note: This is a special version of the patch created for stable backports. It contains code from the following two patches: - mm: memcg/slab: introduce mem_cgroup_from_obj() - mm: fork: fix kernel_stack memcg stats for various stack implementations [guro@fb.com: introduce mem_cgroup_from_obj()] Link: http://lkml.kernel.org/r/20200324004221.GA36662@carbon.dhcp.thefacebook.com Fixes: 4d96ba353075 ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages") Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200303233550.251375-1-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-29netfilter: nf_queue: place bridge physports into queue_entry structFlorian Westphal
The refcount is done via entry->skb, which does work fine. Major problem: When putting the refcount of the bridge ports, we must always put the references while the skb is still around. However, we will need to put the references after okfn() to avoid a possible 1 -> 0 -> 1 refcount transition, so we cannot use the skb pointer anymore. Place the physports in the queue entry structure instead to allow for refcounting changes in the next patch. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-29netfilter: nf_queue: make nf_queue_entry_release_refs staticFlorian Westphal
This is a preparation patch, no logical changes. Move free_entry into core and rename it to something more sensible. Will ease followup patches which will complicate the refcount handling. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-29KEYS: Don't write out to userspace while holding key semaphoreWaiman Long
A lockdep circular locking dependency report was seen when running a keyutils test: [12537.027242] ====================================================== [12537.059309] WARNING: possible circular locking dependency detected [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - [12537.125253] ------------------------------------------------------ [12537.153189] keyctl/25598 is trying to acquire lock: [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 [12537.208365] [12537.208365] but task is already holding lock: [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12537.270476] [12537.270476] which lock already depends on the new lock. [12537.270476] [12537.307209] [12537.307209] the existing dependency chain (in reverse order) is: [12537.340754] [12537.340754] -> #3 (&type->lock_class){++++}: [12537.367434] down_write+0x4d/0x110 [12537.385202] __key_link_begin+0x87/0x280 [12537.405232] request_key_and_link+0x483/0xf70 [12537.427221] request_key+0x3c/0x80 [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.601045] kthread+0x30c/0x3d0 [12537.617906] ret_from_fork+0x3a/0x50 [12537.636225] [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: [12537.664525] __mutex_lock+0x105/0x11f0 [12537.683734] request_key_and_link+0x35a/0xf70 [12537.705640] request_key+0x3c/0x80 [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.873477] kthread+0x30c/0x3d0 [12537.890281] ret_from_fork+0x3a/0x50 [12537.908649] [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: [12537.935225] __mutex_lock+0x105/0x11f0 [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] [12538.023920] read_pages+0xf5/0x560 [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 [12538.067047] ondemand_readahead+0x44c/0xc10 [12538.092069] filemap_fault+0xec1/0x1830 [12538.111637] __do_fault+0x82/0x260 [12538.129216] do_fault+0x419/0xfb0 [12538.146390] __handle_mm_fault+0x862/0xdf0 [12538.167408] handle_mm_fault+0x154/0x550 [12538.187401] __do_page_fault+0x42f/0xa60 [12538.207395] do_page_fault+0x38/0x5e0 [12538.225777] page_fault+0x1e/0x30 [12538.243010] [12538.243010] -> #0 (&mm->mmap_sem){++++}: [12538.267875] lock_acquire+0x14c/0x420 [12538.286848] __might_fault+0x119/0x1b0 [12538.306006] keyring_read_iterator+0x7e/0x170 [12538.327936] assoc_array_subtree_iterate+0x97/0x280 [12538.352154] keyring_read+0xe9/0x110 [12538.370558] keyctl_read_key+0x1b9/0x220 [12538.391470] do_syscall_64+0xa5/0x4b0 [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [12538.435535] [12538.435535] other info that might help us debug this: [12538.435535] [12538.472829] Chain exists of: [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class [12538.472829] [12538.524820] Possible unsafe locking scenario: [12538.524820] [12538.551431] CPU0 CPU1 [12538.572654] ---- ---- [12538.595865] lock(&type->lock_class); [12538.613737] lock(root_key_user.cons_lock); [12538.644234] lock(&type->lock_class); [12538.672410] lock(&mm->mmap_sem); [12538.687758] [12538.687758] *** DEADLOCK *** [12538.687758] [12538.714455] 1 lock held by keyctl/25598: [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12538.770573] [12538.770573] stack backtrace: [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [12538.881963] Call Trace: [12538.892897] dump_stack+0x9a/0xf0 [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 [12538.932891] ? save_trace+0xd6/0x250 [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 [12538.971643] ? keyring_compare_object+0x104/0x190 [12538.992738] ? check_usage+0x550/0x550 [12539.009845] ? sched_clock+0x5/0x10 [12539.025484] ? sched_clock_cpu+0x18/0x1e0 [12539.043555] __lock_acquire+0x1f12/0x38d0 [12539.061551] ? trace_hardirqs_on+0x10/0x10 [12539.080554] lock_acquire+0x14c/0x420 [12539.100330] ? __might_fault+0xc4/0x1b0 [12539.119079] __might_fault+0x119/0x1b0 [12539.135869] ? __might_fault+0xc4/0x1b0 [12539.153234] keyring_read_iterator+0x7e/0x170 [12539.172787] ? keyring_read+0x110/0x110 [12539.190059] assoc_array_subtree_iterate+0x97/0x280 [12539.211526] keyring_read+0xe9/0x110 [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 [12539.249076] keyctl_read_key+0x1b9/0x220 [12539.266660] do_syscall_64+0xa5/0x4b0 [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf One way to prevent this deadlock scenario from happening is to not allow writing to userspace while holding the key semaphore. Instead, an internal buffer is allocated for getting the keys out from the read method first before copying them out to userspace without holding the lock. That requires taking out the __user modifier from all the relevant read methods as well as additional changes to not use any userspace write helpers. That is, 1) The put_user() call is replaced by a direct copy. 2) The copy_to_user() call is replaced by memcpy(). 3) All the fault handling code is removed. Compiling on a x86-64 system, the size of the rxrpc_read() function is reduced from 3795 bytes to 2384 bytes with this patch. Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-03-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix memory leak in vti6, from Torsten Hilbrich. 2) Fix double free in xfrm_policy_timer, from YueHaibing. 3) NL80211_ATTR_CHANNEL_WIDTH attribute is put with wrong type, from Johannes Berg. 4) Wrong allocation failure check in qlcnic driver, from Xu Wang. 5) Get ks8851-ml IO operations right, for real this time, from Marek Vasut. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (22 commits) r8169: fix PHY driver check on platforms w/o module softdeps net: ks8851-ml: Fix IO operations, again mlxsw: spectrum_mr: Fix list iteration in error path qlcnic: Fix bad kzalloc null test mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX mac80211: mark station unauthorized before key removal mac80211: Check port authorization in the ieee80211_tx_dequeue() case cfg80211: Do not warn on same channel at the end of CSA mac80211: drop data frames without key on encrypted links ieee80211: fix HE SPR size calculation nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type xfrm: policy: Fix doulbe free in xfrm_policy_timer bpf: Explicitly memset some bpf info structures declared on the stack bpf: Explicitly memset the bpf_attr structure bpf: Sanitize the bpf_struct_ops tcp-cc name vti6: Fix memory leak of skb if input policy check fails esp: remove the skb from the chain when it's enqueued in cryptd_wq ipv6: xfrm6_tunnel.c: Use built-in RCU list checking xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire xfrm: fix uctx len check in verify_sec_ctx_len ...
2020-03-28PCI: Add Zhaoxin Vendor IDRaymond Pang
Add Zhaoxin Vendor ID to pci_ids.h Link: https://lore.kernel.org/r/20200327091148.5190-2-RaymondPang-oc@zhaoxin.com Signed-off-by: Raymond Pang <RaymondPang-oc@zhaoxin.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-03-28xdp: Support specifying expected existing program when attaching XDPToke Høiland-Jørgensen
While it is currently possible for userspace to specify that an existing XDP program should not be replaced when attaching to an interface, there is no mechanism to safely replace a specific XDP program with another. This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_FD, which can be set along with IFLA_XDP_FD. If set, the kernel will check that the program currently loaded on the interface matches the expected one, and fail the operation if it does not. This corresponds to a 'cmpxchg' memory operation. Setting the new attribute with a negative value means that no program is expected to be attached, which corresponds to setting the UPDATE_IF_NOEXIST flag. A new companion flag, XDP_FLAGS_REPLACE, is also added to explicitly request checking of the EXPECTED_FD attribute. This is needed for userspace to discover whether the kernel supports the new attribute. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/158515700640.92963.3551295145441017022.stgit@toke.dk
2020-03-28iio: cros_ec: Report hwfifo_watermark_maxGwendal Grignou
Report the maximum amount of sample the EC can hold. This is not tunable, but can be useful for application to find out the maximum amount of time it can sleep when hwfifo_timeout is set to a large number. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28iio: cros_ec: Expose hwfifo_timeoutGwendal Grignou
Expose EC minimal interrupt period through buffer/hwfifo_timeout: - Maximal timeout is limited to 65s. - When timeout for all sensors is set to 0, EC will not send events, even if the sensor sampling rate is greater than 0. Rename frequency to sampling_frequency to match IIO ABI. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28iio: cros_ec: Remove pm functionGwendal Grignou
Since cros_ec_sensorhub is shutting down the FIFO when the device suspends, no need to slow down the EC sampling period rate. It was necesseary to do that before command CMD_FIFO_INT_ENABLE was introduced, but now all supported chromebooks have it. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28iio: cros_ec: Register to cros_ec_sensorhub when EC supports FIFOGwendal Grignou
When EC supports FIFO, each IIO device registers a callback, to put samples in the buffer when they arrives from the FIFO. When no FIFO, the user space app needs to call trigger_new, or better register a high precision timer. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28iio: expose iio_device_set_clockGwendal Grignou
Some IIO devices may want to override the default (realtime) to another clock source by default. It can beneficial when timestamps coming from the hardware or underlying drivers are already in that format. It can always be overridden by attribute current_timestamp_clock. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28iio: cros_ec: Move function description to .c fileGwendal Grignou
To prevent comment rot, move function description to cros_ec_sensors_core.c. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28platform/chrome: cros_ec_sensorhub: Add median filterGwendal Grignou
Events are timestamped in EC time space, their timestamps need to be converted in host time space. The assumption is the time delta between when the interrupt is sent by the EC and when it is receive by the host is a [small] constant. This is not always true, even with hard-wired interrupt. To mitigate worst offenders, add a median filter to weed out bigger than expected delays. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Lee Jones <lee.jones@linaro.org> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28platform/chrome: cros_ec_sensorhub: Add FIFO supportGwendal Grignou
cros_ec_sensorhub registers a listener and query motion sense FIFO, spread to iio sensors registers. To test, we can use libiio: iiod& iio_readdev -u ip:localhost -T 10000 -s 25 -b 16 cros-ec-gyro | od -x Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28platform/chrome: cros_ec_sensorhub: Add the number of sensors in sensorhubGwendal Grignou
To better manage resources, store the number of sensors reported by the EC. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-28Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Three more driver bugfixes, and two doc improvements fixing build warnings while we are here" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: pca-platform: Use platform_irq_get_optional i2c: st: fix missing struct parameter description i2c: nvidia-gpu: Handle timeout correctly in gpu_i2c_check_status() i2c: fix a doc warning i2c: hix5hd2: add missed clk_disable_unprepare in remove
2020-03-28PCI/AER: Rationalize error status register clearingKuppuswamy Sathyanarayanan
The AER interfaces to clear error status registers were a confusing mess: - pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors from the Uncorrectable Error Status register. - pci_aer_clear_fatal_status() cleared fatal errors from the Uncorrectable Error Status register. - pci_cleanup_aer_error_status_regs() cleared the Root Error Status register (for Root Ports), the Uncorrectable Error Status register, and the Correctable Error Status register. Rename them to make them consistent: From To ---------------------------------------- ------------------------------- pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status() pci_aer_clear_fatal_status() pci_aer_clear_fatal_status() pci_cleanup_aer_error_status_regs() pci_aer_clear_status() Since pci_cleanup_aer_error_status_regs() (renamed to pci_aer_clear_status()) is only used within drivers/pci/, move the declaration from <linux/aer.h> to drivers/pci/pci.h. [bhelgaas: commit log, add renames] Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-03-28PCI/DPC: Add Error Disconnect Recover (EDR) supportKuppuswamy Sathyanarayanan
Error Disconnect Recover (EDR) is a feature that allows ACPI firmware to notify OSPM that a device has been disconnected due to an error condition (ACPI v6.3, sec 5.6.6). OSPM advertises its support for EDR on PCI devices via _OSC (see [1], sec 4.5.1, table 4-4). The OSPM EDR notify handler should invalidate software state associated with disconnected devices and may attempt to recover them. OSPM communicates the status of recovery to the firmware via _OST (sec 6.3.5.2). For PCIe, firmware may use Downstream Port Containment (DPC) to support EDR. Per [1], sec 4.5.1, table 4-6, even if firmware has retained control of DPC, OSPM may read/write DPC control and status registers during the EDR notification processing window, i.e., from the time it receives an EDR notification until it clears the DPC Trigger Status. Note that per [1], sec 4.5.1 and 4.5.2.4, 1. If the OS supports EDR, it should advertise that to firmware by setting OSC_PCI_EDR_SUPPORT in _OSC Support. 2. If the OS sets OSC_PCI_EXPRESS_DPC_CONTROL in _OSC Control to request control of the DPC capability, it must also set OSC_PCI_EDR_SUPPORT in _OSC Support. Add an EDR notify handler to attempt recovery. [1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019, affecting PCI Firmware Specification, Rev. 3.2 https://members.pcisig.com/wg/PCI-SIG/document/12888 [bhelgaas: squash add/enable patches into one] Link: https://lore.kernel.org/r/90f91fe6d25c13f9d2255d2ce97ca15be307e1bb.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org>
2020-03-28PCI/DPC: Move DPC data into struct pci_devBjorn Helgaas
We only need 25 bits of data for DPC, so I don't think it's worth the complexity of allocating and keeping track of the struct dpc_dev separately from the pci_dev. Move that data into the struct pci_dev. Link: https://lore.kernel.org/r/98323eaa18080adbe5bb30846862f09f8722d4b3.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-03-28Merge tag 'nfs-rdma-for-5.7-1' of ↵Trond Myklebust
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA Client Updates for Linux 5.7 New Features: - Allow one active connection and several zombie connections to prevent blocking if the remote server is unresponsive. Bugfixes and Cleanups: - Enhance MR-related trace points - Refactor connection set-up and disconnect functions - Make Protection Domains per-connection instead of per-transport - Merge struct rpcrdma_ia into rpcrdma_ep
2020-03-28fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_tThomas Gleixner
Bit spinlocks are problematic if PREEMPT_RT is enabled, because they disable preemption, which is undesired for latency reasons and breaks when regular spinlocks are taken within the bit_spinlock locked region because regular spinlocks are converted to 'sleeping spinlocks' on RT. PREEMPT_RT replaced the bit spinlocks with regular spinlocks to avoid this problem. The replacement was done conditionaly at compile time, but Christoph requested to do an unconditional conversion. Jan suggested to move the spinlock into a existing padding hole which avoids a size increase of struct buffer_head on production kernels. As a benefit the lock gains lockdep coverage. [ bigeasy: Remove the wrapper and use always spinlock_t and move it into the padding hole ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Link: https://lkml.kernel.org/r/20191118132824.rclhrbujqh4b4g4d@linutronix.de
2020-03-28Merge branch 'uaccess.futex' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into locking/core Pull uaccess futex cleanups for Al Viro: Consolidate access_ok() usage and the futex uaccess function zoo.
2020-03-28Merge branch 'next.uaccess-2' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into x86/cleanups Pull uaccess cleanups from Al Viro: Consolidate the user access areas and get rid of uaccess_try(), user_ex() and other warts.
2020-03-28cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()Thomas Gleixner
A recent change to freeze_secondary_cpus() which added an early abort if a wakeup is pending missed the fact that the function is also invoked for shutdown, reboot and kexec via disable_nonboot_cpus(). In case of disable_nonboot_cpus() the wakeup event needs to be ignored as the purpose is to terminate the currently running kernel. Add a 'suspend' argument which is only set when the freeze is in context of a suspend operation. If not set then an eventually pending wakeup event is ignored. Fixes: a66d955e910a ("cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending") Reported-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Pavankumar Kondeti <pkondeti@codeaurora.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/874kuaxdiz.fsf@nanos.tec.linutronix.de
2020-03-27generic arch_futex_atomic_op_inuser() doesn't need access_ok()Al Viro
uses get_user() and put_user() for memory accesses Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-27futex: arch_futex_atomic_op_inuser() calling conventions changeAl Viro
Move access_ok() in and pagefault_enable()/pagefault_disable() out. Mechanical conversion only - some instances don't really need a separate access_ok() at all (e.g. the ones only using get_user()/put_user(), or architectures where access_ok() is always true); we'll deal with that in followups. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-27bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor idDaniel Borkmann
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(), recvmsg() and bind-related hooks in order to retrieve the cgroup v2 context which can then be used as part of the key for BPF map lookups, for example. Given these hooks operate in process context 'current' is always valid and pointing to the app that is performing mentioned syscalls if it's subject to a v2 cgroup. Also with same motivation of commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper") enable retrieval of ancestor from current so the cgroup id can be used for policy lookups which can then forbid connect() / bind(), for example. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
2020-03-27bpf: Allow to retrieve cgroup v1 classid from v2 hooksDaniel Borkmann
Today, Kubernetes is still operating on cgroups v1, however, it is possible to retrieve the task's classid based on 'current' out of connect(), sendmsg(), recvmsg() and bind-related hooks for orchestrators which attach to the root cgroup v2 hook in a mixed env like in case of Cilium, for example, in order to then correlate certain pod traffic and use it as part of the key for BPF map lookups. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/555e1c69db7376c0947007b4951c260e1074efc3.1585323121.git.daniel@iogearbox.net