summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-16bpf, sockmap: fix leakage of smap_psock_map_entryDaniel Borkmann
While working on sockmap I noticed that we do not always kfree the struct smap_psock_map_entry list elements which track psocks attached to maps. In the case of sock_hash_ctx_update_elem(), these map entries are allocated outside of __sock_map_ctx_update_elem() with their linkage to the socket hash table filled. In the case of sock array, the map entries are allocated inside of __sock_map_ctx_update_elem() and added with their linkage to the psock->maps. Both additions are under psock->maps_lock each. Now, we drop these elements from their psock->maps list in a few occasions: i) in sock array via smap_list_map_remove() when an entry is either deleted from the map from user space, or updated via user space or BPF program where we drop the old socket at that map slot, or the sock array is freed via sock_map_free() and drops all its elements; ii) for sock hash via smap_list_hash_remove() in exactly the same occasions as just described for sock array; iii) in the bpf_tcp_close() where we remove the elements from the list via psock_map_pop() and iterate over them dropping themselves from either sock array or sock hash; and last but not least iv) once again in smap_gc_work() which is a callback for deferring the work once the psock refcount hit zero and thus the socket is being destroyed. Problem is that the only case where we kfree() the list entry is in case iv), which at that point should have an empty list in normal cases. So in cases from i) to iii) we unlink the elements without freeing where they go out of reach from us. Hence fix is to properly kfree() them as well to stop the leakage. Given these are all handled under psock->maps_lock there is no need for deferred RCU freeing. I later also ran with kmemleak detector and it confirmed the finding as well where in the state before the fix the object goes unreferenced while after the patch no kmemleak report related to BPF showed up. [...] unreferenced object 0xffff880378eadae0 (size 64): comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ 50 4d 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00 PMu]............ backtrace: [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210 [<0000000045dd6d3c>] bpf_sock_map_update+0x29/0x60 [<00000000877723aa>] ___bpf_prog_run+0x1e1f/0x4960 [<000000002ef89e83>] 0xffffffffffffffff unreferenced object 0xffff880378ead240 (size 64): comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ 00 44 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00 .Du]............ backtrace: [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210 [<0000000030e37a3a>] sock_map_update_elem+0x125/0x240 [<000000002e5ce36e>] map_update_elem+0x4eb/0x7b0 [<00000000db453cc9>] __x64_sys_bpf+0x1f9/0x360 [<0000000000763660>] do_syscall_64+0x9a/0x300 [<00000000422a2bb2>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<000000002ef89e83>] 0xffffffffffffffff [...] Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close") Fixes: 54fedb42c653 ("bpf: sockmap, fix smap_list_map_remove when psock is in many maps") Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattachDaniel Borkmann
I found that in BPF sockmap programs once we either delete a socket from the map or we updated a map slot and the old socket was purged from the map that these socket can never get reattached into a map even though their related psock has been dropped entirely at that point. Reason is that tcp_cleanup_ulp() leaves the old icsk->icsk_ulp_ops intact, so that on the next tcp_set_ulp_id() the kernel returns an -EEXIST thinking there is still some active ULP attached. BPF sockmap is the only one that has this issue as the other user, kTLS, only calls tcp_cleanup_ulp() from tcp_v4_destroy_sock() whereas sockmap semantics allow dropping the socket from the map with all related psock state being cleaned up. Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16tcp, ulp: add alias for all ulp modulesDaniel Borkmann
Lets not turn the TCP ULP lookup into an arbitrary module loader as we only intend to load ULP modules through this mechanism, not other unrelated kernel modules: [root@bar]# cat foo.c #include <sys/types.h> #include <sys/socket.h> #include <linux/tcp.h> #include <linux/in.h> int main(void) { int sock = socket(PF_INET, SOCK_STREAM, 0); setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp")); return 0; } [root@bar]# gcc foo.c -O2 -Wall [root@bar]# lsmod | grep sctp [root@bar]# ./a.out [root@bar]# lsmod | grep sctp sctp 1077248 4 libcrc32c 16384 3 nf_conntrack,nf_nat,sctp [root@bar]# Fix it by adding module alias to TCP ULP modules, so probing module via request_module() will be limited to tcp-ulp-[name]. The existing modules like kTLS will load fine given tcp-ulp-tls alias, but others will fail to load: [root@bar]# lsmod | grep sctp [root@bar]# ./a.out [root@bar]# lsmod | grep sctp [root@bar]# Sockmap is not affected from this since it's either built-in or not. Fixes: 734942cc4ea6 ("tcp: ULP infrastructure") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16Merge branch 'linus/master' into rdma.git for-nextJason Gunthorpe
rdma.git merge resolution for the 4.19 merge window Conflicts: drivers/infiniband/core/rdma_core.c - Use the rdma code and revise with the new spelling for atomic_fetch_add_unless drivers/nvme/host/rdma.c - Replace max_sge with max_send_sge in new blk code drivers/nvme/target/rdma.c - Use the blk code and revise to use NULL for ib_post_recv when appropriate - Replace max_sge with max_recv_sge in new blk code net/rds/ib_send.c - Use the net code and revise to use NULL for ib_post_recv when appropriate Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16Revert "net/smc: Replace ib_query_gid with rdma_get_gid_attr"Jason Gunthorpe
This reverts commit ddb457c6993babbcdd41fca638b870d2a2fc3941. The include rdma/ib_cache.h is kept, and we have to add a memset to the compat wrapper to avoid compiler warnings in gcc-7 This revert is done to avoid extensive merge conflicts with SMC changes in netdev during the 4.19 merge window. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16bpf: fix a rcu usage warning in bpf_prog_array_copy_core()Yonghong Song
Commit 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") refactored the bpf_prog_array_copy_core() to accommodate new structure bpf_prog_array_item which contains bpf_prog array itself. In the old code, we had perf_event_query_prog_array(): mutex_lock(...) bpf_prog_array_copy_call(): prog = rcu_dereference_check(array, 1)->progs bpf_prog_array_copy_core(prog, ...) mutex_unlock(...) With the above commit, we had perf_event_query_prog_array(): mutex_lock(...) bpf_prog_array_copy_call(): bpf_prog_array_copy_core(array, ...): item = rcu_dereference(array)->items; ... mutex_unlock(...) The new code will trigger a lockdep rcu checking warning. The fix is to change rcu_dereference() to rcu_dereference_check() to prevent such a warning. Reported-by: syzbot+6e72317008eef84a216b@syzkaller.appspotmail.com Fixes: 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERMJesper Dangaard Brouer
It is common XDP practice to unload/deattach the XDP bpf program, when the XDP sample program is Ctrl-C interrupted (SIGINT) or killed (SIGTERM). The samples/bpf programs xdp_redirect_cpu and xdp_rxq_info, forgot to trap signal SIGTERM (which is the default signal used by the kill command). This was discovered by Red Hat QA, which automated scripts depend on killing the XDP sample program after a timeout period. Fixes: fad3917e361b ("samples/bpf: add cpumap sample program xdp_redirect_cpu") Fixes: 0fca931a6f21 ("samples/bpf: program demonstrating access to xdp_rxq_info") Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16net/xdp: Fix suspicious RCU usage warningTariq Toukan
Fix the warning below by calling rhashtable_lookup_fast. Also, make some code movements for better quality and human readability. [ 342.450870] WARNING: suspicious RCU usage [ 342.455856] 4.18.0-rc2+ #17 Tainted: G O [ 342.462210] ----------------------------- [ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage! [ 342.476568] [ 342.476568] other info that might help us debug this: [ 342.476568] [ 342.486978] [ 342.486978] rcu_scheduler_active = 2, debug_locks = 1 [ 342.495211] 4 locks held by modprobe/3934: [ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at: mlx5_unregister_interface+0x18/0x90 [mlx5_core] [ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20 [ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60 [mlx5_core] [ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0 [ 342.541206] [ 342.541206] stack backtrace: [ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17 [ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015 [ 342.565606] Call Trace: [ 342.568861] dump_stack+0x78/0xb3 [ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0 [ 342.578285] ? __call_rcu+0x220/0x300 [ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core] [ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core] [ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core] [ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core] [ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core] [ 342.613005] __dev_close_many+0xb1/0x120 [ 342.617911] dev_close_many+0xa2/0x170 [ 342.622622] rollback_registered_many+0x148/0x460 [ 342.628401] ? __lock_acquire+0x48d/0x11b0 [ 342.633498] ? unregister_netdev+0xe/0x20 [ 342.638495] rollback_registered+0x56/0x90 [ 342.643588] unregister_netdevice_queue+0x7e/0x100 [ 342.649461] unregister_netdev+0x18/0x20 [ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core] [ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core] [ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core] [ 342.673038] cleanup+0x5/0xbfc [mlx5_core] [ 342.678094] __x64_sys_delete_module+0x16b/0x240 [ 342.683725] ? do_syscall_64+0x1c/0x210 [ 342.688476] do_syscall_64+0x5a/0x210 [ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16net/mlx5e: Delete unneeded function argumentYuval Shaia
priv argument is not used by the function, delete it. Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16Documentation: networking: ti-cpsw: correct cbs parameters for Eth1 100MbIvan Khoronzhuk
If set cbs parameters calculated for 1000Mb, but use on 100Mb port w/o h/w offload (for cpsw offload it doesn't matter), it works incorrectly. According to the example and testing board, second port is 100Mb interface. Correct them on recalculated for 100Mb interface. It allows to use the same command for CBS software implementation for board in example. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16isdn: Disable IIOCDBGVARKees Cook
It was possible to directly leak the kernel address where the isdn_dev structure pointer was stored. This is a kernel ASLR bypass for anyone with access to the ioctl. The code had been present since the beginning of git history, though this shouldn't ever be needed for normal operation, therefore remove it. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16net: dsa: add support for ksz9897 ethernet switchLad, Prabhakar
ksz9477 is superset of ksz9xx series, driver just works out of the box for ksz9897 chip with this patch. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16veth: Free queues on link deleteToshiaki Makita
David Ahern reported memory leak in veth. ======================================================================= $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8800354d5c00 (size 1024): comm "ip", pid 836, jiffies 4294722952 (age 25.904s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<(____ptrval____)>] kmemleak_alloc+0x70/0x94 [<(____ptrval____)>] slab_post_alloc_hook+0x42/0x52 [<(____ptrval____)>] __kmalloc+0x101/0x142 [<(____ptrval____)>] kmalloc_array.constprop.20+0x1e/0x26 [veth] [<(____ptrval____)>] veth_newlink+0x147/0x3ac [veth] ... unreferenced object 0xffff88002e009c00 (size 1024): comm "ip", pid 836, jiffies 4294722958 (age 25.898s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<(____ptrval____)>] kmemleak_alloc+0x70/0x94 [<(____ptrval____)>] slab_post_alloc_hook+0x42/0x52 [<(____ptrval____)>] __kmalloc+0x101/0x142 [<(____ptrval____)>] kmalloc_array.constprop.20+0x1e/0x26 [veth] [<(____ptrval____)>] veth_newlink+0x219/0x3ac [veth] ======================================================================= veth_rq allocated in veth_newlink() was not freed on dellink. We need to free up them after veth_close() so that any packets will not reference the queues afterwards. Thus free them in veth_dev_free() in the same way as freeing stats structure (vstats). Also move queues allocation to veth_dev_init() to be in line with stats allocation. Fixes: 638264dc90227 ("veth: Support per queue XDP ring") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16ila: make lockdep happy againCong Wang
Previously, alloc_ila_locks() and bucket_table_alloc() call spin_lock_init() separately, therefore they have two different lock names and lock class keys. However, after commit b893281715ab ("ila: Call library function alloc_bucket_locks") they both call helper alloc_bucket_spinlocks() which now only has one lock name and lock class key. This causes a few bogus lockdep warnings as reported by syzbot. Fix this by making alloc_bucket_locks() a macro and pass declaration name as lock name and a static lock class key inside the macro. Fixes: b893281715ab ("ila: Call library function alloc_bucket_locks") Reported-by: <syzbot+b66a5a554991a8ed027c@syzkaller.appspotmail.com> Cc: Tom Herbert <tom@quantonium.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16net: sched: act_ife: always release ife action on init errorVlad Buslov
Action init API was changed to always take reference to action, even when overwriting existing action. Substitute conditional action release, which was executed only if action is newly created, with unconditional release in tcf_ife_init() error handling code to prevent double free or memory leak in case of overwrite. Fixes: 4e8ddd7f1758 ("net: sched: don't release reference on action overwrite") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16Merge tag 'v4.18' into rdma.git for-nextJason Gunthorpe
Resolve merge conflicts from the -rc cycle against the rdma.git tree: Conflicts: drivers/infiniband/core/uverbs_cmd.c - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next - Merge removal of file->ucontext in for-next with new code in -rc drivers/infiniband/core/uverbs_main.c - for-next removed code from ib_uverbs_write() that was modified in for-rc Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16dt-bindings: net: ravb: Add support for r8a774a1 SoCFabrizio Castro
Document RZ/G2M (R8A774A1) SoC bindings. Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16cls_matchall: fix tcf_unbind_filter missingHangbin Liu
Fix tcf_unbind_filter missing in cls_matchall as this will trigger WARN_ON() in cbq_destroy_class(). Fixes: fd62d9f5c575f ("net/sched: matchall: Fix configuration race") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16Merge branch 'next' into for-linusDmitry Torokhov
Prepare input updates for 4.19 merge window.
2018-08-16drm/amdgpu: Use kvmalloc for allocating UVD/VCE/VCN BO backup memoryMichel Dänzer
The allocated size can be (at least?) as large as megabytes, and there's no need for it to be physically contiguous. May avoid spurious failures to initialize / suspend the corresponding block while there's memory pressure. Bugzilla: https://bugs.freedesktop.org/107432 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-16Merge tag 'for-linus-4.19-ofs1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs updates from Mike Marshall: "Orangefs: one cleanup and Souptick's vm_fault_t patch: - add new return type vm_fault_t (Souptick Joarder) - remove redundant pointer (Colin Ian King)" * tag 'for-linus-4.19-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: remove redundant pointer orangefs_inode orangefs: Adding new return type vm_fault_t
2018-08-16dm writecache: fix a crash due to reading past end of dirty_bitmapMikulas Patocka
wc->dirty_bitmap_size is in bytes so must multiply it by 8, not by BITS_PER_LONG, to get number of bitmap_bits. Fixes crash in find_next_bit() that was reported: https://bugzilla.kernel.org/show_bug.cgi?id=200819 Reported-by: edo.rus@gmail.com Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-08-16netfilter: nft_dynset: allow dynamic updates of non-anonymous setPablo Neira Ayuso
This check is superfluous since it breaks valid configurations, remove it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: nft_tproxy: Fix missing-braces warningMáté Eckl
This patch fixes a warning reported by the kbuild test robot (from linux-next tree): net/netfilter/nft_tproxy.c: In function 'nft_tproxy_eval_v6': >> net/netfilter/nft_tproxy.c:85:9: warning: missing braces around initializer [-Wmissing-braces] struct in6_addr taddr = {0}; ^ net/netfilter/nft_tproxy.c:85:9: warning: (near initialization for 'taddr.in6_u') [-Wmissing-braces] This warning is actually caused by a gcc bug already resolved in newer versions (kbuild used 4.9) so this kind of initialization is omitted and memset is used instead. Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: uapi: fix linux/netfilter/nf_osf.h userspace compilation errorsDmitry V. Levin
Move inclusion of <linux/ip.h> and <linux/tcp.h> from linux/netfilter/xt_osf.h to linux/netfilter/nf_osf.h to fix the following linux/netfilter/nf_osf.h userspace compilation errors: /usr/include/linux/netfilter/nf_osf.h:59:24: error: 'MAX_IPOPTLEN' undeclared here (not in a function) struct nf_osf_opt opt[MAX_IPOPTLEN]; /usr/include/linux/netfilter/nf_osf.h:64:17: error: field 'ip' has incomplete type struct iphdr ip; /usr/include/linux/netfilter/nf_osf.h:65:18: error: field 'tcp' has incomplete type struct tcphdr tcp; Fixes: bfb15f2a95cb ("netfilter: extract Passive OS fingerprint infrastructure from xt_osf") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: nft_ct: make l3 protocol field optional for timeout objectHarsha Sharma
If l3 protocol value is not specified for ct timeout object then use the value from nft_ctx protocol family. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: doc: Add nf_tables part in tproxy.txtMáté Eckl
Recently, transparent proxy support has been added to nf_tables so that this document should be updated with the new information. - Nft commands are added as alternatives to iptables ones. - The link for a patched iptables is removed as it is already part of the mainline iptables implementation (and the link is dead). - tcprdr is added as an example implementation of a transparent proxy Cc: "David S. Miller" <davem@davemloft.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Florian Westphal <fw@strlen.de> Cc: KOVACS Krisztian <hidden@sch.bme.hu> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: linux-doc@vger.kernel.org Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: x_tables: do not fail xt_alloc_table_info too easillyMichal Hocko
eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_table_info()") has unintentionally fortified xt_alloc_table_info allocation when __GFP_RETRY has been dropped from the vmalloc fallback. Later on there was a syzbot report that this can lead to OOM killer invocations when tables are too large and 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive") has been merged to restore the original behavior. Georgi Nikolov however noticed that he is not able to install his iptables anymore so this can be seen as a regression. The primary argument for 0537250fdc6c was that this allocation path shouldn't really trigger the OOM killer and kill innocent tasks. On the other hand the interface requires root and as such should allow what the admin asks for. Root inside a namespaces makes this more complicated because those might be not trusted in general. If they are not then such namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY and replace it by __GFP_ACCOUNT to enfore memcg constrains on it. Fixes: 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive") Reported-by: Georgi Nikolov <gnikolov@icdsoft.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: conntrack: fix removal of conntrack entries when l4tracker is removedFlorian Westphal
nf_ct_l4proto_unregister_one() leaves conntracks added by to-be-removed tracker behind, nf_ct_l4proto_unregister has to iterate for each protocol to be removed. v2: call nf_ct_iterate_destroy without holding nf_ct_proto_mutex. Fixes: 2c41f33c1b703 ("netfilter: move table iteration out of netns exit paths") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: nf_tables: don't prevent event handler from device cleanup on ↵Florian Westphal
netns exit When a netnsamespace exits, the nf_tables pernet_ops will remove all rules. However, there is one caveat: Base chains that register ingress hooks will cause use-after-free: device is already gone at that point. The device event handlers prevent this from happening: netns exit synthesizes unregister events for all devices. However, an improper fix for a race condition made the notifiers a no-op in case they get called from netns exit path, so revert that part. This is safe now as the previous patch fixed nf_tables pernet ops and device notifier initialisation ordering. Fixes: 0a2cf5ee432c2 ("netfilter: nf_tables: close race between netns exit and rmmod") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: nf_tables: fix register orderingFlorian Westphal
We must register nfnetlink ops last, as that exposes nf_tables to userspace. Without this, we could theoretically get nfnetlink request before net->nft state has been initialized. Fixes: 99633ab29b213 ("netfilter: nf_tables: complete net namespace support") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: fix memory leaks on netlink_dump_start errorFlorian Westphal
Shaochun Chen points out we leak dumper filter state allocations stored in dump_control->data in case there is an error before netlink sets cb_running (after which ->done will be called at some point). In order to fix this, add .start functions and move allocations there. Same pattern as used in commit 90fd131afc565159c9e0ea742f082b337e10f8c6 ("netfilter: nf_tables: move dumper state allocation into ->start"). Reported-by: shaochun chen <cscnull@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: nft_set: fix allocation size overflow in privsize callback.Taehee Yoo
In order to determine allocation size of set, ->privsize is invoked. At this point, both desc->size and size of each data structure of set are used. desc->size means number of element that is given by user. desc->size is u32 type. so that upperlimit of set element is 4294967295. but return type of ->privsize is also u32. hence overflow can occurred. test commands: %nft add table ip filter %nft add set ip filter hash1 { type ipv4_addr \; size 4294967295 \; } %nft list ruleset splat looks like: [ 1239.202910] kasan: CONFIG_KASAN_INLINE enabled [ 1239.208788] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 1239.217625] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 1239.219329] CPU: 0 PID: 1603 Comm: nft Not tainted 4.18.0-rc5+ #7 [ 1239.229091] RIP: 0010:nft_hash_walk+0x1d2/0x310 [nf_tables_set] [ 1239.229091] Code: 84 d2 7f 10 4c 89 e7 89 44 24 38 e8 d8 5a 17 e0 8b 44 24 38 48 8d 7b 10 41 0f b6 0c 24 48 89 fa 48 89 fe 48 c1 ea 03 83 e6 07 <42> 0f b6 14 3a 40 38 f2 7f 1a 84 d2 74 16 [ 1239.229091] RSP: 0018:ffff8801118cf358 EFLAGS: 00010246 [ 1239.229091] RAX: 0000000000000000 RBX: 0000000000020400 RCX: 0000000000000001 [ 1239.229091] RDX: 0000000000004082 RSI: 0000000000000000 RDI: 0000000000020410 [ 1239.229091] RBP: ffff880114d5a988 R08: 0000000000007e94 R09: ffff880114dd8030 [ 1239.229091] R10: ffff880114d5a988 R11: ffffed00229bb006 R12: ffff8801118cf4d0 [ 1239.229091] R13: ffff8801118cf4d8 R14: 0000000000000000 R15: dffffc0000000000 [ 1239.229091] FS: 00007f5a8fe0b700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 1239.229091] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1239.229091] CR2: 00007f5a8ecc27b0 CR3: 000000010608e000 CR4: 00000000001006f0 [ 1239.229091] Call Trace: [ 1239.229091] ? nft_hash_remove+0xf0/0xf0 [nf_tables_set] [ 1239.229091] ? memset+0x1f/0x40 [ 1239.229091] ? __nla_reserve+0x9f/0xb0 [ 1239.229091] ? memcpy+0x34/0x50 [ 1239.229091] nf_tables_dump_set+0x9a1/0xda0 [nf_tables] [ 1239.229091] ? __kmalloc_reserve.isra.29+0x2e/0xa0 [ 1239.229091] ? nft_chain_hash_obj+0x630/0x630 [nf_tables] [ 1239.229091] ? nf_tables_commit+0x2c60/0x2c60 [nf_tables] [ 1239.229091] netlink_dump+0x470/0xa20 [ 1239.229091] __netlink_dump_start+0x5ae/0x690 [ 1239.229091] nft_netlink_dump_start_rcu+0xd1/0x160 [nf_tables] [ 1239.229091] nf_tables_getsetelem+0x2e5/0x4b0 [nf_tables] [ 1239.229091] ? nft_get_set_elem+0x440/0x440 [nf_tables] [ 1239.229091] ? nft_chain_hash_obj+0x630/0x630 [nf_tables] [ 1239.229091] ? nf_tables_dump_obj_done+0x70/0x70 [nf_tables] [ 1239.229091] ? nla_parse+0xab/0x230 [ 1239.229091] ? nft_get_set_elem+0x440/0x440 [nf_tables] [ 1239.229091] nfnetlink_rcv_msg+0x7f0/0xab0 [nfnetlink] [ 1239.229091] ? nfnetlink_bind+0x1d0/0x1d0 [nfnetlink] [ 1239.229091] ? debug_show_all_locks+0x290/0x290 [ 1239.229091] ? sched_clock_cpu+0x132/0x170 [ 1239.229091] ? find_held_lock+0x39/0x1b0 [ 1239.229091] ? sched_clock_local+0x10d/0x130 [ 1239.229091] netlink_rcv_skb+0x211/0x320 [ 1239.229091] ? nfnetlink_bind+0x1d0/0x1d0 [nfnetlink] [ 1239.229091] ? netlink_ack+0x7b0/0x7b0 [ 1239.229091] ? ns_capable_common+0x6e/0x110 [ 1239.229091] nfnetlink_rcv+0x2d1/0x310 [nfnetlink] [ 1239.229091] ? nfnetlink_rcv_batch+0x10f0/0x10f0 [nfnetlink] [ 1239.229091] ? netlink_deliver_tap+0x829/0x930 [ 1239.229091] ? lock_acquire+0x265/0x2e0 [ 1239.229091] netlink_unicast+0x406/0x520 [ 1239.509725] ? netlink_attachskb+0x5b0/0x5b0 [ 1239.509725] ? find_held_lock+0x39/0x1b0 [ 1239.509725] netlink_sendmsg+0x987/0xa20 [ 1239.509725] ? netlink_unicast+0x520/0x520 [ 1239.509725] ? _copy_from_user+0xa9/0xc0 [ 1239.509725] __sys_sendto+0x21a/0x2c0 [ 1239.509725] ? __ia32_sys_getpeername+0xa0/0xa0 [ 1239.509725] ? retint_kernel+0x10/0x10 [ 1239.509725] ? sched_clock_cpu+0x132/0x170 [ 1239.509725] ? find_held_lock+0x39/0x1b0 [ 1239.509725] ? lock_downgrade+0x540/0x540 [ 1239.509725] ? up_read+0x1c/0x100 [ 1239.509725] ? __do_page_fault+0x763/0x970 [ 1239.509725] ? retint_user+0x18/0x18 [ 1239.509725] __x64_sys_sendto+0x177/0x180 [ 1239.509725] do_syscall_64+0xaa/0x360 [ 1239.509725] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1239.509725] RIP: 0033:0x7f5a8f468e03 [ 1239.509725] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb d0 0f 1f 84 00 00 00 00 00 83 3d 49 c9 2b 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 [ 1239.509725] RSP: 002b:00007ffd78d0b778 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 1239.509725] RAX: ffffffffffffffda RBX: 00007ffd78d0c890 RCX: 00007f5a8f468e03 [ 1239.509725] RDX: 0000000000000034 RSI: 00007ffd78d0b7e0 RDI: 0000000000000003 [ 1239.509725] RBP: 00007ffd78d0b7d0 R08: 00007f5a8f15c160 R09: 000000000000000c [ 1239.509725] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd78d0b7e0 [ 1239.509725] R13: 0000000000000034 R14: 00007f5a8f9aff60 R15: 00005648040094b0 [ 1239.509725] Modules linked in: nf_tables_set nf_tables nfnetlink ip_tables x_tables [ 1239.670713] ---[ end trace 39375adcda140f11 ]--- [ 1239.676016] RIP: 0010:nft_hash_walk+0x1d2/0x310 [nf_tables_set] [ 1239.682834] Code: 84 d2 7f 10 4c 89 e7 89 44 24 38 e8 d8 5a 17 e0 8b 44 24 38 48 8d 7b 10 41 0f b6 0c 24 48 89 fa 48 89 fe 48 c1 ea 03 83 e6 07 <42> 0f b6 14 3a 40 38 f2 7f 1a 84 d2 74 16 [ 1239.705108] RSP: 0018:ffff8801118cf358 EFLAGS: 00010246 [ 1239.711115] RAX: 0000000000000000 RBX: 0000000000020400 RCX: 0000000000000001 [ 1239.719269] RDX: 0000000000004082 RSI: 0000000000000000 RDI: 0000000000020410 [ 1239.727401] RBP: ffff880114d5a988 R08: 0000000000007e94 R09: ffff880114dd8030 [ 1239.735530] R10: ffff880114d5a988 R11: ffffed00229bb006 R12: ffff8801118cf4d0 [ 1239.743658] R13: ffff8801118cf4d8 R14: 0000000000000000 R15: dffffc0000000000 [ 1239.751785] FS: 00007f5a8fe0b700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000 [ 1239.760993] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1239.767560] CR2: 00007f5a8ecc27b0 CR3: 000000010608e000 CR4: 00000000001006f0 [ 1239.775679] Kernel panic - not syncing: Fatal exception [ 1239.776630] Kernel Offset: 0x1f000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 1239.776630] Rebooting in 5 seconds.. Fixes: 20a69341f2d0 ("netfilter: nf_tables: add netlink set API") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16netfilter: ip6t_rpfilter: set F_IFACE for linklocal addressesFlorian Westphal
Roman reports that DHCPv6 client no longer sees replies from server due to ip6tables -t raw -A PREROUTING -m rpfilter --invert -j DROP rule. We need to set the F_IFACE flag for linklocal addresses, they are scoped per-device. Fixes: 47b7e7f82802 ("netfilter: don't set F_IFACE on ipv6 fib lookups") Reported-by: Roman Mamedov <rm@romanrm.net> Tested-by: Roman Mamedov <rm@romanrm.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16ipvs: don't show negative times in ip_vs_connMatteo Croce
Since commit 500462a9de65 ("timers: Switch to a non-cascading wheel"), timers duration can last even 12.5% more than the scheduled interval. IPVS has two handlers, /proc/net/ip_vs_conn and /proc/net/ip_vs_conn_sync, which shows the remaining time before that a connection expires. The default expire time for a connection is 60 seconds, and the expiration timer can fire even 4 seconds later than the scheduled time. The expiration time is calculated subtracting jiffies to the scheduled expiration time, and it's shown as a huge number when the timer fires late, since both values are unsigned. This can confuse script and tools which relies on it, like ipvsadm: root@mcroce-redhat:~# while ipvsadm -lc |grep SYN_RECV; do sleep 1 ; done TCP 00:05 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 00:04 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 00:03 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 00:02 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 00:01 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 00:00 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:44 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:43 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:42 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:41 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:40 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 TCP 68719476:39 SYN_RECV [fc00:1::1]:55732 [fc00:1::2]:8000 [fc00:2000::1]:8000 Signed-off-by: Matteo Croce <mcroce@redhat.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16jiffies: add utility function to calculate delta in msMatteo Croce
add jiffies_delta_to_msecs() helper func to calculate the delta between two times and eventually 0 if negative. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Matteo Croce <mcroce@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()Tan Hu
We came across infinite loop in ipvs when using ipvs in docker env. When ipvs receives new packets and cannot find an ipvs connection, it will create a new connection, then if the dest is unavailable (i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently. But if the dropped packet is the first packet of this connection, the connection control timer never has a chance to start and the ipvs connection cannot be released. This will lead to memory leak, or infinite loop in cleanup_net() when net namespace is released like this: ip_vs_conn_net_cleanup at ffffffffa0a9f31a [ip_vs] __ip_vs_cleanup at ffffffffa0a9f60a [ip_vs] ops_exit_list at ffffffff81567a49 cleanup_net at ffffffff81568b40 process_one_work at ffffffff810a851b worker_thread at ffffffff810a9356 kthread at ffffffff810b0b6f ret_from_fork at ffffffff81697a18 race condition: CPU1 CPU2 ip_vs_in() ip_vs_conn_new() ip_vs_del_dest() __ip_vs_unlink_dest() ~IP_VS_DEST_F_AVAILABLE cp->dest && !IP_VS_DEST_F_AVAILABLE __ip_vs_conn_put ... cleanup_net ---> infinite looping Fix this by checking whether the timer already started. Signed-off-by: Tan Hu <tan.hu@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-16Merge tag 'vfio-v4.19-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - mark switch fall-through cases (Gustavo A. R. Silva) - disable binding SR-IOV enabled PFs (Alex Williamson) * tag 'vfio-v4.19-rc1' of git://github.com/awilliam/linux-vfio: vfio-pci: Disable binding to PFs with SR-IOV enabled vfio: Mark expected switch fall-throughs
2018-08-16Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal management updates from Eduardo Valentin: - rework tsens driver to add support for tsens-v2 (Amit Kucheria) - rework armada thermal driver to use syscon and multichannel support (Miquel Raynal) - fixes to TI SoC, IMX, Exynos, RCar, and hwmon drivers * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: (34 commits) thermal: armada: fix copy-paste error in armada_thermal_probe() thermal: rcar_thermal: avoid NULL dereference in absence of IRQ resources thermal: samsung: Remove Exynos5440 clock handling left-overs thermal: tsens: Fix negative temperature reporting thermal: tsens: switch from of_iomap() to devm_ioremap_resource() thermal: tsens: Rename variable thermal: tsens: Add generic support for TSENS v2 IP thermal: tsens: Rename tsens-8996 to tsens-v2 for reuse thermal: tsens: Add support to split up register address space into two dt: thermal: tsens: Document the fallback DT property for v2 of TSENS IP thermal: tsens: Get rid of unused fields in structure thermal_hwmon: Pass the originating device down to hwmon_device_register_with_info thermal_hwmon: Sanitize attribute name passed to hwmon dt-bindings: thermal: armada: add reference to new bindings dt-bindings: cp110: add the thermal node in the syscon file dt-bindings: cp110: update documentation since DT de-duplication dt-bindings: ap806: add the thermal node in the syscon file dt-bindings: cp110: prepare the syscon file to list other syscons nodes dt-bindings: ap806: prepare the syscon file to list other syscons nodes dt-bindings: cp110: rename cp110 syscon file ...
2018-08-16Merge tag 'mailbox-v4.19' of ↵Linus Torvalds
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox updates from Jassi Brar: - xgene: potential null pointer fix - omap: switch to spdx license and use of_device_get_match_data() to match data - ti-msgmgr: cleanup and optimisation. New TI specific feature - secure proxy thread. - mediatek: add driver for CMDQ controller. - nxp: add driver for MU controller * tag 'mailbox-v4.19' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: Add support for i.MX messaging unit dt-bindings: mailbox: imx-mu: add generic MU channel support dt-bindings: arm: fsl: add mu binding doc mailbox: add MODULE_LICENSE() for mtk-cmdq-mailbox.c mailbox: mediatek: Add Mediatek CMDQ driver dt-bindings: soc: Add documentation for the MediaTek GCE unit mailbox: ti-msgmgr: Add support for Secure Proxy dt-bindings: mailbox: Add support for secure proxy threads mailbox: ti-msgmgr: Move the memory region name to descriptor mailbox: ti-msgmgr: Change message count mask to be descriptor based mailbox: ti-msgmgr: Allocate Rx channel resources only on request mailbox: ti-msgmgr: Get rid of unused structure members mailbox/omap: use of_device_get_match_data() to get match data mailbox/omap: switch to SPDX license identifier mailbox: xgene-slimpro: Fix potential NULL pointer dereference
2018-08-16Fix kexec forbidding kernels signed with keys in the secondary keyring to bootYannik Sembritzki
The split of .system_keyring into .builtin_trusted_keys and .secondary_trusted_keys broke kexec, thereby preventing kernels signed by keys which are now in the secondary keyring from being kexec'd. Fix this by passing VERIFY_USE_SECONDARY_KEYRING to verify_pefile_signature(). Fixes: d3bfe84129f6 ("certs: Add a secondary system keyring that can be added to dynamically") Signed-off-by: Yannik Sembritzki <yannik@sembritzki.me> Signed-off-by: David Howells <dhowells@redhat.com> Cc: kexec@lists.infradead.org Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-16Replace magic for trusting the secondary keyring with #defineYannik Sembritzki
Replace the use of a magic number that indicates that verify_*_signature() should use the secondary keyring with a symbol. Signed-off-by: Yannik Sembritzki <yannik@sembritzki.me> Signed-off-by: David Howells <dhowells@redhat.com> Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-16Merge tag 'pci-v4.19-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: - Decode AER errors with names similar to "lspci" (Tyler Baicar) - Expose AER statistics in sysfs (Rajat Jain) - Clear AER status bits selectively based on the type of recovery (Oza Pawandeep) - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru Gagniuc) - Don't clear AER status bits if we're using the "Firmware-First" strategy where firmware owns the registers (Alexandru Gagniuc) - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy Shevchenko) - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas) - Defer DPC event handling to work queue (Keith Busch) - Use threaded IRQ for DPC bottom half (Keith Busch) - Print AER status while handling DPC events (Keith Busch) - Work around IDT switch ACS Source Validation erratum (James Puthukattukaran) - Emit diagnostics for all cases of PCIe Link downtraining (Links operating slower than they're capable of) (Alexandru Gagniuc) - Skip VFs when configuring Max Payload Size (Myron Stowe) - Reduce Root Port Max Payload Size if necessary when hot-adding a device below it (Myron Stowe) - Simplify SHPC existence/permission checks (Bjorn Helgaas) - Remove hotplug sample skeleton driver (Lukas Wunner) - Convert pciehp to threaded IRQ handling (Lukas Wunner) - Improve pciehp tolerance of missed events and initially unstable links (Lukas Wunner) - Clear spurious pciehp events on resume (Lukas Wunner) - Add pciehp runtime PM support, including for Thunderbolt controllers (Lukas Wunner) - Support interrupts from pciehp bridges in D3hot (Lukas Wunner) - Mark fall-through switch cases before enabling -Wimplicit-fallthrough (Gustavo A. R. Silva) - Move DMA-debug PCI init from arch code to PCI core (Christoph Hellwig) - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is supplied (Heiner Kallweit) - Unify PCI and DMA direction #defines (Shunyong Yang) - Add PCI_DEVICE_DATA() macro (Andy Shevchenko) - Check for VPD completion before checking for timeout (Bert Kenward) - Limit Netronome NFP5000 config space size to work around erratum (Jakub Kicinski) - Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit) - Document ACPI description of PCI host bridges (Bjorn Helgaas) - Add "pci=disable_acs_redir=" parameter to disable ACS redirection for peer-to-peer DMA support (we don't have the peer-to-peer support yet; this is just one piece) (Logan Gunthorpe) - Clean up devm_of_pci_get_host_bridge_resources() resource allocation (Jan Kiszka) - Fixup resizable BARs after suspend/resume (Christian König) - Make "pci=earlydump" generic (Sinan Kaya) - Fix ROM BAR access routines to stay in bounds and check for signature correctly (Rex Zhu) - Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer) - Expand documentation for pci_add_dma_alias() (Logan Gunthorpe) - To avoid bus errors, enable PASID only if entire path supports End-End TLP prefixes (Sinan Kaya) - Unify slot and bus reset functions and remove hotplug knowledge from callers (Sinan Kaya) - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to fix guest reboot issues (Alex Williamson) - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD Controller (Bjorn Helgaas) - Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt) - Remove Aardvark outbound window configuration (Evan Wang) - Fix Aardvark bridge window sizing issue (Zachary Zhang) - Convert Aardvark to use pci_host_probe() to reduce code duplication (Thomas Petazzoni) - Correct the Cadence cdns_pcie_writel() signature (Alan Douglas) - Add Cadence support for optional generic PHYs (Alan Douglas) - Add Cadence power management ops (Alan Douglas) - Remove redundant variable from Cadence driver (Colin Ian King) - Add Kirin MSI support (Xiaowei Song) - Drop unnecessary root_bus_nr setting from exynos, imx6, keystone, armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn Guo) - Move link notification settings from DesignWare core to individual drivers (Gustavo Pimentel) - Add endpoint library MSI-X interfaces (Gustavo Pimentel) - Correct signature of endpoint library IRQ interfaces (Gustavo Pimentel) - Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel) - Add endpoint library MSI-X test support (Gustavo Pimentel) - Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation (Jia-Ju Bai) - Add more devices to Broadcom PAXC quirk (Ray Jui) - Work around corrupted Broadcom PAXC config space to enable SMMU and GICv3 ITS (Ray Jui) - Disable MSI parsing to work around broken Broadcom PAXC logic in some devices (Ray Jui) - Hide unconfigured functions to work around a Broadcom PAXC defect (Ray Jui) - Lower iproc log level to reduce console output during boot (Ray Jui) - Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi) - Fix mobiveil missing include file (Lorenzo Pieralisi) - Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi) - Fix mvebu I/O space remapping issues (Thomas Petazzoni) - Use generic pci_host_bridge in mvebu instead of ARM-specific API (Thomas Petazzoni) - Whitelist VMD devices with fast interrupt handlers to avoid sharing vectors with slow handlers (Keith Busch) * tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits) PCI/AER: Don't clear AER bits if error handling is Firmware-First PCI: Limit config space size for Netronome NFP5000 PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips PCI/VPD: Check for VPD access completion before checking for timeout PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry PCI: Match Root Port's MPS to endpoint's MPSS as necessary PCI: Skip MPS logic for Virtual Functions (VFs) PCI: Add function 1 DMA alias quirk for Marvell 88SS9183 PCI: Check for PCIe Link downtraining PCI: Add ACS Redirect disable quirk for Intel Sunrise Point PCI: Add device-specific ACS Redirect disable infrastructure PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support PCI: Allow specifying devices using a base bus and path of devfns PCI: Make specifying PCI devices in kernel parameters reusable PCI: Hide ACS quirk declarations inside PCI core PCI: Delay after FLR of Intel DC P3700 NVMe PCI: Disable Samsung SM961/PM961 NVMe before FLR PCI: Export pcie_has_flr() PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers() ...
2018-08-15Merge branch 'next-integrity' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull integrity updates from James Morris: "This adds support for EVM signatures based on larger digests, contains a new audit record AUDIT_INTEGRITY_POLICY_RULE to differentiate the IMA policy rules from the IMA-audit messages, addresses two deadlocks due to either loading or searching for crypto algorithms, and cleans up the audit messages" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: EVM: fix return value check in evm_write_xattrs() integrity: prevent deadlock during digsig verification. evm: Allow non-SHA1 digital signatures evm: Don't deadlock if a crypto algorithm is unavailable integrity: silence warning when CONFIG_SECURITYFS is not enabled ima: Differentiate auditing policy rules from "audit" actions ima: Do not audit if CONFIG_INTEGRITY_AUDIT is not set ima: Use audit_log_format() rather than audit_log_string() ima: Call audit_log_string() rather than logging it untrusted
2018-08-15Merge branch 'next-tpm' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull TPM updates from James Morris: - Migrate away from PM runtime as explicit cmdReady/goIdle transactions for every command is a spec requirement. PM runtime adds only a layer of complexity on our case. - tpm_tis drivers can now specify the hwrng quality. - TPM 2.0 code uses now tpm_buf for constructing messages. Jarkko thinks Tomas Winkler has done the same for TPM 1.2, and will start digging those changes from the patchwork in the near future. - Bug fixes and clean ups * 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: ima: Get rid of ima_used_chip and use ima_tpm_chip != NULL instead ima: Use tpm_default_chip() and call TPM functions with a tpm_chip tpm: replace TPM_TRANSMIT_RAW with TPM_TRANSMIT_NESTED tpm: Convert tpm_find_get_ops() to use tpm_default_chip() tpm: Implement tpm_default_chip() to find a TPM chip tpm: rename tpm_chip_find_get() to tpm_find_get_ops() tpm: Allow tpm_tis drivers to set hwrng quality. tpm: Return the actual size when receiving an unsupported command tpm: separate cmd_ready/go_idle from runtime_pm tpm/tpm_i2c_infineon: switch to i2c_lock_bus(..., I2C_LOCK_SEGMENT) tpm_tis_spi: Pass the SPI IRQ down to the driver tpm: migrate tpm2_get_random() to use struct tpm_buf tpm: migrate tpm2_get_tpm_pt() to use struct tpm_buf tpm: migrate tpm2_probe() to use struct tpm_buf tpm: migrate tpm2_shutdown() to use struct tpm_buf
2018-08-15Merge branch 'next-smack' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull smack updates from James Morris: "Minor fixes from Piotr Sawicki" * 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: Inform peer that IPv6 traffic has been blocked Smack: Check UDP-Lite and DCCP protocols during IPv6 handling Smack: Fix handling of IPv4 traffic received by PF_INET6 sockets
2018-08-15Merge tag 'jfs-4.19' of git://github.com/kleikamp/linux-shaggyLinus Torvalds
Pull jfs update from David Kleikamp: "Just one jfs patch for 4.19" * tag 'jfs-4.19' of git://github.com/kleikamp/linux-shaggy: jfs: use time64_t for otime
2018-08-15Merge tag 'gfs2-4.19.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - iomap support for buffered writes and for direct I/O - two patches that reduce the size of struct gfs2_inode - lots of fixes and cleanups * tag 'gfs2-4.19.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (25 commits) gfs2: eliminate update_rgrp_lvb_unlinked gfs2: Fix gfs2_testbit to use clone bitmaps gfs2: Get rid of gfs2_ea_strlen gfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out gfs2: Special-case rindex for gfs2_grow GFS2: rgrp free blocks used incorrectly gfs2: remove redundant variable 'moved' gfs2: use iomap_readpage for blocksize == PAGE_SIZE gfs2: Use iomap for stuffed direct I/O reads gfs2: fallocate_chunk: Always initialize struct iomap GFS2: Fix recovery issues for spectators fs: gfs2: Adding new return type vm_fault_t gfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr gfs2: Don't reject a supposedly full bitmap if we have blocks reserved gfs2: Eliminate redundant ip->i_rgd gfs2: Stop messing with ip->i_rgd in the rlist code gfs2: Remove gfs2_write_{begin,end} gfs2: iomap direct I/O support gfs2: gfs2_extent_length cleanup gfs2: iomap buffered write support ...
2018-08-15Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx, hisi_sas, smartpqi, megaraid_sas, arcmsr. In addition, with the continuing absence of Nic we have target updates for tcmu and target core (all with reviews and acks). The biggest observable change is going to be that we're (again) trying to switch to mulitqueue as the default (a user can still override the setting on the kernel command line). Other major core stuff is the removal of the remaining Microchannel drivers, an update of the internal timers and some reworks of completion and result handling" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits) scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue scsi: ufs: remove unnecessary query(DM) UPIU trace scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done() scsi: aacraid: Spelling fix in comment scsi: mpt3sas: Fix calltrace observed while running IO & reset scsi: aic94xx: fix an error code in aic94xx_init() scsi: st: remove redundant pointer STbuffer scsi: qla2xxx: Update driver version to 10.00.00.08-k scsi: qla2xxx: Migrate NVME N2N handling into state machine scsi: qla2xxx: Save frame payload size from ICB scsi: qla2xxx: Fix stalled relogin scsi: qla2xxx: Fix race between switch cmd completion and timeout scsi: qla2xxx: Fix Management Server NPort handle reservation logic scsi: qla2xxx: Flush mailbox commands on chip reset scsi: qla2xxx: Fix unintended Logout scsi: qla2xxx: Fix session state stuck in Get Port DB scsi: qla2xxx: Fix redundant fc_rport registration scsi: qla2xxx: Silent erroneous message scsi: qla2xxx: Prevent sysfs access when chip is down scsi: qla2xxx: Add longer window for chip reset ...
2018-08-15Merge tag 'clk-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The new and exciting feature this time around is in the clk core. We've added duty cycle support to the clk API so that clk signal duty cycle ratios can be adjusted while taking into account things like clk dividers and clk tree hierarchy. So far only one SoC has implemented support for this, but I expect there will be more to come in the future. Outside of the core, we have the usual pile of clk driver updates and additions. The Amlogic meson driver got the most lines in the diffstat this time around because it added support for a whole bunch of hardware and duty cycle configuration. After that the Rockchip PX30, Qualcomm SDM845, and Renesas SoC drivers fill in a majority of the diff. We're left with the collection of non-critical fixes after that. Overall it looks pretty quiet this time. Core: - Clk duty cycle support - Proper CLK_SET_RATE_GATE support throughout the tree New Drivers: - Actions Semi Owl series S700 SoC clk driver - Qualcomm SDM845 display clock controller - i.MX6SX ocram_s clk support - Uniphier NAND, USB3 PHY, and SPI clk support - Qualcomm RPMh clk driver - i.MX7D mailbox clk support - Maxim 9485 Programmable Clock Generator - expose 32 kHz PLL on PXA SoCs - imx6sll GPIO clk gate support - Atmel at91 I2S audio clk support - SI544/SI514 clk on/off support - i.MX6UL GPIO clock gates in CCM CCGR - Renesas Crypto Engine clocks on R-Car H3 - Renesas clk support for the new RZ/N1D SoC - Allwinner A64 display engine clock support - support for Rockchip's PX30 SoC - Amlogic Meson axg PCIe and audio clocks - Amlogic Meson GEN CLK on gxbb, gxl and axg Updates: - remove an unused variable from Exynos4412 ISP driver - fix a thinko bug in SCMI clk division logic - add missing of_node_put()s in some i.MX clk drivers - Tegra SDMMC clk jitter improvements with high speed signaling modes - SPDX tagging for qcom and cs2000-cp drivers - stop leaking con ids in __clk_put() - fix a corner case in fixed factor clk probing where node is in DT but parent clk is registered much later - Marvell Armada 3700 clk_pm_cpu_get_parent() had an invalid return value - i.MX clk init arrays removed in place of CLK_IS_CRITICAL - convert to CLK_IS_CRITICAL for i.MX51/53 driver - fix Tegra BPMP driver oops when xlating a NULL clk - proper default configuration for vic03 and vde clks on Tegra124 - mark Tegra memory controller clks as critical - fix array bounds clamp in Tegra's emc determine_rate() op - Ingenic i2s bit update and allow UDC clk to gate - fix name of aspeed SDC clk define to have only one 'CLK' - fix i.MX6QDL video clk parent - critical clk markings for qcom SDM845 - fix Stratix10 mpu_free_clk and sdmmc_free_clk parents - mark Rockchip's pclk_rkpwm_pmu as critical clock, due to it supplying the pwm used to drive the logic supply of the rk3399 core" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (85 commits) clk: rockchip: Add pclk_rkpwm_pmu to PMU critical clocks in rk3399 clk: cs2000-cp: convert to SPDX identifiers clk: scmi: Fix the rounding of clock rate clk: qcom: Add display clock controller driver for SDM845 clk: mvebu: armada-37xx-periph: Remove unused var num_parents clk: samsung: Remove unused mout_user_aclk400_mcuisp_p4x12 variable clk: actions: Add S700 SoC clock support dt-bindings: clock: Add S700 support for Actions Semi Soc's clk: actions: Add missing REGMAP_MMIO dependency clk: uniphier: add clock frequency support for SPI clk: uniphier: add more USB3 PHY clocks clk: uniphier: add NAND 200MHz clock clk: tegra: make sdmmc2 and sdmmc4 as sdmmc clocks clk: tegra: Add sdmmc mux divider clock clk: tegra: Refactor fractional divider calculation clk: tegra: Fix includes required by fence_udelay() clk: imx6sll: fix missing of_node_put() clk: imx6ul: fix missing of_node_put() clk: imx: add ocram_s clock for i.mx6sx clk: mvebu: armada-37xx-periph: Fix wrong return value in get_parent ...