git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-08-02	IB/ipoib: Do not remove child devices from within the ndo_uninit	Jason Gunthorpe
	Switching to priv_destructor and needs_free_netdev created a subtle ordering problem in ipoib_remove_one. Now that unregister_netdev frees the netdev and priv we must ensure that the children are unregistered before trying to unregister the parent, or child unregister will use after free. The solution is to unregister the children, then parent, in the same batch all while holding the rtnl_lock. This closes all the races where a new child could have been added and ensures proper ordering. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	IB/ipoib: Get rid of the sysfs_mutex	Jason Gunthorpe
	This mutex was introduced to deal with the deadlock formed by calling unregister_netdev from within the sysfs callback of a netdev. Now that we have priv_destructor and needs_free_netdev we can switch to the more targeted solution of running the unregister from a work queue. This avoids the deadlock and gets rid of the mutex. The next patch in the series needs this mutex eliminated to create atomicity of unregisteration. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	RDMA/netdev: Use priv_destructor for netdev cleanup	Jason Gunthorpe
	Now that the unregister_netdev flow for IPoIB no longer relies on external code we can now introduce the use of priv_destructor and needs_free_netdev. The rdma_netdev flow is switched to use the netdev common priv_destructor instead of the special free_rdma_netdev and the IPOIB ULP adjusted: - priv_destructor needs to switch to point to the ULP's destructor which will then call the rdma_ndev's in the right order - We need to be careful around the error unwind of register_netdev as it sometimes calls priv_destructor on failure - ULPs need to use ndo_init/uninit to ensure proper ordering of failures around register_netdev Switching to priv_destructor is a necessary pre-requisite to using the rtnl new_link mechanism. The VNIC user for rdma_netdev should also be revised, but that is left for another patch. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Denis Drozdov <denisd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	IB/ipoib: Move init code to ndo_init	Jason Gunthorpe
	Now that we have a proper ndo_uninit, move code that naturally pairs with the ndo_uninit into ndo_init. This allows the netdev core to natually handle ordering. This fixes the situation where register_netdev can fail before calling ndo_init, in which case it wouldn't call ndo_uninit either. Also move a bunch of duplicated init code that is shared between child and parent for clarity. Now the child and parent register functions look very similar. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	IB/ipoib: Move all uninit code into ndo_uninit	Jason Gunthorpe
	Currently uninit is sometimes done twice in error flows, and is sprinkled a bit all over the place. Improve the clarity of the design by moving all uninit only into ndo_uinit. Some duplication is removed: - Sometimes IPOIB_STOP_NEIGH_GC was done before unregister, but this duplicates the process in ipoib_neigh_hash_init - Flushing priv->wq was sometimes done before unregister, but that duplicates what has been done in ndo_uninit Uniniting the IB event queue must remain before unregister_netdev as it requires the RTNL lock to be dropped, this is moved to a helper to make that flow really clear and remove some duplication in error flows. If register_netdev fails (and ndo_init is NULL) then it almost always calls ndo_uninit, which lets us remove all the extra code from the error unwinds. The next patch in the series will close the 'almost always' hole by pairing a proper ndo_init with ndo_uninit. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	IB/ipoib: Use cancel_delayed_work_sync for neigh-clean task	Erez Shitrit
	The neigh_reap_task is self restarting, but so long as we call cancel_delayed_work_sync() it will be guaranteed to not be running and never start again. Thus we don't need to have the racy IPOIB_STOP_NEIGH_GC bit, or the confusing mismatch of places sometimes calling flush_workqueue after the cancel. This fixes a situation where the GC work could have been left running in some rare situations. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	IB/ipoib: Get rid of IPOIB_FLAG_GOING_DOWN	Jason Gunthorpe
	This essentially duplicates the netdev's reg_state, so just use that directly. The reg_state is updated under the rntl_lock, and all places using GOING_DOWN already acquire the rtnl_lock so checking is safe. Since the only place we use GOING_DOWN is for the parent device this does not fix any bugs, but it is a step to tidy up the unregister flow so that after later patches the flow is uniform and sane. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-02	iw_cxgb4: Support FW write completion WR	Potnuri Bharat Teja
	To optimize NVME-oF READ IOPs, use a specialized WQE that combines the RDMA WRITE and SEND_INV WR chain submitted by the NVME-oF target driver. This reduces uP overhead per NVME-oF IO, and results in over 10% improvement in NVME-oF 4K READ IOPs. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	iw_cxgb4: RDMA write with immediate support	Potnuri Bharat Teja
	Adds iw_cxgb4 functionality to support RDMA_WRITE_WITH_IMMEDATE opcode. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	rdma/cxgb4: fix some info leaks	Dan Carpenter
	In c4iw_create_qp() there are several struct members which potentially aren't inintialized like uresp.rq_key. I've fixed this code before in in commit ae1fe07f3f42 ("RDMA/cxgb4: Fix stack info leak in c4iw_create_qp()") so this time I'm just going to take a big hammer approach and memset the whole struct to zero. Hopefully, it will stay fixed this time. In c4iw_create_srq() we don't clear uresp.reserved. Fixes: 6a0b6174d35a ("rdma/cxgb4: Add support for kernel mode SRQ's") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	RDMA/hns: Support flush cqe for hip08 in kernel space	Yixian Liu
	According to IB protocol, there are some cases that work requests must return the flush error completion status through the completion queue. Due to hardware limitation, the driver needs to assist the flush process. This patch adds the support of flush cqe for hip08 in the cases that needed, such as poll cqe, post send, post recv and aeqe handle. The patch also considered the compatibility between kernel and user space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-02	Merge tag 'media/v4.18-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - a deadlock regression at vsp1 driver - some Remote Controller fixes related to the new BPF filter logic added on it for Kernel 4.18. * tag 'media/v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: v4l: vsp1: Fix deadlock in VSPDL DRM pipelines media: rc: read out of bounds if bpf reports high protocol number media: bpf: ensure bpf program is freed on detach media: rc: be less noisy when driver misbehaves
2018-08-02	Merge tag 'arc-4.18-final' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "Another batch of fixes for ARC, this time mainly DMA API rework wreckage: - Fix software managed DMA wreckage after rework in 4.17 [Euginey] * missing cache flush * SMP_CACHE_BYTES vs cache_line_size - Fix allmodconfig build errors [Randy] - Maintainer update for Mellanox (EZChip) NPS platform" * tag 'arc-4.18-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: fix type warnings in arc/mm/cache.c arc: fix build errors in arc/include/asm/delay.h arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c arc: [plat-eznps] fix data type errors in platform headers ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc ARC: add SMP_CACHE_BYTES value validate ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size ARC: dma [non IOC]: fix arc_dma_sync_single_for_(device\|cpu) ARC: Add Ofer Levi as plat-eznps maintainer
2018-08-02	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "3 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails ipc/shm.c add ->pagesize function to shm_vm_ops memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
2018-08-02	media: usb: hackrf: Replace GFP_ATOMIC with GFP_KERNEL	Jia-Ju Bai
	hackrf_submit_urbs(), hackrf_alloc_stream_bufs() and hackrf_alloc_urbs() are never called in atomic context. They call usb_submit_urb(), usb_alloc_coherent() and usb_alloc_urb() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: usb: em28xx: Replace mdelay() with msleep() in em28xx_pre_card_setup()	Jia-Ju Bai
	em28xx_pre_card_setup() is never called in atomic context. It calls mdelay() to busily wait, which is not necessary. mdelay() can be replaced with msleep(). This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: usb: em28xx: Replace GFP_ATOMIC with GFP_KERNEL in em28xx_init_usb_xfer()	Jia-Ju Bai
	em28xx_init_usb_xfer() is never called in atomic context. It calls usb_submit_urb() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: davinci: vpif_display: Mix memory leak on probe error path	Anton Vasilyev
	If vpif_probe() fails on v4l2_device_register() then memory allocated at initialize_vpif() for global vpif_obj.dev[i] become unreleased. The patch adds deallocation of vpif_obj.dev[i] on the error path and removes duplicated check on platform_data presence. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: vimc: Remove redundant free	Anton Vasilyev
	Commit 4a29b7090749 ("[media] vimc: Subdevices as modules") removes vimc allocation from vimc_probe(), so corresponding deallocation on the error path tries to free static memory. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Acked-by: Helen Koike <helen.koike@collabora.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dvb-frontends: rtl2832_sdr: Replace GFP_ATOMIC with GFP_KERNEL	Jia-Ju Bai
	rtl2832_sdr_submit_urbs(), rtl2832_sdr_alloc_stream_bufs(), and rtl2832_sdr_alloc_urbs() are never called in atomic context. They call usb_submit_urb(), usb_alloc_coherent() and usb_alloc_urb() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails	Mike Rapoport
	The fix in commit 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") cleared the vma->vm_userfaultfd_ctx but kept userfaultfd flags in vma->vm_flags that were copied from the parent process VMA. As the result, there is an inconsistency between the values of vma->vm_userfaultfd_ctx.ctx and vma->vm_flags which triggers BUG_ON in userfaultfd_release(). Clearing the uffd flags from vma->vm_flags in case of UFFD_EVENT_FORK failure resolves the issue. Link: http://lkml.kernel.org/r/1532931975-25473-1-git-send-email-rppt@linux.vnet.ibm.com Fixes: 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reported-by: syzbot+121be635a7a35ddb7dcb@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02	ipc/shm.c add ->pagesize function to shm_vm_ops	Jane Chu
	Commit 05ea88608d4e ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); resulting in hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated ------------[ cut here ]------------ kernel BUG at fs/hugetlbfs/inode.c:444! Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2 RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 .... Call Trace: hugetlbfs_evict_inode+0x1e/0x3e evict+0xdb/0x1af iput+0x1a2/0x1f7 dentry_unlink_inode+0xc6/0xf0 __dentry_kill+0xd8/0x18d dput+0x1b5/0x1ed __fput+0x18b/0x216 ____fput+0xe/0x10 task_work_run+0x90/0xa7 exit_to_usermode_loop+0xdd/0x116 do_syscall_64+0x187/0x1ae entry_SYSCALL_64_after_hwframe+0x150/0x0 [jane.chu@oracle.com: relocate comment] Link: http://lkml.kernel.org/r/20180731044831.26036-1-jane.chu@oracle.com Link: http://lkml.kernel.org/r/20180727211727.5020-1-jane.chu@oracle.com Fixes: 05ea88608d4e13 ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") Signed-off-by: Jane Chu <jane.chu@oracle.com> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02	memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure	Kirill Tkhai
	In case of memcg_online_kmem() failure, memcg_cgroup::id remains hashed in mem_cgroup_idr even after memcg memory is freed. This leads to leak of ID in mem_cgroup_idr. This patch adds removal into mem_cgroup_css_alloc(), which fixes the problem. For better readability, it adds a generic helper which is used in mem_cgroup_alloc() and mem_cgroup_id_put_many() as well. Link: http://lkml.kernel.org/r/152354470916.22460.14397070748001974638.stgit@localhost.localdomain Fixes 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02	media: omap2: omapfb: fix bugon.cocci warnings	kbuild test robot
	drivers/video/fbdev/omap2/omapfb/dss/dss_features.c:895:2-5: WARNING: Use BUG_ON instead of if condition followed by BUG. Please make sure the condition has no side effects (see conditional BUG_ON definition in include/asm-generic/bug.h) Use BUG_ON instead of a if condition followed by BUG. Semantic patch information: This makes an effort to find cases where BUG() follows an if condition on an expression and replaces the if condition and BUG() with a BUG_ON having the conditional expression of the if statement as argument. Generated by: scripts/coccinelle/misc/bugon.cocci Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: omap2: omapfb: fix boolreturn.cocci warnings	kbuild test robot
	drivers/video/fbdev/omap2/omapfb/omapfb-main.c:290:9-10: WARNING: return of 0/1 in function 'cmp_var_to_colormode' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: omap2: omapfb: fix ifnullfree.cocci warnings	kbuild test robot
	drivers/video/fbdev/omap2/omapfb/dss/core.c:141:2-26: WARNING: NULL check before some freeing functions is not needed. NULL check before some freeing functions is not needed. Based on checkpatch warning "kfree(NULL) is safe this check is probably not required" and kfreeaddr.cocci by Julia Lawall. Generated by: scripts/coccinelle/free/ifnullfree.cocci Signed-off-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dvb-frontends: add Socionext MN88443x ISDB-S/T demodulator driver	Katsuhiro Suzuki
	This patch adds a frontend driver for the Socionext/Panasonic MN884434 and MN884433 ISDB-S/T demodulators. The maximum and minimum frequency of MN88443x comes from ISDB-S and ISDB-T so frequency range is the following: - ISDB-S (BS/CS110 IF frequency, Local freq 10.678GHz) - Min: BS-1: 1032MHz - Max: ND24: 2070MHz - ISDB-T - Min: ch13: 470MHz - Max: ch62: 770MHz Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dm1105: Limit number of cards to avoid buffer over read	Anton Vasilyev
	dm1105_probe() counts number of cards at dm1105_devcount, but missed bounds check before dereference a card array. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: helene: add I2C device probe function	Katsuhiro Suzuki
	This patch adds I2C probe function to use dvb_module_probe() with this driver. And also support multiple delivery systems at the same device. Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dw2102: Fix memleak on sequence of probes	Anton Vasilyev
	Each call to dw2102_probe() allocates memory by kmemdup for structures p1100, s660, p7500 and s421, but there is no their deallocation. dvb_usb_device_init() copies the corresponding structure into dvb_usb_device->props, so there is no use of original structure after dvb_usb_device_init(). The patch moves structures from global scope to local and adds their deallocation. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-03	Merge branch 'bpf-cgroup-local-storage'	Daniel Borkmann
	Roman Gushchin says: ==================== This patchset implements cgroup local storage for bpf programs. The main idea is to provide a fast accessible memory for storing various per-cgroup data, e.g. number of transmitted packets. Cgroup local storage looks as a special type of map for userspace, and is accessible using generic bpf maps API for reading and updating of the data. The (cgroup inode id, attachment type) pair is used as a map key. A user can't create new entries or destroy existing entries; it happens automatically when a user attaches/detaches a bpf program to a cgroup. From a bpf program's point of view, cgroup storage is accessible without lookup using the special get_local_storage() helper function. It takes a map fd as an argument. It always returns a valid pointer to the corresponding memory area. To implement such a lookup-free access a pointer to the cgroup storage is saved for an attachment of a bpf program to a cgroup, if required by the program. Before running the program, it's saved in a special global per-cpu variable, which is accessible from the get_local_storage() helper. This patchset implement only cgroup local storage, however the API is intentionally made extensible to support other local storage types further: e.g. thread local storage, socket local storage, etc. v7->v6: - fixed a use-after-free bug, caused by not clearing prog->aux->cgroup_storage pointer after releasing the map v6->v5: - fixed an error with returning -EINVAL instead of a pointer v5->v4: - fixed an issue in verifier (test that flags == 0 properly) - added a corresponding test - added a note about synchronization, sync docs to tools/uapi/... - switched the cgroup test to use XADD - added a check for attr->max_entries to be 0, and atter->max_flags to be sane - use bpf_uncharge_memlock() in bpf_uncharge_memlock() - rebased to bpf-next v4->v3: - fixed a leak in cgroup attachment code (discovered by Daniel) - cgroup storage map will be released if the corresponding bpf program failed to load by any reason - introduced bpf_uncharge_memlock() helper v3->v2: - fixed more build and sparse issues - rebased to bpf-next v2->v1: - fixed build issues - removed explicit rlimit calls in patch 14 - rebased to bpf-next ==================== Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	samples/bpf: extend test_cgrp2_attach2 test to use cgroup storage	Roman Gushchin
	The test_cgrp2_attach test covers bpf cgroup attachment code well, so let's re-use it for testing allocation/releasing of cgroup storage. The extension is pretty straightforward: the bpf program will use the cgroup storage to save the number of transmitted bytes. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	selftests/bpf: add a cgroup storage test	Roman Gushchin
	Implement a test to cover the cgroup storage functionality. The test implements a bpf program which drops every second packet by using the cgroup storage as a persistent storage. The test also use the userspace API to check the data in the cgroup storage, alter it, and check that the loaded and attached bpf program sees the update. Expected output: $ ./test_cgroup_storage test_cgroup_storage:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	selftests/bpf: add verifier cgroup storage tests	Roman Gushchin
	Add the following verifier tests to cover the cgroup storage functionality: 1) valid access to the cgroup storage 2) invalid access: use regular hashmap instead of cgroup storage map 3) invalid access: use invalid map fd 4) invalid access: try access memory after the cgroup storage 5) invalid access: try access memory before the cgroup storage 6) invalid access: call get_local_storage() with non-zero flags For tests 2)-6) check returned error strings. Expected output: $ ./test_verifier #0/u add+sub+mul OK #0/p add+sub+mul OK #1/u DIV32 by 0, zero check 1 OK ... #280/p valid cgroup storage access OK #281/p invalid cgroup storage access 1 OK #282/p invalid cgroup storage access 2 OK #283/p invalid per-cgroup storage access 3 OK #284/p invalid cgroup storage access 4 OK #285/p invalid cgroup storage access 5 OK ... #649/p pass modified ctx pointer to helper, 2 OK #650/p pass modified ctx pointer to helper, 3 OK Summary: 901 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf/test_run: support cgroup local storage	Roman Gushchin
	Allocate a temporary cgroup storage to use for bpf program test runs. Because the test program is not actually attached to a cgroup, the storage is allocated manually just for the execution of the bpf program. If the program is executed multiple times, the storage is not zeroed on each run, emulating multiple runs of the program, attached to a real cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpftool: add support for CGROUP_STORAGE maps	Roman Gushchin
	Add BPF_MAP_TYPE_CGROUP_STORAGE maps to the list of maps types which bpftool recognizes. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: sync bpf.h to tools/	Roman Gushchin
	Sync cgroup storage related changes: 1) new BPF_MAP_TYPE_CGROUP_STORAGE map type 2) struct bpf_cgroup_sotrage_key definition 3) get_local_storage() helper Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: introduce the bpf_get_local_storage() helper function	Roman Gushchin
	The bpf_get_local_storage() helper function is used to get a pointer to the bpf local storage from a bpf program. It takes a pointer to a storage map and flags as arguments. Right now it accepts only cgroup storage maps, and flags argument has to be 0. Further it can be extended to support other types of local storage: e.g. thread local storage etc. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: don't allow create maps of cgroup local storages	Roman Gushchin
	As there is one-to-one relation between a bpf program and cgroup local storage map, there is no sense in creating a map of cgroup local storage maps. Forbid it explicitly to avoid possible side effects. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf/verifier: introduce BPF_PTR_TO_MAP_VALUE	Roman Gushchin
	BPF_MAP_TYPE_CGROUP_STORAGE maps are special in a way that the access from the bpf program side is lookup-free. That means the result is guaranteed to be a valid pointer to the cgroup storage; no NULL-check is required. This patch introduces BPF_PTR_TO_MAP_VALUE return type, which is required to cause the verifier accept programs, which are not checking the map value pointer for being NULL. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: extend bpf_prog_array to store pointers to the cgroup storage	Roman Gushchin
	This patch converts bpf_prog_array from an array of prog pointers to the array of struct bpf_prog_array_item elements. This allows to save a cgroup storage pointer for each bpf program efficiently attached to a cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: allocate cgroup storage entries on attaching bpf programs	Roman Gushchin
	If a bpf program is using cgroup local storage, allocate a bpf_cgroup_storage structure automatically on attaching the program to a cgroup and save the pointer into the corresponding bpf_prog_list entry. Analogically, release the cgroup local storage on detaching of the bpf program. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: pass a pointer to a cgroup storage using pcpu variable	Roman Gushchin
	This commit introduces the bpf_cgroup_storage_set() helper, which will be used to pass a pointer to a cgroup storage to the bpf helper. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: introduce cgroup storage maps	Roman Gushchin
	This commit introduces BPF_MAP_TYPE_CGROUP_STORAGE maps: a special type of maps which are implementing the cgroup storage. >From the userspace point of view it's almost a generic hash map with the (cgroup inode id, attachment type) pair used as a key. The only difference is that some operations are restricted: 1) a user can't create new entries, 2) a user can't remove existing entries. The lookup from userspace is o(log(n)). Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-03	bpf: add ability to charge bpf maps memory dynamically	Roman Gushchin
	This commits extends existing bpf maps memory charging API to support dynamic charging/uncharging. This is required to account memory used by maps, if all entries are created dynamically after the map initialization. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-02	media: headers: fix linux/mod_devicetable.h inclusions	Arnd Bergmann
	A couple of drivers produced build errors after the mod_devicetable.h header was split out from the platform_device one, e.g. drivers/media/platform/davinci/vpbe_osd.c:42:40: error: array type has incomplete element type 'struct platform_device_id' drivers/media/platform/davinci/vpbe_venc.c:42:40: error: array type has incomplete element type 'struct platform_device_id' This adds the inclusion where needed. Fixes: ac3167257b9f ("headers: separate linux/mod_devicetable.h from linux/platform_device.h") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dvb_frontend: ensure that the step is ok for both FE and tuner	Mauro Carvalho Chehab
	The frequency step should take into account the tuner step, as, if tuner step is bigger than frontend step, the zigzag algorithm won't be doing the right thing, as it will be tuning multiple times at the same frequency. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	media: dvb: represent min/max/step/tolerance freqs in Hz	Mauro Carvalho Chehab
	Right now, satellite frontend drivers specify frequencies in kHz, while terrestrial/cable ones specify in Hz. That's confusing for developers. However, the main problem is that universal frontends capable of handling both satellite and non-satelite delivery systems are appearing. We end by needing to hack the drivers in order to support such hybrid frontends. So, convert everything to specify frontend frequencies in Hz. Tested-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-08-02	net/socket: remove duplicated init code	Matthieu Baerts
	This refactoring work has been started by David Howells in cdfbabfb2f0c (net: Work around lockdep limitation in sockets that use sockets) but the exact same day in 581319c58600 (net/socket: use per af lockdep classes for sk queues), Paolo Abeni added new classes. This reduces the amount of (nearly) duplicated code and eases the addition of new socket types. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02	MAINTAINERS: Replace Heikki as maintainer of Intel pinctrl	Andy Shevchenko
	Heikki has another priorities and no time to maintain Intel pinctrl driver. As we decided off line I'm going to replace him. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>