git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-01-15	net: ethernet: sun4i-emac: Fix an error handling path in emac_probe()	Christophe JAILLET
	A dma_request_chan() call is hidden in emac_configure_dma(). It must be released in the probe if an error occurs, as already done in the remove function. Add the corresponding dma_release_channel() call. Fixes: 47869e82c8b8 ("sun4i-emac.c: add dma support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-15	net: ethernet: mtk_eth_soc: fix error checking in mtk_mac_config()	Tom Rix
	Clang static analysis reports this problem mtk_eth_soc.c:394:7: warning: Branch condition evaluates to a garbage value if (err) ^~~ err is not initialized and only conditionally set. So intitialize err. Fixes: 7e538372694b ("net: ethernet: mediatek: Re-add support SGMII") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-15	net: mscc: ocelot: don't dereference NULL pointers with shared tc filters	Vladimir Oltean
	The following command sequence: tc qdisc del dev swp0 clsact tc qdisc add dev swp0 ingress_block 1 clsact tc qdisc add dev swp1 ingress_block 1 clsact tc filter add block 1 flower action drop tc qdisc del dev swp0 clsact produces the following NPD: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014 pc : vcap_entry_set+0x14/0x70 lr : ocelot_vcap_filter_del+0x198/0x234 Call trace: vcap_entry_set+0x14/0x70 ocelot_vcap_filter_del+0x198/0x234 ocelot_cls_flower_destroy+0x94/0xe4 felix_cls_flower_del+0x70/0x84 dsa_slave_setup_tc_block_cb+0x13c/0x60c dsa_slave_setup_tc_block_cb_ig+0x20/0x30 tc_setup_cb_reoffload+0x44/0x120 fl_reoffload+0x280/0x320 tcf_block_playback_offloads+0x6c/0x184 tcf_block_unbind+0x80/0xe0 tcf_block_setup+0x174/0x214 tcf_block_offload_cmd.isra.0+0x100/0x13c tcf_block_offload_unbind+0x5c/0xa0 __tcf_block_put+0x54/0x174 tcf_block_put_ext+0x5c/0x74 clsact_destroy+0x40/0x60 qdisc_destroy+0x4c/0x150 qdisc_put+0x70/0x90 qdisc_graft+0x3f0/0x4c0 tc_get_qdisc+0x1cc/0x364 rtnetlink_rcv_msg+0x124/0x340 The reason is that the driver isn't prepared to receive two tc filters with the same cookie. It unconditionally creates a new struct ocelot_vcap_filter for each tc filter, and it adds all filters with the same identifier (cookie) to the ocelot_vcap_block. The problem is here, in ocelot_vcap_filter_del(): /* Gets index of the filter / index = ocelot_vcap_block_get_filter_index(block, filter); if (index < 0) return index; / Delete filter / ocelot_vcap_block_remove_filter(ocelot, block, filter); / Move up all the blocks over the deleted filter / for (i = index; i < block->count; i++) { struct ocelot_vcap_filter tmp; tmp = ocelot_vcap_block_find_filter_by_index(block, i); vcap_entry_set(ocelot, i, tmp); } what will happen is ocelot_vcap_block_get_filter_index() will return the index (@index) of the first filter found with that cookie. This is _not_ the index of _this_ filter, but the other one with the same cookie, because ocelot_vcap_filter_equal() gets fooled. Then later, ocelot_vcap_block_remove_filter() is coded to remove all filters that are ocelot_vcap_filter_equal() with the passed @filter. So unexpectedly, both filters get deleted from the list. Then ocelot_vcap_filter_del() will attempt to move all the other filters up, again finding them by index (@i). The block count is 2, @index was 0, so it will attempt to move up filter @i=0 and @i=1. It assigns tmp = ocelot_vcap_block_find_filter_by_index(block, i), which is now a NULL pointer because ocelot_vcap_block_remove_filter() has removed more than one filter. As far as I can see, this problem has been there since the introduction of tc offload support, however I cannot test beyond the blamed commit due to hardware availability. In any case, any fix cannot be backported that far, due to lots of changes to the code base. Therefore, let's go for the correct solution, which is to not call ocelot_vcap_filter_add() and ocelot_vcap_filter_del(), unless the filter is actually unique and not shared. For the shared filters, we should just modify the ingress port mask and call ocelot_vcap_filter_replace(), a function introduced by commit 95706be13b9f ("net: mscc: ocelot: create a function that replaces an existing VCAP filter"). This way, block->rules will only contain filters with unique cookies, by design. Fixes: 07d985eef073 ("net: dsa: felix: Wire up the ocelot cls_flower methods") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-15	Merge branch 'next' into for-linus	Dmitry Torokhov
	Prepare input updates for 5.17 merge window.
2022-01-15	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk\|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...
2022-01-15	bitmap: unify find_bit operations	Yury Norov
	bitmap_for_each_{set,clear}_region() are similar to for_each_bit() macros in include/linux/find.h, but interface and implementation of them are different. This patch adds for_each_bitrange() macros and drops unused bitmap_*_region() API in sake of unification. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Dennis Zhou <dennis@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC
2022-01-15	Replace for_each__bit_from() with for_each__bit() where appropriate	Yury Norov
	A couple of kernel functions call for_each__bit_from() with start bit equal to 0. Replace them with for_each__bit(). No functional changes, but might improve on readability. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15	cpumask: replace cpumask_next_* with cpumask_first_* where appropriate	Yury Norov
	cpumask_first() is a more effective analogue of 'next' version if n == -1 (which means start == 0). This patch replaces 'next' with 'first' where things look trivial. There's no cpumask_first_zero() function, so create it. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15	all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate	Yury Norov
	find_first{,_zero}_bit is a more effective analogue of 'next' version if start == 0. This patch replaces 'next' with 'first' where things look trivial. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2022-01-15	zram: use ATTRIBUTE_GROUPS	Luis Chamberlain
	Embrace ATTRIBUTE_GROUPS to avoid boiler plate code. This should not introduce any functional changes. Link: https://lkml.kernel.org/r/20211028203600.2157356-1-mcgrof@kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: compound devmap support	Joao Martins
	Use the newly added compound devmap facility which maps the assigned dax ranges as compound pages at a page size of @align. dax devices are created with a fixed @align (huge page size) which is enforced through as well at mmap() of the device. Faults, consequently happen too at the specified @align specified at the creation, and those don't change throughout dax device lifetime. MCEs unmap a whole dax huge page, as well as splits occurring at the configured page size. Performance measured by gup_test improves considerably for unpin_user_pages() and altmap with NVDIMMs: $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms [altmap] (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms [altmap with -m 127004] (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms .. as well as unpin_user_page_range_dirty_lock() being just as effective as THP/hugetlb[0] pages. [0] https://lore.kernel.org/linux-mm/20210212130843.13865-5-joao.m.martins@oracle.com/ Link: https://lkml.kernel.org/r/20211202204422.26777-12-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: remove pfn from __dev_dax_{pte,pmd,pud}_fault()	Joao Martins
	After moving the page mapping to be set prior to pte insertion, the pfn in dev_dax_huge_fault() no longer is necessary. Remove it, as well as the @pfn argument passed to the internal fault handler helpers. [akpm@linux-foundation.org: fix CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=n build] Link: https://lkml.kernel.org/r/20211202204422.26777-11-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Suggested-by: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: set mapping prior to vmf_insert_pfn{,_pmd,pud}()	Joao Martins
	Normally, the @page mapping is set prior to inserting the page into a page table entry. Make device-dax adhere to the same ordering, rather than setting mapping after the PTE is inserted. The address_space never changes and it is always associated with the same inode and underlying pages. So, the page mapping is set once but cleared when the struct pages are removed/freed (i.e. after {devm_}memunmap_pages()). Link: https://lkml.kernel.org/r/20211202204422.26777-10-joao.m.martins@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: factor out page mapping initialization	Joao Martins
	Move initialization of page->mapping into a separate helper. This is in preparation to move the mapping set to be prior to inserting the page table entry and also for tidying up compound page handling into one helper. Link: https://lkml.kernel.org/r/20211202204422.26777-9-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: ensure dev_dax->pgmap is valid for dynamic devices	Joao Martins
	Right now, only static dax regions have a valid @pgmap pointer in its struct dev_dax. Dynamic dax case however, do not. In preparation for device-dax compound devmap support, make sure that dev_dax pgmap field is set after it has been allocated and initialized. dynamic dax device have the @pgmap is allocated at probe() and it's managed by devm (contrast to static dax region which a pgmap is provided and dax core kfrees it). So in addition to ensure a valid @pgmap, clear the pgmap when the dynamic dax device is released to avoid the same pgmap ranges to be re-requested across multiple region device reconfigs. Add a static_dev_dax() and use that helper in dev_dax_probe() to ensure the initialization differences between dynamic and static regions are more explicit. While at it, consolidate the ranges initialization when we allocate the @pgmap for the dynamic dax region case. Also take the opportunity to document the differences between static and dynamic da regions. Link: https://lkml.kernel.org/r/20211202204422.26777-8-joao.m.martins@oracle.com Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: use struct_size()	Joao Martins
	Use the struct_size() helper for the size of a struct with variable array member at the end, rather than manually calculating it. Link: https://lkml.kernel.org/r/20211202204422.26777-7-joao.m.martins@oracle.com Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	device-dax: use ALIGN() for determining pgoff	Joao Martins
	Rather than calculating @pgoff manually, switch to ALIGN() instead. Link: https://lkml.kernel.org/r/20211202204422.26777-6-joao.m.martins@oracle.com Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	mm: kmemleak: alloc gray object for reserved region with direct map	Calvin Zhang
	Reserved regions with direct mapping may contain references to other regions. CMA region with fixed location is reserved without creating kmemleak_object for it. So add them as gray kmemleak objects. Link: https://lkml.kernel.org/r/20211123090641.3654006-1-calvinzhang.cool@gmail.com Signed-off-by: Calvin Zhang <calvinzhang.cool@gmail.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15	RDMA/siw: make use of the helper function kthread_run_on_cpu()	Cai Huoqing
	Replace kthread_create/kthread_bind/wake_up_process() with kthread_run_on_cpu() to simplify the code. Link: https://lkml.kernel.org/r/20211022025711.3673-3-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Cc: Bernard Metzler <bmt@zurich.ibm.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Doug Ledford <dledford@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-14	vdpa/mlx5: Fix tracking of current number of VQs	Eli Cohen
	Modify the code such that ndev->cur_num_vqs better reflects the actual number of data virtqueues. The value can be accurately realized after features have been negotiated. This is to prevent possible failures when modifying the RQT object if the cur_num_vqs bears invalid value. No issue was actually encountered but this also makes the code more readable. Fixes: c5a5cd3d3217 ("vdpa/mlx5: Support configuring max data virtqueue") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/mlx5: Fix is_index_valid() to refer to features	Eli Cohen
	Make sure the decision whether an index received through a callback is valid or not consults the negotiated features. The motivation for this was due to a case encountered where I shut down the VM. After the reset operation was called features were already clear, I got get_vq_state() call which caused out array bounds access since is_index_valid() reported the index value. So this is more of not hit a bug since the call shouldn't have been made first place. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa: Protect vdpa reset with cf_mutex	Eli Cohen
	Call reset using the wrapper function vdpa_reset() to make sure the operation is serialized with cf_mutex. This comes to protect from the following possible scenario: vhost_vdpa_set_status() could call the reset op. Since the call is not protected by cf_mutex, a netlink thread calling vdpa_dev_config_fill could get passed the VIRTIO_CONFIG_S_FEATURES_OK check in vdpa_dev_config_fill() and end up reporting wrong features. Fixes: 5f6e85953d8f ("vdpa: Read device configuration only if FEATURES_OK") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-3-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa: Avoid taking cf_mutex lock on get status	Eli Cohen
	Avoid the wrapper holding cf_mutex since it is not protecting anything. To avoid confusion and unnecessary overhead incurred by it, remove. Fixes: f489f27bc0ab ("vdpa: Sync calls set/get config/status with cf_mutex") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/vdpa_sim_net: Report max device capabilities	Eli Cohen
	Configure max supported virtqueues features on the management device. This info can be retrieved using: $ vdpa mgmtdev show vdpasim_net: supported_classes net max_supported_vqs 2 dev_features MAC ANY_LAYOUT VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-15-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa: Use BIT_ULL for bit operations	Eli Cohen
	All masks in this file are 64 bits. Change BIT to BIT_ULL. Other occurences use (1 << val) which yields a 32 bit value. Change them to use BIT_ULL too. Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-14-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/vdpa_sim: Configure max supported virtqueues	Eli Cohen
	Configure max supported virtqueues on the management device. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-13-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/mlx5: Report max device capabilities	Eli Cohen
	Configure max supported virtqueues and features on the management device. This info can be retrieved using: $ vdpa mgmtdev show auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-12-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
2022-01-14	vdpa: Support reporting max device capabilities	Eli Cohen
	Add max_supported_vqs and supported_features fields to struct vdpa_mgmt_dev. Upstream drivers need to feel these values according to the device capabilities. These values are reported back in a netlink message when showing management devices. Examples: $ auxiliary/mlx5_core.sf.1: supported_classes net max_supported_vqs 257 dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ MQ \ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j mgmtdev show {"mgmtdev":{"auxiliary/mlx5_core.sf.1":{"supported_classes":["net"], \ "max_supported_vqs":257,"dev_features":["CSUM","GUEST_CSUM","MTU", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR", \ "VERSION_1","ACCESS_PLATFORM"]}}} $ vdpa -jp mgmtdev show { "mgmtdev": { "auxiliary/mlx5_core.sf.1": { "supported_classes": [ "net" ], "max_supported_vqs": 257, "dev_features": ["CSUM","GUEST_CSUM","MTU","HOST_TSO4", \ "HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM"] } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-11-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com>
2022-01-14	vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()	Eli Cohen
	Restore ndev->cur_num_vqs to the original value in case change_num_qps() fails. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-10-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: Add support for returning device configuration information	Eli Cohen
	Add netlink attribute to store the negotiated features. This can be used by userspace to get the current state of the vdpa instance. Examples: $ vdpa dev config show vdpa-a vdpa-a: mac 00:00:00:00:88:88 link up link_announce false max_vq_pairs 16 mtu 1500 negotiated_features CSUM GUEST_CSUM MTU MAC HOST_TSO4 HOST_TSO6 STATUS \ CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM $ vdpa -j dev config show vdpa-a {"config":{"vdpa-a":{"mac":"00:00:00:00:88:88","link ":"up","link_announce":false, \ "max_vq_pairs":16,"mtu":1500,"negotiated_features":["CSUM","GUEST_CSUM","MTU","MAC", \ "HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ","CTRL_MAC_ADDR","VERSION_1", \ "ACCESS_PLATFORM"]}}} $ vdpa -jp dev config show vdpa-a { "config": { "vdpa-a": { "mac": "00:00:00:00:88:88", "link ": "up", "link_announce ": false, "max_vq_pairs": 16, "mtu": 1500, "negotiated_features": [ "CSUM","GUEST_CSUM","MTU","MAC","HOST_TSO4","HOST_TSO6","STATUS","CTRL_VQ","MQ", \ "CTRL_MAC_ADDR","VERSION_1","ACCESS_PLATFORM" ] } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-9-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/mlx5: Support configuring max data virtqueue	Eli Cohen
	Check whether the max number of data virtqueue pairs was provided when a adding a new device and verify the new value does not exceed device capabilities. In addition, change the arrays holding virtqueue and callback contexts to be dynamically allocated. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-8-elic@nvidia.com Includes fixup: vdpa/mlx5: fix error handling in mlx5_vdpa_dev_add() Clang build fails with mlx5_vnet.c:2574:6: error: variable 'mvdev' is used uninitialized whenever 'if' condition is true if (!ndev->vqs \|\| !ndev->event_cbs) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mlx5_vnet.c:2660:14: note: uninitialized use occurs here put_device(&mvdev->vdev.dev); ^~~~~ This because mvdev is set after trying to allocate ndev->vqs,event_cbs. So move the allocation to after mvdev is set but before the arrays are used in init_mvqs() Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20220107211352.3940570-1-trix@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Includes fixup: vdpa/mlx5: fix endian-ness for max vqs sparse warnings: (new ones prefixed by >>) >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast to restricted __le16 >> drivers/vdpa/mlx5/net/mlx5_vnet.c:1247:23: sparse: sparse: cast from restricted __virtio16 > 1247 num = le16_to_cpu(ndev->config.max_virtqueue_pairs); Address this using the appropriate wrapper. Cc: "Eli Cohen" <elic@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>
2022-01-14	vdpa/mlx5: Fix config_attr_mask assignment	Eli Cohen
	Fix VDPA_ATTR_DEV_NET_CFG_MACADDR assignment to be explicit 64 bit assignment. No issue was seen since the value is well below 64 bit max value. Nevertheless it needs to be fixed. Fixes: a007d940040c ("vdpa/mlx5: Support configuration of MAC") Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-7-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: Allow to configure max data virtqueues	Eli Cohen
	Add netlink support to configure the max virtqueue pairs for a device. At least one pair is required. The maximum is dictated by the device. Example: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 max_vqp 4 Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: Read device configuration only if FEATURES_OK	Eli Cohen
	Avoid reading device configuration during feature negotiation. Read device status and verify that VIRTIO_CONFIG_S_FEATURES_OK is set. Protect the entire operation, including configuration read with cf_mutex to ensure integrity of the results. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa: Sync calls set/get config/status with cf_mutex	Eli Cohen
	Add wrappers to get/set status and protect these operations with cf_mutex to serialize these operations with respect to get/set config operations. Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa/mlx5: Distribute RX virtqueues in RQT object	Eli Cohen
	Distribute the available rx virtqueues amongst the available RQT entries. RQTs require to have a power of two entries. When creating or modifying the RQT, use the lowest number of power of two entries that is not less than the number of rx virtqueues. Distribute them in the available entries such that some virtqueus may be referenced twice. This allows to configure any number of virtqueue pairs when multiqueue is used. Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-3-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: Provide interface to read driver features	Eli Cohen
	Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: clean up get_config_size ret value handling	Laura Abbott
	The return type of get_config_size is size_t so it makes sense to change the type of the variable holding its result. That said, this already got taken care of (differently, and arguably not as well) by commit 3ed21c1451a1 ("vdpa: check that offsets are within bounds"). The added 'c->off > size' test in that commit will be done as an unsigned comparison on 32-bit (safe due to not being signed). On a 64-bit platform, it will be done as a signed comparison, but in that case the comparison will be done in 64-bit, and 'c->off' being an u32 it will be valid thanks to the extended range (ie both values will be positive in 64 bits). So this was a real bug, but it was already addressed and marked for stable. Signed-off-by: Laura Abbott <labbott@kernel.org> Reported-by: Luo Likang <luolikang@nsfocus.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	virtio_ring: mark ring unused on error	Michael S. Tsirkin
	A recently added error path does not mark ring unused when exiting on OOM, which will lead to BUG on the next entry in debug builds. TODO: refactor code so we have START_USE and END_USE in the same function. Fixes: fc6d70f40b3d ("virtio_ring: check desc == NULL when using indirect with packed") Cc: "Xuan Zhuo" <xuanzhuo@linux.alibaba.com> Cc: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vhost/test: fix memory leak of vhost virtqueues	Xianting Tian
	We need free the vqs in .release(), which are allocated in .open(). Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Link: https://lore.kernel.org/r/20211228030924.3468439-1-xianting.tian@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	vdpa/mlx5: Fix wrong configuration of virtio_version_1_0	Eli Cohen
	Remove overriding of virtio_version_1_0 which forced the virtqueue object to version 1. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20211230142024.142979-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-01-14	virtio/virtio_pci_legacy_dev: ensure the correct return value	Peng Hao
	When pci_iomap return NULL, the return value is zero. Signed-off-by: Peng Hao <flyingpeng@tencent.com> Link: https://lore.kernel.org/r/20211222112014.87394-1-flyingpeng@tencent.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-01-14	virtio/virtio_mem: handle a possible NULL as a memcpy parameter	Peng Hao
	There is a check for vm->sbm.sb_states before, and it should check it here as well. Signed-off-by: Peng Hao <flyingpeng@tencent.com> Link: https://lore.kernel.org/r/20211222011225.40573-1-flyingpeng@tencent.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug") Cc: stable@vger.kernel.org # v5.8+
2022-01-14	virtio: fix a typo in function "vp_modern_remove" comments.	Dapeng Mi
	Function name "vp_modern_remove" in comments is written to "vp_modern_probe" incorrectly. Change it. Signed-off-by: Dapeng Mi <dapeng1.mi@intel.com> Link: https://lore.kernel.org/r/20211210073546.700783-1-dapeng1.mi@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-01-14	virtio-pci: fix the confusing error message	王贇
	The error message on the failure of pfn check should tell virtio-pci rather than virtio-mmio, just fix it. Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/ae5e154e-ac59-f0fa-a7c7-091a2201f581@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	firmware: qemu_fw_cfg: remove sysfs entries explicitly	Johan Hovold
	Explicitly remove the file entries from sysfs before dropping the final reference for symmetry reasons and for consistency with the rest of the driver. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-5-johan@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	firmware: qemu_fw_cfg: fix sysfs information leak	Johan Hovold
	Make sure to always NUL-terminate file names retrieved from the firmware to avoid accessing data beyond the entry slab buffer and exposing it through sysfs in case the firmware data is corrupt. Fixes: 75f3e8e47f38 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: stable@vger.kernel.org # 4.6 Cc: Gabriel Somlo <somlo@cmu.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-4-johan@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	firmware: qemu_fw_cfg: fix kobject leak in probe error path	Johan Hovold
	An initialised kobject must be freed using kobject_put() to avoid leaking associated resources (e.g. the object name). Commit fe3c60684377 ("firmware: Fix a reference count leak.") "fixed" the leak in the first error path of the file registration helper but left the second one unchanged. This "fix" would however result in a NULL pointer dereference due to the release function also removing the never added entry from the fw_cfg_entry_cache list. This has now been addressed. Fix the remaining kobject leak by restoring the common error path and adding the missing kobject_put(). Fixes: 75f3e8e47f38 ("firmware: introduce sysfs driver for QEMU's fw_cfg device") Cc: stable@vger.kernel.org # 4.6 Cc: Gabriel Somlo <somlo@cmu.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-3-johan@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entries	Johan Hovold
	Commit fe3c60684377 ("firmware: Fix a reference count leak.") "fixed" a kobject leak in the file registration helper by properly calling kobject_put() for the entry in case registration of the object fails (e.g. due to a name collision). This would however result in a NULL pointer dereference when the release function tries to remove the never added entry from the fw_cfg_entry_cache list. Fix this by moving the list-removal out of the release function. Note that the offending commit was one of the benign looking umn.edu fixes which was reviewed but not reverted. [1][2] [1] https://lore.kernel.org/r/202105051005.49BFABCE@keescook [2] https://lore.kernel.org/all/YIg7ZOZvS3a8LjSv@kroah.com Fixes: fe3c60684377 ("firmware: Fix a reference count leak.") Cc: stable@vger.kernel.org # 5.8 Cc: Qiushi Wu <wu000273@umn.edu> Cc: Kees Cook <keescook@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211201132528.30025-2-johan@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-14	vdpa: Avoid duplicate call to vp_vdpa get_status	Eugenio Pérez
	It has no sense to call get_status twice, since we already have a variable for that. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20211104195833.2089796-1-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>