summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-04-23netfs: Add tracepointsDavid Howells
Add three tracepoints to track the activity of the read helpers: (1) netfs/netfs_read This logs entry to the read helpers and also expansion of the range in a readahead request. (2) netfs/netfs_rreq This logs the progress of netfs_read_request objects which track read requests. A read request may be a compound of multiple subrequests. (3) netfs/netfs_sreq This logs the progress of netfs_read_subrequest objects, which track the contributions from various sources to a read request. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/161118138060.1232039.5353374588021776217.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161033468.2537118.14021843889844001905.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340395843.1303470.7355519662919639648.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539538693.286939.10171713520419106334.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653796447.2770958.1870655382450862155.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789078003.6155.17814844411672989942.stgit@warthog.procyon.org.uk/ # v6
2021-04-23netfs: Provide readahead and readpage netfs helpersDavid Howells
Add a pair of helper functions: (*) netfs_readahead() (*) netfs_readpage() to do the work of handling a readahead or a readpage, where the page(s) that form part of the request may be split between the local cache, the server or just require clearing, and may be single pages and transparent huge pages. This is all handled within the helper. Note that while both will read from the cache if there is data present, only netfs_readahead() will expand the request beyond what it was asked to do, and only netfs_readahead() will write back to the cache. netfs_readpage(), on the other hand, is synchronous and only fetches the page (which might be a THP) it is asked for. The netfs gives the helper parameters from the VM, the cache cookie it wants to use (or NULL) and a table of operations (only one of which is mandatory): (*) expand_readahead() [optional] Called to allow the netfs to request an expansion of a readahead request to meet its own alignment requirements. This is done by changing rreq->start and rreq->len. (*) clamp_length() [optional] Called to allow the netfs to cut down a subrequest to meet its own boundary requirements. If it does this, the helper will generate additional subrequests until the full request is satisfied. (*) is_still_valid() [optional] Called to find out if the data just read from the cache has been invalidated and must be reread from the server. (*) issue_op() [required] Called to ask the netfs to issue a read to the server. The subrequest describes the read. The read request holds information about the file being accessed. The netfs can cache information in rreq->netfs_priv. Upon completion, the netfs should set the error, transferred and can also set FSCACHE_SREQ_CLEAR_TAIL and then call fscache_subreq_terminated(). (*) done() [optional] Called after the pages have been unlocked. The read request is still pinning the file and mapping and may still be pinning pages with PG_fscache. rreq->error indicates any error that has been accumulated. (*) cleanup() [optional] Called when the helper is disposing of a finished read request. This allows the netfs to clear rreq->netfs_priv. Netfs support is enabled with CONFIG_NETFS_SUPPORT=y. It will be built even if CONFIG_FSCACHE=n and in this case much of it should be optimised away, allowing the filesystem to use it even when caching is disabled. Changes: v5: - Comment why netfs_readahead() is putting pages[2]. - Use page_file_mapping() rather than page->mapping[2]. - Use page_index() rather than page->index[2]. - Use set_page_fscache()[3] rather then SetPageFsCache() as this takes an appropriate ref too[4]. v4: - Folded in a kerneldoc comment fix. - Folded in a fix for the error handling in the case that ENOMEM occurs. - Added flag to netfs_subreq_terminated() to indicate that the caller may have been running async and stuff that might sleep needs punting to a workqueue (can't use in_softirq()[1]). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20210216084230.GA23669@lst.de/ [1] Link: https://lore.kernel.org/r/20210321014202.GF3420@casper.infradead.org/ [2] Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com/ [4] Link: https://lore.kernel.org/r/160588497406.3465195.18003475695899726222.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118136849.1232039.8923686136144228724.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161032290.2537118.13400578415247339173.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340394873.1303470.6237319335883242536.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539537375.286939.16642940088716990995.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653795430.2770958.4947584573720000554.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789076581.6155.6745849361504760209.stgit@warthog.procyon.org.uk/ # v6
2021-04-23netfs, mm: Add set/end/wait_on_page_fscache() aliasesDavid Howells
Add set/end/wait_on_page_fscache() as aliases of set/end/wait_page_private_2(). These allow a page to marked with PG_fscache, the flag to be removed and waiters woken and waiting for the flag to be cleared. A ref on the page is also taken and dropped. [Linus suggested putting the fscache-themed functions into the caching-specific headers rather than pagemap.h[1]] Changes: v5: - Mirror the changes to the core routines[2]. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Linus Torvalds <torvalds@linux-foundation.org> cc: Matthew Wilcox <willy@infradead.org> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/1330473.1612974547@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/CAHk-=wjgA-74ddehziVk=XAEMTKswPu1Yw4uaro1R3ibs27ztw@mail.gmail.com/ [1] Link: https://lore.kernel.org/r/161340393568.1303470.4997526899111310530.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539536093.286939.5076448803512118764.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/2499407.1616505440@warthog.procyon.org.uk/ [2] Link: https://lore.kernel.org/r/161653793873.2770958.12157243390965814502.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789075327.6155.7432127924219092385.stgit@warthog.procyon.org.uk/ # v6
2021-04-23netfs, mm: Move PG_fscache helper funcs to linux/netfs.hDavid Howells
Move the PG_fscache related helper funcs (such as SetPageFsCache()) to linux/netfs.h rather than linux/fscache.h as the intention is to move to a model where they're used by the network filesystem and the helper library, but not by fscache/cachefiles itself. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/161340392347.1303470.18065131603507621762.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539534516.286939.6265142985563005000.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653792959.2770958.5386546945273988117.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789073997.6155.18442271115255650614.stgit@warthog.procyon.org.uk/ # v6
2021-04-23mm: Implement readahead_control pageset expansionDavid Howells
Provide a function, readahead_expand(), that expands the set of pages specified by a readahead_control object to encompass a revised area with a proposed size and length. The proposed area must include all of the old area and may be expanded yet more by this function so that the edges align on (transparent huge) page boundaries as allocated. The expansion will be cut short if a page already exists in either of the areas being expanded into. Note that any expansion made in such a case is not rolled back. This will be used by fscache so that reads can be expanded to cache granule boundaries, thereby allowing whole granules to be stored in the cache, but there are other potential users also. Changes: v6: - Fold in a patch from Matthew Wilcox to tell the ondemand readahead algorithm about the expansion so that the next readahead starts at the right place[2]. v4: - Moved the declaration of readahead_expand() to a better place[1]. Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Christoph Hellwig <hch@lst.de> cc: Mike Marshall <hubcap@omnibond.com> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20210217161358.GM2858050@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/20210407201857.3582797-4-willy@infradead.org/ [2] Link: https://lore.kernel.org/r/159974633888.2094769.8326206446358128373.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588479816.3465195.553952688795241765.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118131787.1232039.4863969952441067985.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161028670.2537118.13831420617039766044.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340389201.1303470.14353807284546854878.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539530488.286939.18085961677838089157.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653789422.2770958.2108046612147345000.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789069829.6155.4295672417565512161.stgit@warthog.procyon.org.uk/ # v6
2021-04-23fs: Document file_ra_stateMatthew Wilcox (Oracle)
Turn the comments into kernel-doc and improve the wording slightly. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> Link: https://lore.kernel.org/r/20210407201857.3582797-3-willy@infradead.org/ Link: https://lore.kernel.org/r/161789068619.6155.1397999970593531574.stgit@warthog.procyon.org.uk/ # v6
2021-04-23mm/filemap: Pass the file_ra_state in the ractlMatthew Wilcox (Oracle)
For readahead_expand(), we need to modify the file ra_state, so pass it down by adding it to the ractl. We have to do this because it's not always the same as f_ra in the struct file that is already being passed. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> Link: https://lore.kernel.org/r/20210407201857.3582797-2-willy@infradead.org/ Link: https://lore.kernel.org/r/161789067431.6155.8063840447229665720.stgit@warthog.procyon.org.uk/ # v6
2021-04-23mm: Add set/end/wait functions for PG_private_2David Howells
Add three functions to manipulate PG_private_2: (*) set_page_private_2() - Set the flag and take an appropriate reference on the flagged page. (*) end_page_private_2() - Clear the flag, drop the reference and wake up any waiters, somewhat analogously with end_page_writeback(). (*) wait_on_page_private_2() - Wait for the flag to be cleared. Wrappers will need to be placed in the netfs lib header in the patch that adds that. [This implements a suggestion by Linus[1] to not mix the terminology of PG_private_2 and PG_fscache in the mm core function] Changes: v7: - Use compound_head() in all the functions to make them THP safe[6]. v5: - Add set and end functions, calling the end function end rather than unlock[3]. - Keep a ref on the page when PG_private_2 is set[4][5]. v4: - Remove extern from the declaration[2]. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Christoph Hellwig <hch@lst.de> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/1330473.1612974547@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/CAHk-=wjgA-74ddehziVk=XAEMTKswPu1Yw4uaro1R3ibs27ztw@mail.gmail.com/ [1] Link: https://lore.kernel.org/r/20210216102659.GA27714@lst.de/ [2] Link: https://lore.kernel.org/r/161340387944.1303470.7944159520278177652.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539528910.286939.1252328699383291173.stgit@warthog.procyon.org.uk # v4 Link: https://lore.kernel.org/r/20210321105309.GG3420@casper.infradead.org [3] Link: https://lore.kernel.org/r/CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com/ [4] Link: https://lore.kernel.org/r/CAHk-=wjSGsRj7xwhSMQ6dAQiz53xA39pOG+XA_WeTgwBBu4uqg@mail.gmail.com/ [5] Link: https://lore.kernel.org/r/20210408145057.GN2531743@casper.infradead.org/ [6] Link: https://lore.kernel.org/r/161653788200.2770958.9517755716374927208.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789066013.6155.9816857201817288382.stgit@warthog.procyon.org.uk/ # v6
2021-04-23iov_iter: Add ITER_XARRAYDavid Howells
Add an iterator, ITER_XARRAY, that walks through a set of pages attached to an xarray, starting at a given page and offset and walking for the specified amount of bytes. The iterator supports transparent huge pages. The iterate_xarray() macro calls the helper function with rcu_access() helped. I think that this is only a problem for iov_iter_for_each_range() - and that returns an error for ITER_XARRAY (also, this function does not appear to be called). The caller must guarantee that the pages are all present and they must be locked using PG_locked, PG_writeback or PG_fscache to prevent them from going away or being migrated whilst they're being accessed. This is useful for copying data from socket buffers to inodes in network filesystems and for transferring data between those inodes and the cache using direct I/O. Whilst it is true that ITER_BVEC could be used instead, that would require a bio_vec array to be allocated to refer to all the pages - which should be redundant if inode->i_pages also points to all these pages. Note that older versions of this patch implemented an ITER_MAPPING instead, which was almost the same. Changes: v7: - Rename iter_xarray_copy_pages() to iter_xarray_populate_pages()[1]. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Alexander Viro <viro@zeniv.linux.org.uk> cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Christoph Hellwig <hch@lst.de> cc: linux-mm@kvack.org cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/3577430.1579705075@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/158861205740.340223.16592990225607814022.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/159465785214.1376674.6062549291411362531.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/160588477334.3465195.3608963255682568730.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161118129703.1232039.17141248432017826976.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161026313.2537118.14676007075365418649.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340386671.1303470.10752208972482479840.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539527815.286939.14607323792547049341.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653786033.2770958.14154191921867463240.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789064740.6155.11932541175173658065.stgit@warthog.procyon.org.uk/ # v6 Link: https://lore.kernel.org/r/27c369a8f42bb8a617672b2dc0126a5c6df5a050.camel@kernel.org [1]
2021-04-23xen: Remove support for PV ACPI cpu/memory hotplugBoris Ostrovsky
Commit 76fc253723ad ("xen/acpi-stub: Disable it b/c the acpi_processor_add is no longer called.") declared as BROKEN support for Xen ACPI stub (which is required for xen-acpi-{cpu|memory}-hotplug) and suggested that this is temporary and will be soon fixed. This was in March 2013. Further, commit cfafae940381 ("xen: rename dom0_op to platform_op") renamed an interface used by memory hotplug code without updating that code (as it was BROKEN and therefore not compiled). This was in November 2015 and has gone unnoticed for over 5 year. It is now clear that this code is of no interest to anyone and therefore should be removed. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/1618336344-3162-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-04-23mmc: mmc_spi: Make of_mmc_spi.c resource provider agnosticAndy Shevchenko
In order to use the same driver on non-OF platforms, make of_mmc_spi.c resource provider agnostic. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210419112459.25241-6-andriy.shevchenko@linux.intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-04-23mmc: core: Convert mmc_of_parse_voltage() to use device property APIAndy Shevchenko
mmc_of_parse() for a few years has been using device property API. Convert mmc_of_parse_voltage() as well. At the same time switch users to new API. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210419112459.25241-2-andriy.shevchenko@linux.intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-04-23signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architecturesMarco Elver
The alignment of a structure is that of its largest member. On architectures like 32-bit Arm (but not e.g. 32-bit x86) 64-bit integers will require 64-bit alignment and not its natural word size. This means that there is no portable way to add 64-bit integers to siginfo_t on 32-bit architectures without breaking the ABI, because siginfo_t does not yet (and therefore likely never will) contain 64-bit fields on 32-bit architectures. Adding a 64-bit integer could change the alignment of the union after the 3 initial int si_signo, si_errno, si_code, thus introducing 4 bytes of padding shifting the entire union, which would break the ABI. One alternative would be to use the __packed attribute, however, it is non-standard C. Given siginfo_t has definitions outside the Linux kernel in various standard libraries that can be compiled with any number of different compilers (not just those we rely on), using non-standard attributes on siginfo_t should be avoided to ensure portability. In the case of the si_perf field, word size is sufficient since there is no exact requirement on size, given the data it contains is user-defined via perf_event_attr::sig_data. On 32-bit architectures, any excess bits of perf_event_attr::sig_data will therefore be truncated when copying into si_perf. Since si_perf is intended to disambiguate events (e.g. encoding relevant information if there are more events of the same type), 32 bits should provide enough entropy to do so on 32-bit architectures. For 64-bit architectures, no change is intended. Fixes: fb6cc127e0b6 ("signal: Introduce TRAP_PERF si_code and si_perf to siginfo") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Reported-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lkml.kernel.org/r/20210422191823.79012-1-elver@google.com
2021-04-22net: stmmac: Add HW descriptor prefetch setting for DWMAC Core 5.20 onwardsMohammad Athari Bin Ismail
DWMAC Core 5.20 onwards supports HW descriptor prefetching. Additionally, it also depends on platform specific RTL configuration. This capability could be enabled by setting DMA_Mode bit-19 (DCHE). So, to enable this cability, platform must set plat->dma_cfg->dche = true and the DWMAC core version must be 5.20 onwards. Else, this capability wouldn`t be configured Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22landlock: Enable user space to infer supported featuresMickaël Salaün
Add a new flag LANDLOCK_CREATE_RULESET_VERSION to landlock_create_ruleset(2). This enables to retreive a Landlock ABI version that is useful to efficiently follow a best-effort security approach. Indeed, it would be a missed opportunity to abort the whole sandbox building, because some features are unavailable, instead of protecting users as much as possible with the subset of features provided by the running kernel. This new flag enables user space to identify the minimum set of Landlock features supported by the running kernel without relying on a filesystem interface (e.g. /proc/version, which might be inaccessible) nor testing multiple syscall argument combinations (i.e. syscall bisection). New Landlock features will be documented and tied to a minimum version number (greater than 1). The current version will be incremented for each new kernel release supporting new Landlock features. User space libraries can leverage this information to seamlessly restrict processes as much as possible while being compatible with newer APIs. This is a much more lighter approach than the previous landlock_get_features(2): the complexity is pushed to user space libraries. This flag meets similar needs as securityfs versions: selinux/policyvers, apparmor/features/*/version* and tomoyo/version. Supporting this flag now will be convenient for backward compatibility. Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-14-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Add syscall implementationsMickaël Salaün
These 3 system calls are designed to be used by unprivileged processes to sandbox themselves: * landlock_create_ruleset(2): Creates a ruleset and returns its file descriptor. * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a ruleset, identified by the dedicated file descriptor. * landlock_restrict_self(2): Enforces a ruleset on the calling thread and its future children (similar to seccomp). This syscall has the same usage restrictions as seccomp(2): the caller must have the no_new_privs attribute set or have CAP_SYS_ADMIN in the current user namespace. All these syscalls have a "flags" argument (not currently used) to enable extensibility. Here are the motivations for these new syscalls: * A sandboxed process may not have access to file systems, including /dev, /sys or /proc, but it should still be able to add more restrictions to itself. * Neither prctl(2) nor seccomp(2) (which was used in a previous version) fit well with the current definition of a Landlock security policy. All passed structs (attributes) are checked at build time to ensure that they don't contain holes and that they are aligned the same way for each architecture. See the user and kernel documentation for more details (provided by a following commit): * Documentation/userspace-api/landlock.rst * Documentation/security/landlock.rst Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Link: https://lore.kernel.org/r/20210422154123.13086-9-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22arch: Wire up Landlock syscallsMickaël Salaün
Wire up the following system calls for all architectures: * landlock_create_ruleset(2) * landlock_add_rule(2) * landlock_restrict_self(2) Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22fs,security: Add sb_delete hookMickaël Salaün
The sb_delete security hook is called when shutting down a superblock, which may be useful to release kernel objects tied to the superblock's lifetime (e.g. inodes). This new hook is needed by Landlock to release (ephemerally) tagged struct inodes. This comes from the unprivileged nature of Landlock described in the next commit. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: James Morris <jmorris@namei.org> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-7-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22landlock: Support filesystem access-controlMickaël Salaün
Using Landlock objects and ruleset, it is possible to tag inodes according to a process's domain. To enable an unprivileged process to express a file hierarchy, it first needs to open a directory (or a file) and pass this file descriptor to the kernel through landlock_add_rule(2). When checking if a file access request is allowed, we walk from the requested dentry to the real root, following the different mount layers. The access to each "tagged" inodes are collected according to their rule layer level, and ANDed to create access to the requested file hierarchy. This makes possible to identify a lot of files without tagging every inodes nor modifying the filesystem, while still following the view and understanding the user has from the filesystem. Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not keep the same struct inodes for the same inodes whereas these inodes are in use. This commit adds a minimal set of supported filesystem access-control which doesn't enable to restrict all file-related actions. This is the result of multiple discussions to minimize the code of Landlock to ease review. Thanks to the Landlock design, extending this access-control without breaking user space will not be a problem. Moreover, seccomp filters can be used to restrict the use of syscall families which may not be currently handled by Landlock. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Kees Cook <keescook@chromium.org> Cc: Richard Weinberger <richard@nod.at> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22LSM: Infrastructure management of the superblockCasey Schaufler
Move management of the superblock->sb_security blob out of the individual security modules and into the security infrastructure. Instead of allocating the blobs from within the modules, the modules tell the infrastructure how much space is required, and the space is allocated there. Cc: John Johansen <john.johansen@canonical.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22Merge branch 'kvm-sev-cgroup' into HEADPaolo Bonzini
2021-04-22ice: Enable RSS configure for AVFQi Zhang
Currently, RSS hash input is not available to AVF by ethtool, it is set by the PF directly. Add the RSS configure support for AVF through new virtchnl message, and define the capability flag VIRTCHNL_VF_OFFLOAD_ADV_RSS_PF to query this new RSS offload support. Signed-off-by: Jia Guo <jia.guo@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Tested-by: Bo Chen <BoX.C.Chen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22ice: Advertise virtchnl UDP segmentation offload capabilityBrett Creeley
As the hardware is capable of supporting UDP segmentation offload, add a capability bit to virtchnl.h to communicate this and have the driver advertise its support. Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22ice: Allow ignoring opcodes on specific VFMichal Swiatkowski
Declare bitmap of allowed commands on VF. Initialize default opcodes list that should be always supported. Declare array of supported opcodes for each caps used in virtchnl code. Change allowed bitmap by setting or clearing corresponding bit to allowlist (bit set) or denylist (bit clear). Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22irqdomain: Get rid of irq_create_strict_mappings()Marc Zyngier
No user of this helper is left, remove it. Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-22irqchip/gic-v4.1: Disable vSGI upon (GIC CPUIF < v4.1) detectionLorenzo Pieralisi
GIC CPU interfaces versions predating GIC v4.1 were not built to accommodate vINTID within the vSGI range; as reported in the GIC specifications (8.2 "Changes to the CPU interface"), it is CONSTRAINED UNPREDICTABLE to deliver a vSGI to a PE with ID_AA64PFR0_EL1.GIC < b0011. Check the GIC CPUIF version by reading the SYS_ID_AA64_PFR0_EL1. Disable vSGIs if a CPUIF version < 4.1 is detected to prevent using vSGIs on systems where they may misbehave. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210317100719.3331-2-lorenzo.pieralisi@arm.com
2021-04-22RDMA/nldev: Add QP numbers to SRQ informationNeta Ostrovsky
Add QP numbers that are associated with the SRQ to the SRQ information. The QPs are displayed in a range form. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC lqpn 125-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141-156 pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC lqpn 157-172 pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC lqpn 329-344 pdn 4 pid 3586 comm ibv_srq_pingpon $ rdma res show srq lqpn 126-141 dev ibp8s0f0 srqn 4 type BASIC lqpn 126-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141 pdn 10 pid 3584 comm ibv_srq_pingpon $ rdma res show srq lqpn 127 dev ibp8s0f0 srqn 4 type BASIC lqpn 127 pdn 9 pid 3581 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/79a4bd4caec2248fd9583cccc26786af8e4414fc.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/nldev: Return SRQ informationNeta Ostrovsky
Extend the RDMA nldev return a SRQ information, like SRQ number, SRQ type, PD number, CQ number and process ID that created that SRQ. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC pdn 4 pid 3586 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/322f9210b95812799190dd4a0fb92f3a3bba0333.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/restrack: Add support to get resource tracking for SRQNeta Ostrovsky
In order to track SRQ resources, a new restrack object is initialized and added to the resource tracking database. Link: https://lore.kernel.org/r/0db71c409f24f2f6b019bf8797a8fed96fe7079c.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22RDMA/nldev: Return context informationNeta Ostrovsky
Extend the RDMA nldev return a context information, like ctx number and process ID that created that context. This functionality is helpful to find orphan contexts that are not closed for some reason. Sample output: $ rdma res show ctx dev ibp8s0f0 ctxn 0 pid 980 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 1 pid 981 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 2 pid 992 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong $ rdma res show ctx dev ibp8s0f1 dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong Link: https://lore.kernel.org/r/5c956acfeac4e9d532988575f3da7d64cb449374.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-22Merge branch 'kvm-arm64/kill_oprofile_dependency' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-22perf: Get rid of oprofile leftoversMarc Zyngier
perf_pmu_name() and perf_num_counters() are unused. Drop them. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210414134409.1266357-6-maz@kernel.org
2021-04-22KVM: arm64: Divorce the perf code from oprofile helpersMarc Zyngier
KVM/arm64 is the sole user of perf_num_counters(), and really could do without it. Stop using the obsolete API by relying on the existing probing code. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210414134409.1266357-2-maz@kernel.org
2021-04-22thermal/core: Remove thermal_notify_frameworkThara Gopinath
thermal_notify_framework just updates for a single trip point where as thermal_zone_device_update does other bookkeeping like updating the temperature of the thermal zone and setting the next trip point. The only driver that was using thermal_notify_framework was updated in the previous patch to use thermal_zone_device_update instead. Since there are no users for thermal_notify_framework remove it. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210122023406.3500424-3-thara.gopinath@linaro.org
2021-04-22serial: do not restore interrupt state in sysrq helperJohan Hovold
The uart_unlock_and_check_sysrq() helper can be used to defer processing of sysrq until the interrupt handler has released the port lock and is about to return. Since commit 81e2073c175b ("genirq: Disable interrupts for force threaded handlers") interrupt handlers that are not explicitly requested as threaded are always called with interrupts disabled and there is no need to save the interrupt state when taking the port lock. Instead of adding another sysrq helper for when the interrupt state has not needlessly been saved, drop the state parameter from uart_unlock_and_check_sysrq() and update its callers to no longer explicitly disable interrupts in their interrupt handlers. Cc: Joel Stanley <joel@jms.id.au> Cc: Andrew Jeffery <andrew@aj.id.au> Cc: Andy Gross <agross@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20210416140557.25177-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-22Merge tag 'usb-serial-5.13-rc1' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-next Johan writes: USB-serial updates for 5.13-rc1 Here are the USB-serial updates for 5.13-rc1, including: - better type detection for pl2303 - support for more line speeds for pl2303 (TA/TB) - fixed CSIZE handling for the new xr driver - core support for multi-interface functions - TIOCGSERIAL and TIOCSSERIAL fixes - generic TIOCSSERIAL support (e.g. for closing_wait) - fixed return value for unsupported ioctls - support for gpio valid masks in cp210x - drain-delay fixes and improvements - support for multi-port devices for xr - generalisation of the xr driver to support three new device classes (XR21B142X, XR21B1411 and XR2280X) Included are also various clean ups. All have been in linux-next with no reported issues. * tag 'usb-serial-5.13-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: (72 commits) USB: cdc-acm: add more Maxlinear/Exar models to ignore list USB: serial: xr: add copyright notice USB: serial: xr: reset FIFOs on open USB: serial: xr: add support for XR22801, XR22802, XR22804 USB: serial: xr: add support for XR21B1411 USB: serial: xr: add support for XR21B1421, XR21B1422 and XR21B1424 USB: serial: xr: add type abstraction USB: serial: xr: drop type prefix from shared defines USB: serial: xr: move pin configuration to probe USB: serial: xr: rename GPIO-pin defines USB: serial: xr: rename GPIO-mode defines USB: serial: xr: add support for XR21V1412 and XR21V1414 USB: serial: ti_usb_3410_5052: clean up termios CSIZE handling USB: serial: ti_usb_3410_5052: use kernel types consistently USB: serial: ti_usb_3410_5052: add port-command helpers USB: serial: ti_usb_3410_5052: clean up vendor-request helpers USB: serial: ti_usb_3410_5052: drop unnecessary packed attributes USB: serial: io_ti: drop unnecessary packed attributes USB: serial: io_ti: use kernel types consistently USB: serial: io_ti: add read-port-command helper ...
2021-04-22firmware: xilinx: Add pinctrl supportSai Krishna Potthuri
Adding pinctrl support to query platform specific information (pins) from firmware. Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/1619080202-31924-2-git-send-email-lakshmi.sai.krishna.potthuri@xilinx.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-22devm-helpers: Fix devm_delayed_work_autocancel() kerneldocMatti Vaittinen
The kerneldoc for devm_delayed_work_autocancel() contains invalid parameter description. Fix the parameter description. And while at it - make it more obvous that this function operates on delayed_work. That helps differentiating with resource-managed INIT_WORK description (which should follow in near future) Fixes: 0341ce544394 ("workqueue: Add resource managed version of delayed work init") Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Link: https://lore.kernel.org/r/db3a8b4b8899fdf109a0cc760807de12d3b4f09b.1619028482.git.matti.vaittinen@fi.rohmeurope.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-21scsi: blk-mq: Fix build warning when making htmldocsMing Lei
Fixes the following warning when running 'make htmldocs': include/linux/blk-mq.h:395: warning: Function parameter or member 'set_rq_budget_token' not described in 'blk_mq_ops' include/linux/blk-mq.h:395: warning: Function parameter or member 'get_rq_budget_token' not described in 'blk_mq_ops' [mkp: added warning messages] Link: https://lore.kernel.org/r/20210421154526.1954174-1-ming.lei@redhat.com Fixes: d022d18c045f ("scsi: blk-mq: Add callbacks for storing & retrieving budget token") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-22pinctrl: Add PIN_CONFIG_MODE_PWM to enum pin_config_paramAndy Shevchenko
It seems that we will have more and more pin controllers that support PWM function on the (selected) pins. Due to it being a part of pin controller IP the idea is to have some code that will switch the mode and attach the corresponding driver, for example, via using it as a library. Meanwhile, put a corresponding item to the pin_config_param enumerator. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210412140741.39946-3-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-22pinctrl: Introduce MODE group in enum pin_config_paramAndy Shevchenko
Better to have a MODE group of settings to keep them together when ordered alphabetically. Hence, rename PIN_CONFIG_LOW_POWER_MODE to PIN_CONFIG_MODE_LOW_POWER. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210412140741.39946-2-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-22pinctrl: Keep enum pin_config_param ordered by nameAndy Shevchenko
It seems the ordering is by name. Keep it that way. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210412140741.39946-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-22dt-bindings: pinctrl: Add binding for ZynqMP pinctrl driverSai Krishna Potthuri
Adding documentation and dt-bindings file which contains MIO pin configuration defines for Xilinx ZynqMP pinctrl driver. Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/1618485193-5403-3-git-send-email-lakshmi.sai.krishna.potthuri@xilinx.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-22dt-bindings: pinctrl: mt8195: add pinctrl file and binding documentZhiyong Tao
1. This patch adds pinctrl file for mt8195. 2. This patch adds mt8195 compatible node in binding document. Signed-off-by: Zhiyong Tao <zhiyong.tao@mediatek.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210413055702.27535-2-zhiyong.tao@mediatek.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-04-21KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA commandBrijesh Singh
The command is used for copying the incoming buffer into the SEV guest memory space. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <c5d0e3e719db7bb37ea85d79ed4db52e9da06257.1618498113.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: SVM: Add support for KVM_SEV_RECEIVE_START commandBrijesh Singh
The command is used to create the encryption context for an incoming SEV guest. The encryption context can be later used by the hypervisor to import the incoming data into the SEV guest memory space. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <c7400111ed7458eee01007c4d8d57cdf2cbb0fc2.1618498113.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: SVM: Add support for KVM_SEV_SEND_CANCEL commandSteve Rutherford
After completion of SEND_START, but before SEND_FINISH, the source VMM can issue the SEND_CANCEL command to stop a migration. This is necessary so that a cancelled migration can restart with a new target later. Reviewed-by: Nathan Tempelman <natet@google.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Steve Rutherford <srutherford@google.com> Message-Id: <20210412194408.2458827-1-srutherford@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: SVM: Add KVM_SEND_UPDATE_DATA commandBrijesh Singh
The command is used for encrypting the guest memory region using the encryption context created with KVM_SEV_SEND_START. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by : Steve Rutherford <srutherford@google.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <d6a6ea740b0c668b30905ae31eac5ad7da048bb3.1618498113.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: SVM: Add KVM_SEV SEND_START commandBrijesh Singh
The command is used to create an outgoing SEV guest encryption context. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <2f1686d0164e0f1b3d6a41d620408393e0a48376.1618498113.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: Boost vCPU candidate in user mode which is delivering interruptWanpeng Li
Both lock holder vCPU and IPI receiver that has halted are condidate for boost. However, the PLE handler was originally designed to deal with the lock holder preemption problem. The Intel PLE occurs when the spinlock waiter is in kernel mode. This assumption doesn't hold for IPI receiver, they can be in either kernel or user mode. the vCPU candidate in user mode will not be boosted even if they should respond to IPIs. Some benchmarks like pbzip2, swaptions etc do the TLB shootdown in kernel mode and most of the time they are running in user mode. It can lead to a large number of continuous PLE events because the IPI sender causes PLE events repeatedly until the receiver is scheduled while the receiver is not candidate for a boost. This patch boosts the vCPU candidiate in user mode which is delivery interrupt. We can observe the speed of pbzip2 improves 10% in 96 vCPUs VM in over-subscribe scenario (The host machine is 2 socket, 48 cores, 96 HTs Intel CLX box). There is no performance regression for other benchmarks like Unixbench spawn (most of the time contend read/write lock in kernel mode), ebizzy (most of the time contend read/write sem and TLB shoodtdown in kernel mode). Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1618542490-14756-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>