summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-10-18mm: don't include <linux/blkdev.h> in <linux/backing-dev.h>Christoph Hellwig
Move inode_to_bdi out of line to avoid having to include blkdev.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18mm: don't include <linux/blk-cgroup.h> in <linux/backing-dev.h>Christoph Hellwig
There is no need to pull blk-cgroup.h and thus blkdev.h in here, so break the include chain. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18mm: don't include <linux/blk-cgroup.h> in <linux/writeback.h>Christoph Hellwig
blk-cgroup.h pulls in blkdev.h and thus pretty much all the block headers. Break this dependency chain by turning wbc_blkcg_css into a macro and dropping the blk-cgroup.h include. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18firmware: arm_ffa: Add support for MEM_LENDMarc Bonnici
As part of the FF-A spec, an endpoint is allowed to transfer access of, or lend, a memory region to one or more borrowers. Extend the existing memory sharing implementation to support FF-A MEM_LEND functionality and expose this to other kernel drivers. Note that upon a successful MEM_LEND request the caller must ensure that the memory region specified is not accessed until a successful MEM_RECALIM call has been made. On systems with a hypervisor present this will been enforced, however on systems without a hypervisor the responsibility falls to the calling kernel driver to prevent access. Link: https://lore.kernel.org/r/20211015165742.2513065-1-marc.bonnici@arm.com Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Marc Bonnici <marc.bonnici@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2021-10-18net: sched: Remove Qdisc::running sequence counterAhmed S. Darwish
The Qdisc::running sequence counter has two uses: 1. Reliably reading qdisc's tc statistics while the qdisc is running (a seqcount read/retry loop at gnet_stats_add_basic()). 2. As a flag, indicating whether the qdisc in question is running (without any retry loops). For the first usage, the Qdisc::running sequence counter write section, qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what is actually needed: the raw qdisc's bstats update. A u64_stats sync point was thus introduced (in previous commits) inside the bstats structure itself. A local u64_stats write section is then started and stopped for the bstats updates. Use that u64_stats sync point mechanism for the bstats read/retry loop at gnet_stats_add_basic(). For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag, accessed with atomic bitops, is sufficient. Using a bit flag instead of a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads to the SMP barriers implicitly added through raw_read_seqcount() and write_seqcount_begin/end() getting removed. All call sites have been surveyed though, and no required ordering was identified. Now that the qdisc->running sequence counter is no longer used, remove it. Note, using u64_stats implies no sequence counter protection for 64-bit architectures. This can lead to the qdisc tc statistics "packets" vs. "bytes" values getting out of sync on rare occasions. The individual values will still be valid. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data typesAhmed S. Darwish
The only factor differentiating per-CPU bstats data type (struct gnet_stats_basic_cpu) from the packed non-per-CPU one (struct gnet_stats_basic_packed) was a u64_stats sync point inside the former. The two data types are now equivalent: earlier commits added a u64_stats sync point to the latter. Combine both data types into "struct gnet_stats_basic_sync". This eliminates redundancy and simplifies the bstats read/write APIs. Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit architectures, u64_stats sync points do not use sequence counter protection. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Protect Qdisc::bstats with u64_statsAhmed S. Darwish
The not-per-CPU variant of qdisc tc (traffic control) statistics, Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running sequence counter. This sequence counter is used for reliably protecting bstats reads from parallel writes. Meanwhile, the seqcount's write section covers a much wider area than bstats update: qdisc_run_begin() => qdisc_run_end(). That read/write section asymmetry can lead to needless retries of the read section. To prepare for removing the Qdisc::running sequence counter altogether, introduce a u64_stats sync point inside bstats instead. Modify _bstats_update() to start/end the bstats u64_stats write section. For bisectability, and finer commits granularity, the bstats read section is still protected with a Qdisc::running read/retry loop and qdisc_run_begin/end() still starts/ends that seqcount write section. Once all call sites are modified to use _bstats_update(), the Qdisc::running seqcount will be removed and bstats read/retry loop will be modified to utilize the internal u64_stats sync point. Note, using u64_stats implies no sequence counter protection for 64-bit architectures. This can lead to the statistics "packets" vs. "bytes" values getting out of sync on rare occasions. The individual values will still be valid. [bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.] Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18u64_stats: Introduce u64_stats_set()Ahmed S. Darwish
Allow to directly set a u64_stats_t value which is used to provide an init function which sets it directly to zero intead of memset() the value. Add u64_stats_set() to the u64_stats API. [bigeasy: commit message. ] Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18gen_stats: Move remaining users to gnet_stats_add_queue().Sebastian Andrzej Siewior
The gnet_stats_queue::qlen member is only used in the SMP-case. qdisc_qstats_qlen_backlog() needs to add qdisc_qlen() to qstats.qlen to have the same value as that provided by qdisc_qlen_sum(). gnet_stats_copy_queue() needs to overwritte the resulting qstats.qlen field whith the caller submitted qlen value. It might be differ from the submitted value. Let both functions use gnet_stats_add_queue() and remove unused __gnet_stats_copy_queue(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18gen_stats: Add gnet_stats_add_queue().Sebastian Andrzej Siewior
This function will replace __gnet_stats_copy_queue(). It reads all arguments and adds them into the passed gnet_stats_queue argument. In contrast to __gnet_stats_copy_queue() it also copies the qlen member. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18gen_stats: Add instead Set the value in __gnet_stats_copy_basic().Sebastian Andrzej Siewior
__gnet_stats_copy_basic() always assigns the value to the bstats argument overwriting the previous value. The later added per-CPU version always accumulated the values in the returning gnet_stats_basic_packed argument. Based on review there are five users of that function as of today: - est_fetch_counters(), ___gnet_stats_copy_basic() memsets() bstats to zero, single invocation. - mq_dump(), mqprio_dump(), mqprio_dump_class_stats() memsets() bstats to zero, multiple invocation but does not use the function due to !qdisc_is_percpu_stats(). Add the values in __gnet_stats_copy_basic() instead overwriting. Rename the function to gnet_stats_add_basic() to make it more obvious. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18mm/writeback: Add folio_write_oneMatthew Wilcox (Oracle)
Transform write_one_page() into folio_write_one() and add a compatibility wrapper. Also move the declaration to pagemap.h as this is page cache functionality that doesn't need to be used by the rest of the kernel. Saves 58 bytes of kernel text. While folio_write_one() is 101 bytes smaller than write_one_page(), the inlined call to page_folio() expands each caller. There are fewer than ten callers so it doesn't seem worth putting a wrapper in the core. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com>
2021-10-18mm/filemap: Add FGP_STABLEMatthew Wilcox (Oracle)
Allow filemap_get_folio() to wait for writeback to complete (if the filesystem wants that behaviour). This is the folio equivalent of grab_cache_page_write_begin(), which is moved into the folio-compat file as a reminder to migrate all the code using it. This paves the way for getting rid of AOP_FLAG_NOFS once grab_cache_page_write_begin() is removed. Kernel grows by 11 bytes. filemap_get_folio() grows by 33 bytes but grab_cache_page_write_begin() shrinks by 22 bytes to make up for it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/filemap: Add filemap_get_folioMatthew Wilcox (Oracle)
filemap_get_folio() is a replacement for find_get_page(). Turn pagecache_get_page() into a wrapper around __filemap_get_folio(). Remove find_lock_head() as this use case is now covered by filemap_get_folio(). Reduces overall kernel size by 209 bytes. __filemap_get_folio() is 316 bytes shorter than pagecache_get_page() was, but the new pagecache_get_page() wrapper is 99 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/filemap: Add filemap_add_folio()Matthew Wilcox (Oracle)
Convert __add_to_page_cache_locked() into __filemap_add_folio(). Add an assertion to it that (for !hugetlbfs), the folio is naturally aligned within the file. Move the prototype from mm.h to pagemap.h. Convert add_to_page_cache_lru() into filemap_add_folio(). Add a compatibility wrapper for unconverted callers. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/filemap: Add filemap_alloc_folioMatthew Wilcox (Oracle)
Reimplement __page_cache_alloc as a wrapper around filemap_alloc_folio to allow filesystems to be converted at our leisure. Increases kernel text size by 133 bytes, mostly in cachefiles_read_backing_file(). pagecache_get_page() shrinks by 32 bytes, though. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/page_alloc: Add folio allocation functionsMatthew Wilcox (Oracle)
The __folio_alloc(), __folio_alloc_node() and folio_alloc() functions are mostly for type safety, but they also ensure that the page allocator allocates a compound page and initialises the deferred list if the page is large enough to have one. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/lru: Add folio_add_lru()Matthew Wilcox (Oracle)
Reimplement lru_cache_add() as a wrapper around folio_add_lru(). Saves 159 bytes of kernel text due to removing calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/lru: Convert __pagevec_lru_add_fn to take a folioMatthew Wilcox (Oracle)
This saves five calls to compound_head(), totalling 60 bytes of text. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/workingset: Convert workingset_refault() to take a folioMatthew Wilcox (Oracle)
This nets us 178 bytes of savings from removing calls to compound_head. The three callers all grow a little, but each of them will be converted to use folios soon, so that's fine. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/filemap: Add readahead_folio()Matthew Wilcox (Oracle)
The pointers stored in the page cache are folios, by definition. This change comes with a behaviour change -- callers of readahead_folio() are no longer required to put the page reference themselves. This matches how readpage works, rather than matching how readpages used to work. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/filemap: Add folio_mkwrite_check_truncate()Matthew Wilcox (Oracle)
This is the folio equivalent of page_mkwrite_check_truncate(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com>
2021-10-18mm/filemap: Add i_blocks_per_folio()Matthew Wilcox (Oracle)
Reimplement i_blocks_per_page() as a wrapper around i_blocks_per_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_redirty_for_writepage()Matthew Wilcox (Oracle)
Reimplement redirty_page_for_writepage() as a wrapper around folio_redirty_for_writepage(). Account the number of pages in the folio, add kernel-doc and move the prototype to writeback.h. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_account_redirty()Matthew Wilcox (Oracle)
Account the number of pages in the folio that we're redirtying. Turn account_page_dirty() into a wrapper around it. Also turn the comment on folio_account_redirty() into kernel-doc and edit it slightly so it makes sense to its potential callers. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_clear_dirty_for_io()Matthew Wilcox (Oracle)
Transform clear_page_dirty_for_io() into folio_clear_dirty_for_io() and add a compatibility wrapper. Also move the declaration to pagemap.h as this is page cache functionality that doesn't need to be used by the rest of the kernel. Increases the size of the kernel by 79 bytes. While we remove a few calls to compound_head(), we add a call to folio_nr_pages() to get the stats correct for the eventual support of multi-page folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_cancel_dirty()Matthew Wilcox (Oracle)
Turn __cancel_dirty_page() into __folio_cancel_dirty() and add wrappers. Move the prototypes into pagemap.h since this is page cache functionality. Saves 44 bytes of kernel text in total; 33 bytes from __folio_cancel_dirty and 11 from two callers of cancel_dirty_page(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_account_cleaned()Matthew Wilcox (Oracle)
Get the statistics right; compound pages were being accounted as a single page. This didn't matter before now as no filesystem which supported compound pages did writeback. Also move the declaration to pagemap.h since this is part of the page cache. Add a wrapper for account_page_cleaned(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add filemap_dirty_folio()Matthew Wilcox (Oracle)
Reimplement __set_page_dirty_nobuffers() as a wrapper around filemap_dirty_folio(). Eventually folio_mark_dirty() will pass the folio's mapping to the address space's ->dirty_folio() operation, so add the parameter to filemap_dirty_folio() now. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Convert tracing writeback_page_template to foliosMatthew Wilcox (Oracle)
Rename writeback_dirty_page() to writeback_dirty_folio() and wait_on_page_writeback() to folio_wait_writeback(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add __folio_mark_dirty()Matthew Wilcox (Oracle)
Turn __set_page_dirty() into a wrapper around __folio_mark_dirty(). Convert account_page_dirtied() into folio_account_dirtied() and account the number of pages in the folio to support multi-page folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_mark_dirty()Matthew Wilcox (Oracle)
Reimplement set_page_dirty() as a wrapper around folio_mark_dirty(). There is no change to filesystems as they were already being called with the compound_head of the page being marked dirty. We avoid several calls to compound_head(), both statically (through using folio_test_dirty() instead of PageDirty() and dynamically by calling folio_mapping() instead of page_mapping(). Also return bool instead of int to show the range of values actually returned, and add kernel-doc. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add folio_start_writeback()Matthew Wilcox (Oracle)
Rename set_page_writeback() to folio_start_writeback() to match folio_end_writeback(). Do not bother with wrappers that return void; callers are perfectly capable of ignoring return values. Add wrappers for set_page_writeback(), set_page_writeback_keepwrite() and test_set_page_writeback() for compatibililty with existing filesystems. The main advantage of this patch is getting the statistics right, although it does eliminate a couple of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/writeback: Add __folio_end_writeback()Matthew Wilcox (Oracle)
test_clear_page_writeback() is actually an mm-internal function, although it's named as if it's a pagecache function. Move it to mm/internal.h, rename it to __folio_end_writeback() and change the return type to bool. The conversion from page to folio is mostly about accounting the number of pages being written back, although it does eliminate a couple of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18flex_proportions: Allow N events instead of 1Matthew Wilcox (Oracle)
When batching events (such as writing back N pages in a single I/O), it is better to do one flex_proportion operation instead of N. There is only one caller of __fprop_inc_percpu_max(), and it's the one we're going to change in the next patch, so rename it instead of adding a compatibility wrapper. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz>
2021-10-18mm/writeback: Rename __add_wb_stat() to wb_stat_mod()Matthew Wilcox (Oracle)
Make this look like the newly renamed vmstat functions. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/migrate: Add folio_migrate_copy()Matthew Wilcox (Oracle)
This is the folio equivalent of migrate_page_copy(), which is retained as a wrapper for filesystems which are not yet converted to folios. Also convert copy_huge_page() to folio_copy(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/migrate: Add folio_migrate_flags()Matthew Wilcox (Oracle)
Turn migrate_page_states() into a wrapper around folio_migrate_flags(). Also convert two functions only called from folio_migrate_flags() to be folio-based. ksm_migrate_page() becomes folio_migrate_ksm() and copy_page_owner() becomes folio_copy_owner(). folio_migrate_flags() alone shrinks by two thirds -- 1967 bytes down to 642 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/migrate: Add folio_migrate_mapping()Matthew Wilcox (Oracle)
Reimplement migrate_page_move_mapping() as a wrapper around folio_migrate_mapping(). Saves 193 bytes of kernel text. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/rmap: Add folio_mkclean()Matthew Wilcox (Oracle)
Transform page_mkclean() into folio_mkclean() and add a page_mkclean() wrapper around folio_mkclean(). folio_mkclean is 15 bytes smaller than page_mkclean, but the kernel is enlarged by 33 bytes due to inlining page_folio() into each caller. This will go away once the callers are converted to use folio_mkclean(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/swap: Add folio_mark_accessed()Matthew Wilcox (Oracle)
Convert mark_page_accessed() to folio_mark_accessed(). It already operated on the entire compound page, but now we can avoid calling compound_head quite so many times. Shrinks the function from 424 bytes to 295 bytes (shrinking by 129 bytes). The compatibility wrapper is 30 bytes, plus the 8 bytes for the exported symbol means the kernel shrinks by 91 bytes. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm/swap: Add folio_activate()Matthew Wilcox (Oracle)
This replaces activate_page() and eliminates lots of calls to compound_head(). Saves net 118 bytes of kernel text. There are still some redundant calls to page_folio() here which will be removed when pagevec_lru_move_fn() is converted to use folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm: Add folio_young and folio_idleMatthew Wilcox (Oracle)
Idle page tracking is handled through page_ext on 32-bit architectures. Add folio equivalents for 32-bit and move all the page compatibility parts to common code. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com>
2021-10-18mm: Add arch_make_folio_accessible()Matthew Wilcox (Oracle)
As a default implementation, call arch_make_page_accessible n times. If an architecture can do better, it can override this. Also move the default implementation of arch_make_page_accessible() from gfp.h to mm.h. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm: Add kmap_local_folio()Matthew Wilcox (Oracle)
This allows us to map a portion of a folio. Callers can only expect to access up to the next page boundary. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18mm: Add flush_dcache_folio()Matthew Wilcox (Oracle)
This is a default implementation which calls flush_dcache_page() on each page in the folio. If architectures can do better, they should implement their own version of it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18ALSA: memalloc: Convert x86 SG-buffer handling with non-contiguous typeTakashi Iwai
We've had an x86-specific SG-buffer handling code, but now it can be merged gracefully with the standard non-contiguous DMA pages. After the migration, SNDRV_DMA_TYPE_DMA_SG becomes identical with SNDRV_DMA_TYPE_NONCONTIG on x86, while others still fall back to SNDRV_DMA_TYPE_DEV. The remaining problem is about the SG-buffer with WC pages: the DMA core stuff on x86 doesn't treat it well, so we still need some special handling to manipulate the page attribute manually. The mmap handler for SNDRV_DMA_TYPE_DEV_SG_WC still returns -ENOENT intentionally for the fallback to the default handler. Link: https://lore.kernel.org/r/20211017074859.24112-4-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-18ALSA: memalloc: Support for non-coherent page allocationTakashi Iwai
Following to the addition of non-contiguous pages, this patch adds the new contiguous non-coherent page allocation to the standard memalloc helper. Like the previous non-contig type, this non-coherent type is also directional and requires the explicit sync, too. Hence the driver using this type of buffer may need to set SNDRV_PCM_INFO_EXPLICIT_SYNC flag to the PCM hardware.info as well, unless it's set up in the managed mode. Link: https://lore.kernel.org/r/20211017074859.24112-3-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-18ALSA: memalloc: Support for non-contiguous page allocationTakashi Iwai
This patch adds the support for allocation of non-contiguous DMA pages in the common memalloc helper. It's another SG-buffer type, but unlike the existing one, this is directional and requires the explicit sync / invalidation of dirty pages on non-coherent architectures. For this enhancement, the following points are changed: - snd_dma_device stores the DMA direction. - snd_dma_device stores need_sync flag indicating whether the explicit sync is required or not. - A new variant of helper functions, snd_dma_alloc_dir_pages() and *_all() are introduced; the old snd_dma_alloc_pages() and *_all() kept as just wrappers with DMA_BIDIRECTIONAL. - A new helper snd_dma_buffer_sync() is introduced; this gets called in the appropriate places. - A new allocation type, SNDRV_DMA_TYPE_NONCONTIG, is introduced. When the driver allocates pages with this new type, and it may require the SNDRV_PCM_INFO_EXPLICIT_SYNC flag set to the PCM hardware.info for taking the full control of PCM applptr and hwptr changes (that implies disabling the mmap of control/status data). When the buffer allocation is managed by snd_pcm_set_managed_buffer(), this flag is automatically set depending on the result of dma_need_sync() internally. Otherwise, if the buffer is managed manually, the driver has to set the flag explicitly, too. The explicit sync between CPU and device for non-coherent memory is performed at the points before and after read/write transfer as well as the applptr/hwptr syncptr ioctl. In the case of mmap mode, user-space is supposed to call the syncptr ioctl with the hwptr flag to update and fetch the status at first; this corresponds to CPU-sync. Then user-space advances the applptr via syncptr ioctl again with applptr flag, and this corresponds to the device sync with flushing. Other than the DMA direction and the explicit sync, the usage of this new buffer type is almost equivalent with the existing SNDRV_DMA_TYPE_DEV_SG; you can get the page and the address via snd_sgbuf_get_page() and snd_sgbuf_get_addr(), also calculate the continuous pages via snd_sgbuf_get_chunk_size(). For those SG-page handling, the non-contig type shares the same ops with the vmalloc handler. As we do always vmap the SG pages at first, the actual address can be deduced from the vmapped address easily without iterating the SG-list. Link: https://lore.kernel.org/r/20211017074859.24112-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-10-18iommu/vt-d: Avoid duplicate removing in __domain_mapping()Longpeng(Mike)
The __domain_mapping() always removes the pages in the range from 'iov_pfn' to 'end_pfn', but the 'end_pfn' is always the last pfn of the range that the caller wants to map. This would introduce too many duplicated removing and leads the map operation take too long, for example: Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x7e617ff iov_pfn: 0x140000, end_pfn: 0x7e617ff iov_pfn: 0x180000, end_pfn: 0x7e617ff iov_pfn: 0x1c0000, end_pfn: 0x7e617ff iov_pfn: 0x200000, end_pfn: 0x7e617ff ... it takes about 50ms in total. We can reduce the cost by recalculate the 'end_pfn' and limit it to the boundary of the end of this pte page. Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x13ffff iov_pfn: 0x140000, end_pfn: 0x17ffff iov_pfn: 0x180000, end_pfn: 0x1bffff iov_pfn: 0x1c0000, end_pfn: 0x1fffff iov_pfn: 0x200000, end_pfn: 0x23ffff ... it only need 9ms now. This also removes a meaningless BUG_ON() in __domain_mapping(). Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Tested-by: Liujunjie <liujunjie23@huawei.com> Link: https://lore.kernel.org/r/20211008000433.1115-1-longpeng2@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>