Age | Commit message (Collapse) | Author |
|
With the latest mm-unstable, setting the iaa_crypto sync_mode to 'async'
causes crypto testmgr.c test_acomp() failure and dmesg call traces, and
zswap being unable to use 'deflate-iaa' as a compressor:
echo async > /sys/bus/dsa/drivers/crypto/sync_mode
[ 255.271030] zswap: compressor deflate-iaa not available
[ 369.960673] INFO: task cryptomgr_test:4889 blocked for more than 122 seconds.
[ 369.970127] Not tainted 6.13.0-rc1-mm-unstable-12-16-2024+ #324
[ 369.977411] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.986246] task:cryptomgr_test state:D stack:0 pid:4889 tgid:4889 ppid:2 flags:0x00004000
[ 369.986253] Call Trace:
[ 369.986256] <TASK>
[ 369.986260] __schedule+0x45c/0xfa0
[ 369.986273] schedule+0x2e/0xb0
[ 369.986277] schedule_timeout+0xe7/0x100
[ 369.986284] ? __prepare_to_swait+0x4e/0x70
[ 369.986290] wait_for_completion+0x8d/0x120
[ 369.986293] test_acomp+0x284/0x670
[ 369.986305] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986312] alg_test_comp+0x263/0x440
[ 369.986315] ? sched_balance_newidle+0x259/0x430
[ 369.986320] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986323] alg_test.part.27+0x103/0x410
[ 369.986326] ? __schedule+0x464/0xfa0
[ 369.986330] ? __pfx_cryptomgr_test+0x10/0x10
[ 369.986333] cryptomgr_test+0x20/0x40
[ 369.986336] kthread+0xda/0x110
[ 369.986344] ? __pfx_kthread+0x10/0x10
[ 369.986346] ret_from_fork+0x2d/0x40
[ 369.986355] ? __pfx_kthread+0x10/0x10
[ 369.986358] ret_from_fork_asm+0x1a/0x30
[ 369.986365] </TASK>
This happens because the only async polling without interrupts that
iaa_crypto currently implements is with the 'sync' mode. With 'async',
iaa_crypto calls to compress/decompress submit the descriptor and return
-EINPROGRESS, without any mechanism in the driver to poll for
completions. Hence callers such as test_acomp() in crypto/testmgr.c or
zswap, that wrap the calls to crypto_acomp_compress() and
crypto_acomp_decompress() in synchronous wrappers, will block
indefinitely. Even before zswap can notice this problem, the crypto
testmgr.c's test_acomp() will fail and prevent registration of
"deflate-iaa" as a valid crypto acomp algorithm, thereby disallowing the
use of "deflate-iaa" as a zswap compress (zswap will fall-back to the
default compressor in this case).
To fix this issue, this patch modifies the iaa_crypto sync_mode set
function to treat 'async' equivalent to 'sync', so that the correct and
only supported driver async polling without interrupts implementation is
enabled, and zswap can use 'deflate-iaa' as the compressor.
Hence, with this patch, this is what will happen:
echo async > /sys/bus/dsa/drivers/crypto/sync_mode
cat /sys/bus/dsa/drivers/crypto/sync_mode
sync
There are no crypto/testmgr.c test_acomp() errors, no call traces and zswap
can use 'deflate-iaa' without any errors. The iaa_crypto documentation has
also been updated to mention this caveat with 'async' and what to expect
with this fix.
True iaa_crypto async polling without interrupts is enabled in patch
"crypto: iaa - Implement batch_compress(), batch_decompress() API in
iaa_crypto." [1] which is under review as part of the "zswap IAA compress
batching" patch-series [2]. Until this is merged, we would appreciate it if
this current patch can be considered for a hotfix.
[1]: https://patchwork.kernel.org/project/linux-mm/patch/20241221063119.29140-5-kanchana.p.sridhar@intel.com/
[2]: https://patchwork.kernel.org/project/linux-mm/list/?series=920084
Fixes: 09646c98d ("crypto: iaa - Add irq support for the crypto async interface")
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
|
|
These knobs offer more fine-grained control to userspace than needed and
directly expose/influence kernel implementation; remove them.
For disabling same_filled handling, there is no logical reason to refuse
storing same-filled pages more efficiently and opt for compression.
Scanning pages for patterns may be an argument, but the page contents will
be read into the CPU cache anyway during compression. Also, removing the
same_filled handling code does not move the needle significantly in terms
of performance anyway [1].
For disabling non_same_filled handling, it was added when the compressed
pages in zswap were not being properly charged to memcgs, as workloads
could escape the accounting with compression [2]. This is no longer the
case after commit f4840ccfca25 ("zswap: memcg accounting"), and using
zswap without compression does not make much sense.
[1]https://lore.kernel.org/lkml/CAJD7tkaySFP2hBQw4pnZHJJwe3bMdjJ1t9VC2VJd=khn1_TXvA@mail.gmail.com/
[2]https://lore.kernel.org/lkml/19d5cdee-2868-41bd-83d5-6da75d72e940@maciej.szmigiero.name/
[yosryahmed@google.com: remove same_filled_pages from docs]
Link: https://lkml.kernel.org/r/ZhxFVggdyvCo79jc@google.com
Link: https://lkml.kernel.org/r/20240413022407.785696-5-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This cleans up the following issues I ran into when trying to use the
scripts and commands in the iaa-crypto.rst document.
- Fix incorrect arguments being passed to accel-config
config-wq.
- Replace --device_name with --driver-name.
- Replace --driver_name with --driver-name.
- Replace --size with --wq-size.
- Add missing --priority argument.
- Add missing accel-config config-engine command after the
config-wq commands.
- Fix wq name passed to accel-config config-wq.
- Add rmmod/modprobe of iaa_crypto to script that disables,
then enables all devices and workqueues to avoid enable-wq
failing with -EEXIST when trying to register to compression
algorithm.
- Fix device name in cases where iaa was used instead of iax.
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-crypto@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Currently, the wq_stats output also includes the global stats, while
the individual global stats are also available as separate debugfs
files. Since these are all read-only, there's really no reason to
have them as separate files, especially since we already display them
as global stats in the wq_stats. It makes more sense to just add a
separate global_stats file to display those, and remove them from the
wq_stats, as well as removing the individual stats files.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Because the IAA Compression Accelerator requires significant user
setup in order to be used properly, this adds documentation on the
iaa_crypto driver including setup, usage, and examples.
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|