summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-01-10Merge tag 'linux-kselftest-next-5.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest update from Shuah Khan: "Fixes to build errors, false negatives, and several code cleanups, including the ARRAY_SIZE cleanup that removes 25+ duplicates ARRAY_SIZE defines from individual tests" * tag 'linux-kselftest-next-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/vm: remove ARRAY_SIZE define from individual tests selftests/timens: remove ARRAY_SIZE define from individual tests selftests/sparc64: remove ARRAY_SIZE define from adi-test selftests/seccomp: remove ARRAY_SIZE define from seccomp_benchmark selftests/rseq: remove ARRAY_SIZE define from individual tests selftests/net: remove ARRAY_SIZE define from individual tests selftests/landlock: remove ARRAY_SIZE define from common.h selftests/ir: remove ARRAY_SIZE define from ir_loopback.c selftests/core: remove ARRAY_SIZE define from close_range_test.c selftests/cgroup: remove ARRAY_SIZE define from cgroup_util.h selftests/arm64: remove ARRAY_SIZE define from vec-syscfg.c tools: fix ARRAY_SIZE defines in tools and selftests hdrs selftests: cgroup: build error multiple outpt files selftests/move_mount_set_group remove unneeded conversion to bool selftests/mount: remove unneeded conversion to bool selftests: harness: avoid false negatives if test has no ASSERTs selftests/ftrace: make kprobe profile testcase description unique selftests: clone3: clone3: add case CLONE3_ARGS_NO_TEST selftests: timers: Remove unneeded semicolon kselftests: timers:Remove unneeded semicolon
2022-01-10Merge tag 'slab-for-5.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - Separate struct slab from struct page - an offshot of the page folio work. Struct page fields used by slab allocators are moved from struct page to a new struct slab, that uses the same physical storage. Similar to struct folio, it always is a head page. This brings better type safety, separation of large kmalloc allocations from true slabs, and cleanup of related objcg code. - A SLAB_MERGE_DEFAULT config optimization. * tag 'slab-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (33 commits) mm/slob: Remove unnecessary page_mapcount_reset() function call bootmem: Use page->index instead of page->freelist zsmalloc: Stop using slab fields in struct page mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled mm/slub: Simplify struct slab slabs field definition mm/sl*b: Differentiate struct slab fields by sl*b implementations mm/kfence: Convert kfence_guarded_alloc() to struct slab mm/kasan: Convert to struct folio and struct slab mm/slob: Convert SLOB to use struct slab and struct folio mm/memcg: Convert slab objcgs from struct page to struct slab mm: Convert struct page to struct slab in functions used by other subsystems mm/slab: Finish struct page to struct slab conversion mm/slab: Convert most struct page to struct slab by spatch mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab mm/slub: Finish struct page to struct slab conversion mm/slub: Convert most struct page to struct slab by spatch mm/slub: Convert pfmemalloc_match() to take a struct slab mm/slub: Convert __free_slab() to use struct slab mm/slub: Convert alloc_slab_page() to return a struct slab mm/slub: Convert print_page_info() to print_slab_info() ...
2022-01-10Merge branch 'random-5.17-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "These a bit more numerous than usual for the RNG, due to folks resubmitting patches that had been pending prior and generally renewed interest. There are a few categories of patches in here: 1) Dominik Brodowski and I traded a series back and forth for a some weeks that fixed numerous issues related to seeds being provided at extremely early boot by the firmware, before other parts of the kernel or of the RNG have been initialized, both fixing some crashes and addressing correctness around early boot randomness. One of these is marked for stable. 2) I replaced the RNG's usage of SHA-1 with BLAKE2s in the entropy extractor, and made the construction a bit safer and more standard. This was sort of a long overdue low hanging fruit, as we were supposed to have phased out SHA-1 usage quite some time ago (even if all we needed here was non-invertibility). Along the way it also made extraction 131% faster. This required a bit of Kconfig and symbol plumbing to make things work well with the crypto libraries, which is one of the reasons why I'm sending you this pull early in the cycle. 3) I got rid of a truly superfluous call to RDRAND in the hot path, which resulted in a whopping 370% increase in performance. 4) Sebastian Andrzej Siewior sent some patches regarding PREEMPT_RT, the full series of which wasn't ready yet, but the first two preparatory cleanups were good on their own. One of them touches files in kernel/irq/, which is the other reason why I'm sending you this pull early in the cycle. 5) Other assorted correctness fixes from Eric Biggers, Jann Horn, Mark Brown, Dominik Brodowski, and myself" * 'random-5.17-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: don't reset crng_init_cnt on urandom_read() random: avoid superfluous call to RDRAND in CRNG extraction random: early initialization of ChaCha constants random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs random: harmonize "crng init done" messages random: mix bootloader randomness into pool random: do not throw away excess input to crng_fast_load random: do not re-init if crng_reseed completes before primary init random: fix crash on multiple early calls to add_bootloader_randomness() random: do not sign extend bytes for rotation when mixing random: use BLAKE2s instead of SHA1 in extraction lib/crypto: blake2s: include as built-in random: fix data race on crng init time random: fix data race on crng_node_pool irq: remove unused flags argument from __handle_irq_event_percpu() random: remove unused irq_flags argument from add_interrupt_randomness() random: document add_hwgenerator_randomness() with other input functions MAINTAINERS: add git tree for random.c
2022-01-10Merge tag 'seccomp-v5.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The core seccomp code hasn't changed for this cycle, but the selftests were improved while helping to debug the recent signal handling refactoring work Eric did. Summary: - Improve seccomp selftests in support of signal handler refactoring (Kees Cook)" * tag 'seccomp-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Report event mismatches more clearly selftests/seccomp: Stop USER_NOTIF test if kcmp() fails
2022-01-10Merge tag 'pstore-v5.17-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore update from Kees Cook: - Add boot param for early ftrace recording in pstore (Uwe Kleine-König) * tag 'pstore-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ftrace: Allow immediate recording
2022-01-10Merge tag 'edac_updates_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Add support for version 3 of the Synopsys DDR controller to synopsys_edac - Add support for DRR5 and new models 0x10-0x1f and 0x50-0x5f of AMD family 0x19 CPUs to amd64_edac - The usual set of fixes and cleanups * tag 'edac_updates_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/amd64: Add support for family 19h, models 50h-5fh EDAC/sb_edac: Remove redundant initialization of variable rc RAS/CEC: Remove a repeated 'an' in a comment EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh EDAC: Add RDDR5 and LRDDR5 memory types EDAC/sifive: Fix non-kernel-doc comment dt-bindings: memory: Add entry for version 3.80a EDAC/synopsys: Enable the driver on Intel's N5X platform EDAC/synopsys: Add support for version 3 of the Synopsys EDAC DDR EDAC/synopsys: Use the quirk for version instead of ddr version
2022-01-10Merge tag 'ras_core_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: "A relatively big amount of movements in RAS-land this time around: - First part of a series to move the AMD address translation code from arch/x86/ to amd64_edac as that is its only user anyway - Some MCE error injection improvements to the AMD side - Reorganization of the #MC handler code and the facilities it calls to make it noinstr-safe - Add support for new AMD MCA bank types and non-uniform banks layout - The usual set of cleanups and fixes" * tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/mce: Reduce number of machine checks taken during recovery x86/mce/inject: Avoid out-of-bounds write when setting flags x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types x86/mce: Check regs before accessing it x86/mce: Mark mce_start() noinstr x86/mce: Mark mce_timed_out() noinstr x86/mce: Move the tainting outside of the noinstr region x86/mce: Mark mce_read_aux() noinstr x86/mce: Mark mce_end() noinstr x86/mce: Mark mce_panic() noinstr x86/mce: Prevent severity computation from being instrumented x86/mce: Allow instrumentation during task work queueing x86/mce: Remove noinstr annotation from mce_setup() x86/mce: Use mce_rdmsrl() in severity checking code x86/mce: Remove function-local cpus variables x86/mce: Do not use memset to clear the banks bitmaps x86/mce/inject: Set the valid bit in MCA_STATUS before error injection x86/mce/inject: Check if a bank is populated before injecting x86/mce: Get rid of cpu_missing ...
2022-01-10Merge tag 'core_entry_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull thread_info flag accessor helper updates from Borislav Petkov: "Add a set of thread_info.flags accessors which snapshot it before accesing it in order to prevent any potential data races, and convert all users to those new accessors" * tag 'core_entry_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: powerpc: Snapshot thread flags powerpc: Avoid discarding flags in system_call_exception() openrisc: Snapshot thread flags microblaze: Snapshot thread flags arm64: Snapshot thread flags ARM: Snapshot thread flags alpha: Snapshot thread flags sched: Snapshot thread flags entry: Snapshot thread flags x86: Snapshot thread flags thread_info: Add helpers to snapshot thread flags
2022-01-10Merge tag 'core_core_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull notifier fix from Borislav Petkov: "Return an error when a notifier callback has been registered already" * tag 'core_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: notifier: Return an error when a callback has already been registered
2022-01-10perf annotate: Avoid TUI crash when navigating in the annotation of ↵Dario Petrillo
recursive functions In 'perf report', entering a recursive function from inside of itself (either directly of indirectly through some other function) results in calling symbol__annotate2 multiple() times, and freeing the whole disassembly when exiting from the innermost instance. The first issue causes the function's disassembly to be duplicated, and the latter a heap use-after-free (and crash) when trying to access the disassembly again. I reproduced the bug on perf 5.11.22 (Ubuntu 20.04.3 LTS) and 5.16.rc8 with the following testcase (compile with gcc recursive.c -o recursive). To reproduce: - perf record ./recursive - perf report - enter fibonacci and annotate it - move the cursor on one of the "callq fibonacci" instructions and press enter - at this point there will be two copies of the function in the disassembly - go back by pressing q, and perf will crash #include <stdio.h> int fibonacci(int n) { if(n <= 2) return 1; return fibonacci(n-1) + fibonacci(n-2); } int main() { printf("%d\n", fibonacci(40)); } This patch addresses the issue by annotating a function and freeing the associated memory on exit only if no annotation is already present, so that a recursive function is only annotated on entry. Signed-off-by: Dario Petrillo <dario.pk1@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org Link: http://lore.kernel.org/lkml/20220109234441.325106-1-dario.pk1@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf powerpc: Update global/local variants for p_stage_cycAthira Rajeev
Update the arch_support_sort_key() function in powerpc to enable presenting local and global variants of sort key 'p_stage_cyc'. Update the "se_header" strings for these in arch_perf_header_entry() along with instruction latency. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211203022038.48240-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf sort: Include global and local variants for p_stage_cyc sort keyAthira Rajeev
Sort key 'p_stage_cyc' is used to present the latency cycles spent in pipeline stages. perf has local 'p_stage_cyc' sort key to display this info. There is no global variant available for this sort key. The local variant shows latency in a single sample, whereas the global value will be useful to present the total latency (sum of latencies) in the hist entry. It represents the latency number multiplied by the number of samples. Add global ('p_stage_cyc') and local variant ('local_p_stage_cyc') for this sort key. Use 'local_p_stage_cyc' as default option for "mem" sort mode. Also add this to the list of dynamic sort keys and made the "dynamic_headers" and "arch_specific_sort_keys" as static. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211203022038.48240-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10ext4: don't use the orphan list when migrating an inodeTheodore Ts'o
We probably want to remove the indirect block to extents migration feature after a deprecation window, but until then, let's fix a potential data loss problem caused by the fact that we put the tmp_inode on the orphan list. In the unlikely case where we crash and do a journal recovery, the data blocks belonging to the inode being migrated are also represented in the tmp_inode on the orphan list --- and so its data blocks will get marked unallocated, and available for reuse. Instead, stop putting the tmp_inode on the oprhan list. So in the case where we crash while migrating the inode, we'll leak an inode, which is not a disaster. It will be easily fixed the next time we run fsck, and it's better than potentially having blocks getting claimed by two different files, and losing data as a result. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@kernel.org
2022-01-10ext4: use BUG_ON instead of if condition followed by BUGxu xin
BUG_ON would be better. This issue was detected with the help of Coccinelle. Reported-by: Zeal robot <zealci@zte.com.cn> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Link: https://lore.kernel.org/r/20211228073252.580296-1-xu.xin16@zte.com.cn Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: fix a copy and paste typoDan Carpenter
This was obviously supposed to be an ext4 struct, not xfs. GCC doesn't care either way so it doesn't affect the build or runtime. Fixes: cebe85d570cf ("ext4: switch to the new mount api") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20211215114309.GB14552@kili Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: set csum seed in tmp inode while migrating to extentsLuís Henriques
When migrating to extents, the temporary inode will have it's own checksum seed. This means that, when swapping the inodes data, the inode checksums will be incorrect. This can be fixed by recalculating the extents checksums again. Or simply by copying the seed into the temporary inode. Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357 Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl> Signed-off-by: Luís Henriques <lhenriques@suse.de> Link: https://lore.kernel.org/r/20211214175058.19511-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: remove unnecessary 'offset' assignmentluo penghao
Although it is in the loop, offset is reassigned at the beginning of the while loop. And after the loop, the value will not be used The clang_analyzer complains as follows: fs/ext4/dir.c:306:3 warning: Value stored to 'offset' is never read Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211208075307.404703-1-luo.penghao@zte.com.cn Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: remove redundant o_start statementluo penghao
The if will goto out of the loop, and until the end of the function execution, o_start will not be used again. The clang_analyzer complains as follows: fs/ext4/move_extent.c:635:5 warning: Value stored to 'o_start' is never read Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211208075157.404535-1-luo.penghao@zte.com.cn Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: drop an always true checkAdam Borowski
EXT_FIRST_INDEX(ptr) is ptr+12, which can't possibly be null; gcc-12 warns about this. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20211115172020.57853-1-kilobyte@angband.pl Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: remove unused assignmentsluo penghao
The eh assignment in these two places is meaningless, because the function will goto to merge, which will not use eh. The clang_analyzer complains as follows: fs/ext4/extents.c:1988:4 warning: fs/ext4/extents.c:2016:4 warning: Value stored to 'eh' is never read Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211104064007.2919-1-luo.penghao@zte.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: remove redundant statementluo penghao
The local variable assignment at the end of the function is meaningless. The clang_analyzer complains as follows: fs/ext4/fast_commit.c:779:2 warning: Value stored to 'dst' is never read Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211104063406.2747-1-luo.penghao@zte.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: remove useless resetting io_end_size in mpage_process_page()Nghia Le
The command "make clang-analyzer" detects dead stores in mpage_process_page() function. Do not reset io_end_size to 0 in the current paths, as the function exits on those paths without further using io_end_size. Signed-off-by: Nghia Le <nghialm78@gmail.com> Link: https://lore.kernel.org/r/20211025221803.3326-1-nghialm78@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: allow to change s_last_trim_minblks via sysfsLukas Czerner
Ext4 has an optimization mechanism for batched disacrd (FITRIM) that should help speed up subsequent calls of FITRIM ioctl by skipping the groups that were previously trimmed. However because the FITRIM allows to set the minimum size of an extent to trim, ext4 stores the last minimum extent size and only avoids trimming the group if it was previously trimmed with minimum extent size equal to, or smaller than the current call. There is currently no way to bypass the optimization without umount/mount cycle. This becomes a problem when the file system is live migrated to a different storage, because the optimization will prevent possibly useful discard calls to the storage. Fix it by exporting the s_last_trim_minblks via sysfs interface which will allow us to set the minimum size to the number of blocks larger than subsequent FITRIM call, effectively bypassing the optimization. By setting the s_last_trim_minblks to ULONG_MAX the optimization will be effectively cleared regardless of the previous state, or file system configuration. For example: getconf ULONG_MAX > /sys/fs/ext4/dm-1/last_trim_minblks Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Laurent GUERBY <laurent@guerby.net> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20211103145122.17338-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: change s_last_trim_minblks type to unsigned longLukas Czerner
There is no good reason for the s_last_trim_minblks to be atomic. There is no data integrity needed and there is no real danger in setting and reading it in a racy manner. Change it to be unsigned long, the same type as s_clusters_per_group which is the maximum that's allowed. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Suggested-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20211103145122.17338-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: implement support for get/set fs labelLukas Czerner
Implement support for FS_IOC_GETFSLABEL and FS_IOC_SETFSLABEL ioctls for online reading and setting of file system label. ext4_ioctl_getlabel() is simple, just get the label from the primary superblock. This might not be the first sb on the file system if 'sb=' mount option is used. In ext4_ioctl_setlabel() we update what ext4 currently views as a primary superblock and then proceed to update backup superblocks. There are two caveats: - the primary superblock might not be the first superblock and so it might not be the one used by userspace tools if read directly off the disk. - because the primary superblock might not be the first superblock we potentialy have to update it as part of backup superblock update. However the first sb location is a bit more complicated than the rest so we have to account for that. The superblock modification is created generic enough so the infrastructure can be used for other potential superblock modification operations, such as chaning UUID. Tested with generic/492 with various configurations. I also checked the behavior with 'sb=' mount options, including very large file systems with and without sparse_super/sparse_super2. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20211213135618.43303-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: only set EXT4_MOUNT_QUOTA when journalled quota file is specifiedLukas Czerner
Only set EXT4_MOUNT_QUOTA when journalled quota file is specified, otherwise simply disabling specific quota type (usrjquota=) will also set the EXT4_MOUNT_QUOTA super block option. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Fixes: e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") Link: https://lore.kernel.org/r/20220104143518.134465-2-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: don't use kfree() on rcu protected pointer sbi->s_qf_namesLukas Czerner
During ext4 mount api rework the commit e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") introduced a bug where we would kfree(sbi->s_qf_names[i]) before assigning the new quota name in ext4_apply_quota_options(). This is wrong because we're using kfree() on rcu prointer that could be simultaneously accessed from ext4_show_quota_options() during remount. Fix it by using rcu_replace_pointer() to replace the old qname with the new one and then kfree_rcu() the old quota name. Also use get_qf_name() instead of sbi->s_qf_names in strcmp() to silence the sparse warning. Fixes: e6e268cb6822 ("ext4: move quota configuration out of handle_mount_opt()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20220104143518.134465-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: avoid trim error on fs with small groupsJan Kara
A user reported FITRIM ioctl failing for him on ext4 on some devices without apparent reason. After some debugging we've found out that these devices (being LVM volumes) report rather large discard granularity of 42MB and the filesystem had 1k blocksize and thus group size of 8MB. Because ext4 FITRIM implementation puts discard granularity into minlen, ext4_trim_fs() declared the trim request as invalid. However just silently doing nothing seems to be a more appropriate reaction to such combination of parameters since user did not specify anything wrong. CC: Lukas Czerner <lczerner@redhat.com> Fixes: 5c2ed62fd447 ("ext4: Adjust minlen with discard_granularity in the FITRIM ioctl") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211112152202.26614-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: fix an use-after-free issue about data=journal writeback modeZhang Yi
Our syzkaller report an use-after-free issue that accessing the freed buffer_head on the writeback page in __ext4_journalled_writepage(). The problem is that if there was a truncate racing with the data=journalled writeback procedure, the writeback length could become zero and bget_one() refuse to get buffer_head's refcount, then the truncate procedure release buffer once we drop page lock, finally, the last ext4_walk_page_buffers() trigger the use-after-free problem. sync truncate ext4_sync_file() file_write_and_wait_range() ext4_setattr(0) inode->i_size = 0 ext4_writepage() len = 0 __ext4_journalled_writepage() page_bufs = page_buffers(page) ext4_walk_page_buffers(bget_one) <- does not get refcount do_invalidatepage() free_buffer_head() ext4_walk_page_buffers(page_bufs) <- trigger use-after-free After commit bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()"), we have already handled the racing case, so the bget_one() and bput_one() are not needed. So this patch simply remove these hunk, and recheck the i_size to make it safe. Fixes: bdf96838aea6 ("ext4: fix race between truncate and __ext4_journalled_writepage()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: fix null-ptr-deref in '__ext4_journal_ensure_credits'Ye Bin
We got issue as follows when run syzkaller test: [ 1901.130043] EXT4-fs error (device vda): ext4_remount:5624: comm syz-executor.5: Abort forced by user [ 1901.130901] Aborting journal on device vda-8. [ 1901.131437] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.16: Detected aborted journal [ 1901.131566] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.11: Detected aborted journal [ 1901.132586] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.18: Detected aborted journal [ 1901.132751] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.9: Detected aborted journal [ 1901.136149] EXT4-fs error (device vda) in ext4_reserve_inode_write:6035: Journal has aborted [ 1901.136837] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-fuzzer: Detected aborted journal [ 1901.136915] ================================================================== [ 1901.138175] BUG: KASAN: null-ptr-deref in __ext4_journal_ensure_credits+0x74/0x140 [ext4] [ 1901.138343] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.13: Detected aborted journal [ 1901.138398] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.1: Detected aborted journal [ 1901.138808] Read of size 8 at addr 0000000000000000 by task syz-executor.17/968 [ 1901.138817] [ 1901.138852] EXT4-fs error (device vda): ext4_journal_check_start:61: comm syz-executor.30: Detected aborted journal [ 1901.144779] CPU: 1 PID: 968 Comm: syz-executor.17 Not tainted 4.19.90-vhulk2111.1.0.h893.eulerosv2r10.aarch64+ #1 [ 1901.146479] Hardware name: linux,dummy-virt (DT) [ 1901.147317] Call trace: [ 1901.147552] dump_backtrace+0x0/0x2d8 [ 1901.147898] show_stack+0x28/0x38 [ 1901.148215] dump_stack+0xec/0x15c [ 1901.148746] kasan_report+0x108/0x338 [ 1901.149207] __asan_load8+0x58/0xb0 [ 1901.149753] __ext4_journal_ensure_credits+0x74/0x140 [ext4] [ 1901.150579] ext4_xattr_delete_inode+0xe4/0x700 [ext4] [ 1901.151316] ext4_evict_inode+0x524/0xba8 [ext4] [ 1901.151985] evict+0x1a4/0x378 [ 1901.152353] iput+0x310/0x428 [ 1901.152733] do_unlinkat+0x260/0x428 [ 1901.153056] __arm64_sys_unlinkat+0x6c/0xc0 [ 1901.153455] el0_svc_common+0xc8/0x320 [ 1901.153799] el0_svc_handler+0xf8/0x160 [ 1901.154265] el0_svc+0x10/0x218 [ 1901.154682] ================================================================== This issue may happens like this: Process1 Process2 ext4_evict_inode ext4_journal_start ext4_truncate ext4_ind_truncate ext4_free_branches ext4_ind_truncate_ensure_credits ext4_journal_ensure_credits_fn ext4_journal_restart handle->h_transaction = NULL; mount -o remount,abort /mnt -> trigger JBD abort start_this_handle -> will return failed ext4_xattr_delete_inode ext4_journal_ensure_credits ext4_journal_ensure_credits_fn __ext4_journal_ensure_credits jbd2_handle_buffer_credits journal = handle->h_transaction->t_journal; ->null-ptr-deref Now, indirect truncate process didn't handle error. To solve this issue maybe simply add check handle is abort in '__ext4_journal_ensure_credits' is enough, and i also think this is necessary. Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20211224100341.3299128-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: initialize err_blk before calling __ext4_get_inode_locHarshad Shirwadkar
It is not guaranteed that __ext4_get_inode_loc will definitely set err_blk pointer when it returns EIO. To avoid using uninitialized variables, let's first set err_blk to 0. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211201163421.2631661-1-harshads@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: fix a possible ABBA deadlock due to busy PAChunguang Xu
We found on older kernel (3.10) that in the scenario of insufficient disk space, system may trigger an ABBA deadlock problem, it seems that this problem still exists in latest kernel, try to fix it here. The main process triggered by this problem is that task A occupies the PA and waits for the jbd2 transaction finish, the jbd2 transaction waits for the completion of task B's IO (plug_list), but task B waits for the release of PA by task A to finish discard, which indirectly forms an ABBA deadlock. The related calltrace is as follows: Task A vfs_write ext4_mb_new_blocks() ext4_mb_mark_diskspace_used() JBD2 jbd2_journal_get_write_access() -> jbd2_journal_commit_transaction() ->schedule() filemap_fdatawait() | | | Task B | | do_unlinkat() | | ext4_evict_inode() | | jbd2_journal_begin_ordered_truncate() | | filemap_fdatawrite_range() | | ext4_mb_new_blocks() | -ext4_mb_discard_group_preallocations() <----- Here, try to cancel ext4_mb_discard_group_preallocations() internal retry due to PA busy, and do a limited number of retries inside ext4_mb_discard_preallocations(), which can circumvent the above problems, but also has some advantages: 1. Since the PA is in a busy state, if other groups have free PAs, keeping the current PA may help to reduce fragmentation. 2. Continue to traverse forward instead of waiting for the current group PA to be released. In most scenarios, the PA discard time can be reduced. However, in the case of smaller free space, if only a few groups have space, then due to multiple traversals of the group, it may increase CPU overhead. But in contrast, I feel that the overall benefit is better than the cost. Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1637630277-23496-1-git-send-email-brookxu.cn@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: replace snprintf in show functions with sysfs_emitQing Wang
coccicheck complains about the use of snprintf() in sysfs show functions. Fix the coccicheck warning: WARNING: use scnprintf or sprintf. Use sysfs_emit instead of scnprintf or sprintf makes more sense. Signed-off-by: Qing Wang <wangqing@vivo.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1634095731-4528-1-git-send-email-wangqing@vivo.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: make sure to reset inode lockdep class when quota enabling failsJan Kara
When we succeed in enabling some quota type but fail to enable another one with quota feature, we correctly disable all enabled quota types. However we forget to reset i_data_sem lockdep class. When the inode gets freed and reused, it will inherit this lockdep class (i_data_sem is initialized only when a slab is created) and thus eventually lockdep barfs about possible deadlocks. Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: make sure quota gets properly shutdown on errorJan Kara
When we hit an error when enabling quotas and setting inode flags, we do not properly shutdown quota subsystem despite returning error from Q_QUOTAON quotactl. This can lead to some odd situations like kernel using quota file while it is still writeable for userspace. Make sure we properly cleanup the quota subsystem in case of error. Signed-off-by: Jan Kara <jack@suse.cz> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-01-10ext4: Fix BUG_ON in ext4_bread when write quota dataYe Bin
We got issue as follows when run syzkaller: [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0' [ 167.983601] ------------[ cut here ]------------ [ 167.984245] kernel BUG at fs/ext4/inode.c:847! [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 168.001899] Call Trace: [ 168.002235] <TASK> [ 168.007167] ext4_bread+0xd/0x53 [ 168.007612] ext4_quota_write+0x20c/0x5c0 [ 168.010457] write_blk+0x100/0x220 [ 168.010944] remove_free_dqentry+0x1c6/0x440 [ 168.011525] free_dqentry.isra.0+0x565/0x830 [ 168.012133] remove_tree+0x318/0x6d0 [ 168.014744] remove_tree+0x1eb/0x6d0 [ 168.017346] remove_tree+0x1eb/0x6d0 [ 168.019969] remove_tree+0x1eb/0x6d0 [ 168.022128] qtree_release_dquot+0x291/0x340 [ 168.023297] v2_release_dquot+0xce/0x120 [ 168.023847] dquot_release+0x197/0x3e0 [ 168.024358] ext4_release_dquot+0x22a/0x2d0 [ 168.024932] dqput.part.0+0x1c9/0x900 [ 168.025430] __dquot_drop+0x120/0x190 [ 168.025942] ext4_clear_inode+0x86/0x220 [ 168.026472] ext4_evict_inode+0x9e8/0xa22 [ 168.028200] evict+0x29e/0x4f0 [ 168.028625] dispose_list+0x102/0x1f0 [ 168.029148] evict_inodes+0x2c1/0x3e0 [ 168.030188] generic_shutdown_super+0xa4/0x3b0 [ 168.030817] kill_block_super+0x95/0xd0 [ 168.031360] deactivate_locked_super+0x85/0xd0 [ 168.031977] cleanup_mnt+0x2bc/0x480 [ 168.033062] task_work_run+0xd1/0x170 [ 168.033565] do_exit+0xa4f/0x2b50 [ 168.037155] do_group_exit+0xef/0x2d0 [ 168.037666] __x64_sys_exit_group+0x3a/0x50 [ 168.038237] do_syscall_64+0x3b/0x90 [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae In order to reproduce this problem, the following conditions need to be met: 1. Ext4 filesystem with no journal; 2. Filesystem image with incorrect quota data; 3. Abort filesystem forced by user; 4. umount filesystem; As in ext4_quota_write: ... if (EXT4_SB(sb)->s_journal && !handle) { ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" " cancelled because transaction is not started", (unsigned long long)off, (unsigned long long)len); return -EIO; } ... We only check handle if NULL when filesystem has journal. There is need check handle if NULL even when filesystem has no journal. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: destroy ext4_fc_dentry_cachep kmemcache on module removalSebastian Andrzej Siewior
The kmemcache for ext4_fc_dentry_cachep remains registered after module removal. Destroy ext4_fc_dentry_cachep kmemcache on module removal. Fixes: aa75f4d3daaeb ("ext4: main fast-commit commit path") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211110134640.lyku5vklvdndw6uk@linutronix.de Link: https://lore.kernel.org/r/YbiK3JetFFl08bd7@linutronix.de Link: https://lore.kernel.org/r/20211223164436.2628390-1-bigeasy@linutronix.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: fast commit may miss tracking unwritten range during ftruncateXin Yin
If use FALLOC_FL_KEEP_SIZE to alloc unwritten range at bottom, the inode->i_size will not include the unwritten range. When call ftruncate with fast commit enabled, it will miss to track the unwritten range. Change to trace the full range during ftruncate. Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211223032337.5198-3-yinxin.x@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: use ext4_ext_remove_space() for fast commit replay delete rangeXin Yin
For now ,we use ext4_punch_hole() during fast commit replay delete range procedure. But it will be affected by inode->i_size, which may not correct during fast commit replay procedure. The following test will failed. -create & write foo (len 1000K) -falloc FALLOC_FL_ZERO_RANGE foo (range 400K - 600K) -create & fsync bar -falloc FALLOC_FL_PUNCH_HOLE foo (range 300K-500K) -fsync foo -crash before a full commit After the fast_commit reply procedure, the range 400K-500K will not be removed. Because in this case, when calling ext4_punch_hole() the inode->i_size is 0, and it just retruns with doing nothing. Change to use ext4_ext_remove_space() instead of ext4_punch_hole() to remove blocks of inode directly. Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211223032337.5198-2-yinxin.x@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10ext4: fix fast commit may miss tracking range for FALLOC_FL_ZERO_RANGEXin Yin
when call falloc with FALLOC_FL_ZERO_RANGE, to set an range to unwritten, which has been already initialized. If the range is align to blocksize, fast commit will not track range for this change. Also track range for unwritten range in ext4_map_blocks(). Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20211221022839.374606-1-yinxin.x@bytedance.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-01-10Merge tag 'x86_vdso_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso updates from Borislav Petkov: "Remove -nostdlib compiler flag now that the vDSO uses the linker instead of the compiler driver to link files" * tag 'x86_vdso_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/purgatory: Remove -nostdlib compiler flag x86/vdso: Remove -nostdlib compiler flag
2022-01-10Merge tag 'x86_build_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build fix from Borislav Petkov: "A fix for cross-compiling the compressed stub on arm64 with clang" * tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/compressed: Move CLANG_FLAGS to beginning of KBUILD_CFLAGS
2022-01-10Merge tag 'x86_cpu_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Enable the short string copies for CPUs which support them, in copy_user_enhanced_fast_string() - Avoid writing MSR_CSTAR on Intel due to TDX guests raising a #VE trap * tag 'x86_cpu_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/lib: Add fast-short-rep-movs check to copy_user_enhanced_fast_string() x86/cpu: Don't write CSTAR MSR on Intel CPUs
2022-01-10Merge tag 'x86_cleanups_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "The mandatory set of random minor cleanups all over tip" * tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/events/amd/iommu: Remove redundant assignment to variable shift x86/boot/string: Add missing function prototypes x86/fpu: Remove duplicate copy_fpstate_to_sigframe() prototype x86/uaccess: Move variable into switch case statement
2022-01-10Merge tag 'x86_misc_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: "The pile which we cannot find the proper topic for so we stick it in x86/misc: - Add support for decoding instructions which do MMIO accesses in order to use it in SEV and TDX guests - An include fix and reorg to allow for removing set_fs in UML later" * tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mtrr: Remove the mtrr_bp_init() stub x86/sev-es: Use insn_decode_mmio() for MMIO implementation x86/insn-eval: Introduce insn_decode_mmio() x86/insn-eval: Introduce insn_get_modrm_reg_ptr() x86/insn-eval: Handle insn_get_opcode() failure
2022-01-10Merge branch 'workqueue/for-5.16-fixes' into workqueue/for-5.17Tejun Heo
for-5.16-fixes contains two subtle race conditions which were introduced by scheduler side code cleanups. The branch didn't get pushed out, so merge into for-5.17.
2022-01-10Merge tag 'x86_mm_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Borislav Petkov: - Flush *all* mappings from the TLB after switching to the trampoline pagetable to prevent any stale entries' presence - Flush global mappings from the TLB, in addition to the CR3-write, after switching off of the trampoline_pgd during boot to clear the identity mappings - Prevent instrumentation issues resulting from the above changes * tag 'x86_mm_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Prevent early boot triple-faults with instrumentation x86/mm: Include spinlock_t definition in pgtable. x86/mm: Flush global TLB when switching to trampoline page-table x86/mm/64: Flush global TLB on boot and AP bringup x86/realmode: Add comment for Global bit usage in trampoline_pgd x86/mm: Add missing <asm/cpufeatures.h> dependency to <asm/page_64.h>
2022-01-10Merge tag 'x86_sgx_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SGX updates from Borislav Petkov: - Add support for handling hw errors in SGX pages: poisoning, recovering from poison memory and error injection into SGX pages - A bunch of changes to the SGX selftests to simplify and allow of SGX features testing without the need of a whole SGX software stack - Add a sysfs attribute which is supposed to show the amount of SGX memory in a NUMA node, similar to what /proc/meminfo is to normal memory - The usual bunch of fixes and cleanups too * tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/sgx: Fix NULL pointer dereference on non-SGX systems selftests/sgx: Fix corrupted cpuid macro invocation x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node x86/sgx: Fix minor documentation issues selftests/sgx: Add test for multiple TCS entry selftests/sgx: Enable multiple thread support selftests/sgx: Add page permission and exception test selftests/sgx: Rename test properties in preparation for more enclave tests selftests/sgx: Provide per-op parameter structs for the test enclave selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed selftests/sgx: Move setup_test_encl() to each TEST_F() selftests/sgx: Encpsulate the test enclave creation selftests/sgx: Dump segments and /proc/self/maps only on failure selftests/sgx: Create a heap for the test enclave selftests/sgx: Make data measurement for an enclave segment optional selftests/sgx: Assign source for each segment selftests/sgx: Fix a benign linker warning x86/sgx: Add check for SGX pages to ghes_do_memory_failure() x86/sgx: Add hook to error injection address validation x86/sgx: Hook arch_memory_failure() into mainline code ...
2022-01-10Merge tag 'x86_cache_for_v5.17_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control fixlet from Borislav Petkov: "A minor code cleanup removing a redundant assignment" * tag 'x86_cache_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Remove redundant assignment to variable chunks