summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-09KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptrMarc Zyngier
Sparse complains that __hyp_this_cpu_ptr() returns something that is flagged noderef and not in the correct address space (both being the result of the __percpu annotation). Pretend that __hyp_this_cpu_ptr() knows what it is doing by forcefully casting the pointer with __kernel __force. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09KVM: arm64: Handle PtrAuth traps earlyMarc Zyngier
The current way we deal with PtrAuth is a bit heavy handed: - We forcefully save the host's keys on each vcpu_load() - Handling the PtrAuth trap forces us to go all the way back to the exit handling code to just set the HCR bits Overall, this is pretty cumbersome. A better approach would be to handle it the same way we deal with the FPSIMD registers: - On vcpu_load() disable PtrAuth for the guest - On first use, save the host's keys, enable PtrAuth in the guest Crucially, this can happen as a fixup, which is done very early on exit. We can then reenter the guest immediately without leaving the hypervisor role. Another thing is that it simplify the rest of the host handling: exiting all the way to the host means that the only possible outcome for this trap is to inject an UNDEF. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09KVM: x86: Unexport x86_fpu_cache and make it staticSean Christopherson
Make x86_fpu_cache static now that FPU allocation and destruction is handled entirely by common x86 code. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200608180218.20946-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-09KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4KSean Christopherson
Explicitly set the VA width to 48 bits for the x86_64-only PXXV48_4K VM mode instead of asserting the guest VA width is 48 bits. The fact that KVM supports 5-level paging is irrelevant unless the selftests opt-in to 5-level paging by setting CR4.LA57 for the guest. The overzealous assert prevents running the selftests on a kernel with 5-level paging enabled. Incorporate LA57 into the assert instead of removing the assert entirely as a sanity check of KVM's CPUID output. Fixes: 567a9f1e9deb ("KVM: selftests: Introduce VM_MODE_PXXV48_4K") Reported-by: Sergio Perez Gonzalez <sergio.perez.gonzalez@intel.com> Cc: Adriana Cervantes Jimenez <adriana.cervantes.jimenez@intel.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200528021530.28091-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-09KVM: arm64: Save the host's PtrAuth keys in non-preemptible contextMarc Zyngier
When using the PtrAuth feature in a guest, we need to save the host's keys before allowing the guest to program them. For that, we dump them in a per-CPU data structure (the so called host context). But both call sites that do this are in preemptible context, which may end up in disaster should the vcpu thread get preempted before reentering the guest. Instead, save the keys eagerly on each vcpu_load(). This has an increased overhead, but is at least safe. Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-06-09x86_64: Fix jiffies ODR violationBob Haarman
'jiffies' and 'jiffies_64' are meant to alias (two different symbols that share the same address). Most architectures make the symbols alias to the same address via a linker script assignment in their arch/<arch>/kernel/vmlinux.lds.S: jiffies = jiffies_64; which is effectively a definition of jiffies. jiffies and jiffies_64 are both forward declared for all architectures in include/linux/jiffies.h. jiffies_64 is defined in kernel/time/timer.c. x86_64 was peculiar in that it wasn't doing the above linker script assignment, but rather was: 1. defining jiffies in arch/x86/kernel/time.c instead via the linker script. 2. overriding the symbol jiffies_64 from kernel/time/timer.c in arch/x86/kernel/vmlinux.lds.s via 'jiffies_64 = jiffies;'. As Fangrui notes: In LLD, symbol assignments in linker scripts override definitions in object files. GNU ld appears to have the same behavior. It would probably make sense for LLD to error "duplicate symbol" but GNU ld is unlikely to adopt for compatibility reasons. This results in an ODR violation (UB), which seems to have survived thus far. Where it becomes harmful is when; 1. -fno-semantic-interposition is used: As Fangrui notes: Clang after LLVM commit 5b22bcc2b70d ("[X86][ELF] Prefer to lower MC_GlobalAddress operands to .Lfoo$local") defaults to -fno-semantic-interposition similar semantics which help -fpic/-fPIC code avoid GOT/PLT when the referenced symbol is defined within the same translation unit. Unlike GCC -fno-semantic-interposition, Clang emits such relocations referencing local symbols for non-pic code as well. This causes references to jiffies to refer to '.Ljiffies$local' when jiffies is defined in the same translation unit. Likewise, references to jiffies_64 become references to '.Ljiffies_64$local' in translation units that define jiffies_64. Because these differ from the names used in the linker script, they will not be rewritten to alias one another. 2. Full LTO Full LTO effectively treats all source files as one translation unit, causing these local references to be produced everywhere. When the linker processes the linker script, there are no longer any references to jiffies_64' anywhere to replace with 'jiffies'. And thus '.Ljiffies$local' and '.Ljiffies_64$local' no longer alias at all. In the process of porting patches enabling Full LTO from arm64 to x86_64, spooky bugs have been observed where the kernel appeared to boot, but init doesn't get scheduled. Avoid the ODR violation by matching other architectures and define jiffies only by linker script. For -fno-semantic-interposition + Full LTO, there is no longer a global definition of jiffies for the compiler to produce a local symbol which the linker script won't ensure aliases to jiffies_64. Fixes: 40747ffa5aa8 ("asmlinkage: Make jiffies visible") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Alistair Delva <adelva@google.com> Debugged-by: Nick Desaulniers <ndesaulniers@google.com> Debugged-by: Sami Tolvanen <samitolvanen@google.com> Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Bob Haarman <inglorion@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # build+boot on Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linux/issues/852 Link: https://lkml.kernel.org/r/20200602193100.229287-1-inglorion@google.com
2020-06-09x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.Anthony Steinhauser
Currently, it is possible to enable indirect branch speculation even after it was force-disabled using the PR_SPEC_FORCE_DISABLE option. Moreover, the PR_GET_SPECULATION_CTRL command gives afterwards an incorrect result (force-disabled when it is in fact enabled). This also is inconsistent vs. STIBP and the documention which cleary states that PR_SPEC_FORCE_DISABLE cannot be undone. Fix this by actually enforcing force-disabled indirect branch speculation. PR_SPEC_ENABLE called after PR_SPEC_FORCE_DISABLE now fails with -EPERM as described in the documentation. Fixes: 9137bb27e60e ("x86/speculation: Add prctl() control for indirect branch speculation") Signed-off-by: Anthony Steinhauser <asteinhauser@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2020-06-09x86/speculation: Prevent rogue cross-process SSBD shutdownAnthony Steinhauser
On context switch the change of TIF_SSBD and TIF_SPEC_IB are evaluated to adjust the mitigations accordingly. This is optimized to avoid the expensive MSR write if not needed. This optimization is buggy and allows an attacker to shutdown the SSBD protection of a victim process. The update logic reads the cached base value for the speculation control MSR which has neither the SSBD nor the STIBP bit set. It then OR's the SSBD bit only when TIF_SSBD is different and requests the MSR update. That means if TIF_SSBD of the previous and next task are the same, then the base value is not updated, even if TIF_SSBD is set. The MSR write is not requested. Subsequently if the TIF_STIBP bit differs then the STIBP bit is updated in the base value and the MSR is written with a wrong SSBD value. This was introduced when the per task/process conditional STIPB switching was added on top of the existing SSBD switching. It is exploitable if the attacker creates a process which enforces SSBD and has the contrary value of STIBP than the victim process (i.e. if the victim process enforces STIBP, the attacker process must not enforce it; if the victim process does not enforce STIBP, the attacker process must enforce it) and schedule it on the same core as the victim process. If the victim runs after the attacker the victim becomes vulnerable to Spectre V4. To fix this, update the MSR value independent of the TIF_SSBD difference and dependent on the SSBD mitigation method available. This ensures that a subsequent STIPB initiated MSR write has the correct state of SSBD. [ tglx: Handle X86_FEATURE_VIRT_SSBD & X86_FEATURE_VIRT_SSBD correctly and massaged changelog ] Fixes: 5bfbe3ad5840 ("x86/speculation: Prepare for per task indirect branch speculation control") Signed-off-by: Anthony Steinhauser <asteinhauser@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2020-06-09x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.Anthony Steinhauser
When STIBP is unavailable or enhanced IBRS is available, Linux force-disables the IBPB mitigation of Spectre-BTB even when simultaneous multithreading is disabled. While attempts to enable IBPB using prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, ...) fail with EPERM, the seccomp syscall (or its prctl(PR_SET_SECCOMP, ...) equivalent) which are used e.g. by Chromium or OpenSSH succeed with no errors but the application remains silently vulnerable to cross-process Spectre v2 attacks (classical BTB poisoning). At the same time the SYSFS reporting (/sys/devices/system/cpu/vulnerabilities/spectre_v2) displays that IBPB is conditionally enabled when in fact it is unconditionally disabled. STIBP is useful only when SMT is enabled. When SMT is disabled and STIBP is unavailable, it makes no sense to force-disable also IBPB, because IBPB protects against cross-process Spectre-BTB attacks regardless of the SMT state. At the same time since missing STIBP was only observed on AMD CPUs, AMD does not recommend using STIBP, but recommends using IBPB, so disabling IBPB because of missing STIBP goes directly against AMD's advice: https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf Similarly, enhanced IBRS is designed to protect cross-core BTB poisoning and BTB-poisoning attacks from user space against kernel (and BTB-poisoning attacks from guest against hypervisor), it is not designed to prevent cross-process (or cross-VM) BTB poisoning between processes (or VMs) running on the same core. Therefore, even with enhanced IBRS it is necessary to flush the BTB during context-switches, so there is no reason to force disable IBPB when enhanced IBRS is available. Enable the prctl control of IBPB even when STIBP is unavailable or enhanced IBRS is available. Fixes: 7cc765a67d8e ("x86/speculation: Enable prctl mode for spectre_v2_user") Signed-off-by: Anthony Steinhauser <asteinhauser@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2020-06-09KVM: arm64: Stop save/restoring ACTLR_EL1James Morse
KVM sets HCR_EL2.TACR via HCR_GUEST_FLAGS. This means ACTLR* accesses from the guest are always trapped, and always return the value in the sys_regs array. The guest can't change the value of these registers, so we are save restoring the reset value, which came from the host. Stop save/restoring this register. Keep the storage for this register in sys_regs[] as this is how the value is exposed to user-space, removing it would break migration. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200529150656.7339-4-james.morse@arm.com
2020-06-09KVM: arm64: Add emulation for 32bit guests accessing ACTLR2James Morse
ACTLR_EL1 is a 64bit register while the 32bit ACTLR is obviously 32bit. For 32bit software, the extra bits are accessible via ACTLR2... which KVM doesn't emulate. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200529150656.7339-3-james.morse@arm.com
2020-06-09KVM: arm64: Stop writing aarch32's CSSELR into ACTLRJames Morse
aarch32 has pairs of registers to access the high and low parts of 64bit registers. KVM has a union of 64bit sys_regs[] and 32bit copro[]. The 32bit accessors read the high or low part of the 64bit sys_reg[] value through the union. Both sys_reg_descs[] and cp15_regs[] list access_csselr() as the accessor for CSSELR{,_EL1}. access_csselr() is only aware of the 64bit sys_regs[], and expects r->reg to be 'CSSELR_EL1' in the enum, index 2 of the 64bit array. cp15_regs[] uses the 32bit copro[] alias of sys_regs[]. Here CSSELR is c0_CSSELR which is the same location in sys_reg[]. r->reg is 'c0_CSSELR', index 4 in the 32bit array. access_csselr() uses the 32bit r->reg value to access the 64bit array, so reads and write the wrong value. sys_regs[4], is ACTLR_EL1, which is subsequently save/restored when we enter the guest. ACTLR_EL1 is supposed to be read-only for the guest. This register only affects execution at EL1, and the host's value is restored before we return to host EL1. Convert the 32bit register index back to the 64bit version. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200529150656.7339-2-james.morse@arm.com
2020-06-09exfat: Fix potential use after free in exfat_load_upcase_table()Dan Carpenter
This code calls brelse(bh) and then dereferences "bh" on the next line resulting in a possible use after free. The brelse() should just be moved down a line. Fixes: b676fdbcf4c8 ("exfat: standardize checksum calculation") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: fix range validation error in alloc and free clusterhyeongseok.kim
There is check error in range condition that can never be entered even with invalid input. Replace incorrent checking code with already existing valid checker. Signed-off-by: hyeongseok.kim <hyeongseok@gmail.com> Acked-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: fix incorrect update of stream entry in __exfat_truncate()Namjae Jeon
At truncate, there is a problem of incorrect updating in the file entry pointer instead of stream entry. This will cause the problem of overwriting the time field of the file entry to new_size. Fix it to update stream entry. Fixes: 98d917047e8b ("exfat: add file operations") Cc: stable@vger.kernel.org # v5.7 Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: fix memory leak in exfat_parse_param()Al Viro
butt3rflyh4ck reported memory leak found by syzkaller. A param->string held by exfat_mount_options. BUG: memory leak unreferenced object 0xffff88801972e090 (size 8): comm "syz-executor.2", pid 16298, jiffies 4295172466 (age 14.060s) hex dump (first 8 bytes): 6b 6f 69 38 2d 75 00 00 koi8-u.. backtrace: [<000000005bfe35d6>] kstrdup+0x36/0x70 mm/util.c:60 [<0000000018ed3277>] exfat_parse_param+0x160/0x5e0 fs/exfat/super.c:276 [<000000007680462b>] vfs_parse_fs_param+0x2b4/0x610 fs/fs_context.c:147 [<0000000097c027f2>] vfs_parse_fs_string+0xe6/0x150 fs/fs_context.c:191 [<00000000371bf78f>] generic_parse_monolithic+0x16f/0x1f0 fs/fs_context.c:231 [<000000005ce5eb1b>] do_new_mount fs/namespace.c:2812 [inline] [<000000005ce5eb1b>] do_mount+0x12bb/0x1b30 fs/namespace.c:3141 [<00000000b642040c>] __do_sys_mount fs/namespace.c:3350 [inline] [<00000000b642040c>] __se_sys_mount fs/namespace.c:3327 [inline] [<00000000b642040c>] __x64_sys_mount+0x18f/0x230 fs/namespace.c:3327 [<000000003b024e98>] do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 [<00000000ce2b698c>] entry_SYSCALL_64_after_hwframe+0x49/0xb3 exfat_free() should call exfat_free_iocharset(), to prevent a leak in case we fail after parsing iocharset= but before calling get_tree_bdev(). Additionally, there's no point copying param->string in exfat_parse_param() - just steal it, leaving NULL in param->string. That's independent from the leak or fix thereof - it's simply avoiding an extra copy. Fixes: 719c1e182916 ("exfat: add super block operations") Cc: stable@vger.kernel.org # v5.7 Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: remove unnecessary reassignment of p_uniname->name_lenNamjae Jeon
kbuild test robot reported : fs/exfat/nls.c:531:22: warning: Variable 'p_uniname->name_len' is reassigned a value before the old one has been used. The reassignment of p_uniname->name_len is not needed and remove it. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: standardize checksum calculationTetsuhiro Kohada
To clarify that it is a 16-bit checksum, the parts related to the 16-bit checksum are renamed and change type to u16. Furthermore, replace checksum calculation in exfat_load_upcase_table() with exfat_calc_checksum32(). Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: add boot region verificationTetsuhiro Kohada
Add Boot-Regions verification specified in exFAT specification. Note that the checksum type is strongly related to the raw structure, so the'u32 'type is used to clarify the number of bits. Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: separate the boot sector analysisTetsuhiro Kohada
Separate the boot sector analysis to read_boot_sector(). And add a check for the fs_name field. Furthermore, add a strict consistency check, because overlapping areas can cause serious corruption. Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: redefine PBR as boot_sectorTetsuhiro Kohada
Aggregate PBR related definitions and redefine as "boot_sector" to comply with the exFAT specification. And, rename variable names including 'pbr'. Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: optimize dir-cacheTetsuhiro Kohada
Optimize directory access based on exfat_entry_set_cache. - Hold bh instead of copied d-entry. - Modify bh->data directly instead of the copied d-entry. - Write back the retained bh instead of rescanning the d-entry-set. And - Remove unused cache related definitions. Signed-off-by: Tetsuhiro Kohada <kohada.tetsuhiro@dc.mitsubishielectric.co.jp> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: replace 'time_ms' with 'time_cs'Tetsuhiro Kohada
Replace time_ms with time_cs in the file directory entry structure and related functions. The unit of create_time_ms/modify_time_ms in File Directory Entry are not 'milli-second', but 'centi-second'. The exfat specification uses the term '10ms', but instead use 'cs' as in msdos_fs.h. Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: remove the assignment of 0 to bool variableJason Yan
There is no need to init 'sync' in exfat_set_vol_flags(). This also fixes the following coccicheck warning: fs/exfat/super.c:104:6-10: WARNING: Assignment of 0/1 to bool variable Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: Remove unused functions exfat_high_surrogate() and exfat_low_surrogate()Pali Rohár
After applying previous two patches, these functions are not used anymore. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: Simplify exfat_utf8_d_hash() for code points above U+FFFFPali Rohár
Function partial_name_hash() takes long type value into which can be stored one Unicode code point. Therefore conversion from UTF-32 to UTF-16 is not needed. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: Improve wording of EXFAT_DEFAULT_IOCHARSET config optionGeert Uytterhoeven
- Use consistent capitalization for "exFAT". - Fix grammar, - Split long sentence. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: Use a more common logging styleJoe Perches
Remove the direct use of KERN_<LEVEL> in functions by creating separate exfat_<level> macros. Miscellanea: o Remove several unnecessary terminating newlines in formats o Realign arguments and fit to 80 columns where appropriate Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-09exfat: Simplify exfat_utf8_d_cmp() for code points above U+FFFFPali Rohár
If two Unicode code points represented in UTF-16 are different then also their UTF-32 representation must be different. Therefore conversion from UTF-32 to UTF-16 is not needed. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-06-08cifs: Add get_security_type_str function to return sec type.Kenneth D'souza
This code is more organized and robust. Signed-off-by: Kenneth D'souza <kdsouza@redhat.com> Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2020-06-08iomap: Fix unsharing of an extent >2GB on a 32-bit machineMatthew Wilcox (Oracle)
Widen the type used for counting the number of bytes unshared. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-08xfs: Add the missed xfs_perag_put() for xfs_ifree_cluster()Chuhong Yuan
xfs_ifree_cluster() calls xfs_perag_get() at the beginning, but forgets to call xfs_perag_put() in one failed path. Add the missed function call to fix it. Fixes: ce92464c180b ("xfs: make xfs_trans_get_buf return an error code") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-06-08f2fs: attach IO flags to the missing casesJaegeuk Kim
This adds more IOs to attach flags. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08f2fs: add node_io_flag for bio flags likewise data_io_flagJaegeuk Kim
This patch adds another way to attach bio flags to node writes. Description: Give a way to attach REQ_META|FUA to node writes given temperature-based bits. Now the bits indicate: * REQ_META | REQ_FUA | * 5 | 4 | 3 | 2 | 1 | 0 | * Cold | Warm | Hot | Cold | Warm | Hot | Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08f2fs: remove unused parameter of f2fs_put_rpages_mapping()Chao Yu
Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08f2fs: handle readonly filesystem in f2fs_ioc_shutdown()Chao Yu
If mountpoint is readonly, we should allow shutdowning filesystem successfully, this fixes issue found by generic/599 testcase of xfstest. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08f2fs: avoid utf8_strncasecmp() with unstable nameEric Biggers
If the dentry name passed to ->d_compare() fits in dentry::d_iname, then it may be concurrently modified by a rename. This can cause undefined behavior (possibly out-of-bounds memory accesses or crashes) in utf8_strncasecmp(), since fs/unicode/ isn't written to handle strings that may be concurrently modified. Fix this by first copying the filename to a stack buffer if needed. This way we get a stable snapshot of the filename. Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups") Cc: <stable@vger.kernel.org> # v5.4+ Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Daniel Rosenberg <drosen@google.com> Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08f2fs: don't return vmalloc() memory from f2fs_kmalloc()Eric Biggers
kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either kmalloc'ed or vmalloc'ed memory. But the f2fs wrappers, f2fs_kmalloc() and f2fs_kvmalloc(), both return both kinds of memory. It's redundant to have two functions that do the same thing, and also breaking the standard naming convention is causing bugs since people assume it's safe to kfree() memory allocated by f2fs_kmalloc(). See e.g. the various allocations in fs/f2fs/compress.c. Fix this by making f2fs_kmalloc() just use kmalloc(). And to avoid re-introducing the allocation failures that the vmalloc fallback was intended to fix, convert the largest allocations to use f2fs_kvmalloc(). Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-08selftests/net: in timestamping, strncpy needs to preserve null bytetannerlove
If user passed an interface option longer than 15 characters, then device.ifr_name and hwtstamp.ifr_name became non-null-terminated strings. The compiler warned about this: timestamping.c:353:2: warning: ‘strncpy’ specified bound 16 equals \ destination size [-Wstringop-truncation] 353 | strncpy(device.ifr_name, interface, sizeof(device.ifr_name)); Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08Merge tag 'rxrpc-fixes-20200605' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fix hang due to missing notification Here's a fix for AF_RXRPC. Occasionally calls hang because there are circumstances in which rxrpc generate a notification when a call is completed - primarily because initial packet transmission failed and the call was killed off and an error returned. But the AFS filesystem driver doesn't check this under all circumstances, expecting failure to be delivered by asynchronous notification. There are two patches: the first moves the problematic bits out-of-line and the second contains the fix. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08mptcp: bugfix for RM_ADDR option parsingGeliang Tang
In MPTCPOPT_RM_ADDR option parsing, the pointer "ptr" pointed to the "Subtype" octet, the pointer "ptr+1" pointed to the "Address ID" octet: +-------+-------+---------------+ |Subtype|(resvd)| Address ID | +-------+-------+---------------+ | | ptr ptr+1 We should set mp_opt->rm_id to the value of "ptr+1", not "ptr". This patch will fix this bug. Fixes: 3df523ab582c ("mptcp: Add ADD_ADDR handling") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08net-zerocopy: use vm_insert_pages() for tcp rcv zerocopyArjun Roy
Use vm_insert_pages() for tcp receive zerocopy. Spin lock cycles (as reported by perf) drop from a couple of percentage points to a fraction of a percent. This results in a roughly 6% increase in efficiency, measured roughly as zerocopy receive count divided by CPU utilization. The intention of this patchset is to reduce atomic ops for tcp zerocopy receives, which normally hits the same spinlock multiple times consecutively. [akpm@linux-foundation.org: suppress gcc-7.2.0 warning] Link: http://lkml.kernel.org/r/20200128025958.43490-3-arjunroy.kdev@gmail.com Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08net/tls(TLS_SW): Add selftest for 'chunked' sendfile testPooja Trivedi
This selftest tests for cases where sendfile's 'count' parameter is provided with a size greater than the intended file size. Motivation: When sendfile is provided with 'count' parameter value that is greater than the size of the file, kTLS example fails to send the file correctly. Last chunk of the file is not sent, and the data integrity is compromised. The reason is that the last chunk has MSG_MORE flag set because of which it gets added to pending records, but is not pushed. Note that if user space were to send SSL_shutdown control message, pending records would get flushed and the issue would not happen. So a shutdown control message following sendfile can mask the issue. Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com> Signed-off-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com> Signed-off-by: Josh Tway <josh.tway@stackpath.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08Merge tag 'mac80211-for-davem-2020-06-08' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a small update: * fix the deadlock on rfkill/wireless removal that a few people reported * fix an uninitialized variable * update wiki URLs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-08i2c: npcm7xx: Fix a couple of error codes in probeDan Carpenter
The code here is accidentally returning IS_ERR() which is 1 but it should be returning negative error codes with PTR_ERR(). Fixes: 56a1485b102e ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-06-08Merge tag 'rproc-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc Pull remoteproc updates from Bjorn Andersson: "This introduces device managed versions of functions used to register remoteproc devices, add support for remoteproc driver specific resource control, enables remoteproc drivers to specify ELF class and machine for coredumps. It integrates pm_runtime in the core for keeping resources active while the remote is booted and holds a wake source while recoverying a remote processor after a firmware crash. It refactors the remoteproc device's allocation path to simplify the logic, fix a few cleanup bugs and to not clone const strings onto the heap. Debugfs code is simplifies using the DEFINE_SHOW_ATTRIBUTE and a zero-length array is replaced with flexible-array. A new remoteproc driver for the JZ47xx VPU is introduced, the Qualcomm SM8250 gains support for audio, compute and sensor remoteprocs and the Qualcomm SC7180 modem support is cleaned up and improved. The Qualcomm glink subsystem-restart driver is merged into the main glink driver, the Qualcomm sysmon driver is extended to properly notify remote processors about all other remote processors' state transitions" * tag 'rproc-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: (43 commits) remoteproc: Fix an error code in devm_rproc_alloc() MAINTAINERS: Add myself as reviewer for Ingenic rproc driver remoteproc: ingenic: Added remoteproc driver remoteproc: Add support for runtime PM dt-bindings: Document JZ47xx VPU auxiliary processor remoteproc: wcss: Fix arguments passed to qcom_add_glink_subdev() remoteproc: Fix and restore the parenting hierarchy for vdev remoteproc: Fall back to using parent memory pool if no dedicated available remoteproc: Replace zero-length array with flexible-array remoteproc: wcss: add support for rpmsg communication remoteproc: core: Prevent system suspend during remoteproc recovery remoteproc: qcom_q6v5_mss: Remove unused q6v5_da_to_va function remoteproc: qcom_q6v5_mss: map/unmap mpss segments before/after use remoteproc: qcom_q6v5_mss: Drop accesses to MPSS PERPH register space dt-bindings: remoteproc: qcom: Replace halt-nav with spare-regs remoteproc: qcom: pas: Add SM8250 PAS remoteprocs dt-bindings: remoteproc: qcom: pas: Add SM8250 remoteprocs remoteproc: qcom_q6v5_mss: Extract mba/mpss from memory-region dt-bindings: remoteproc: qcom: Use memory-region to reference memory remoteproc: qcom: pas: Add SC7180 Modem support ...
2020-06-08Merge tag 'rpmsg-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc Pull rpmsg updates from Bjorn Andersson: "This replaces a zero-length array with flexible-array and fixes a typo in a comment in the rpmsg core" * tag 'rpmsg-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc: rpmsg: Replace zero-length array with flexible-array rpmsg: fix a comment typo for rpmsg_device_match()
2020-06-08Merge tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "The highlights are: - OSD/MDS latency and caps cache metrics infrastructure for the filesytem (Xiubo Li). Currently available through debugfs and will be periodically sent to the MDS in the future. - support for replica reads (balanced and localized reads) for rbd and the filesystem (myself). The default remains to always read from primary, users can opt-in with the new crush_location and read_from_replica options. Note that reading from replica is safe for general use only since Octopus. - support for RADOS allocation hint flags (myself). Currently used by rbd to propagate the compressible/incompressible hint given with the new compression_hint map option and ready for passing on more advanced hints, e.g. based on fadvise() from the filesystem. - support for efficient cross-quota-realm renames (Luis Henriques) - assorted cap handling improvements and cleanups, particularly untangling some of the locking (Jeff Layton)" * tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client: (29 commits) rbd: compression_hint option libceph: support for alloc hint flags libceph: read_from_replica option libceph: support for balanced and localized reads libceph: crush_location infrastructure libceph: decode CRUSH device/bucket types and names libceph: add non-asserting rbtree insertion helper ceph: skip checking caps when session reconnecting and releasing reqs ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock ceph: don't return -ESTALE if there's still an open file libceph, rbd: replace zero-length array with flexible-array ceph: allow rename operation under different quota realms ceph: normalize 'delta' parameter usage in check_quota_exceeded ceph: ceph_kick_flushing_caps needs the s_mutex ceph: request expedited service on session's last cap flush ceph: convert mdsc->cap_dirty to a per-session list ceph: reset i_requested_max_size if file write is not wanted ceph: throw a warning if we destroy session with mutex still locked ceph: fix potential race in ceph_check_caps ceph: document what protects i_dirty_item and i_flushing_item ...
2020-06-08io_wq: add per-wq work handler instead of per workPavel Begunkov
io_uring is the only user of io-wq, and now it uses only io-wq callback for all its requests, namely io_wq_submit_work(). Instead of storing work->runner callback in each instance of io_wq_work, keep it in io-wq itself. pros: - reduces io_wq_work size - more robust -- ->func won't be invalidated with mem{cpy,set}(req) - helps other work Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-08io_uring: don't arm a timeout through work.funcPavel Begunkov
Remove io_link_work_cb() -- the last custom work.func. Not the prettiest thing, but works. Instead of queueing a linked timeout in io_link_work_cb() mark a request with REQ_F_QUEUE_TIMEOUT and do enqueueing based on the flag in io_wq_submit_work(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>