summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-28sctp: check assoc before SCTP_ADDR_{MADE_PRIM, ADDED} eventJonas Falkevik
Make sure SCTP_ADDR_{MADE_PRIM,ADDED} are sent only for associations that have been established. These events are described in rfc6458#section-6.1 SCTP_PEER_ADDR_CHANGE: This tag indicates that an address that is part of an existing association has experienced a change of state (e.g., a failure or return to service of the reachability of an endpoint via a specific transport address). Signed-off-by: Jonas Falkevik <jonas.falkevik@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Just a few random driver fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics - add a second working PNP_ID for Lenovo T470s Input: applespi - replace zero-length array with flexible-array Input: axp20x-pek - always register interrupt handlers Input: lm8333 - update contact email Input: synaptics-rmi4 - fix error return code in rmi_driver_probe() Input: synaptics-rmi4 - really fix attn_data use-after-free Input: i8042 - add ThinkPad S230u to i8042 reset list Revert "Input: i8042 - add ThinkPad S230u to i8042 nomux list" Input: dlink-dir685-touchkeys - fix a typo in driver name Input: xpad - add custom init packet for Xbox One S controllers Input: evdev - call input_flush_device() on release(), not flush() Input: i8042 - add ThinkPad S230u to i8042 nomux list Input: usbtouchscreen - add support for BonXeon TP Input: cros_ec_keyb - use cros_ec_cmd_xfer_status helper Input: mms114 - fix handling of mms345l Input: elants_i2c - support palm detection
2020-05-28x86/ioperm: Prevent a memory leak when fork failsJay Lang
In the copy_process() routine called by _do_fork(), failure to allocate a PID (or further along in the function) will trigger an invocation to exit_thread(). This is done to clean up from an earlier call to copy_thread_tls(). Naturally, the child task is passed into exit_thread(), however during the process, io_bitmap_exit() nullifies the parent's io_bitmap rather than the child's. As copy_thread_tls() has been called ahead of the failure, the reference count on the calling thread's io_bitmap is incremented as we would expect. However, io_bitmap_exit() doesn't accept any arguments, and thus assumes it should trash the current thread's io_bitmap reference rather than the child's. This is pretty sneaky in practice, because in all instances but this one, exit_thread() is called with respect to the current task and everything works out. A determined attacker can issue an appropriate ioctl (i.e. KDENABIO) to get a bitmap allocated, and force a clone3() syscall to fail by passing in a zeroed clone_args structure. The kernel handles the erroneous struct and the buggy code path is followed, and even though the parent's reference to the io_bitmap is trashed, the child still holds a reference and thus the structure will never be freed. Fix this by tweaking io_bitmap_exit() and its subroutines to accept a task_struct argument which to operate on. Fixes: ea5f1cd7ab49 ("x86/ioperm: Remove bitmap if all permissions dropped") Signed-off-by: Jay Lang <jaytlang@mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable#@vger.kernel.org Link: https://lkml.kernel.org/r/20200524162742.253727-1-jaytlang@mit.edu
2020-05-28Merge tag 'csky-for-linus-5.7-rc8' of git://github.com/c-sky/csky-linuxLinus Torvalds
Pull csky fixes from Guo Ren: "Another four fixes for csky: - fix req_syscall debug - fix abiv2 syscall_trace - fix preempt enable - clean up regs usage in entry.S" * tag 'csky-for-linus-5.7-rc8' of git://github.com/c-sky/csky-linux: csky: Fixup CONFIG_DEBUG_RSEQ csky: Coding convention in entry.S csky: Fixup abiv2 syscall_trace break a4 & a5 csky: Fixup CONFIG_PREEMPT panic
2020-05-28Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"Jens Axboe
This reverts commit c58c1f83436b501d45d4050fd1296d71a9760bcb. io_uring does do the right thing for this case, and we're still returning -EAGAIN to userspace for the cases we don't support. Revert this change to avoid doing endless spins of resubmits. Cc: stable@vger.kernel.org # v5.6 Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-28x86/spinlock: Remove obsolete ticket spinlock macros and typesWaiman Long
Even though the x86 ticket spinlock code has been removed with cfd8983f03c7 ("x86, locking/spinlocks: Remove ticket (spin)lock implementation") a while ago, there are still some ticket spinlock specific macros and types left in the asm/spinlock_types.h header file that are no longer used. Remove those as well to avoid confusion. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200526122014.25241-1-longman@redhat.com
2020-05-28include/asm-generic/topology.h: guard cpumask_of_node() macro argumentArnd Bergmann
drivers/hwmon/amd_energy.c:195:15: error: invalid operands to binary expression ('void' and 'int') (channel - data->nr_cpus)); ~~~~~~~~~^~~~~~~~~~~~~~~~~ include/asm-generic/topology.h:51:42: note: expanded from macro 'cpumask_of_node' #define cpumask_of_node(node) ((void)node, cpu_online_mask) ^~~~ include/linux/cpumask.h:618:72: note: expanded from macro 'cpumask_first_and' #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p), (src2p)) ^~~~~ Fixes: f0b848ce6fe9 ("cpumask: Introduce cpumask_of_{node,pcibus} to replace {node,pcibus}_to_cpumask") Fixes: 8abee9566b7e ("hwmon: Add amd_energy driver to report energy counters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20200527134623.930247-1-arnd@arndb.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info()Alexander Potapenko
KMSAN reported uninitialized data being written to disk when dumping core. As a result, several kilobytes of kmalloc memory may be written to the core file and then read by a non-privileged user. Reported-by: sam <sunhaoyl@outlook.com> Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200419100848.63472-1-glider@google.com Link: https://github.com/google/kmsan/issues/76 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()Konstantin Khlebnikov
Replace superfluous VM_BUG_ON() with comment about correct usage. Technically reverts commit 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()"), but context lines have changed. Function isolate_migratepages_block() runs some checks out of lru_lock when choose pages for migration. After checking PageLRU() it checks extra page references by comparing page_count() and page_mapcount(). Between these two checks page could be removed from lru, freed and taken by slab. As a result this race triggers VM_BUG_ON(PageSlab()) in page_mapcount(). Race window is tiny. For certain workload this happens around once a year. page:ffffea0105ca9380 count:1 mapcount:0 mapping:ffff88ff7712c180 index:0x0 compound_mapcount: 0 flags: 0x500000000008100(slab|head) raw: 0500000000008100 dead000000000100 dead000000000200 ffff88ff7712c180 raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(PageSlab(page)) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:628! invalid opcode: 0000 [#1] SMP NOPTI CPU: 77 PID: 504 Comm: kcompactd1 Tainted: G W 4.19.109-27 #1 Hardware name: Yandex T175-N41-Y3N/MY81-EX0-Y3N, BIOS R05 06/20/2019 RIP: 0010:isolate_migratepages_block+0x986/0x9b0 The code in isolate_migratepages_block() was added in commit 119d6d59dcc0 ("mm, compaction: avoid isolating pinned pages") before adding VM_BUG_ON into page_mapcount(). This race has been predicted in 2015 by Vlastimil Babka (see link below). [akpm@linux-foundation.org: comment tweaks, per Hugh] Fixes: 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/159032779896.957378.7852761411265662220.stgit@buzz Link: https://lore.kernel.org/lkml/557710E1.6060103@suse.cz/ Link: https://lore.kernel.org/linux-mm/158937872515.474360.5066096871639561424.stgit@buzz/T/ (v1) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm,thp: stop leaking unreleased file pagesHugh Dickins
When collapse_file() calls try_to_release_page(), it has already isolated the page: so if releasing buffers happens to fail (as it sometimes does), remember to putback_lru_page(): otherwise that page is left unreclaimable and unfreeable, and the file extent uncollapsible. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@surriel.com> Cc: <stable@vger.kernel.org> [5.4+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005231837500.1766@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm/z3fold: silence kmemleak false positives of slotsQian Cai
Kmemleak reported many leaks while under memory pressue in, slots = alloc_slots(pool, gfp); which is referenced by "zhdr" in init_z3fold_page(), zhdr->slots = slots; However, "zhdr" could be gone without freeing slots as the later will be freed separately when the last "handle" off of "handles" array is freed. It will be within "slots" which is always aligned. unreferenced object 0xc000000fdadc1040 (size 104): comm "oom04", pid 140476, jiffies 4295359280 (age 3454.970s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: z3fold_zpool_malloc+0x7b0/0xe10 alloc_slots at mm/z3fold.c:214 (inlined by) init_z3fold_page at mm/z3fold.c:412 (inlined by) z3fold_alloc at mm/z3fold.c:1161 (inlined by) z3fold_zpool_malloc at mm/z3fold.c:1735 zpool_malloc+0x34/0x50 zswap_frontswap_store+0x60c/0xda0 zswap_frontswap_store at mm/zswap.c:1093 __frontswap_store+0x128/0x330 swap_writepage+0x58/0x110 pageout+0x16c/0xa40 shrink_page_list+0x1ac8/0x25c0 shrink_inactive_list+0x270/0x730 shrink_lruvec+0x444/0xf30 shrink_node+0x2a4/0x9c0 do_try_to_free_pages+0x158/0x640 try_to_free_pages+0x1bc/0x5f0 __alloc_pages_slowpath.constprop.60+0x4dc/0x15a0 __alloc_pages_nodemask+0x520/0x650 alloc_pages_vma+0xc0/0x420 handle_mm_fault+0x1174/0x1bf0 Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vitaly Wool <vitaly.wool@konsulko.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: http://lkml.kernel.org/r/20200522220052.2225-1-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28x86/dma: Fix max PFN arithmetic overflow on 32 bit systemsAlexander Dahl
The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is 4 294 967 296 or 0x100000000 which is no problem on 64 bit systems. The patch does not change the later overall result of 0x100000 for MAX_DMA32_PFN (after it has been shifted by PAGE_SHIFT). The new calculation yields the same result, but does not require 64 bit arithmetic. On 32 bit systems the old calculation suffers from an arithmetic overflow in that intermediate term in braces: 4UL aka unsigned long int is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does not fit in 4 bytes), the in braces result is truncated to zero, the following right shift does not alter that, so MAX_DMA32_PFN evaluates to 0 on 32 bit systems. That wrong value is a problem in a comparision against MAX_DMA32_PFN in the init code for swiotlb in pci_swiotlb_detect_4gb() to decide if swiotlb should be active. That comparison yields the opposite result, when compiling on 32 bit systems. This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32 (and which landed in v3.0). In practice this wasn't a problem, unless CONFIG_SWIOTLB is active on x86-32. However if one has set CONFIG_IOMMU_INTEL, since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page is used") there's a dependency on CONFIG_SWIOTLB, which was not necessarily active before. That landed in v5.4, where we noticed it in the fli4l Linux distribution. We have CONFIG_IOMMU_INTEL active on both 32 and 64 bit kernel configs there (I could not find out why, so let's just say historical reasons). The effect is at boot time 64 MiB (default size) were allocated for bounce buffers now, which is a noticeable amount of memory on small systems like pcengines ALIX 2D3 with 256 MiB memory, which are still frequently used as home routers. We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4 (LTS) in fli4l and got that kernel messages for example: Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018 … Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem) … PCI-DMA: Using software bounce buffering for IO (SWIOTLB) software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB) The initial analysis and the suggested fix was done by user 'sourcejedi' at stackoverflow and explicitly marked as GPLv2 for inclusion in the Linux kernel: https://unix.stackexchange.com/a/520525/50007 The new calculation, which does not suffer from that overflow, is the same as for arch/mips now as suggested by Robin Murphy. The fix was tested by fli4l users on round about two dozen different systems, including both 32 and 64 bit archs, bare metal and virtualized machines. [ bp: Massage commit message. ] Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alexander Dahl <post@lespocky.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Link: https://unix.stackexchange.com/q/520065/50007 Link: https://web.nettworks.org/bugs/browse/FFL-2560 Link: https://lkml.kernel.org/r/20200526175749.20742-1-post@lespocky.de
2020-05-28bonding: Fix reference count leak in bond_sysfs_slave_add.Qiushi Wu
kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit "b8eb718348b8" fixed a similar problem. Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Uninitialized when used in __nf_conntrack_update(), from Nathan Chancellor. 2) Comparison of unsigned expression in nf_confirm_cthelper(). 3) Remove 'const' type qualifier with no effect. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge branch 'for-next/scs' into for-next/coreWill Deacon
Support for Clang's Shadow Call Stack in the kernel (Sami Tolvanen and Will Deacon) * for-next/scs: arm64: entry-ftrace.S: Update comment to indicate that x18 is live scs: Move DEFINE_SCS macro into core code scs: Remove references to asm/scs.h from core code scs: Move scs_overflow_check() out of architecture code arm64: scs: Use 'scs_sp' register alias for x18 scs: Move accounting into alloc/free functions arm64: scs: Store absolute SCS stack pointer value in thread_info efi/libstub: Disable Shadow Call Stack arm64: scs: Add shadow stacks for SDEI arm64: Implement Shadow Call Stack arm64: Disable SCS for hypervisor code arm64: vdso: Disable Shadow Call Stack arm64: efi: Restore register x18 if it was corrupted arm64: Preserve register x18 when CPU is suspended arm64: Reserve register x18 from general allocation with SCS scs: Disable when function graph tracing is enabled scs: Add support for stack usage debugging scs: Add page accounting for shadow call stack allocations scs: Add support for Clang's Shadow Call Stack (SCS)
2020-05-28Merge branch 'for-next/kvm/errata' into for-next/coreWill Deacon
KVM CPU errata rework (Andrew Scull and Marc Zyngier) * for-next/kvm/errata: KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h arm64: Unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}
2020-05-28Merge branch 'for-next/bti' into for-next/coreWill Deacon
Support for Branch Target Identification (BTI) in user and kernel (Mark Brown and others) * for-next/bti: (39 commits) arm64: vdso: Fix CFI directives in sigreturn trampoline arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction arm64: bti: Fix support for userspace only BTI arm64: kconfig: Update and comment GCC version check for kernel BTI arm64: vdso: Map the vDSO text with guarded pages when built for BTI arm64: vdso: Force the vDSO to be linked as BTI when built for BTI arm64: vdso: Annotate for BTI arm64: asm: Provide a mechanism for generating ELF note for BTI arm64: bti: Provide Kconfig for kernel mode BTI arm64: mm: Mark executable text as guarded pages arm64: bpf: Annotate JITed code for BTI arm64: Set GP bit in kernel page tables to enable BTI for the kernel arm64: asm: Override SYM_FUNC_START when building the kernel with BTI arm64: bti: Support building kernel C code using BTI arm64: Document why we enable PAC support for leaf functions arm64: insn: Report PAC and BTI instructions as skippable arm64: insn: Don't assume unrecognized HINTs are skippable arm64: insn: Provide a better name for aarch64_insn_is_nop() arm64: insn: Add constants for new HINT instruction decode arm64: Disable old style assembly annotations ...
2020-05-28Merge branches 'for-next/acpi', 'for-next/bpf', 'for-next/cpufeature', ↵Will Deacon
'for-next/docs', 'for-next/kconfig', 'for-next/misc', 'for-next/perf', 'for-next/ptr-auth', 'for-next/sdei', 'for-next/smccc' and 'for-next/vdso' into for-next/core ACPI and IORT updates (Lorenzo Pieralisi) * for-next/acpi: ACPI/IORT: Remove the unused __get_pci_rid() ACPI/IORT: Fix PMCG node single ID mapping handling ACPI: IORT: Add comments for not calling acpi_put_table() ACPI: GTDT: Put GTDT table after parsing ACPI: IORT: Add extra message "applying workaround" for off-by-1 issue ACPI/IORT: work around num_ids ambiguity Revert "ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()" ACPI/IORT: take _DMA methods into account for named components BPF JIT optimisations for immediate value generation (Luke Nelson) * for-next/bpf: bpf, arm64: Optimize ADD,SUB,JMP BPF_K using arm64 add/sub immediates bpf, arm64: Optimize AND,OR,XOR,JSET BPF_K using arm64 logical immediates arm64: insn: Fix two bugs in encoding 32-bit logical immediates Addition of new CPU ID register fields and removal of some benign sanity checks (Anshuman Khandual and others) * for-next/cpufeature: (27 commits) KVM: arm64: Check advertised Stage-2 page size capability arm64/cpufeature: Add get_arm64_ftr_reg_nowarn() arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register arm64/cpufeature: Add remaining feature bits in ID_PFR0 register arm64/cpufeature: Introduce ID_MMFR5 CPU register arm64/cpufeature: Introduce ID_DFR1 CPU register arm64/cpufeature: Introduce ID_PFR2 CPU register arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0 arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register arm64/cpufeature: Drop open encodings while extracting parange arm64/cpufeature: Validate hypervisor capabilities during CPU hotplug arm64: cpufeature: Group indexed system register definitions by name arm64: cpufeature: Extend comment to describe absence of field info arm64: drop duplicate definitions of ID_AA64MMFR0_TGRAN constants arm64: cpufeature: Add an overview comment for the cpufeature framework ... Minor documentation tweaks for silicon errata and booting requirements (Rob Herring and Will Deacon) * for-next/docs: arm64: silicon-errata.rst: Sort the Cortex-A55 entries arm64: docs: Mandate that the I-cache doesn't hold stale kernel text Minor Kconfig cleanups (Geert Uytterhoeven) * for-next/kconfig: arm64: cpufeature: Add "or" to mitigations for multiple errata arm64: Sort vendor-specific errata Miscellaneous updates (Ard Biesheuvel and others) * for-next/misc: arm64: mm: Add asid_gen_match() helper arm64: stacktrace: Factor out some common code into on_stack() arm64: Call debug_traps_init() from trap_init() to help early kgdb arm64: cacheflush: Fix KGDB trap detection arm64/cpuinfo: Move device_initcall() near cpuinfo_regs_init() arm64: kexec_file: print appropriate variable arm: mm: use __pfn_to_section() to get mem_section arm64: Reorder the macro arguments in the copy routines efi/libstub/arm64: align PE/COFF sections to segment alignment KVM: arm64: Drop PTE_S2_MEMATTR_MASK arm64/kernel: Fix range on invalidating dcache for boot page tables arm64: set TEXT_OFFSET to 0x0 in preparation for removing it entirely arm64: lib: Consistently enable crc32 extension arm64/mm: Use phys_to_page() to access pgtable memory arm64: smp: Make cpus_stuck_in_kernel static arm64: entry: remove unneeded semicolon in el1_sync_handler() arm64/kernel: vmlinux.lds: drop redundant discard/keep macros arm64: drop GZFLAGS definition and export arm64: kexec_file: Avoid temp buffer for RNG seed arm64: rename stext to primary_entry Perf PMU driver updates (Tang Bin and others) * for-next/perf: pmu/smmuv3: Clear IRQ affinity hint on device removal drivers/perf: hisi: Permit modular builds of HiSilicon uncore drivers drivers/perf: hisi: Fix typo in events attribute array drivers/perf: arm_spe_pmu: Avoid duplicate printouts drivers/perf: arm_dsu_pmu: Avoid duplicate printouts Pointer authentication updates and support for vmcoreinfo (Amit Daniel Kachhap and Mark Rutland) * for-next/ptr-auth: Documentation/vmcoreinfo: Add documentation for 'KERNELPACMASK' arm64/crash_core: Export KERNELPACMASK in vmcoreinfo arm64: simplify ptrauth initialization arm64: remove ptrauth_keys_install_kernel sync arg SDEI cleanup and non-critical fixes (James Morse and others) * for-next/sdei: firmware: arm_sdei: Document the motivation behind these set_fs() calls firmware: arm_sdei: remove unused interfaces firmware: arm_sdei: Put the SDEI table after using it firmware: arm_sdei: Drop check for /firmware/ node and always register driver SMCCC updates and refactoring (Sudeep Holla) * for-next/smccc: firmware: smccc: Fix missing prototype warning for arm_smccc_version_init firmware: smccc: Add function to fetch SMCCC version firmware: smccc: Refactor SMCCC specific bits into separate file firmware: smccc: Drop smccc_version enum and use ARM_SMCCC_VERSION_1_x instead firmware: smccc: Add the definition for SMCCCv1.2 version/error codes firmware: smccc: Update link to latest SMCCC specification firmware: smccc: Add HAVE_ARM_SMCCC_DISCOVERY to identify SMCCC v1.1 and above vDSO cleanup and non-critical fixes (Mark Rutland and Vincenzo Frascino) * for-next/vdso: arm64: vdso: Add --eh-frame-hdr to ldflags arm64: vdso: use consistent 'map' nomenclature arm64: vdso: use consistent 'abi' nomenclature arm64: vdso: simplify arch_vdso_type ifdeffery arm64: vdso: remove aarch32_vdso_pages[] arm64: vdso: Add '-Bsymbolic' to ldflags
2020-05-28x86/mm: Drop deprecated DISCONTIGMEM support for 32-bitMike Rapoport
The DISCONTIGMEM support was marked as deprecated in v5.2 and since there were no complaints about it for almost 5 releases it can be completely removed. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20200223094322.15206-1-rppt@kernel.org
2020-05-28KVM: arm64: Move __load_guest_stage2 to kvm_mmu.hMarc Zyngier
Having __load_guest_stage2 in kvm_hyp.h is quickly going to trigger a circular include problem. In order to avoid this, let's move it to kvm_mmu.h, where it will be a better fit anyway. In the process, drop the __hyp_text annotation, which doesn't help as the function is marked as __always_inline. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-05-28KVM: arm64: Check advertised Stage-2 page size capabilityMarc Zyngier
With ARMv8.5-GTG, the hardware (or more likely a hypervisor) can advertise the supported Stage-2 page sizes. Let's check this at boot time. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-05-28x86/Kconfig: Update config and kernel doc for MPK feature on AMDBabu Moger
AMD's next generation of EPYC processors support the MPK (Memory Protection Keys) feature. Update the dependency and documentation. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/159068199556.26992.17733929401377275140.stgit@naples-babu.amd.com
2020-05-28powerpc/bpf: Enable bpf_probe_read{, str}() on powerpc againPetr Mladek
The commit 0ebeea8ca8a4d1d453a ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") caused that bpf_probe_read{, str}() functions were not longer available on architectures where the same logical address might have different content in kernel and user memory mapping. These architectures should use probe_read_{user,kernel}_str helpers. For backward compatibility, the problematic functions are still available on architectures where the user and kernel address spaces are not overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. At the moment, these backward compatible functions are enabled only on x86_64, arm, and arm64. Let's do it also on powerpc that has the non overlapping address space as well. Fixes: 0ebeea8ca8a4 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/20200527122844.19524-1-pmladek@suse.com
2020-05-28hwmon: Add Baikal-T1 PVT sensor driverSerge Semin
Baikal-T1 SoC provides an embedded process, voltage and temperature sensor to monitor an internal SoC environment (chip temperature, supply voltage and process monitor) and on time detect critical situations, which may cause the system instability and even damages. The IP-block is based on the Analog Bits PVT sensor, but is equipped with a dedicated control wrapper, which provides a MMIO registers-based access to the sensor core functionality (APB3-bus based) and exposes an additional functions like thresholds/data ready interrupts, its status and masks, measurements timeout. All of these is used to create a hwmon driver being added to the kernel by this commit. The driver implements support for the hardware monitoring capabilities of Baikal-T1 process, voltage and temperature sensors. PVT IP-core consists of one temperature and four voltage sensors, each of which is implemented as a dedicated hwmon channel config. The driver can optionally provide the hwmon alarms for each sensor the PVT controller supports. The alarms functionality is made compile-time configurable due to the hardware interface implementation peculiarity, which is connected with an ability to convert data from only one sensor at a time. Additional limitation is that the controller performs the thresholds checking synchronously with the data conversion procedure. Due to these limitations in order to have the hwmon alarms automatically detected the driver code must switch from one sensor to another, read converted data and manually check the threshold status bits. Depending on the measurements timeout settings this design may cause additional burden on the system performance. By default if the alarms kernel config is disabled the data conversion is performed by the driver on demand when read operation is requested via corresponding _input-file. Co-developed-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru> Signed-off-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28hwmon: Add notification supportGuenter Roeck
For hwmon drivers using the hwmon_device_register_with_info() API, it is desirable to have a generic notification mechanism available. This mechanism can be used to notify userspace as well as the thermal subsystem if the driver experiences any events, such as warning or critical alarms. Implement hwmon_notify_event() to provide this mechanism. The function generates a sysfs event and a udev event. If the device is registered with the thermal subsystem and the event is associated with a temperature sensor, also notify the thermal subsystem that a thermal event occurred. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Cc: Maxim Kaurkin <Maxim.Kaurkin@baikalelectronics.ru> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28dt-bindings: hwmon: Add Baikal-T1 PVT sensor bindingSerge Semin
Baikal-T1 SoC is equipped with an embedded process, voltage and temperature sensor to monitor the chip internal environment like temperature, supply voltage and transistors performance. This bindings describes the external Baikal-T1 PVT control interfaces like MMIO registers space, interrupt request number and clocks source. These are then used by the corresponding hwmon device driver to implement the sysfs files-based access to the sensors functionality. Co-developed-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru> Signed-off-by: Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-mips@vger.kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-28Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7Jens Axboe
Pull NVMe poll fix from Christoph. * 'nvme-5.7' of git://git.infradead.org/nvme: nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
2020-05-28block: fix a warning when blkdev.h is included for !CONFIG_BLOCK buildsChristoph Hellwig
disk_start_io_acct and disk_end_io_acct need at least a struct gendisk forward declaration, but for weird historic reasons much of blkdev.h is stubbed out for CONFIG_BLOCK=n. Fix this by stubbing more out for now, but eventually this header will need a massive cleanup. Fixes: 956d510ee78 ("block: add disk/bio-based accounting helpers") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-28Merge series "add ecspi ERR009165 for i.mx6/7 soc family" from Robin Gong ↵Mark Brown
<yibin.gong@nxp.com>: There is ecspi ERR009165 on i.mx6/7 soc family, which cause FIFO transfer to be send twice in DMA mode. Please get more information from: https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf. The workaround is adding new sdma ram script which works in XCH mode as PIO inside sdma instead of SMC mode, meanwhile, 'TX_THRESHOLD' should be 0. The issue should be exist on all legacy i.mx6/7 soc family before i.mx6ul. NXP fix this design issue from i.mx6ul, so newer chips including i.mx6ul/ 6ull/6sll do not need this workaroud anymore. All other i.mx6/7/8 chips still need this workaroud. This patch set add new 'fsl,imx6ul-ecspi' for ecspi driver and 'ecspi_fixed' in sdma driver to choose if need errata or not. The first two reverted patches should be the same issue, though, it seems 'fixed' by changing to other shp script. Hope Sean or Sascha could have the chance to test this patch set if could fix their issues. Besides, enable sdma support for i.mx8mm/8mq and fix ecspi1 not work on i.mx8mm because the event id is zero. PS: Please get sdma firmware from below linux-firmware and copy it to your local rootfs /lib/firmware/imx/sdma. https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/imx/sdma v2: 1.Add commit log for reverted patches. 2.Add comment for 'ecspi_fixed' in sdma driver. 3.Add 'fsl,imx6sll-ecspi' compatible instead of 'fsl,imx6ul-ecspi' rather than remove. v3: 1.Confirm with design team make sure ERR009165 fixed on i.mx6ul/i.mx6ull /i.mx6sll, not fixed on i.mx8m/8mm and other i.mx6/7 legacy chips. Correct dts related dts patch in v2. 2.Clean eratta information in binding doc and new 'tx_glitch_fixed' flag in spi-imx driver to state ERR009165 fixed or not. 3.Enlarge burst size to fifo size for tx since tx_wml set to 0 in the errata workaroud, thus improve performance as possible. v4: 1.Add Ack tag from Mark and Vinod 2.Remove checking 'event_id1' zero as 'event_id0'. v5: 1.Add the last patch for compatible with the current uart driver which using rom script, so both uart ram script and rom script supported in latest firmware, by default uart rom script used. UART driver will be broken without this patch. v6: 1.Resend after rebase the latest next branch. 2.Remove below No.13~No.15 patches of v5 because they were mergered. ARM: dts: imx6ul: add dma support on ecspi ARM: dts: imx6sll: correct sdma compatible arm64: defconfig: Enable SDMA on i.mx8mq/8mm 3.Revert "dmaengine: imx-sdma: fix context cache" since 'context_loaded' removed. v7: 1.Put the last patch 13/13 'Revert "dmaengine: imx-sdma: fix context cache"' to the ahead of 03/13 'Revert "dmaengine: imx-sdma: refine to load context only once" so that no building waring during comes out during bisect. 2.Address Sascha's comments, including eliminating any i.mx6sx in this series, adding new 'is_imx6ul_ecspi()' instead imx in imx51 and taking care SMC bit for PIO. 3.Add back missing 'Reviewed-by' tag on 08/15(v5):09/13(v7) 'spi: imx: add new i.mx6ul compatible name in binding doc' v8: 1.remove 0003-Revert-dmaengine-imx-sdma-fix-context-cache.patch and merge it into 04/13 of v7 2.add 0005-spi-imx-fallback-to-PIO-if-dma-setup-failure.patch for no any ecspi function broken even if sdma firmware not updated. 3.merge 'tx.dst_maxburst' changes in the two continous patches into one patch to avoid confusion. 4.fix typo 'duplicated'. Robin Gong (13): Revert "ARM: dts: imx6q: Use correct SDMA script for SPI5 core" Revert "ARM: dts: imx6: Use correct SDMA script for SPI cores" Revert "dmaengine: imx-sdma: refine to load context only once" dmaengine: imx-sdma: remove duplicated sdma_load_context spi: imx: fallback to PIO if dma setup failure dmaengine: imx-sdma: add mcu_2_ecspi script spi: imx: fix ERR009165 spi: imx: remove ERR009165 workaround on i.mx6ul spi: imx: add new i.mx6ul compatible name in binding doc dmaengine: imx-sdma: remove ERR009165 on i.mx6ul dma: imx-sdma: add i.mx6ul compatible name dmaengine: imx-sdma: fix ecspi1 rx dma not work on i.mx8mm dmaengine: imx-sdma: add uart rom script .../devicetree/bindings/dma/fsl-imx-sdma.txt | 1 + .../devicetree/bindings/spi/fsl-imx-cspi.txt | 1 + arch/arm/boot/dts/imx6q.dtsi | 2 +- arch/arm/boot/dts/imx6qdl.dtsi | 8 +- drivers/dma/imx-sdma.c | 67 ++++++++++------ drivers/spi/spi-imx.c | 92 +++++++++++++++++++--- include/linux/platform_data/dma-imx-sdma.h | 8 +- 7 files changed, 135 insertions(+), 44 deletions(-) -- 2.7.4 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
2020-05-28spi: tegra20-sflash: Fix runtime PM imbalance on errorDinghao Liu
pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20200523124758.28604-1-dinghao.liu@zju.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28spi: tegra20-slink: Fix runtime PM imbalance on errorDinghao Liu
pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20200523122909.25247-1-dinghao.liu@zju.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28spi: tegra114: Fix runtime PM imbalance on errorDinghao Liu
pm_runtime_get_sync() increments the runtime PM usage counter even when it returns an error code. Thus a pairing decrement is needed on the error handling path to keep the counter balanced. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20200523125704.30300-1-dinghao.liu@zju.edu.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28spi: imx: fallback to PIO if dma setup failureRobin Gong
Fallback to PIO in case dma setup failed. For example, sdma firmware not updated but ERR009165 workaroud added in kernel. Signed-off-by: Robin Gong <yibin.gong@nxp.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/r/1590006865-20900-6-git-send-email-yibin.gong@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-05-28efi/x86: Don't blow away existing initrdArvind Sankar
Commit 987053a30016 ("efi/x86: Move command-line initrd loading to efi_main") moved the command-line initrd loading into efi_main(), with a check to ensure that it was attempted only if the EFI stub was booted via efi_pe_entry rather than the EFI handover entry. However, in the case where it was booted via handover entry, and thus an initrd may have already been loaded by the bootloader, it then wrote 0 for the initrd address and size, removing any existing initrd. Fix this by checking if size is positive before setting the fields in the bootparams structure. Fixes: 987053a30016 ("efi/x86: Move command-line initrd loading to efi_main") Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lkml.kernel.org/r/20200527232602.21596-1-nivedita@alum.mit.edu
2020-05-28btrfs: fix space_info bytes_may_use underflow during space cache writeoutFilipe Manana
We always preallocate a data extent for writing a free space cache, which causes writeback to always try the nocow path first, since the free space inode has the prealloc bit set in its flags. However if the block group that contains the data extent for the space cache has been turned to RO mode due to a running scrub or balance for example, we have to fallback to the cow path. In that case once a new data extent is allocated we end up calling btrfs_add_reserved_bytes(), which decrements the counter named bytes_may_use from the data space_info object with the expection that this counter was previously incremented with the same amount (the size of the data extent). However when we started writeout of the space cache at cache_save_setup(), we incremented the value of the bytes_may_use counter through a call to btrfs_check_data_free_space() and then decremented it through a call to btrfs_prealloc_file_range_trans() immediately after. So when starting the writeback if we fallback to cow mode we have to increment the counter bytes_may_use of the data space_info again to compensate for the extent allocation done by the cow path. When this issue happens we are incorrectly decrementing the bytes_may_use counter and when its current value is smaller then the amount we try to subtract we end up with the following warning: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 657 at fs/btrfs/space-info.h:115 btrfs_add_reserved_bytes+0x3d6/0x4e0 [btrfs] Modules linked in: btrfs blake2b_generic xor raid6_pq libcrc32c (...) CPU: 3 PID: 657 Comm: kworker/u8:7 Tainted: G W 5.6.0-rc7-btrfs-next-58 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-btrfs-1591) RIP: 0010:btrfs_add_reserved_bytes+0x3d6/0x4e0 [btrfs] Code: ff ff 48 (...) RSP: 0000:ffffa41608f13660 EFLAGS: 00010287 RAX: 0000000000001000 RBX: ffff9615b93ae400 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff9615b96ab410 RBP: fffffffffffee000 R08: 0000000000000001 R09: 0000000000000000 R10: ffff961585e62a40 R11: 0000000000000000 R12: ffff9615b96ab400 R13: ffff9615a1a2a000 R14: 0000000000012000 R15: ffff9615b93ae400 FS: 0000000000000000(0000) GS:ffff9615bb200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055cbbc2ae178 CR3: 0000000115794006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: find_free_extent+0x4a0/0x16c0 [btrfs] btrfs_reserve_extent+0x91/0x180 [btrfs] cow_file_range+0x12d/0x490 [btrfs] btrfs_run_delalloc_range+0x9f/0x6d0 [btrfs] ? find_lock_delalloc_range+0x221/0x250 [btrfs] writepage_delalloc+0xe8/0x150 [btrfs] __extent_writepage+0xe8/0x4c0 [btrfs] extent_write_cache_pages+0x237/0x530 [btrfs] extent_writepages+0x44/0xa0 [btrfs] do_writepages+0x23/0x80 __writeback_single_inode+0x59/0x700 writeback_sb_inodes+0x267/0x5f0 __writeback_inodes_wb+0x87/0xe0 wb_writeback+0x382/0x590 ? wb_workfn+0x4a2/0x6c0 wb_workfn+0x4a2/0x6c0 process_one_work+0x26d/0x6a0 worker_thread+0x4f/0x3e0 ? process_one_work+0x6a0/0x6a0 kthread+0x103/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace bd7c03622e0b0a52 ]--- ------------[ cut here ]------------ So fix this by incrementing the bytes_may_use counter of the data space_info when we fallback to the cow path. If the cow path is successful the counter is decremented after extent allocation (by btrfs_add_reserved_bytes()), if it fails it ends up being decremented as well when clearing the delalloc range (extent_clear_unlock_delalloc()). This could be triggered sporadically by the test case btrfs/061 from fstests. Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: fix space_info bytes_may_use underflow after nocow buffered writeFilipe Manana
When doing a buffered write we always try to reserve data space for it, even when the file has the NOCOW bit set or the write falls into a file range covered by a prealloc extent. This is done both because it is expensive to check if we can do a nocow write (checking if an extent is shared through reflinks or if there's a hole in the range for example), and because when writeback starts we might actually need to fallback to COW mode (for example the block group containing the target extents was turned into RO mode due to a scrub or balance). When we are unable to reserve data space we check if we can do a nocow write, and if we can, we proceed with dirtying the pages and setting up the range for delalloc. In this case the bytes_may_use counter of the data space_info object is not incremented, unlike in the case where we are able to reserve data space (done through btrfs_check_data_free_space() which calls btrfs_alloc_data_chunk_ondemand()). Later when running delalloc we attempt to start writeback in nocow mode but we might revert back to cow mode, for example because in the meanwhile a block group was turned into RO mode by a scrub or relocation. The cow path after successfully allocating an extent ends up calling btrfs_add_reserved_bytes(), which expects the bytes_may_use counter of the data space_info object to have been incremented before - but we did not do it when the buffered write started, since there was not enough available data space. So btrfs_add_reserved_bytes() ends up decrementing the bytes_may_use counter anyway, and when the counter's current value is smaller then the size of the allocated extent we get a stack trace like the following: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 20138 at fs/btrfs/space-info.h:115 btrfs_add_reserved_bytes+0x3d6/0x4e0 [btrfs] Modules linked in: btrfs blake2b_generic xor raid6_pq libcrc32c (...) CPU: 0 PID: 20138 Comm: kworker/u8:15 Not tainted 5.6.0-rc7-btrfs-next-58 #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-btrfs-1754) RIP: 0010:btrfs_add_reserved_bytes+0x3d6/0x4e0 [btrfs] Code: ff ff 48 (...) RSP: 0018:ffffbda18a4b3568 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff9ca076f5d800 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff9ca068470410 RBP: fffffffffffff000 R08: 0000000000000001 R09: 0000000000000000 R10: ffff9ca079d58040 R11: 0000000000000000 R12: ffff9ca068470400 R13: ffff9ca0408b2000 R14: 0000000000001000 R15: ffff9ca076f5d800 FS: 0000000000000000(0000) GS:ffff9ca07a600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005605dbfe7048 CR3: 0000000138570006 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: find_free_extent+0x4a0/0x16c0 [btrfs] btrfs_reserve_extent+0x91/0x180 [btrfs] cow_file_range+0x12d/0x490 [btrfs] run_delalloc_nocow+0x341/0xa40 [btrfs] btrfs_run_delalloc_range+0x1ea/0x6d0 [btrfs] ? find_lock_delalloc_range+0x221/0x250 [btrfs] writepage_delalloc+0xe8/0x150 [btrfs] __extent_writepage+0xe8/0x4c0 [btrfs] extent_write_cache_pages+0x237/0x530 [btrfs] ? btrfs_wq_submit_bio+0x9f/0xc0 [btrfs] extent_writepages+0x44/0xa0 [btrfs] do_writepages+0x23/0x80 __writeback_single_inode+0x59/0x700 writeback_sb_inodes+0x267/0x5f0 __writeback_inodes_wb+0x87/0xe0 wb_writeback+0x382/0x590 ? wb_workfn+0x4a2/0x6c0 wb_workfn+0x4a2/0x6c0 process_one_work+0x26d/0x6a0 worker_thread+0x4f/0x3e0 ? process_one_work+0x6a0/0x6a0 kthread+0x103/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffff94ebdedf>] copy_process+0x74f/0x2020 softirqs last enabled at (0): [<ffffffff94ebdedf>] copy_process+0x74f/0x2020 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace f9f6ef8ec4cd8ec9 ]--- So to fix this, when falling back into cow mode check if space was not reserved, by testing for the bit EXTENT_NORESERVE in the respective file range, and if not, increment the bytes_may_use counter for the data space_info object. Also clear the EXTENT_NORESERVE bit from the range, so that if the cow path fails it decrements the bytes_may_use counter when clearing the delalloc range (through the btrfs_clear_delalloc_extent() callback). Fixes: 7ee9e4405f264e ("Btrfs: check if we can nocow if we don't have data space") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: fix wrong file range cleanup after an error filling dealloc rangeFilipe Manana
If an error happens while running dellaloc in COW mode for a range, we can end up calling extent_clear_unlock_delalloc() for a range that goes beyond our range's end offset by 1 byte, which affects 1 extra page. This results in clearing bits and doing page operations (such as a page unlock) outside our target range. Fix that by calling extent_clear_unlock_delalloc() with an inclusive end offset, instead of an exclusive end offset, at cow_file_range(). Fixes: a315e68f6e8b30 ("Btrfs: fix invalid attempt to free reserved space on failure to cow range") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: remove redundant local variable in read_block_for_searchNikolay Borisov
The local 'b' variable is only used to directly read values from passed extent buffer. So eliminate it and directly use the input parameter. Furthermore this shrinks the size of the following functions: ./scripts/bloat-o-meter ctree.orig fs/btrfs/ctree.o add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-73 (-73) Function old new delta read_block_for_search.isra 876 871 -5 push_node_left 1112 1044 -68 Total: Before=50348, After=50275, chg -0.14% Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: open code key_searchNikolay Borisov
This function wraps the optimisation implemented by d7396f07358a ("Btrfs: optimize key searches in btrfs_search_slot") however this optimisation is really used in only one place - btrfs_search_slot. Just open code the optimisation and also add a comment explaining how it works since it's not clear just by looking at the code - the key point here is it depends on an internal invariant that BTRFS' btree provides, namely intermediate pointers always contain the key at slot0 at the child node. So in the case of exact match we can safely assume that the given key will always be in slot 0 on lower levels. Furthermore this results in a reduction of btrfs_search_slot's size: ./scripts/bloat-o-meter ctree.orig fs/btrfs/ctree.o add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-75 (-75) Function old new delta btrfs_search_slot 2783 2708 -75 Total: Before=50423, After=50348, chg -0.15% Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: split btrfs_direct_IO to read and write partChristoph Hellwig
The read and write versions don't have anything in common except for the call to iomap_dio_rw. So split this function, and merge each half into its only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: remove BTRFS_INODE_READDIO_NEED_LOCKGoldwyn Rodrigues
Since we now perform direct reads using i_rwsem, we can remove this inode flag used to co-ordinate unlocked reads. The truncate call takes i_rwsem. This means it is correctly synchronized with concurrent direct reads. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28fs: remove dio_end_io()Goldwyn Rodrigues
Since we removed the last user of dio_end_io(), remove the helper function dio_end_io(). Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28btrfs: switch to iomap_dio_rw() for dioGoldwyn Rodrigues
Switch from __blockdev_direct_IO() to iomap_dio_rw(). Rename btrfs_get_blocks_direct() to btrfs_dio_iomap_begin() and use it as iomap_begin() for iomap direct I/O functions. This function allocates and locks all the blocks required for the I/O. btrfs_submit_direct() is used as the submit_io() hook for direct I/O ops. Since we need direct I/O reads to go through iomap_dio_rw(), we change file_operations.read_iter() to a btrfs_file_read_iter() which calls btrfs_direct_IO() for direct reads and falls back to generic_file_buffered_read() for incomplete reads and buffered reads. We don't need address_space.direct_IO() anymore so set it to noop. Similarly, we don't need flags used in __blockdev_direct_IO(). iomap is capable of direct I/O reads from a hole, so we don't need to return -ENOENT. BTRFS direct I/O is now done under i_rwsem, shared in case of reads and exclusive in case of writes. This guards against simultaneous truncates. Use iomap->iomap_end() to check for failed or incomplete direct I/O: - for writes, call __endio_write_update_ordered() - for reads, unlock extents btrfs_dio_data is now hooked in iomap->private and not current->journal_info. It carries the reservation variable and the amount of data submitted, so we can calculate the amount of data to call __endio_write_update_ordered in case of an error. This patch removes last use of struct buffer_head from btrfs. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-05-28ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()Qiushi Wu
kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit "b8eb718348b8" fixed a similar problem. Fixes: 158c998ea44b ("ACPI / CPPC: add sysfs support to compute delivered performance") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-28ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()Qiushi Wu
kobject_init_and_add() takes reference even when it fails. Thus, when kobject_init_and_add() returns an error, kobject_put() must be called to properly clean up the kobject. Fixes: 3f8055c35836 ("ACPI / hotplug: Introduce user space interface for hotplug profiles") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-28arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()Nobuhiro Iwamatsu
If boot_secondary() was successful, and cpu_online() was an error in __cpu_up(), -EIO was returned, but 0 is returned by commit d22b115cbfbb7 ("arm64/kernel: Simplify __cpu_up() by bailing out early"). Therefore, bringup_wait_for_ap() causes the primary core to wait for a long time, which may cause boot failure. This commit sets -EIO to return code under the same conditions. Fixes: d22b115cbfbb ("arm64/kernel: Simplify __cpu_up() by bailing out early") Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Tested-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp> Acked-by: Will Deacon <will@kernel.org> Cc: Gavin Shan <gshan@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200527233457.2531118-1-nobuhiro1.iwamatsu@toshiba.co.jp [catalin.marinas@arm.com: return -EIO at the end of the function] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-05-28mmc: sdio: Fix macro name for Marvell device with ID 0x9134Pali Rohár
Marvell SDIO device ID 0x9134 is used in SDIO Common CIS (Card Information Structure) and not in SDIO wlan function (with ID 1). SDIO Common CIS is accessed by function ID 0. So change this misleading macro name to SDIO_DEVICE_ID_MARVELL_8887_F0 as it does not refer to wlan function. It refers to function 0. Wlan module on this SDIO card is available at function ID 1 and is identified by different SDIO device ID 0x9135. Kernel quirks for SDIO devices are matched against device ID from SDIO Common CIS. Therefore device ID used in quirk is correct, just has misleading name. Signed-off-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20200522144412.19712-2-pali@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-28Merge branch 'fixes' into nextUlf Hansson
2020-05-28m68k: coldfire/clk.c: move m5441x specific codeAngelo Dureghello
Moving specific m5441x clk-related code in more appropriate location, since breaking compilation for other targets. Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com> Link: https://lore.kernel.org/r/20200525102324.2723438-1-angelo.dureghello@timesys.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-05-28mmc: sdhci-of-esdhc: exit HS400 properly before setting any speed modeYangbo Lu
The eSDHC HS400 timing requires many specific registers setting, unlike other speed modes which need to set only host controller 2 register. When driver needs to downgrade HS400 mode to other speed mode, the controller have to exit HS400 timing properly first. This patch is to support the procedure of HS400 exiting at the beginning of esdhc_set_uhs_signaling. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20200522031256.856-1-yangbo.lu@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>