linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-04-03	dax: introduce CONFIG_DAX_DRIVER	Dan Williams
	In support of allowing device-mapper to compile out idle/dead code when there are no dax providers in the system, introduce the DAX_DRIVER symbol. This is selected by all leaf drivers that device-mapper might be layered on top. This allows device-mapper to conditionally 'select DAX' only when a provider is present. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-04-03	fs, dax: use page->mapping to warn if truncate collides with a busy page	Dan Williams
	Catch cases where extent unmap operations encounter pages that are pinned / busy. Typically this is pinned pages that are under active dma. This warning is a canary for potential data corruption as truncated blocks could be allocated to a new file while the device is still performing i/o. Here is an example of a collision that this implementation catches: WARNING: CPU: 2 PID: 1286 at fs/dax.c:343 dax_disassociate_entry+0x55/0x80 [..] Call Trace: __dax_invalidate_mapping_entry+0x6c/0xf0 dax_delete_mapping_entry+0xf/0x20 truncate_exceptional_pvec_entries.part.12+0x1af/0x200 truncate_inode_pages_range+0x268/0x970 ? tlb_gather_mmu+0x10/0x20 ? up_write+0x1c/0x40 ? unmap_mapping_range+0x73/0x140 xfs_free_file_space+0x1b6/0x5b0 [xfs] ? xfs_file_fallocate+0x7f/0x320 [xfs] ? down_write_nested+0x40/0x70 ? xfs_ilock+0x21d/0x2f0 [xfs] xfs_file_fallocate+0x162/0x320 [xfs] ? rcu_read_lock_sched_held+0x3f/0x70 ? rcu_sync_lockdep_assert+0x2a/0x50 ? __sb_start_write+0xd0/0x1b0 ? vfs_fallocate+0x20c/0x270 vfs_fallocate+0x154/0x270 SyS_fallocate+0x43/0x80 entry_SYSCALL_64_fastpath+0x1f/0x96 Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-04-03	ext2, dax: introduce ext2_dax_aops	Dan Williams
	In preparation for the dax implementation to start associating dax pages to inodes via page->mapping, we need to provide a 'struct address_space_operations' instance for dax. Otherwise, direct-I/O triggers incorrect page cache assumptions and warnings. Reviewed-by: Jan Kara <jack@suse.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-04-03	powerpc/powernv: Fix SMT4 forcing idle code	Nicholas Piggin
	The PSSCR value is not stored to PACA_REQ_PSSCR if the CPU does not have the XER[SO] bug. Fix this by storing up-front, outside the workaround code. The initial test is not required because it is a slow path. The workaround is made to depend on CONFIG_KVM_BOOK3S_HV_POSSIBLE, to match pnv_power9_force_smt4_catch() where it is used. Drop the comment on pnv_power9_force_smt4_catch() as it's no longer true. Fixes: 7672691a08c8 ("powerpc/powernv: Provide a way to force a core into SMT4 mode") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	selftests/powerpc: Fix copyloops build since Power4 assembler change	Michael Ellerman
	The recent commit 15a3204d24a3 ("powerpc/64s: Set assembler machine type to POWER4") set the machine type in our ASFLAGS when building the kernel, and removed some ".machine power4" directives from various asm files. This broke the selftests build on old toolchains (that don't assume Power4), because we build the kernel source files into the selftests using different ASFLAGS. The fix is simply to add -mpower4 to the selftest ASFLAGS as well. Fixes: 15a3204d24a3 ("powerpc/64s: Set assembler machine type to POWER4") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	powerpc/pseries: Restore default security feature flags on setup	Mauricio Faria de Oliveira
	After migration the security feature flags might have changed (e.g., destination system with unpatched firmware), but some flags are not set/clear again in init_cpu_char_feature_flags() because it assumes the security flags to be the defaults. Additionally, if the H_GET_CPU_CHARACTERISTICS hypercall fails then init_cpu_char_feature_flags() does not run again, which potentially might leave the system in an insecure or sub-optimal configuration. So, just restore the security feature flags to the defaults assumed by init_cpu_char_feature_flags() so it can set/clear them correctly, and to ensure safe settings are in place in case the hypercall fail. Fixes: f636c14790ea ("powerpc/pseries: Set or clear security feature flags") Depends-on: 19887d6a28e2 ("powerpc: Move default security feature flags") Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	powerpc: Move default security feature flags	Mauricio Faria de Oliveira
	This moves the definition of the default security feature flags (i.e., enabled by default) closer to the security feature flags. This can be used to restore current flags to the default flags. Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	powerpc: Don't write to DABR on >= Power8 if DAWR is disabled	Nicholas Piggin
	flush_thread() calls __set_breakpoint() via set_debug_reg_defaults() without checking ppc_breakpoint_available(). On Power8 or later CPUs which have the DAWR feature disabled that will cause a write to the DABR which is incorrect as those CPUs don't have a DABR. Fix it two ways, by checking ppc_breakpoint_available() in set_debug_reg_defaults(), and also by reworking __set_breakpoint() to only write to DABR on Power7 or earlier. Fixes: 9654153158d3 ("powerpc: Disable DAWR in the base POWER9 CPU features") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Rework the logic in __set_breakpoint()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	KVM: PPC: Book3S HV: Fix ppc_breakpoint_available compile error	Nicholas Piggin
	arch/powerpc/kvm/book3s_hv.c: In function ‘kvmppc_h_set_mode’: arch/powerpc/kvm/book3s_hv.c:745:8: error: implicit declaration of function ‘ppc_breakpoint_available’ if (!ppc_breakpoint_available()) ^~~~~~~~~~~~~~~~~~~~~~~~ Fixes: 398e712c007f ("KVM: PPC: Book3S HV: Return error from h_set_mode(SET_DAWR) on POWER9") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	powerpc: Fix oops due to bad access of lppaca on bare metal	Aneesh Kumar K.V
	Commit 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized") removed allocation of lppaca on bare metal platforms. But with CONFIG_PPC_SPLPAR enabled, we still access the lppaca on bare metal in some code paths. Fix this but adding runtime checks for SPLPAR (shared processor LPAR). Fixes: 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-03	misc: pci_endpoint_test: Handle 64-bit BARs properly	Niklas Cassel
	A 64-bit BAR consists of a BAR pair, where the second BAR has the upper bits, so we cannot simply call pci_ioremap_bar() on every single BAR index. The second BAR in a BAR pair will not have the IORESOURCE_MEM resource flag set. Only call ioremap on BARs that have the IORESOURCE_MEM resource flag set. pci 0000:01:00.0: BAR 4: assigned [mem 0xc0300000-0xc031ffff 64bit] pci 0000:01:00.0: BAR 2: assigned [mem 0xc0320000-0xc03203ff 64bit] pci 0000:01:00.0: BAR 0: assigned [mem 0xc0320400-0xc03204ff 64bit] pci-endpoint-test 0000:01:00.0: can't ioremap BAR 1: [??? 0x00000000 flags 0x0] pci-endpoint-test 0000:01:00.0: failed to read BAR1 pci-endpoint-test 0000:01:00.0: can't ioremap BAR 3: [??? 0x00000000 flags 0x0] pci-endpoint-test 0000:01:00.0: failed to read BAR3 pci-endpoint-test 0000:01:00.0: can't ioremap BAR 5: [??? 0x00000000 flags 0x0] pci-endpoint-test 0000:01:00.0: failed to read BAR5 Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: designware-ep: Make dw_pcie_ep_reset_bar() handle 64-bit BARs properly	Niklas Cassel
	Since a 64-bit BAR consists of a BAR pair, we need to write to both BARs in the BAR pair to clear the BAR properly. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: Make sure that BAR_5 does not have 64-bit flag set when clearing	Niklas Cassel
	Since a 64-bit BAR consists of a BAR pair, and since there is no BAR after BAR_5, BAR_5 cannot be 64-bits wide. This sanity check is done in pci_epc_clear_bar(), so that we don't need to do this sanity check in all epc->ops->clear_bar() implementations. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct ↵	Niklas Cassel
	epf_bar Make epc->ops->clear_bar()/pci_epc_clear_bar() take struct epf_bar. This is needed so that epc->ops->clear_bar() can clear the BAR pair, if the BAR is 64-bits wide. This also makes it possible for pci_epc_clear_bar() to sanity check the flags. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2018-04-03	PCI: endpoint: Handle 64-bit BARs properly	Niklas Cassel
	If a 64-bit BAR was set-up, we need to skip a BAR, since a 64-bit BAR consists of a BAR pair. We need to check what BAR width the epc->ops->set_bar() specific implementation actually did set-up, since some drivers, like the Cadence EP controller, sometimes sets up a 64-bit BAR, even though a 32-bit BAR was requested. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: cadence: Set PCI_BASE_ADDRESS_MEM_TYPE_64 if a 64-bit BAR was set-up	Niklas Cassel
	cdns_pcie_ep_set_bar() does some round-up of the BAR size, which means that a 64-bit BAR can be set-up, even when the flag PCI_BASE_ADDRESS_MEM_TYPE_64 isn't set. If a 64-bit BAR was set-up, set the flag PCI_BASE_ADDRESS_MEM_TYPE_64, so that the calling function can know what BAR width that was actually set-up. I'm not sure why cdns_pcie_ep_set_bar() doesn't obey the flag PCI_BASE_ADDRESS_MEM_TYPE_64, but I leave this for the MAINTAINER to fix, since there might be a reason why this flag is ignored. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Alan Douglas <adouglas@cadence.com>
2018-04-03	PCI: designware-ep: Make dw_pcie_ep_set_bar() handle 64-bit BARs properly	Niklas Cassel
	Since a 64-bit BAR consists of a BAR pair, we need to write to both BARs in the BAR pair to setup the BAR properly. Link: https://lkml.kernel.org/r/20180328115018.31921-7-niklas.cassel@axis.com Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> [lorenzo.pieralisi@arm.com: updated code according to review] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
2018-04-03	PCI: endpoint: Setting a BAR size > 4 GB is invalid if 64-bit flag is not set	Niklas Cassel
	Setting a BAR size > 4 GB is invalid if PCI_BASE_ADDRESS_MEM_TYPE_64 flag is not set. This sanity check is done in pci_epc_set_bar(), so that we don't need to do this sanity check in all epc->ops->set_bar() implementations. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: Setting 64-bit/prefetch bit is invalid when IO is set	Niklas Cassel
	If flag PCI_BASE_ADDRESS_SPACE_IO is set, also having any PCI_BASE_ADDRESS_MEM_* bit set is invalid. This sanity check is done in pci_epc_set_bar(), so that we don't need to do this sanity check in all epc->ops->set_bar() implementations. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: Setting BAR_5 to 64-bits wide is invalid	Niklas Cassel
	Since a 64-bit BAR consists of a BAR pair, and since there is no BAR after BAR_5, BAR_5 cannot be 64-bits wide. This sanity check is done in pci_epc_set_bar(), so that we don't need to do this sanity check in all epc->ops->set_bar() implementations. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: Simplify epc->ops->set_bar()/pci_epc_set_bar()	Niklas Cassel
	Add barno and flags to struct epf_bar. That way we can simplify epc->ops->set_bar()/pci_epc_set_bar() by passing a struct *epf_bar instead of a whole lot of arguments. This is needed so that epc->ops->set_bar() implementations can modify BAR flags. Will be utilized in a succeeding patch. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	PCI: endpoint: BAR width should not depend on sizeof dma_addr_t	Niklas Cassel
	If a BAR supports 64-bit width or not depends on the hardware, and should thus not depend on sizeof(dma_addr_t). If a certain hardware doesn't support 64-bit BARs, its epc->ops->set_bar() implementation should return -EINVAL when PCI_BASE_ADDRESS_MEM_TYPE_64 is set. We can't change pci_epc_set_bar() to only set PCI_BASE_ADDRESS_MEM_TYPE_64 based on size, since if the user, for some reason, wants to configure a BAR with a 64-bit width, even though the BAR size is less than 4 GB, he should be able to do that. However, since pci-epf-test is simply a test and not an API, we can set PCI_BASE_ADDRESS_MEM_TYPE_64 in pci-epf-test itself only based on size. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-04-03	ALSA: pcm: Fix UAF at PCM release via PCM timer access	Takashi Iwai
	The PCM runtime object is created and freed dynamically at PCM stream open / close time. This is tracked via substream->runtime, and it's cleared at snd_pcm_detach_substream(). The runtime object assignment is protected by PCM open_mutex, so for all PCM operations, it's safely handled. However, each PCM substream provides also an ALSA timer interface, and user-space can access to this while closing a PCM substream. This may eventually lead to a UAF, as snd_pcm_timer_resolution() tries to access the runtime while clearing it in other side. Fortunately, it's the only concurrent access from the PCM timer, and it merely reads runtime->timer_resolution field. So, we can avoid the race by reordering kfree() and wrapping the substream->runtime clearance with the corresponding timer lock. Reported-by: syzbot+8e62ff4e07aa2ce87826@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-04-02	Merge branch 'syscalls-next' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux Pull removal of in-kernel calls to syscalls from Dominik Brodowski: "System calls are interaction points between userspace and the kernel. Therefore, system call functions such as sys_xyzzy() or compat_sys_xyzzy() should only be called from userspace via the syscall table, but not from elsewhere in the kernel. At least on 64-bit x86, it will likely be a hard requirement from v4.17 onwards to not call system call functions in the kernel: It is better to use use a different calling convention for system calls there, where struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands processing over to the actual syscall function. This means that only those parameters which are actually needed for a specific syscall are passed on during syscall entry, instead of filling in six CPU registers with random user space content all the time (which may cause serious trouble down the call chain). Those x86-specific patches will be pushed through the x86 tree in the near future. Moreover, rules on how data may be accessed may differ between kernel data and user data. This is another reason why calling sys_xyzzy() is generally a bad idea, and -- at most -- acceptable in arch-specific code. This patchset removes all in-kernel calls to syscall functions in the kernel with the exception of arch/. On top of this, it cleans up the three places where many syscalls are referenced or prototyped, namely kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h" * 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits) bpf: whitelist all syscalls for error injection kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions kernel/sys_ni: sort cond_syscall() entries syscalls/x86: auto-create compat_sys_() prototypes syscalls: sort syscall prototypes in include/linux/compat.h net: remove compat_sys_() prototypes from net/compat.h syscalls: sort syscall prototypes in include/linux/syscalls.h kexec: move sys_kexec_load() prototype to syscalls.h x86/sigreturn: use SYSCALL_DEFINE0 x86: fix sys_sigreturn() return type to be long, not unsigned long x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm() mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff() mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64() fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare() ...
2018-04-02	bitmap: fix memset optimization on big-endian systems	Omar Sandoval
	Commit 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") introduced an optimization to bitmap_{set,clear}() which uses memset() when the start and length are constants aligned to a byte. This is wrong on big-endian systems; our bitmaps are arrays of unsigned long, so bit n is not at byte n / 8 in memory. This was caught by the Btrfs selftests, but the bitmap selftests also fail when run on a big-endian machine. We can still use memset if the start and length are aligned to an unsigned long, so do that on big-endian. The same problem applies to the memcmp in bitmap_equal(), so fix it there, too. Fixes: 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") Fixes: 2c6deb01525a ("bitmap: use memcmp optimisation in more situations") Cc: stable@kernel.org Reported-by: "Erhard F." <erhard_f@mailbox.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-02	RISC-V: Fixes to module loading	Palmer Dabbelt
	This cleans up the module support that was commited earlier to work with what's actually emitted from our GCC port as it lands upstream. Most of the work here is adding new relocations to the kernel. There's some limitations on module loading imposed by the kernel: * The kernel doesn't support linker relaxation, which is necessary to support R_RISCV_ALIGN. In order to get reliable module building you're going to need to a GCC that supports the new '-mno-relax', which IIRC isn't going to be out until 8.1.0. It's somewhat unlikely that R_RISCV_ALIGN will appear in a module even without '-mno-relax' support, so issues shouldn't be common. * There is no large code model for RISC-V, which means modules must be loaded within a 32-bit signed offset of the kernel. We don't currently have any mechanism for ensuring this memory remains free or moving pages around, so issues here might be common. I fixed a singcle merge conflict in arch/riscv/kernel/Makefile.
2018-04-02	RISC-V: Assorted memory model fixes	Palmer Dabbelt
	These fixes fall into three categories * The definiton of __smp_{store_release,load_acquire}, which allow us to emit a full fence when unnecessary. * Fixes to avoid relying on the behavior of ".aqrl" atomics, as those are specified in the currently released RISC-V memory model draft in a way that makes them useless for Linux. This might change in the future, but now the code matches the memory model spec as it's written so at least we're getting closer to something sane. The actual fix is to delete the RISC-V specific atomics and drop back to generic versions that use the new fences from above. Cleanups to our atomic macros, which are mostly non-functional changes. Unfortunately I haven't given these as thorough of a testing as I probably should have, but I've poked through the code and they seem generally OK.
2018-04-02	RISC-V: Add dynamic ftrace support for RISC-V platforms	Palmer Dabbelt
	This patch set includes the building blocks of dynamic ftrace features for RISC-V machines. I'm afraid I'm not very familiar with ftrace, but the code looks OK to me. It's been used to track down a performance problem in our SPI driver and appears to work acceptably, but we haven't given it a whole lot of banging yet so there might still be some bugs lurking around somewhere.
2018-04-02	Merge tag 'arch-removal' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02	f2fs: make assignment of t->dentry_bitmap more readable	Yunlong Song
	In make_dentry_ptr_block, it is confused with "&" for t->dentry_bitmap but without "&" for t->dentry, so delete "&" to make code more readable. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-02	f2fs: truncate preallocated blocks in error case	Jaegeuk Kim
	If write is failed, we must deallocate the blocks that we couldn't write. Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-02	RISC-V: Add definition of relocation types	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Enable module support in defconfig	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support SUB32 relocation type in kernel module	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support ADD32 relocation type in kernel module	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support ALIGN relocation type in kernel module	Zong Li
	Just fail on align type. Kernel modules loader didn't do relax like linker, it is difficult to remove or migrate the code, but the remnant nop instructions harm the performaace of module. We expect the building module with the no-relax option. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support RVC_BRANCH/JUMP relocation type in kernel modulewq	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support HI20/LO12_I/LO12_S relocation type in kernel module	Zong Li
	HI20 and LO12_I/LO12_S relocate the absolute address, the range of offset must in 32-bit. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support CALL relocation type in kernel module	Zong Li
	Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Support GOT_HI20/CALL_PLT relocation type in kernel module	Zong Li
	For CALL_PLT, emit the plt entry only when offset is more than 32-bit. For PCREL_LO12, it uses the location of corresponding HI20 to get the address of external symbol. It should check the HI20 type is the PCREL_HI20 or GOT_HI20, because sometime the location will have two or more relocation types. For example: 0: 00000797 auipc a5,0x0 0: R_RISCV_ALIGN ABS 0: R_RISCV_GOT_HI20 SYMBOL 4: 0007b783 ld a5,0(a5) # 0 <SYMBOL> 4: R_RISCV_PCREL_LO12_I .L0 4: R_RISCV_RELAX ABS Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Add section of GOT.PLT for kernel module	Zong Li
	Separate the function symbol address from .plt to .got.plt section. The original plt entry has trampoline code with symbol address, there is a 32-bit padding bwtween jar instruction and symbol address. Extract the symbol address to .got.plt to reduce the module size. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	RISC-V: Add sections of PLT and GOT for kernel module	Zong Li
	The address of external symbols will locate more than 32-bit offset in 64-bit kernel with sv39 or sv48 virtual addressing. Module loader emits the GOT and PLT entries for data symbols and function symbols respectively. The PLT entry is a trampoline code for jumping to the 64-bit real address. The GOT entry is just the data symbol address. Signed-off-by: Zong Li <zong@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/atomic: Strengthen implementations with fences	Andrea Parri
	Atomics present the same issue with locking: release and acquire variants need to be strengthened to meet the constraints defined by the Linux-kernel memory consistency model [1]. Atomics present a further issue: implementations of atomics such as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs, which do not give full-ordering with .aqrl; for example, current implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test below to end up with the state indicated in the "exists" clause. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the atomics operations by replacing .rl and .aq with the use of ("lightweigth") fences, and by replacing .aqrl LR/SC pairs in sequences such as: 0: lr.w.aqrl %0, %addr bne %0, %old, 1f ... sc.w.aqrl %1, %new, %addr bnez %1, 0b 1: with sequences of the form: 0: lr.w %0, %addr bne %0, %old, 1f ... sc.w.rl %1, %new, %addr /* SC-release / bnez %1, 0b fence rw, rw / "full" fence / 1: following Daniel's suggestion. These modifications were validated with simulation of the RISC-V memory consistency model. C lr-sc-aqrl-pair-vs-full-barrier {} P0(int x, int y, atomic_t u) { int r0; int r1; WRITE_ONCE(x, 1); r0 = atomic_cmpxchg(u, 0, 1); r1 = READ_ONCE(y); } P1(int x, int y, atomic_t v) { int r0; int r1; WRITE_ONCE(y, 1); r0 = atomic_cmpxchg(v, 0, 1); r1 = READ_ONCE(*x); } exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Suggested-by: Daniel Lustig <dlustig@nvidia.com> Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/spinlock: Strengthen implementations with fences	Andrea Parri
	Current implementations map locking operations using .rl and .aq annotations. However, this mapping is unsound w.r.t. the kernel memory consistency model (LKMM) [1]: Referring to the "unlock-lock-read-ordering" test reported below, Daniel wrote: "I think an RCpc interpretation of .aq and .rl would in fact allow the two normal loads in P1 to be reordered [...] The intuition would be that the amoswap.w.aq can forward from the amoswap.w.rl while that's still in the store buffer, and then the lw x3,0(x4) can also perform while the amoswap.w.rl is still in the store buffer, all before the l1 x1,0(x2) executes. That's not forbidden unless the amoswaps are RCsc, unless I'm missing something. Likewise even if the unlock()/lock() is between two stores. A control dependency might originate from the load part of the amoswap.w.aq, but there still would have to be something to ensure that this load part in fact performs after the store part of the amoswap.w.rl performs globally, and that's not automatic under RCpc." Simulation of the RISC-V memory consistency model confirmed this expectation. In order to "synchronize" LKMM and RISC-V's implementation, this commit strengthens the implementations of the locking operations by replacing .rl and .aq with the use of ("lightweigth") fences, resp., "fence rw, w" and "fence r , rw". C unlock-lock-read-ordering {} /* s initially owned by P1 / P0(int x, int y) { WRITE_ONCE(x, 1); smp_wmb(); WRITE_ONCE(y, 1); } P1(int x, int y, spinlock_t s) { int r0; int r1; r0 = READ_ONCE(y); spin_unlock(s); spin_lock(s); r1 = READ_ONCE(x); } exists (1:r0=1 /\ 1:r1=0) [1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2 https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM https://marc.info/?l=linux-kernel&m=151633436614259&w=2 Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Daniel Lustig <dlustig@nvidia.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jade Alglave <j.alglave@ucl.ac.uk> Cc: Luc Maranget <luc.maranget@inria.fr> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Akira Yokosawa <akiyks@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/barrier: Define __smp_{store_release,load_acquire}	Andrea Parri
	Introduce __smp_{store_release,load_acquire}, and rely on the generic definitions for smp_{store_release,load_acquire}. This avoids the use of full ("rw,rw") fences on SMP. Signed-off-by: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support	Alan Kao
	In walk_stackframe, the pc now receives the address from calling ftrace_graph_ret_addr instead of manual calculation. Note that the original calculation, pc = frame->ra - 4 is buggy when the instruction at the return address happened to be a compressed inst. But since it is not a critical part of ftrace, it is ignored for now to ease the review process. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support	Alan Kao
	Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support	Alan Kao
	Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add dynamic function graph tracer support	Alan Kao
	Once the function_graph tracer is enabled, a filtered function has the following call sequence: * ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop * ftrace_graph_caller * ftrace_graph_call ==> on/off by ftrace_en/disable_ftrace_graph_caller * prepare_ftrace_return Considering the following DYNAMIC_FTRACE_WITH_REGS feature, it would be more extendable to have a ftrace_graph_caller function, instead of calling prepare_ftrace_return directly in ftrace_caller. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-04-02	riscv/ftrace: Add dynamic function tracer support	Alan Kao
	We now have dynamic ftrace with the following added items: * ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c) The two functions turn each recorded call site of filtered functions into a call to ftrace_caller or nops * ftracce_update_ftrace_func (in kernel/ftrace.c) turns the nops at ftrace_call into a call to a generic entry for function tracers. * ftrace_caller (in kernel/mcount-dyn.S) The entry where each _mcount call sites calls to once they are filtered to be traced. Also, this patch fixes the semantic problems in mcount.S, which will be treated as only a reference implementation once we have the dynamic ftrace. Cc: Greentime Hu <greentime@andestech.com> Signed-off-by: Alan Kao <alankao@andestech.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>