summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-29ovl: clear nlink on rmdirMiklos Szeredi
To make delete notification work on fa/inotify. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: disallow overlayfs as upperdirMiklos Szeredi
This does not work and does not make sense. So instead of fixing it (probably not hard) just disallow. Reported-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-07-29ovl: fix warningMiklos Szeredi
There's a superfluous newline in the warning message in ovl_d_real(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: remove duplicated include from super.cWei Yongjun
Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: append MAY_READ when diluting write checksVivek Goyal
Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on lower/. This is done as files on lower will never be written and will be copied up. But to copy up a file, mounter should have MAY_READ permission otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is reset. Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned True (context mounts) but when he tried to actually write to file, it failed as mounter did not have permission on lower file. [SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this won't trigger a copy-up. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: dilute permission checks on lower only if not special fileVivek Goyal
Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from mask as lower/ will never be written and file will be copied up. But this is not true for special files. These files are not copied up and are opened in place. So don't dilute the checks for these types of files. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: fix POSIX ACL settingMiklos Szeredi
Setting POSIX ACL needs special handling: 1) Some permission checks are done by ->setxattr() which now uses mounter's creds ("ovl: do operations on underlying file system in mounter's context"). These permission checks need to be done with current cred as well. 2) Setting ACL can fail for various reasons. We do not need to copy up in these cases. In the mean time switch to using generic_setxattr. [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr() doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is disabled, and I could not come up with an obvious way to do it. This instead avoids the link error by defining two sets of ACL operations and letting the compiler drop one of the two at compile time depending on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code, also leading to smaller code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: share inode for hard linkMiklos Szeredi
Inode attributes are copied up to overlay inode (uid, gid, mode, atime, mtime, ctime) so generic code using these fields works correcty. If a hard link is created in overlayfs separate inodes are allocated for each link. If chmod/chown/etc. is performed on one of the links then the inode belonging to the other ones won't be updated. This patch attempts to fix this by sharing inodes for hard links. Use inode hash (with real inode pointer as a key) to make sure overlay inodes are shared for hard links on upper. Hard links on lower are still split (which is not user observable until the copy-up happens, see Documentation/filesystems/overlayfs.txt under "Non-standard behavior"). The inode is only inserted in the hash if it is non-directoy and upper. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: store real inode pointer in ->i_privateMiklos Szeredi
To get from overlay inode to real inode we currently use 'struct ovl_entry', which has lifetime connected to overlay dentry. This is okay, since each overlay dentry had a new overlay inode allocated. Following patch will break that assumption, so need to leave out ovl_entry. This patch stores the real inode directly in i_private, with the lowest bit used to indicate whether the inode is upper or lower. Lifetime rules remain, using ovl_inode_real() must only be done while caller holds ref on overlay dentry (and hence on real dentry), or within RCU protected regions. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: permission: return ECHILD instead of ENOENTMiklos Szeredi
The error is due to RCU and is temporary. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: update atime on upperMiklos Szeredi
Fix atime update logic in overlayfs. This patch adds an i_op->update_time() handler to overlayfs inodes. This forwards atime updates to the upper layer only. No atime updates are done on lower layers. Remove implicit atime updates to underlying files and directories with O_NOATIME. Remove explicit atime update in ovl_readlink(). Clear atime related mnt flags from cloned upper mount. This means atime updates are controlled purely by overlayfs mount options. Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: fix sgid on directoryMiklos Szeredi
When creating directory in workdir, the group/sgid inheritance from the parent dir was omitted completely. Fix this by calling inode_init_owner() on overlay inode and using the resulting uid/gid/mode to create the file. Unfortunately the sgid bit can be stripped off due to umask, so need to reset the mode in this case in workdir before moving the directory in place. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: simplify permission checkingMiklos Szeredi
The fact that we always do permission checking on the overlay inode and clear MAY_WRITE for checking access to the lower inode allows cruft to be removed from ovl_permission(). 1) "default_permissions" option effectively did generic_permission() on the overlay inode with i_mode, i_uid and i_gid updated from underlying filesystem. This is what we do by default now. It did the update using vfs_getattr() but that's only needed if the underlying filesystem can change (which is not allowed). We may later introduce a "paranoia_mode" that verifies that mode/uid/gid are not changed. 2) splitting out the IS_RDONLY() check from inode_permission() also becomes unnecessary once we remove the MAY_WRITE from the lower inode check. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: do not require mounter to have MAY_WRITE on lowerVivek Goyal
Now we have two levels of checks in ovl_permission(). overlay inode is checked with the creds of task while underlying inode is checked with the creds of mounter. Looks like mounter does not have to have WRITE access to files on lower/. So remove the MAY_WRITE from access mask for checks on underlying lower inode. This means task should still have the MAY_WRITE permission on lower inode and mounter is not required to have MAY_WRITE. It also solves the problem of read only NFS mounts being used as lower. If __inode_permission(lower_inode, MAY_WRITE) is called on read only NFS, it fails. By resetting MAY_WRITE, check succeeds and case of read only NFS shold work with overlay without having to specify any special mount options (default permission). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: do operations on underlying file system in mounter's contextVivek Goyal
Given we are now doing checks both on overlay inode as well underlying inode, we should be able to do checks and operations on underlying file system using mounter's context. So modify all operations to do checks/operations on underlying dentry/inode in the context of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: modify ovl_permission() to do checks on two inodesVivek Goyal
Right now ovl_permission() calls __inode_permission(realinode), to do permission checks on real inode and no checks are done on overlay inode. Modify it to do checks both on overlay inode as well as underlying inode. Checks on overlay inode will be done with the creds of calling task while checks on underlying inode will be done with the creds of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: define ->get_acl() for overlay inodesVivek Goyal
Now we are planning to do DAC permission checks on overlay inode itself. And to make it work, we will need to make sure we can get acls from underlying inode. So define ->get_acl() for overlay inodes and this in turn calls into underlying filesystem to get acls, if any. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: move some common code in a functionVivek Goyal
ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some common code which can be moved into a separate function. No functionality change. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: store ovl_entry in inode->i_private for all inodesAndreas Gruenbacher
Previously this was only done for directory inodes. Doing so for all inodes makes for a nice cleanup in ovl_permission at zero cost. Inodes are not shared for hard links on the overlay, so this works fine. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: use generic_delete_inodeMiklos Szeredi
No point in keeping overlay inodes around since they will never be reused. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29ovl: check mounter creds on underlying lookupMiklos Szeredi
The hash salting changes meant that we can no longer reuse the hash in the overlay dentry to look up the underlying dentry. Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to mounter's creds (like we do for all other operations later in the series). Now the lookup_hash() export introduced in 4.6 by 3c9fe8cdff1b ("vfs: add lookup_hash() helper") is unused and can possibly be removed; its usefulness negated by the hash salting and the idea that mounter's creds should be used on operations on underlying filesystems. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 8387ff2577eb ("vfs: make the string hashes salt the hash")
2016-07-29avr32: off by one in at32_init_pio()Dan Carpenter
The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be >=. Fixes: 5f97f7f9400d ('[PATCH] avr32 architecture') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2016-07-29avr32: fixup code style in unistd.h and syscall_table.SHans-Christian Noren Egtvedt
This patch swaps the mix of tabs and space for alignment of comment after code to use spaces only. Also document why recvmmsg was defined twice in the syscall_table.S table, but only once in unistd.h. In short, wired in the table by generic arch patch, but forgotten in unistd.h (review slip).
2016-07-29avr32: wire up preadv2 and pwritev2 syscallsHans-Christian Noren Egtvedt
This patch wires up the new preadv2 and pwritev2 syscall on AVR32. On AVR32, all parameters beyond the 5th are passed on the stack. System calls don't use the stack -- they borrow a callee-saved register instead. This means that syscalls that take 6 parameters must be called through a stub that pushes the last parameter on the stack. Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
2016-07-29arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFOJames Hogan
AT_VECTOR_SIZE_ARCH should be defined with the maximum number of NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined for arm64 at all even though ARCH_DLINFO will contain one NEW_AUX_ENT for the VDSO address. This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for AT_BASE_PLATFORM which arm64 doesn't use, but lets define it now and add the comment above ARCH_DLINFO as found in several other architectures to remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to date. Fixes: f668cd1673aa ("arm64: ELF definitions") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-29arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinuxArd Biesheuvel
The linker routines that we rely on to produce a relocatable PIE binary treat it as a shared ELF object in some ways, i.e., it emits symbol based R_AARCH64_ABS64 relocations into the final binary since doing so would be appropriate when linking a shared library that is subject to symbol preemption. (This means that an executable can override certain symbols that are exported by a shared library it is linked with, and that the shared library *must* update all its internal references as well, and point them to the version provided by the executable.) Symbol preemption does not occur for OS hosted PIE executables, let alone for vmlinux, and so we would prefer to get rid of these symbol based relocations. This would allow us to simplify the relocation routines, and to strip the .dynsym, .dynstr and .hash sections from the binary. (Note that these are tiny, and are placed in the .init segment, but they clutter up the vmlinux binary.) Note that these R_AARCH64_ABS64 relocations are only emitted for absolute references to symbols defined in the linker script, all other relocatable quantities are covered by anonymous R_AARCH64_RELATIVE relocations that simply list the offsets to all 64-bit values in the binary that need to be fixed up based on the offset between the link time and run time addresses. Fortunately, GNU ld has a -Bsymbolic option, which is intended for shared libraries to allow them to ignore symbol preemption, and unconditionally bind all internal symbol references to its own definitions. So set it for our PIE binary as well, and get rid of the asoociated sections and the relocation code that processes them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: fixed conflict with __dynsym_offset linker script entry] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-29arm64: vmlinux.lds: make __rela_offset and __dynsym_offset ABSOLUTEArd Biesheuvel
Due to the untyped KIMAGE_VADDR constant, the linker may not notice that the __rela_offset and __dynsym_offset expressions are absolute values (i.e., are not subject to relocation). This does not matter for KASLR, but it does confuse kallsyms in relative mode, since it uses the lowest non-absolute symbol address as the anchor point, and expects all other symbol addresses to be within 4 GB of it. Fix this by qualifying these expressions as ABSOLUTE() explicitly. Fixes: 0cd3defe0af4 ("arm64: kernel: perform relocation processing from ID map") Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-29mmc: rtsx_pci: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar
The workqueue "workq" provides support for sd/mmc async request, which makes next request do dma_map_sg() while previous request transferring data. The workqueue has a single workitem(&host->work) and hence doesn't require ordering. Also, it is not being used on a memory reclaim path. Hence, the singlethreaded workqueue has been replaced with the use of system_wq. System workqueues have been able to handle high level of concurrency for a long time now and hence it's not required to have a singlethreaded workqueue just to gain concurrency. Unlike a dedicated per-cpu workqueue created with create_singlethread_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantee unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Work item has been flushed in rtsx_pci_sdmmc_drv_remove() to ensure that there are no pending tasks while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29mmc: rtsx_pci: Enable MMC_CAP_ERASE to allow erase/discard/trim requestsUlf Hansson
Cc: Micky Ching <micky_ching@realsil.com.cn> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Mauro Santos <registo.mailling@gmail.com>
2016-07-29mmc: rtsx_pci: Use the provided busy timeout from the mmc coreUlf Hansson
The rtsx_pci driver is using a fixed 3s timeout for R1B responses, which in some cases isn't suffient. For example, erase/discard requests may require longer timeouts. Instead of always using a fixed timeout, let's use the per request calculated busy timeout from the mmc core. Cc: Micky Ching <micky_ching@realsil.com.cn> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Mauro Santos <registo.mailling@gmail.com>
2016-07-29mmc: sdhci-pltfm: Drop define for SDHCI_PLTFM_PMOPSUlf Hansson
Due to previous changes this define has no longer a purpose. Instead move the sdhci-pltfm drivers over to use the exported struct sdhci_pltfm_pmops. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29mmc: sdhci-pltfm: Convert to use the SET_SYSTEM_SLEEP_PM_OPSUlf Hansson
Move the system PM callbacks within #ifdef CONFIG_PM_SLEEP as to avoid them being build when not used. This also allows us to use the SET_SYSTEM_SLEEP_PM_OPS macro which simplifies the code. Within this context it also makes sense to move the declaration of the struct sdhci_pltfm_pmops, outside the #ifdef CONFIG_PM as the SET_SYSTEM_SLEEP_PM_OPS deals with this. This further simplifies the code. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29mmc: sdhci-pltfm: Make sdhci_pltfm_suspend|resume() staticUlf Hansson
There are no users left of these exported APIs, so let's make them static. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-07-29mmc: sdhci-esdhc-imx: Use common sdhci_suspend|resume_host()Ulf Hansson
To prepare to make the sdhci_pltfm_suspend|resume() static functions, move sdhci-esdhc-imx over to use the sdhci_suspend|resume_host(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
2016-07-29mmc: sdhci-esdhc-imx: Assign system PM ops within #ifdef CONFIG_PM_SLEEPUlf Hansson
The system PM callbacks isn't used unless CONFIG_PM_SLEEP is set, thus it triggers a compiler warning about unused functions. Avoid this by changing from CONFIG_PM to CONFIG_PM_SLEEP. Reported-by: Arnd Bergmann <arnd@arndb.de> Fixes: b70d0b3b5b29 ("mmc: sdhci-esdhc-imx: add esdhc specific suspend resume callback") Cc: Dong Aisheng <aisheng.dong@nxp.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
2016-07-29MIPS: c-r4k: Use SMP calls for CM indexed cache opsJames Hogan
The MIPS Coherence Manager (CM) can propagate address-based ("hit") cache operations to other cores in the coherent system, alleviating software of the need to use SMP calls, however indexed cache operations are not propagated by hardware since doing so makes no sense for separate caches. Update r4k_op_needs_ipi() to report that only hit cache operations are globalized by the CM, requiring indexed cache operations to be globalized by software via an SMP call. r4k_on_each_cpu() previously had a special case for CONFIG_MIPS_MT_SMP, intended to avoid the SMP calls when the only other CPUs in the system were other VPEs in the same core, and hence sharing the same caches. This was changed by commit cccf34e9411c ("MIPS: c-r4k: Fix cache flushing for MT cores") to apparently handle multi-core multi-VPE systems, but it focussed mainly on hit cache ops, so the SMP calls were still disabled entirely for CM systems. This doesn't normally cause problems, but tests can be written to hit these corner cases by using multiple threads, or changing task affinities to force the process to migrate cores. For example the failure of mprotect RW->RX to globally sync icaches (via flush_cache_range) can be detected by modifying and mprotecting a code page on one core, and migrating to a different core to execute from it. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13807/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Avoid small flush_icache_range SMP callsJames Hogan
Avoid SMP calls for flushing small icache ranges. On non-CM platforms, and CM platforms too after we make r4k_on_each_cpu() take the cache op type into account, it will be called on multiple CPUs due to the possibility that local_r4k_flush_icache_range_ipi() could do non-globalized indexed cache ops. This rougly copies the range size check out into r4k_flush_icache_range(), which can disallow indexed cache ops and allow r4k_on_each_cpu() to skip the SMP call. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13805/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Local flush_icache_range cache op overrideJames Hogan
Allow the permitted cache op types used by local_r4k_flush_icache_range_ipi() to be overridden by the SMP caller. This will allow SMP calls to be avoided under certain circumstances, falling back to a single CPU performing globalized hit cache ops only. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13803/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Split r4k_flush_kernel_vmap_range()James Hogan
Split the operation of r4k_flush_kernel_vmap_range() into separate SMP callbacks for the indexed cache flush and hit cache flush cases, since the logic to determine which to use can be determined by the initiating CPU prior to doing any SMP calls. This will help when we change r4k_on_each_cpu() to distinguish indexed and hit cache ops in a later patch, preventing globalized hit cache ops being performed redundantly on multiple CPUs. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13806/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Exclude sibling CPUs in SMP callsJames Hogan
When performing SMP calls to foreign cores, exclude sibling CPUs from the provided map, as we already handle the local core on the current CPU. This prevents an SMP call from for example core 0, VPE 1 to VPE 0 on the same core. In the process the cpu_foreign_map cpumask is turned into an array of cpumasks, so that each CPU has its own version of it which excludes sibling CPUs. r4k_op_needs_ipi() is also updated to reflect that cache management SMP calls are not needed when all CPUs are siblings (i.e. there are no foreign CPUs according to the new cpu_foreign_map[] semantics which exclude siblings). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: Felix Fietkau <nbd@nbd.name> Cc: Jayachandran C. <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13801/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Fix valid ASID optimisationJames Hogan
Several cache operations are optimised to return early from the SMP call handler if the memory map in question has no valid ASID on the current CPU, or any online CPU in the case of MIPS_MT_SMP. The idea is that if a memory map has never been used on a CPU it shouldn't have cache lines in need of flushing. However this doesn't cover all cases when ASIDs for other CPUs need to be checked: - Offline VPEs may have recently been online and brought lines into the (shared) cache, so they should also be checked, rather than only online CPUs. - SMP systems with a Coherence Manager (CM), but with MT disabled still have globalized hit cache ops, but don't use SMP calls, so all present CPUs should be taken into account. - R6 systems have a different multithreading implementation, so MIPS_MT_SMP won't be set, but as above may still have a CM which globalizes hit cache ops. Additionally for non-globalized cache operations where an SMP call to a single VPE in each foreign core is used, it is not necessary to check every CPU in the system, only sibling CPUs sharing the same first level cache. Fix this by making has_valid_asid() take a cache op type argument like r4k_on_each_cpu(), so it can determine whether r4k_on_each_cpu() will have done SMP calls to other cores. It can then determine which set of CPUs to check the ASIDs of based on that, excluding foreign CPUs if an SMP call will have been performed. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13804/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Add r4k_on_each_cpu cache op type argJames Hogan
The r4k_on_each_cpu() function calls the specified cache flush helper on other CPUs if deemed necessary due to the cache ops not being globalized by hardware. However this really depends on the cache op addressing type, as the MIPS Coherence Manager (CM) if present will globalize "hit" cache ops (addressed by virtual address), but not "index" cache ops (addressed by cache index). This results in index cache ops only being performed on a single CPU when CM is present. Most (but not all) of the functions called by r4k_on_each_cpu() perform cache operations exclusively with a single cache op type, so add a type argument and modify the callers to pass in some combination of R4K_HIT (global kernel virtual addressing or user virtual addressing conditional upon matching active_mm) and R4K_INDEX (index into cache). This will allow r4k_on_each_cpu() to later distinguish these cases and decide whether to perform an SMP call based on it. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13798/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Avoid dcache flush for sigtrampsJames Hogan
Avoid the dcache and scache flush in local_r4k_flush_cache_sigtramp() if the icache fills straight from the dcache. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13802/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Fix sigtramp SMP call to use kmapJames Hogan
Fix r4k_flush_cache_sigtramp() and local_r4k_flush_cache_sigtramp() to flush the delay slot emulation trampoline cacheline through a kmap rather than directly when the active_mm doesn't match that of the task initiating the flush, a bit like local_r4k_flush_cache_page() does. This would fix a corner case on SMP systems without hardware globalized hit cache ops, where a migration to another CPU after the flush, where that CPU did not have the same mm active at the time of the flush, could result in stale icache content being executed instead of the trampoline, e.g. from a previous delay slot emulation with a similar stack pointer. This case was artificially triggered by replacing the icache flush with a full indexed flush (not globalized on CM systems) and forcing the SMP call to take place, with a test program that alternated two FPU delay slots with a parent process repeatedly changing scheduler affinity. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13797/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: c-r4k: Fix protected_writeback_scache_line for EVAJames Hogan
The protected_writeback_scache_line() function is used by local_r4k_flush_cache_sigtramp() to flush an FPU delay slot emulation trampoline on the userland stack from the caches so it is visible to subsequent instruction fetches. Commit de8974e3f76c ("MIPS: asm: r4kcache: Add EVA cache flushing functions") updated some protected_ cache flush functions to use EVA CACHEE instructions via protected_cachee_op(), and commit 83fd43449baa ("MIPS: r4kcache: Add EVA case for protected_writeback_dcache_line") did the same thing for protected_writeback_dcache_line(), but protected_writeback_scache_line() never got updated. Lets fix that now to flush the right user address from the secondary cache rather than some arbitrary kernel unmapped address. This issue was spotted through code inspection, and it seems unlikely to be possible to hit this in practice. It theoretically affect EVA kernels on EVA capable cores with an L2 cache, where the icache fetches straight from RAM (cpu_icache_snoops_remote_store == 0), running a hard float userland with FPU disabled (nofpu). That both Malta and Boston platforms override cpu_icache_snoops_remote_store to 1 suggests that all MIPS cores fetch instructions into icache straight from L2 rather than RAM. Fixes: de8974e3f76c ("MIPS: asm: r4kcache: Add EVA cache flushing functions") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13800/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: SMP: Drop stop_this_cpu() cpu_foreign_map hackJames Hogan
Commit cccf34e9411c ("MIPS: c-r4k: Fix cache flushing for MT cores") added the cpu_foreign_map cpumask containing a single VPE from each online core, and recalculated it when secondary CPUs are brought up. stop_this_cpu() was also updated to recalculate cpu_foreign_map, but with an additional hack before marking the CPU as offline to copy cpu_online_mask into cpu_foreign_map and perform an SMP memory barrier. This appears to have been intended to prevent cache management IPIs being missed when the VPE representing the core in cpu_foreign_map is taken offline while other VPEs remain online. Unfortunately there is nothing in this hack to prevent r4k_on_each_cpu() from reading the old cpu_foreign_map, and smp_call_function_many() from reading that new cpu_online_mask with the core's representative VPE marked offline. It then wouldn't send an IPI to any online VPEs of that core. stop_this_cpu() is only actually called in panic and system shutdown / halt / reboot situations, in which case all CPUs are going down and we don't really need to care about cache management, so drop this hack. Note that the __cpu_disable() case for CPU hotplug is handled in the previous commit, and no synchronisation is needed there due to the use of stop_machine() which prevents hotplug from taking place while any CPU has disabled preemption (as r4k_on_each_cpu() does). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13796/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: SMP: Update cpu_foreign_map on CPU disableJames Hogan
When a CPU is disabled via CPU hotplug, cpu_foreign_map is not updated. This could result in cache management SMP calls being sent to offline CPUs instead of online siblings in the same core. Add a call to calculate_cpu_foreign_map() in the various MIPS cpu disable callbacks after set_cpu_online(). All cases are updated for consistency and to keep cpu_foreign_map strictly up to date, not just those which may support hardware multithreading. Fixes: cccf34e9411c ("MIPS: c-r4k: Fix cache flushing for MT cores") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: David Daney <david.daney@cavium.com> Cc: Kevin Cernekee <cernekee@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Hongliang Tao <taohl@lemote.com> Cc: Hua Yan <yanh@lemote.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13799/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-29MIPS: SMP: Clear ASID without confusing has_valid_asid()James Hogan
The SMP flush_tlb_*() functions may clear the memory map's ASIDs for other CPUs if the mm has only a single user (the current CPU) in order to avoid SMP calls. However this makes it appear to has_valid_asid(), which is used by various cache flush functions, as if the CPUs have never run in the mm, and therefore can't have cached any of its memory. For flush_tlb_mm() this doesn't sound unreasonable. flush_tlb_range() corresponds to flush_cache_range() which does do full indexed cache flushes, but only on the icache if the specified mapping is executable, otherwise it doesn't guarantee that there are no cache contents left for the mm. flush_tlb_page() corresponds to flush_cache_page(), which will perform address based cache ops on the specified page only, and also only touches the icache if the page is executable. It does not guarantee that there are no cache contents left for the mm. For example, this affects flush_cache_range() which uses the has_valid_asid() optimisation. It is required to flush the icache when mappings are made executable (e.g. using mprotect) so they are immediately usable. If some code is changed to non executable in order to be modified then it will not be flushed from the icache during that time, but the ASID on other CPUs may still be cleared for TLB flushing. When the code is changed back to executable, flush_cache_range() will assume the code hasn't run on those other CPUs due to the zero ASID, and won't invalidate the icache on them. This is fixed by clearing the other CPUs ASIDs to 1 instead of 0 for the above two flush_tlb_*() functions when the corresponding cache flushes are likely to be incomplete (non executable range flush, or any page flush). This ASID appears valid to has_valid_asid(), but still triggers ASID regeneration due to the upper ASID version bits being 0, which is less than the minimum ASID version of 1 and so always treated as stale. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/13795/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-07-28sparc64 mm: Fix base TSB sizing when hugetlb pages are usedMike Kravetz
do_sparc64_fault() calculates both the base and huge page RSS sizes and uses this information in calls to tsb_grow(). The calculation for base page TSB size is not correct if the task uses hugetlb pages. hugetlb pages are not accounted for in RSS, therefore the call to get_mm_rss(mm) does not include hugetlb pages. However, the number of pages based on huge_pte_count (which does include hugetlb pages) is subtracted from this value. This will result in an artificially small and often negative RSS calculation. The base TSB size is then often set to max_tsb_size as the passed RSS is unsigned, so a negative value looks really big. THP pages are also accounted for in huge_pte_count, and THP pages are accounted for in RSS so the calculation in do_sparc64_fault() is correct if a task only uses THP pages. A single huge_pte_count is not sufficient for TSB sizing if both hugetlb and THP pages can be used. Instead of a single counter, use two: one for hugetlb and one for THP. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-29drm: aux ->transfer() can return 0, deal with itVille Syrjälä
Restore the correct behaviour (as in check msg.reply) when aux ->transfer() returns 0. It got removed in commit 82922da39190 ("drm/dp_helper: Retry aux transactions on all errors") Now I can actually dump the "entire" DPCD on a Dell UP2314Q with ddrescue. It has some offsets in the DPCD that can't be read for some resaon, all you get is defers. Previously ddrescue would just give up at the first unredable offset on account of read() returning 0 means EOF. Here's the ddrescue log for the interested: 0x00000000 0x00001400 + 0x00001400 0x00000030 - 0x00001430 0x000001D0 + 0x00001600 0x00000030 - 0x00001630 0x0001F9D0 + 0x00021000 0x00000001 - 0x00021001 0x000DEFFF + Cc: Lyude <cpaul@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch Cc: stable@vger.kernel.org Fixes: 82922da39190 ("drm/dp_helper: Retry aux transactions on all errors") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>