summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-05-05NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSIONTrond Myklebust
If the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION because we are trunking, then RECLAIM_COMPLETE must handle that by calling nfs4_schedule_session_recovery() and then retrying. Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2017-05-05tcp: randomize timestamps on syncookiesEric Dumazet
Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-05fbdev: sti: don't select CONFIG_VTArnd Bergmann
While working on another build error, I ran into several variations of this dependency loop: subsection "Kconfig recursive dependency limitations" drivers/input/Kconfig:8: symbol INPUT is selected by VT For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/tty/Kconfig:12: symbol VT is selected by FB_STI For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/video/fbdev/Kconfig:677: symbol FB_STI depends on FB For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/video/fbdev/Kconfig:5: symbol FB is selected by DRM_KMS_FB_HELPER For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/Kconfig:72: symbol DRM_KMS_FB_HELPER is selected by DRM_KMS_CMA_HELPER For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/Kconfig:137: symbol DRM_KMS_CMA_HELPER is selected by DRM_HDLCD For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/arm/Kconfig:6: symbol DRM_HDLCD depends on OF For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/of/Kconfig:4: symbol OF is selected by X86_INTEL_CE For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:523: symbol X86_INTEL_CE depends on X86_IO_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:1011: symbol X86_IO_APIC depends on X86_LOCAL_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:1005: symbol X86_LOCAL_APIC depends on X86_UP_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:980: symbol X86_UP_APIC depends on PCI_MSI For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/pci/Kconfig:11: symbol PCI_MSI is selected by AMD_IOMMU For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/iommu/Kconfig:106: symbol AMD_IOMMU depends on IOMMU_SUPPORT For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/iommu/Kconfig:5: symbol IOMMU_SUPPORT is selected by DRM_ETNAVIV For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/etnaviv/Kconfig:2: symbol DRM_ETNAVIV depends on THERMAL For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/thermal/Kconfig:5: symbol THERMAL is selected by ACPI_VIDEO For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/acpi/Kconfig:183: symbol ACPI_VIDEO is selected by INPUT This doesn't currently show up as I fixed the 'THERMAL' part of it, but I noticed that the FB_STI dependency should not be there but was introduced by slightly incorrect bug-fix patch that tried to fix a link error. Instead of selecting 'VT' to make us enter the drivers/video/console directory at compile-time, it's sufficient to build the drivers/video/console/sticore.c file by adding its directory to when CONFIG_FB_STI is enabled. Alternatively, we could move the sticore code to another directory that is always built when we have at STI_CONSOLE or FB_STI enabled. Fixes: 17085a934592 ("parisc: stifb: should depend on STI_CONSOLE") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2017-05-05bridge: netlink: account for IFLA_BRPORT_{B, M}CAST_FLOOD size and policyTobias Klauser
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size() when they were added. Do so now and also add the corresponding policy entries: Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Mike Manning <mmanning@brocade.com> Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag") Fixes: 99f906e9ad7b ("bridge: add per-port broadcast flood flag") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-05CIFS: add misssing SFM mapping for doublequoteBjörn Jacke
SFM is mapping doublequote to 0xF020 Without this patch creating files with doublequote fails to Windows/Mac Signed-off-by: Bjoern Jacke <bjacke@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> CC: stable <stable@vger.kernel.org>
2017-05-05Merge branch 'backup-thermal-shutdown' into nextZhang Rui
2017-05-05Merge branches 'thermal-core' and 'thermal-intel' into nextZhang Rui
2017-05-05arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUSCatalin Marinas
While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit 44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable() have been broken by passing a physically contiguous vm_struct with an invalid pages pointer through the common iommu API. Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA, this patch simply reuses the existing swiotlb logic for mmap and get_sgtable. Note that the current implementation of get_sgtable (both swiotlb and iommu) is broken if dma_declare_coherent_memory() is used since such memory does not have a corresponding struct page. To be addressed in a subsequent patch. Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Reported-by: Andrzej Hajda <a.hajda@samsung.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-05-05befs: make export work with cold dcacheFabian Frederick
based on commit b3b42c0deaa1 ("fs/affs: make export work with cold dcache") This adds get_parent function so that nfs client can still work after cache drop (Tested on NFS v4 with echo 3 > /proc/sys/vm/drop_caches) Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
2017-05-05ovl: update documentation w.r.t. constant inode numbersAmir Goldstein
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: persistent inode numbers for upper hardlinksAmir Goldstein
An upper type non directory dentry that is a copy up target should have a reference to its lower copy up origin. There are three ways for an upper type dentry to be instantiated: 1. A lower type dentry that is being copied up 2. An entry that is found in upper dir by ovl_lookup() 3. A negative dentry is hardlinked to an upper type dentry In the first case, the lower reference is set before copy up. In the second case, the lower reference is found by ovl_lookup(). In the last case of hardlinked upper dentry, it is not easy to update the lower reference of the negative dentry. Instead, drop the newly hardlinked negative dentry from dcache and let the next access call ovl_lookup() to find its lower reference. This makes sure that the inode number reported by stat(2) after the hardlink is created is the same inode number that will be reported by stat(2) after mount cycle, which is the inode number of the lower copy up origin of the hardlink source. NOTE that this does not fix breaking of lower hardlinks on copy up, but only fixes the case of lower nlink == 1, whose upper copy up inode is hardlinked in upper dir. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: merge getattr for dir and nondirMiklos Szeredi
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: constant st_ino/st_dev across copy upAmir Goldstein
When all layers are on the same underlying filesystem, let stat(2) return st_dev/st_ino values of the copy up origin inode if it is known. This results in constant st_ino/st_dev representation of files in an overlay mount before and after copy up. When the underlying filesystem support NFS exportfs, the result is also persistent st_ino/st_dev representation before and after mount cycle. Lower hardlinks are broken on copy up to different upper files, so we cannot use the lower origin st_ino for those different files, even for the same fs case. When all overlay layers are on the same fs, use overlay st_dev for non-dirs to get the correct result from du -x. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: persistent inode number for directoriesAmir Goldstein
stat(2) on overlay directories reports the overlay temp inode number, which is constant across copy up, but is not persistent. When all layers are on the same fs, report the copy up origin inode number for directories. This inode number is persistent, unique across the overlay mount and constant across copy up. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: set the ORIGIN type flagAmir Goldstein
For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE. Define a new type flag OVL_TYPE_ORIGIN to indicate that an entry holds a reference to its lower copy up origin. For directory entries ORIGIN := MERGE && UPPER. For non-dir entries ORIGIN means that a lower type dentry has been recently copied up or that we were able to find the copy up origin from overlay.origin xattr. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: lookup non-dir copy-up-origin by file handleAmir Goldstein
If overlay.origin xattr is found on a non-dir upper inode try to get lower dentry by calling exportfs_decode_fh(). On failure to lookup by file handle to lower layer, do not lookup the copy up origin by name, because the lower found by name could be another file in case the upper file was renamed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: use an auxiliary var for overlay root entryAmir Goldstein
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: store file handle of lower inode on copy upAmir Goldstein
Sometimes it is interesting to know if an upper file is pure upper or a copy up target, and if it is a copy up target, it may be interesting to find the copy up origin. This will be used to preserve lower inode numbers across copy up. Store the lower inode file handle in upper inode extended attribute overlay.origin on copy up to use it later for these cases. Store the lower filesystem uuid along side the file handle, so we can validate that we are looking for the origin file in the original fs. If lower fs does not support NFS export ops store a zero sized xattr so we can always use the overlay.origin xattr to distinguish between a copy up and a pure upper inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05ovl: check if all layers are on the same fsAmir Goldstein
Some features can only work when all layers are on the same fs. Test this condition during mount time, so features can check them later. Add helper ovl_same_sb() to return the common super block in case all layers are on the same fs. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-05-05xen/x86: Do not call xen_init_time_ops() until shared_info is initializedBoris Ostrovsky
Routines that are set by xen_init_time_ops() use shared_info's pvclock_vcpu_time_info area. This area is not properly available until shared_info is mapped in xen_setup_shared_info(). This became especially problematic due to commit dd759d93f4dd ("x86/timers: Add simple udelay calibration") where we end up reading tsc_to_system_mul from xen_dummy_shared_info (i.e. getting zero value) and then trying to divide by it in pvclock_tsc_khz(). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-05x86/xen: fix xsave capability settingJuergen Gross
Commit 690b7f10b4f9f ("x86/xen: use capabilities instead of fake cpuid values for xsave") introduced a regression as it tried to make use of the fixup feature before it being available. Fall back to the old variant testing via cpuid(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-05kvm: nVMX: Don't validate disabled secondary controlsJim Mattson
According to the SDM, if the "activate secondary controls" primary processor-based VM-execution control is 0, no checks are performed on the secondary processor-based VM-execution controls. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-05thermal: core: Add a back up thermal shutdown mechanismKeerthy
orderly_poweroff is triggered when a graceful shutdown of system is desired. This may be used in many critical states of the kernel such as when subsystems detects conditions such as critical temperature conditions. However, in certain conditions in system boot up sequences like those in the middle of driver probes being initiated, userspace will be unable to power off the system in a clean manner and leaves the system in a critical state. In cases like these, the /sbin/poweroff will return success (having forked off to attempt powering off the system. However, the system overall will fail to completely poweroff (since other modules will be probed) and the system is still functional with no userspace (since that would have shut itself off). However, there is no clean way of detecting such failure of userspace powering off the system. In such scenarios, it is necessary for a backup workqueue to be able to force a shutdown of the system when orderly shutdown is not successful after a configurable time period. Reported-by: Nishanth Menon <nm@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05thermal: core: Allow orderly_poweroff to be called only onceKeerthy
thermal_zone_device_check --> thermal_zone_device_update --> handle_thermal_trip --> handle_critical_trips --> orderly_poweroff The above sequence happens every 250/500 mS based on the configuration. The orderly_poweroff function is getting called every 250/500 mS. With a full fledged file system it takes at least 5-10 Seconds to power off gracefully. In that period due to the thermal_zone_device_check triggering periodically the thermal work queues bombard with orderly_poweroff calls multiple times eventually leading to failures in gracefully powering off the system. Make sure that orderly_poweroff is called only once. Signed-off-by: Keerthy <j-keerthy@ti.com> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05Thermal: Intel SoC DTS: Change interrupt request behaviorBrian Bian
The interrupt request call in Intel SoC DTS driver may fail if there is no underlying BIOS support. However, the user space thermal daemon can still use the thermal zones created by the SoC DTS driver in polling mode, therefore, instead of bailing out on interrupt request failures, it is better just to log a warning message and continue the init process. Signed-off-by: Brian Bian <brian.bian@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2017-05-05trace: thermal: add another parameter 'power' to the tracing functionLukasz Luba
This patch adds another parameter to the trace function: trace_thermal_power_devfreq_get_power(). In case when we call directly driver's code for the real power, we do not have static/dynamic_power values. Instead we get total power in the '*power' value. The 'static_power' and 'dynamic_power' are set to 0. Therefore, we have to trace that '*power' value in this scenario. CC: Steven Rostedt <rostedt@goodmis.org> CC: Ingo Molnar <mingo@redhat.com> CC: Zhang Rui <rui.zhang@intel.com> CC: Eduardo Valentin <edubezval@gmail.com> Acked-by: Javi Merino <javi.merino@kernel.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
2017-05-05thermal: devfreq_cooling: add new interface for direct power readLukasz Luba
This patch introduces a new interface for device drivers connected to devfreq_cooling in the thermal framework: get_real_power(). Some devices have more sophisticated methods (like power counters) to approximate the actual power that they use. In the previous implementation we had a pre-calculated power table which was then scaled by 'utilization' ('busy_time' and 'total_time' taken from devfreq 'last_status'). With this new interface the driver can provide more precise data regarding actual power to the thermal governor every time the power budget is calculated. We then use this value and calculate the real resource utilization scaling factor. Reviewed-by: Chris Diamand <chris.diamand@arm.com> Acked-by: Javi Merino <javi.merino@kernel.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
2017-05-05thermal: devfreq_cooling: refactor code and add get_voltage functionLukasz Luba
Move the code which gets the voltage for a given frequency. This code will be resused in few places. Acked-by: Javi Merino <javi.merino@kernel.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
2017-05-05initramfs: Always do fput() and load modules after rootfs populateStafford Horne
In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") Cc: stable@vger.kernel.org Cc: Lokesh Vutla <lokeshvutla@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stafford Horne <shorne@gmail.com>
2017-05-04Merge branch 'for-4.12/dax' into libnvdimm-for-nextDan Williams
2017-05-05x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang ↵Matthias Kaehlcke
incompatibility The constraint "rm" allows the compiler to put mix_const into memory. When the input operand is a memory location then MUL needs an operand size suffix, since Clang can't infer the multiplication width from the operand. Add and use the _ASM_MUL macro which determines the operand size and resolves to the NUL instruction with the corresponding suffix. This fixes the following error when building with clang: CC arch/x86/lib/kaslr.o /tmp/kaslr-dfe1ad.s: Assembler messages: /tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Cc: Grant Grundler <grundler@chromium.org> Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Davidson <md@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170501224741.133938-1-mka@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-05powerpc/64e: Don't place the stack beyond TASK_SIZEScott Wood
Commit f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") increased the task size on book3s, and introduced a mechanism to dynamically control whether a task uses these larger addresses. While the change to the task size itself was ifdef-protected to only apply on book3s, the change to STACK_TOP_USER64 was not. On book3e, this had the effect of trying to use addresses up to 128TiB for the stack despite a 64TiB task size limit -- which broke 64-bit userspace producing the following errors: Starting init: /sbin/init exists but couldn't execute it (error -14) Starting init: /bin/sh exists but couldn't execute it (error -14) Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Scott Wood <oss@buserror.net>
2017-05-05x86/mm: Fix boot crash caused by incorrect loop count calculation in ↵Baoquan He
sync_global_pgds() Jeff Moyer reported that on his system with two memory regions 0~64G and 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR will make the system hang intermittently during boot. While adding 'nokaslr' won't. The back trace is: Oops: 0000 [#1] SMP RIP: memcpy_erms() [ .... ] Call Trace: pmem_rw_page() bdev_read_page() do_mpage_readpage() mpage_readpages() blkdev_readpages() __do_page_cache_readahead() force_page_cache_readahead() page_cache_sync_readahead() generic_file_read_iter() blkdev_read_iter() __vfs_read() vfs_read() SyS_read() entry_SYSCALL_64_fastpath() This crash happens because the for loop count calculation in sync_global_pgds() is not correct. When a mapping area crosses PGD entries, we should calculate the starting address of region which next PGD covers and assign it to next for loop count, but not add PGDIR_SIZE directly. The old code works right only if the mapping area is an exact multiple of PGDIR_SIZE, otherwize the end region could be skipped so that it can't be synchronized to all other processes from kernel PGD init_mm.pgd. In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it makes this area be mapped inside one PGD entry. With KASLR enabled, this area could cross two PGD entries, then the next PGD entry won't be synced to all other processes. That is why we saw empty PGD. Fix it. Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jinbum Park <jinb.park7@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-05Merge branch 'linus' into x86/urgent, to pick up dependent commitsIngo Molnar
We are going to fix a bug introduced by a more recent commit, so refresh the tree. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-05stackprotector: Increase the per-task stack canary's random range from 32 ↵Daniel Micay
bits to 64 bits on 64-bit platforms The stack canary is an 'unsigned long' and should be fully initialized to random data rather than only 32 bits of random data. Signed-off-by: Daniel Micay <danielmicay@gmail.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arjan van Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-05x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic()Josh Poimboeuf
Andrey Konovalov reported the following warning while fuzzing the kernel with syzkaller: WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0 The unwinder dump revealed that RBP had a bad value when an interrupt occurred in csum_partial_copy_generic(). That function saves RBP on the stack and then overwrites it, using it as a scratch register. That's problematic because it breaks stack traces if an interrupt occurs in the middle of the function. Replace the usage of RBP with another callee-saved register (R15) so stack traces are no longer affected. Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: David S . Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: linux-sctp@vger.kernel.org Cc: netdev <netdev@vger.kernel.org> Cc: syzkaller <syzkaller@googlegroups.com> Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-04tcmu: fix module removal due to stuck threadMike Christie
We need to do a kthread_should_stop to check when kthread_stop has been called. This was a regression added in b6df4b79a5514a9c6c53533436704129ef45bf76 tcmu: Add global data block pool support so not sure if you wanted to merge it in with that patch or what. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04target: Don't force session reset if queue_depth does not changeNicholas Bellinger
Keeping in the idempotent nature of target_core_fabric_configfs.c, if a queue_depth value is set and it's the same as the existing value, don't attempt to force session reinstatement. Reported-by: Raghu Krishnamurthy <rk@datera.io> Cc: Raghu Krishnamurthy <rk@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatementNicholas Bellinger
While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit d36ad77f702 to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo <ghg@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04target: Fix compare_and_write_callback handling for non GOOD statusNicholas Bellinger
Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit 9b2792c3da1, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari <wgb@datera.io> Cc: Bill Borsari <wgb@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04libnvdimm, pfn: fix 'npfns' vs section alignmentDan Williams
Fix failures to create namespaces due to the vmem_altmap not advertising enough free space to store the memmap. WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0 [..] Call Trace: dump_stack+0x63/0x83 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xde/0xf0 devm_memremap_pages+0x244/0x440 pmem_attach_disk+0x37e/0x490 [nd_pmem] nd_pmem_probe+0x7e/0xa0 [nd_pmem] nvdimm_bus_probe+0x71/0x120 [libnvdimm] driver_probe_device+0x2bb/0x460 bind_store+0x114/0x160 drv_attr_store+0x25/0x30 In commit 658922e57b84 "libnvdimm, pfn: fix memmap reservation sizing" we arranged for the capacity to be allocated, but failed to also update the 'npfns' parameter. This leads to cases where there is enough capacity reserved to hold all the allocated sections, but vmemmap_populate_hugepages() still encounters -ENOMEM from altmap_alloc_block_buf(). This fix is a stop-gap until we can teach the core memory hotplug implementation to permit sub-section hotplug. Cc: <stable@vger.kernel.org> Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing") Reported-by: Anisha Allada <anisha.allada@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-05-04Merge tag 'char-misc-4.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of new char/misc driver drivers and features for 4.12-rc1. There's lots of new drivers added this time around, new firmware drivers from Google, more auxdisplay drivers, extcon drivers, fpga drivers, and a bunch of other driver updates. Nothing major, except if you happen to have the hardware for these drivers, and then you will be happy :) All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits) firmware: google memconsole: Fix return value check in platform_memconsole_init() firmware: Google VPD: Fix return value check in vpd_platform_init() goldfish_pipe: fix build warning about using too much stack. goldfish_pipe: An implementation of more parallel pipe fpga fr br: update supported version numbers fpga: region: release FPGA region reference in error path fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe() mei: drop the TODO from samples firmware: Google VPD sysfs driver firmware: Google VPD: import lib_vpd source files misc: lkdtm: Add volatile to intentional NULL pointer reference eeprom: idt_89hpesx: Add OF device ID table misc: ds1682: Add OF device ID table misc: tsl2550: Add OF device ID table w1: Remove unneeded use of assert() and remove w1_log.h w1: Use kernel common min() implementation uio_mf624: Align memory regions to page size and set correct offsets uio_mf624: Refactor memory info initialization uio: Allow handling of non page-aligned memory regions hangcheck-timer: Fix typo in comment ...
2017-05-05drm: Document code of conductDaniel Vetter
freedesktop.org has adopted a formal&enforced code of conduct: https://www.fooishbar.org/blog/fdo-contributor-covenant/ https://www.freedesktop.org/wiki/CodeOfConduct/ Besides formalizing things a bit more I don't think this changes anything for us, we've already peer-enforced respectful and constructive interactions since a long time. But it's good to document things properly. v2: Drop confusing note from commit message and clarify the grammer (Chris, Alex and others). Cc: Daniel Stone <daniels@collabora.com> Cc: Keith Packard <keithp@keithp.com> Cc: tfheen@err.no Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Archit Taneja <architt@codeaurora.org> Reviewed-by: Martin Peres <martin.peres@free.fr> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Vincent Abriou <vincent.abriou@st.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Brian Starkey <brian.starkey@arm.com> Acked-by: Rob Clark <robdclark@gmail.com> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Gustavo Padovan <gustavo.padovan@collabora.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Keith Packard <keithp@keithp.com> Acked-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Acked-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-05Merge tag 'drm/tegra/for-4.12-rc1' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v4.12-rc1 This contains various fixes to the host1x driver as well as a plug for a leak of kernel pointers to userspace. A fairly big addition this time around is the Video Image Composer (VIC) support that can be used to accelerate some 2D and image compositing operations. Furthermore the driver now supports FB modifiers, so we no longer rely on a custom IOCTL to set those. Finally this contains a few preparatory patches for Tegra186 support which unfortunately didn't quite make it this time, but will hopefully be ready for v4.13. * tag 'drm/tegra/for-4.12-rc1' of git://anongit.freedesktop.org/tegra/linux: gpu: host1x: Fix host1x driver shutdown gpu: host1x: Support module reset gpu: host1x: Sort includes alphabetically drm/tegra: Add VIC support dt-bindings: Add bindings for the Tegra VIC drm/tegra: Add falcon helper library drm/tegra: Add Tegra DRM allocation API drm/tegra: Add tiling FB modifiers drm/tegra: Don't leak kernel pointer to userspace drm/tegra: Protect IOMMU operations by mutex drm/tegra: Enable IOVA API when IOMMU support is enabled gpu: host1x: Add IOMMU support gpu: host1x: Fix potential out-of-bounds access iommu/iova: Fix compile error with CONFIG_IOMMU_IOVA=m iommu: Add dummy implementations for !IOMMU_IOVA MAINTAINERS: Add related headers to IOMMU section iommu/iova: Consolidate code for adding new node to iovad domain rbtree
2017-05-04Merge tag 'driver-core-4.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Very tiny pull request for 4.12-rc1 for the driver core this time around. There are some documentation fixes, an eventpoll.h fixup to make it easier for the libc developers to take our header files directly, and some very minor driver core fixes and changes. All have been in linux-next for a very long time with no reported issues" * tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "kref: double kref_put() in my_data_handler()" driver core: don't initialize 'parent' in device_add() drivers: base: dma-mapping: use nth_page helper Documentation/ABI: add information about cpu_capacity debugfs: set no_llseek in DEFINE_DEBUGFS_ATTRIBUTE eventpoll.h: add missing epoll event masks eventpoll.h: fix epoll event masks
2017-05-04Merge tag 'usb-4.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB updates from Greg KH: "Here is the big USB patchset for 4.12-rc1. Lots of good stuff here, after many many many attempts, the kernel finally has a working typeC interface, many thanks to Heikki and Guenter and others who have taken the time to get this merged. It wasn't an easy path for them at all. There's also a staging driver that uses this new api, which is why it's coming in through this tree. Along with that, there's the usual huge number of changes for gadget drivers, xhci, and other stuff. Johan also finally refactored pretty much every driver that was looking at USB endpoints to do it in a common way, which will help prevent any "badly-formed" devices from causing problems in drivers. That too wasn't a simple task. All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (263 commits) staging: typec: Fairchild FUSB302 Type-c chip driver staging: typec: Type-C Port Controller Interface driver (tcpci) staging: typec: USB Type-C Port Manager (tcpm) usb: host: xhci: remove #ifdef around PM functions usb: musb: don't mark of_dev_auxdata as initdata usb: misc: legousbtower: Fix buffers on stack USB: Revert "cdc-wdm: fix "out-of-sync" due to missing notifications" usb: Make sure usb/phy/of gets built-in USB: storage: e-mail update in drivers/usb/storage/unusual_devs.h usb: host: xhci: print correct command ring address usb: host: xhci: delete sp_dma_buffers for scratchpad usb: host: xhci: using correct specification chapter reference for DCBAAP xhci: switch to pci_alloc_irq_vectors usb: host: xhci-plat: set resume_quirk() for R-Car controllers usb: host: xhci-plat: add resume_quirk() usb: host: xhci-plat: enable clk in resume timing usb: host: plat: Enable xHCI plat runtime PM USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit USB: serial: constify static arrays usb: fix some references for /proc/bus/usb ...
2017-05-04btrfs: fix the gfp_mask for the reada_zones radix treeChris Mason
Commits cc8385b59e17 and 7ef70b4d9987a7 added preallocation for the reada radix trees and also switched them over to GFP_KERNEL for the default gfp mask. Since we're doing radix tree insertions under spinlocks, we need to make sure the mask doesn't allow sleeping. This fix keeps the radix preallocation but switches back to the original gfp_mask. Reported-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-05-04Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()David Howells
stat/lstat/fstatat need to pass AT_NO_AUTOMOUNT to vfs_statx() as the pre-statx code didn't set LOOKUP_AUTOMOUNT, even though fstatat() accepted the AT_NO_AUTOMOUNT flag. Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available") Reported-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ian Kent <raven@themaw.net> cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-04rxe: expose num_possible_cpus() cnum_comp_vectorsSagi Grimberg
They're completely logical, so don't impose an artificial limitation. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-04IB/rxe: Update caller's CRC for RXE_MEM_TYPE_DMA memory typeLeon Romanovsky
Callers of rxe_mem_copy() provide pointer to store updated CRC value. That pointer was supposed to be updated, but the commit cee2688e3cd6 ("IB/rxe: Offload CRC calculation when possible") mistakenly removed that assignment for RXE_MEM_TYPE_DMA memory type. The code worked because there are no actual callers with RXE_MEM_TYPE_DMA, who are interested in returned value of crcp. The one caller in read_reply(), who uses the returned crcp didn't set RXE_MEM_TYPE_DMA as mem->type. Fixes: cee2688e3cd6 ("IB/rxe: Offload CRC calculation when possible") Reported-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>