summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-09-08mm: introduce vma_is_anonymous(vma) helperOleg Nesterov
special_mapping_fault() is absolutely broken. It seems it was always wrong, but this didn't matter until vdso/vvar started to use more than one page. And after this change vma_is_anonymous() becomes really trivial, it simply checks vm_ops == NULL. However, I do think the helper makes sense. There are a lot of ->vm_ops != NULL checks, the helper makes the caller's code more understandable (self-documented) and this is more grep-friendly. This patch (of 3): Preparation. Add the new simple helper, vma_is_anonymous(vma), and change handle_pte_fault() to use it. It will have more users. The name is not accurate, say a hpet_mmap()'ed vma is not anonymous. Perhaps it should be named vma_has_fault() instead. But it matches the logic in mmap.c/memory.c (see next changes). "True" just means that a page fault will use do_anonymous_page(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08selftests/userfaultfd: fix compiler warnings on 32-bitGeert Uytterhoeven
On 32-bit: userfaultfd.c: In function 'locking_thread': userfaultfd.c:152: warning: left shift count >= width of type userfaultfd.c: In function 'uffd_poll_thread': userfaultfd.c:295: warning: cast to pointer from integer of different size userfaultfd.c: In function 'uffd_read_thread': userfaultfd.c:332: warning: cast to pointer from integer of different size Fix the shift warning by splitting the shift in two parts, and the integer/pointer warnigns by adding intermediate casts to "unsigned long". Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08cgroup: fix seq_show_option merge with legacy_nameKees Cook
When seq_show_option (commit a068acf2ee77: "fs: create and use seq_show_option for escaping") was merged, it did not correctly collide with cgroup's addition of legacy_name (commit 3e1d2eed39d8: "cgroup: introduce cgroup_subsys->legacy_name") changes. This fixes the reported name. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08Merge tag 'libnvdimm-for-4.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "This update has successfully completed a 0day-kbuild run and has appeared in a linux-next release. The changes outside of the typical drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and the introduction of ZONE_DEVICE + devm_memremap_pages(). Summary: - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic mechanism for adding device-driver-discovered memory regions to the kernel's direct map. This facility is used by the pmem driver to enable pfn_to_page() operations on the page frames returned by DAX ('direct_access' in 'struct block_device_operations'). For now, the 'memmap' allocation for these "device" pages comes from "System RAM". Support for allocating the memmap from device memory will arrive in a later kernel. - Introduce memremap() to replace usages of ioremap_cache() and ioremap_wt(). memremap() drops the __iomem annotation for these mappings to memory that do not have i/o side effects. The replacement of ioremap_cache() with memremap() is limited to the pmem driver to ease merging the api change in v4.3. Completion of the conversion is targeted for v4.4. - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem driver, update the VFS DAX implementation and PMEM api to provide persistence guarantees for kernel operations on a DAX mapping. - Convert the ACPI NFIT 'BLK' driver to map the block apertures as cacheable to improve performance. - Miscellaneous updates and fixes to libnvdimm including support for issuing "address range scrub" commands, clarifying the optimal 'sector size' of pmem devices, a clarification of the usage of the ACPI '_STA' (status) property for DIMM devices, and other minor fixes" * tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits) libnvdimm, pmem: direct map legacy pmem by default libnvdimm, pmem: 'struct page' for pmem libnvdimm, pfn: 'struct page' provider infrastructure x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB add devm_memremap_pages mm: ZONE_DEVICE for "device memory" mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h dax: drop size parameter to ->direct_access() nd_blk: change aperture mapping from WC to WB nvdimm: change to use generic kvfree() pmem, dax: have direct_access use __pmem annotation dax: update I/O path to do proper PMEM flushing pmem: add copy_from_iter_pmem() and clear_pmem() pmem, x86: clean up conditional pmem includes pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: move x86 PMEM API to new pmem.h header libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option pmem: switch to devm_ allocations devres: add devm_memremap libnvdimm, btt: write and validate parent_uuid ...
2015-09-08Merge branch 'misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull misc kbuild updates from Michal Marek: - deb-pkg: + module signing fix + dtb files are added to the package + do not require `hostname -f` to work during build + make deb-pkg generates a source package, bindeb-pkg has been added to only generate the binary package - rpm-pkg packages /lib/modules as well - new coccinelle patch and updates to existing ones - new stackusage & stackdelta script to collect and compare stack usage info (using gcc's -fstack-usage) - make tags understands trace_*_rcuidle() macros - .gitignore updates, misc cleanups * 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (27 commits) deb-pkg: add source package package/Makefile: move source tar creation to a function scripts: add stackdelta script kbuild: remove *.su files generated by -fstack-usage .gitignore: add *.su pattern scripts: add stackusage script kbuild: avoid listing /lib/modules in kernel spec file fallback to hostname in scripts/package/builddeb coccinelle: api: extend spatch for dropping unnecessary owner deb-pkg: simplify directory creation scripts/tags.sh: Include trace_*_rcuidle() in tags scripts/package/Makefile: rpmbuild is needed for rpm targets Kbuild: Add ID files to .gitignore gitignore: Add MIPS vmlinux.32 to the list coccinelle: simple_return: Add a blank line coccinelle: irqf_oneshot.cocci: Improve the generated commit log coccinelle: api: add vma_pages.cocci scripts/coccinelle/misc/irqf_oneshot.cocci: Fix grammar scripts/coccinelle/misc/semicolon.cocci: Use imperative mood coccinelle: simple_open: Use imperative mood ...
2015-09-08Merge branch 'kconfig' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kconfig updates from Michal Marek: - kconfig warns about junk characters in Kconfig files - merge_config.sh error handling - small cleanup * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: merge_config.sh: exit on missing input files kconfig: Regenerate shipped zconf.{hash,lex}.c files kconfig: warn of unhandled characters in Kconfig commands kconfig: Delete unnecessary checks before the function call "sym_calc_value"
2015-09-08Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull core kbuild updates from Michal Marek: - modpost portability fix - linker script fix - genksyms segfault fix - fixdep cleanup - fix for clang detection * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kbuild: Fix clang detection kbuild: fixdep: drop meaningless hash table initialization kbuild: fixdep: optimize code slightly genksyms: Regenerate parser genksyms: Duplicate function pointer type definitions segfault kbuild: Fix .text.unlikely placement Avoid conflict with host definitions when cross-compiling
2015-09-08Merge tag 'trace-v4.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing update from Steven Rostedt: "Mostly this is just clean ups and micro optimizations. The changes with more meat are: - Allowing the trace event filters to filter on CPU number and process ids - Two new markers for trace output latency were added (10 and 100 msec latencies) - Have tracing_thresh filter function profiling time I also worked on modifying the ring buffer code for some future work, and moved the adding of the timestamp around. One of my changes caused a regression, and since other changes were built on top of it and already tested, I had to operate a revert of that change. Instead of rebasing, this change set has the code that caused a regression as well as the code to revert that change without touching the other changes that were made on top of it" * tag 'trace-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Revert "ring-buffer: Get timestamp after event is allocated" tracing: Don't make assumptions about length of string on task rename tracing: Allow triggers to filter for CPU ids and process names ftrace: Format MCOUNT_ADDR address as type unsigned long tracing: Introduce two additional marks for delay ftrace: Fix function_graph duration spacing with 7-digits ftrace: add tracing_thresh to function profile tracing: Clean up stack tracing and fix fentry updates ring-buffer: Reorganize function locations ring-buffer: Make sure event has enough room for extend and padding ring-buffer: Get timestamp after event is allocated ring-buffer: Move the adding of the extended timestamp out of line ring-buffer: Add event descriptor to simplify passing data ftrace: correct the counter increment for trace_buffer data tracing: Fix for non-continuous cpu ids tracing: Prefer kcalloc over kzalloc with multiply
2015-09-08device property: Don't overwrite addr when failing in device_get_mac_addressJulien Grall
The function device_get_mac_address is trying different property names in order to get the mac address. To check the return value, the variable addr (which contain the buffer pass by the caller) will be re-used. This means that if the previous property is not found, the next property will be read using a NULL buffer. Therefore it's only possible to retrieve the mac if node contains a property "mac-address". Fix it by using a temporary buffer for the return value. This has been introduced by commit 4c96b7dc0d393f12c17e0d81db15aa4a820a6ab3 "Add a matching set of device_ functions for determining mac/phy" Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-08Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds
Pull audit update from Paul Moore: "This is one of the larger audit patchsets in recent history, consisting of eight patches and almost 400 lines of changes. The bulk of the patchset is the new "audit by executable" functionality which allows admins to set an audit watch based on the executable on disk. Prior to this, admins could only track an application by PID, which has some obvious limitations. Beyond the new functionality we also have some refcnt fixes and a few minor cleanups" * 'upstream' of git://git.infradead.org/users/pcmoore/audit: fixup: audit: implement audit by executable audit: implement audit by executable audit: clean simple fsnotify implementation audit: use macros for unset inode and device values audit: make audit_del_rule() more robust audit: fix uninitialized variable in audit_add_rule() audit: eliminate unnecessary extra layer of watch parent references audit: eliminate unnecessary extra layer of watch references
2015-09-08usbnet: Fix a race between usbnet_stop() and the BHEugene Shatokhin
The race may happen when a device (e.g. YOTA 4G LTE Modem) is unplugged while the system is downloading a large file from the Net. Hardware breakpoints and Kprobes with delays were used to confirm that the race does actually happen. The race is on skb_queue ('next' pointer) between usbnet_stop() and rx_complete(), which, in turn, calls usbnet_bh(). Here is a part of the call stack with the code where the changes to the queue happen. The line numbers are for the kernel 4.1.0: *0 __skb_unlink (skbuff.h:1517) prev->next = next; *1 defer_bh (usbnet.c:430) spin_lock_irqsave(&list->lock, flags); old_state = entry->state; entry->state = state; __skb_unlink(skb, list); spin_unlock(&list->lock); spin_lock(&dev->done.lock); __skb_queue_tail(&dev->done, skb); if (dev->done.qlen == 1) tasklet_schedule(&dev->bh); spin_unlock_irqrestore(&dev->done.lock, flags); *2 rx_complete (usbnet.c:640) state = defer_bh(dev, skb, &dev->rxq, state); At the same time, the following code repeatedly checks if the queue is empty and reads these values concurrently with the above changes: *0 usbnet_terminate_urbs (usbnet.c:765) /* maybe wait for deletions to finish. */ while (!skb_queue_empty(&dev->rxq) && !skb_queue_empty(&dev->txq) && !skb_queue_empty(&dev->done)) { schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS)); set_current_state(TASK_UNINTERRUPTIBLE); netif_dbg(dev, ifdown, dev->net, "waited for %d urb completions\n", temp); } *1 usbnet_stop (usbnet.c:806) if (!(info->flags & FLAG_AVOID_UNLINK_URBS)) usbnet_terminate_urbs(dev); As a result, it is possible, for example, that the skb is removed from dev->rxq by __skb_unlink() before the check "!skb_queue_empty(&dev->rxq)" in usbnet_terminate_urbs() is made. It is also possible in this case that the skb is added to dev->done queue after "!skb_queue_empty(&dev->done)" is checked. So usbnet_terminate_urbs() may stop waiting and return while dev->done queue still has an item. Locking in defer_bh() and usbnet_terminate_urbs() was revisited to avoid this race. Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru> Reviewed-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-08libceph: use keepalive2 to verify the mon session is aliveYan, Zheng
Signed-off-by: Yan, Zheng <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08rbd: plug rbd_dev->header.object_prefix memory leakIlya Dryomov
Need to free object_prefix when rbd_dev_v2_snap_context() fails, but only if this is the first time we are reading in the header. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2015-09-08rbd: fix double free on rbd_dev->header_nameIlya Dryomov
If rbd_dev_image_probe() in rbd_dev_probe_parent() fails, header_name is freed twice: once in rbd_dev_probe_parent() and then in its caller rbd_dev_image_probe() (rbd_dev_image_probe() is called recursively to handle parent images). rbd_dev_probe_parent() is responsible for probing the parent, so it shouldn't muck with clone's fields. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2015-09-08libceph: set 'exists' flag for newly up osdYan, Zheng
Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08ceph: cleanup use of ceph_msg_getJianpeng Ma
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: no need to get parent inode in ceph_openJianpeng Ma
parent inode is needed in creating new inode case. For ceph_open, the target inode already exists. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: remove the useless judgementJianpeng Ma
err != 0 is already handled. So skip this. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: remove redundant test of head->safe and silence static analysis warningsBrad Hubbard
Signed-off-by: Brad Hubbard <bhubbard@redhat.com> Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: fix queuing inode to mdsdir's snaprealmYan, Zheng
During MDS failovers, MClientSnap message may cause kclient to move some inodes from root directory's snaprealm to mdsdir's snaprealm and queue snapshots for these inodes. For a FS has never created any snapshot, both root directory's snaprealm and mdsdir's snaprealm share the same snapshot contexts (both are ceph_empty_snapc). This confuses ceph_put_wrbuffer_cap_refs(), make it unable to distinguish snapshot buffers from head buffers. The fix is do not use ceph_empty_snapc as snaprealm's cached context. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08libceph: rename con_work() to ceph_con_workfn()Ilya Dryomov
Even though it's static, con_work(), being a work func, shows up in various stacktraces a lot. Prefix it with ceph_. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08libceph: Avoid holding the zero page on ceph_msgr_slab_init errorsBenoît Canet
ceph_msgr_slab_init may fail due to a temporary ENOMEM. Delay a bit the initialization of zero_page in ceph_msgr_init and reorder its cleanup in _ceph_msgr_exit so it's done in reverse order of setup. BUG_ON() will not suffer to be postponed in case it is triggered. Signed-off-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08libceph: remove the unused macro AES_KEY_SIZENicholas Krause
This removes the no longer used macro AES_KEY_SIZE as no functions use this macro anymore and thus this macro can be removed due it no longer being required. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-09-08ceph: invalidate dirty pages after forced umountYan, Zheng
After forced umount, ceph_writepages_start() skips flushing dirty pages. To make sure inode's reference count get dropped to zero, we need to invalidate dirty pages. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08ceph: EIO all operations after forced umountYan, Zheng
This patch makes try_get_cap_refs() and __do_request() check if the file system was forced umount, and return -EIO if it was. This patch also adds a helper function to drops dirty caps and wakes up blocking operation. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-09-08Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - PKCS#7 support added to support signed kexec, also utilized for module signing. See comments in 3f1e1bea. ** NOTE: this requires linking against the OpenSSL library, which must be installed, e.g. the openssl-devel on Fedora ** - Smack - add IPv6 host labeling; ignore labels on kernel threads - support smack labeling mounts which use binary mount data - SELinux: - add ioctl whitelisting (see http://kernsec.org/files/lss2015/vanderstoep.pdf) - fix mprotect PROT_EXEC regression caused by mm change - Seccomp: - add ptrace options for suspend/resume" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (57 commits) PKCS#7: Add OIDs for sha224, sha284 and sha512 hash algos and use them Documentation/Changes: Now need OpenSSL devel packages for module signing scripts: add extract-cert and sign-file to .gitignore modsign: Handle signing key in source tree modsign: Use if_changed rule for extracting cert from module signing key Move certificate handling to its own directory sign-file: Fix warning about BIO_reset() return value PKCS#7: Add MODULE_LICENSE() to test module Smack - Fix build error with bringup unconfigured sign-file: Document dependency on OpenSSL devel libraries PKCS#7: Appropriately restrict authenticated attributes and content type KEYS: Add a name for PKEY_ID_PKCS7 PKCS#7: Improve and export the X.509 ASN.1 time object decoder modsign: Use extract-cert to process CONFIG_SYSTEM_TRUSTED_KEYS extract-cert: Cope with multiple X.509 certificates in a single file sign-file: Generate CMS message as signature instead of PKCS#7 PKCS#7: Support CMS messages also [RFC5652] X.509: Change recorded SKID & AKID to not include Subject or Issuer PKCS#7: Check content type and versions MAINTAINERS: The keyrings mailing list has moved ...
2015-09-08Merge branch 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds
Pull NMI backtrace update from Russell King: "These changes convert the x86 NMI handling to be a library implementation which other architectures can make use of. Thomas Gleixner has reviewed and tested these changes, and wishes me to send these rather than taking them through the tip tree. The final patch in the set adds an initial implementation using this infrastructure to ARM, even though it doesn't send the IPI at "NMI" level. Patches are in progress to add the ARM equivalent of NMI, but we still need the IRQ-level fallback for systems where the "NMI" isn't available due to secure firmware denying access to it" * 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: add basic support for on-demand backtrace of other CPUs nmi: x86: convert to generic nmi handler nmi: create generic NMI backtrace implementation
2015-09-08Merge tag 'for-linus-4.3-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Xen features and fixes for 4.3: - Convert xen-blkfront to the multiqueue API - [arm] Support binding event channels to different VCPUs. - [x86] Support > 512 GiB in a PV guests (off by default as such a guest cannot be migrated with the current toolstack). - [x86] PMU support for PV dom0 (limited support for using perf with Xen and other guests)" * tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits) xen: switch extra memory accounting to use pfns xen: limit memory to architectural maximum xen: avoid another early crash of memory limited dom0 xen: avoid early crash of memory limited dom0 arm/xen: Remove helpers which are PV specific xen/x86: Don't try to set PCE bit in CR4 xen/PMU: PMU emulation code xen/PMU: Intercept PMU-related MSR and APIC accesses xen/PMU: Describe vendor-specific PMU registers xen/PMU: Initialization code for Xen PMU xen/PMU: Sysfs interface for setting Xen PMU mode xen: xensyms support xen: remove no longer needed p2m.h xen: allow more than 512 GB of RAM for 64 bit pv-domains xen: move p2m list if conflicting with e820 map xen: add explicit memblock_reserve() calls for special pages mm: provide early_memremap_ro to establish read-only mapping xen: check for initrd conflicting with e820 map xen: check pre-allocated page tables for conflict with memory map xen: check for kernel memory conflicting with memory layout ...
2015-09-08Merge branch 'irq-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more irq updates from Thomas Gleixner: "The second part of irq related updates: - Provide EOImode for GIC[V3] irq chips, which is a prerequisite for direct interrupt handling in [KVM] guests" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/GIC: Fix EOImode setting for non-DT/ACPI systems irqchip/GIC: Don't deactivate interrupts forwarded to a guest irqchip/GIC: Convert to EOImode == 1 irqchip/GICv3: Don't deactivate interrupts forwarded to a guest irqchip/GICv3: Convert to EOImode == 1
2015-09-08Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68k/colfire fixes from Greg Ungerer: "Only a couple of patches this time. One migrating the clock driver code to the new set-state interface. The other cleaning up to use the PFN_DOWN macro" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k/coldfire: use PFN_DOWN macro m68k/coldfire/pit: Migrate to new 'set-state' interface
2015-09-08Merge tag 'ecryptfs-4.3-rc1-stale-dcache' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs fixes from Tyler Hicks: "Invalidate stale eCryptfs dcache entries caused by unlinked lower inodes" * tag 'ecryptfs-4.3-rc1-stale-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: eCryptfs: Delete a check before the function call "key_put" eCryptfs: Invalidate dcache entries when lower i_nlink is zero
2015-09-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fix from Herbert Xu: "This fixes a memory corruption bug in ghash-clmulni-intel due to insufficient memory allocation" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ghash-clmulni: specify context size for ghash async algorithm
2015-09-08userfaultfd: selftest: update userfaultfd x86 32bit syscall numberAndrea Arcangeli
It changed as result of other syscalls, and while the system call list itself was correctly updated, the selftest program was not. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfnJulien Grall
The variable xen_store_mfn is effectively storing a GFN and not an MFN. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen/privcmd: Further s/MFN/GFN/ clean-upJulien Grall
The privcmd code is mixing the usage of GFN and MFN within the same functions which make the code difficult to understand when you only work with auto-translated guests. The privcmd driver is only dealing with GFN so replace all the mention of MFN into GFN. The ioctl structure used to map foreign change has been left unchanged given that the userspace is using it. Nonetheless, add a comment to explain the expected value within the "mfn" field. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08hvc/xen: Further s/MFN/GFN clean-upJulien Grall
HVM_PARAM_CONSOLE_PFN is used to retrieved the console PFN for HVM guest. It returns a PFN (aka GFN) and not a MFN. Furthermore, use directly virt_to_gfn for both PV and HVM domain rather than doing a special case for each of the them. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08video/xen-fbfront: Further s/MFN/GFN clean-upJulien Grall
The PV driver xen-fbfront is only dealing with GFN and not MFN. Rename all the occurence of MFN to GFN. Also take the opportunity to replace to usage of pfn_to_gfn by xen_page_to_gfn. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfnJulien Grall
All the caller of xen_tmem_{get,put}_page have a struct page * in hand and call pfn_to_gfn for the only benefits of these 2 functions. Rather than passing the pfn in parameter, pass directly the page and use directly xen_page_to_gfn. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: Use correctly the Xen memory terminologiesJulien Grall
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn which may be used by all kind of guests No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a name to close to the KVM function gfn_to_page. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08arm/xen: implement correctly pfn_to_mfnJulien Grall
After the commit introducing convertion between DMA and guest addresses, all the callers of pfn_to_mfn are expecting to get a GFN (Guest Frame Number). On ARM, all the guests are auto-translated so the GFN is equal to the Linux PFN (Pseudo-physical Frame Number). The current implementation may return an MFN if the caller is passing a PFN associated to a mapped foreign grant. In pratice, I haven't seen the problem on running guest but we should fix it for the sake of correctness. Correct the implementation by always returning the pfn passed in parameter. A follow-up patch will take care to rename pfn_to_mfn to a suitable name. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: Make clear that swiotlb and biomerge are dealing with DMA addressJulien Grall
The swiotlb is required when programming a DMA address on ARM when a device is not protected by an IOMMU. In this case, the DMA address should always be equal to the machine address. For DOM0 memory, Xen ensure it by have an identity mapping between the guest address and host address. However, when mapping a foreign grant reference, the 1:1 model doesn't work. For ARM guest, most of the callers of pfn_to_mfn expects to get a GFN (Guest Frame Number), i.e a PFN (Page Frame Number) from the Linux point of view given that all ARM guest are auto-translated. Even though the name pfn_to_mfn is misleading, we need to ensure that those caller get a GFN and not by mistake a MFN. In pratical, I haven't seen error related to this but we should fix it for the sake of correctness. In order to fix the implementation of pfn_to_mfn on ARM in a follow-up patch, we have to introduce new helpers to return the DMA from a PFN and the invert. On x86, the new helpers will be an alias of pfn_to_mfn and mfn_to_pfn. The helpers will be used in swiotlb and xen_biovec_phys_mergeable. This is necessary in the latter because we have to ensure that the biovec code will not try to merge a biovec using foreign page and another using Linux memory. Lastly, the helper mfn_to_local_pfn has been renamed to bfn_to_local_pfn given that the only usage was in swiotlb. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08parisc: Use platform_device_register_simple("rtc-generic")Helge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2015-09-08parisc: Drop CONFIG_SMP around update_cr16_clocksource()Helge Deller
No need to use CONFIG_SMP around update_cr16_clocksource(). It checks for num_online_cpus() beeing greater than 1, which is always 1 in UP builds. Signed-off-by: Helge Deller <deller@gmx.de>
2015-09-08xen: switch extra memory accounting to use pfnsJuergen Gross
Instead of using physical addresses for accounting of extra memory areas available for ballooning switch to pfns as this is much less error prone regarding partial pages. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: limit memory to architectural maximumJuergen Gross
When a pv-domain (including dom0) is started it tries to size it's p2m list according to the maximum possible memory amount it ever can achieve. Limit the initial maximum memory size to the architectural limit of the hardware in order to avoid overflows during remapping of memory. This problem will occur when dom0 is started with an initial memory size being a multiple of 1GB, but without specifying it's maximum memory size. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: avoid another early crash of memory limited dom0Juergen Gross
Commit b1c9f169047b ("xen: split counting of extra memory pages...") introduced an error when dom0 was started with limited memory occurring only on some hardware. The problem arises in case dom0 is started with initial memory and maximum memory being the same. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. If all of this is true and the E820 map of the machine is sparse (some areas are not covered) then the machine might crash early in the boot process. An example E820 map triggering the problem looks like this: [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cf7fafff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cf7fb000-0x00000000cf95ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cf960000-0x00000000cfb62fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfb63000-0x00000000cfd14fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cfd15000-0x00000000cfd61fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfd62000-0x00000000cfd6cfff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000cfd6d000-0x00000000cfd6ffff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfd70000-0x00000000cfd70fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cfd71000-0x00000000cfea8fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfea9000-0x00000000cfeb9fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfeba000-0x00000000cfecafff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfecb000-0x00000000cfecbfff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfecc000-0x00000000cfedbfff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfedc000-0x00000000cfedcfff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfedd000-0x00000000cfeddfff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfede000-0x00000000cfee3fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfee4000-0x00000000cfef6fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000cfef7000-0x00000000cfefffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed61000-0x00000000fed70fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000020effffff] usable In this case the area a0000-dffff isn't present in the map. This will confuse the memory setup of the domain when remapping the memory from such holes to populated areas. To avoid the problem the accounting of to be remapped memory has to count such holes in the E820 map as well. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08xen: avoid early crash of memory limited dom0Juergen Gross
Commit b1c9f169047b ("xen: split counting of extra memory pages...") introduced an error when dom0 was started with limited memory. The problem arises in case dom0 is started with initial memory and maximum memory being the same and exactly a multiple of 1 GB. The kernel must be configured without CONFIG_XEN_BALLOON_MEMORY_HOTPLUG for the problem to happen. In this case it will crash very early during boot due to the virtual mapped p2m list not being large enough to be able to remap any memory: (XEN) Freed 304kB init memory. mapping kernel into physical memory about to get started... (XEN) traps.c:459:d0v0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S: fault at ffff82d080229a93 create_bounce_frame+0x12b/0x13a (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.5.2-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff81d120cb>] (XEN) RFLAGS: 0000000000000206 EM: 1 CONTEXT: pv guest (d0v0) (XEN) rax: ffffffff81db2000 rbx: 000000004d000000 rcx: 0000000000000000 (XEN) rdx: 000000004d000000 rsi: 0000000000063000 rdi: 000000004d063000 (XEN) rbp: ffffffff81c03d78 rsp: ffffffff81c03d28 r8: 0000000000023000 (XEN) r9: 00000001040ff000 r10: 0000000000007ff0 r11: 0000000000000000 (XEN) r12: 0000000000063000 r13: 000000000004d000 r14: 0000000000000063 (XEN) r15: 0000000000000063 cr0: 0000000080050033 cr4: 00000000000006f0 (XEN) cr3: 0000000105c0f000 cr2: ffffc90000268000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff81c03d28: (XEN) 0000000000000000 0000000000000000 ffffffff81d120cb 000000010000e030 (XEN) 0000000000010006 ffffffff81c03d68 000000000000e02b ffffffffffffffff (XEN) 0000000000000063 000000000004d063 ffffffff81c03de8 ffffffff81d130a7 (XEN) ffffffff81c03de8 000000000004d000 00000001040ff000 0000000000105db1 (XEN) 00000001040ff001 000000000004d062 ffff8800092d6ff8 0000000002027000 (XEN) ffff8800094d8340 ffff8800092d6ff8 00003ffffffff000 ffff8800092d7ff8 (XEN) ffffffff81c03e48 ffffffff81d13c43 ffff8800094d8000 ffff8800094d9000 (XEN) 0000000000000000 ffff8800092d6000 00000000092d6000 000000004cfbf000 (XEN) 00000000092d6000 00000000052d5442 0000000000000000 0000000000000000 (XEN) ffffffff81c03ed8 ffffffff81d185c1 0000000000000000 0000000000000000 (XEN) ffffffff81c03e78 ffffffff810f8ca4 ffffffff81c03ed8 ffffffff8171a15d (XEN) 0000000000000010 ffffffff81c03ee8 0000000000000000 0000000000000000 (XEN) ffffffff81f0e402 ffffffffffffffff ffffffff81dae900 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffffffff81c03f28 ffffffff81d0cf0f (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff81db82e0 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) ffffffff81c03f38 ffffffff81d0c603 ffffffff81c03ff8 ffffffff81d11c86 (XEN) 0300000100000032 0000000000000005 0000000000000020 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) Domain 0 crashed: rebooting machine in 5 seconds. This can be avoided by allocating aneough space for the p2m to cover the maximum memory of dom0 plus the identity mapped holes required for PCI space, BIOS etc. Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08parisc: Use double word condition in 64bit CAS operationJohn David Anglin
The attached change fixes the condition used in the "sub" instruction. A double word comparison is needed. This fixes the 64-bit LWS CAS operation on 64-bit kernels. I can now enable 64-bit atomic support in GCC. Cc: <stable@vger.kernel.org> Signed-off-by: John David Anglin <dave.anglin> Signed-off-by: Helge Deller <deller@gmx.de>
2015-09-08parisc: Filter out spurious interrupts in PA-RISC irq handlerHelge Deller
When detecting a serial port on newer PA-RISC machines (with iosapic) we have a long way to go to find the right IRQ line, registering it, then registering the serial port and the irq handler for the serial port. During this phase spurious interrupts for the serial port may happen which then crashes the kernel because the action handler might not have been set up yet. So, basically it's a race condition between the serial port hardware and the CPU which sets up the necessary fields in the irq sructs. The main reason for this race is, that we unmask the serial port irqs too early without having set up everything properly before (which isn't easily possible because we need the IRQ number to register the serial ports). This patch is a work-around for this problem. It adds checks to the CPU irq handler to verify if the IRQ action field has been initialized already. If not, we just skip this interrupt (which isn't critical for a serial port at bootup). The real fix would probably involve rewriting all PA-RISC specific IRQ code (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of the irq chips and proper irq enabling along this line. This bug has been in the PA-RISC port since the beginning, but the crashes happened very rarely with currently used hardware. But on the latest machine which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900, 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel crashed at every boot because of this race. So, without this patch the machine would currently be unuseable. For the record, here is the flow logic: 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq(). 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq. 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!) 5. serial_init_chip() then registers the 8250 port. Problems: - In step 4 the CPU irq shouldn't have been registered yet, but after step 5 - If serial irq happens between 4 and 5 have finished, the kernel will crash Signed-off-by: Helge Deller <deller@gmx.de>
2015-09-08parisc: Additionally check for in_atomic() in page fault handlerHelge Deller
Craig Estey noticed that we didn't checked for in_atomic() in our page fault handler like other architectures. This commit adds this check by using faulthandler_disabled() which includes a check for pagefault_disabled() and in_atomic(). Reported-by: Craig Estey <cae370@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>