summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-07-02mm/swap: fix release_pages() when releasing devmap pagesIra Weiny
release_pages() is an optimized version of a loop around put_page(). Unfortunately for devmap pages the logic is not entirely correct in release_pages(). This is because device pages can be more than type MEMORY_DEVICE_PUBLIC. There are in fact 4 types, private, public, FS DAX, and PCI P2PDMA. Some of these have specific needs to "put" the page while others do not. This logic to handle any special needs is contained in put_devmap_managed_page(). Therefore all devmap pages should be processed by this function where we can contain the correct logic for a page put. Handle all device type pages within release_pages() by calling put_devmap_managed_page() on all devmap pages. If put_devmap_managed_page() returns true the page has been put and we continue with the next page. A false return of put_devmap_managed_page() means the page did not require special processing and should fall to "normal" processing. This was found via code inspection while determining if release_pages() and the new put_user_pages() could be interchangeable.[1] [1] https://lkml.kernel.org/r/20190523172852.GA27175@iweiny-DESK2.sc.intel.com Link: https://lkml.kernel.org/r/20190605214922.17684-1-ira.weiny@intel.com Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-27mm/hmm: Fix error flows in hmm_invalidate_range_startJason Gunthorpe
If the trylock on the hmm->mirrors_sem fails the function will return without decrementing the notifiers that were previously incremented. Since the caller will not call invalidate_range_end() on EAGAIN this will result in notifiers becoming permanently incremented and deadlock. If the sync_cpu_device_pagetables() required blocking the function will not return EAGAIN even though the device continues to touch the pages. This is a violation of the mmu notifier contract. Switch, and rename, the ranges_lock to a spin lock so we can reliably obtain it without blocking during error unwind. The error unwind is necessary since the notifiers count must be held incremented across the call to sync_cpu_device_pagetables() as we cannot allow the range to become marked valid by a parallel invalidate_start/end() pair while doing sync_cpu_device_pagetables(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-24mm/hmm: Remove confusing comment and logic from hmm_releaseJason Gunthorpe
hmm_release() is called exactly once per hmm. ops->release() cannot accidentally trigger any action that would recurse back onto hmm->mirrors_sem. This fixes a use after-free race of the form: CPU0 CPU1 hmm_release() up_write(&hmm->mirrors_sem); hmm_mirror_unregister(mirror) down_write(&hmm->mirrors_sem); up_write(&hmm->mirrors_sem); kfree(mirror) mirror->ops->release(mirror) The only user we have today for ops->release is an empty function, so this is unambiguously safe. As a consequence of plugging this race drivers are not allowed to register/unregister mirrors from within a release op. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-24mm/hmm: Poison hmm_range during unregisterJason Gunthorpe
Trying to misuse a range outside its lifetime is a kernel bug. Use poison bytes to help detect this condition. Double unregister will reliably crash. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-24mm/hmm: Remove racy protection against double-unregistrationJason Gunthorpe
No other register/unregister kernel API attempts to provide this kind of protection as it is inherently racy, so just drop it. Callers should provide their own protection, and it appears nouveau already does. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18mm/hmm: Use lockdep instead of commentsJason Gunthorpe
So we can check locking at runtime. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18mm/hmm: Hold on to the mmget for the lifetime of the rangeJason Gunthorpe
Range functions like hmm_range_snapshot() and hmm_range_fault() call find_vma, which requires hodling the mmget() and the mmap_sem for the mm. Make this simpler for the callers by holding the mmget() inside the range for the lifetime of the range. Other functions that accept a range should only be called if the range is registered. This has the side effect of directly preventing hmm_release() from happening while a range is registered. That means range->dead cannot be false during the lifetime of the range, so remove dead and hmm_mirror_mm_is_alive() entirely. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18mm/hmm: Do not use list*_rcu() for hmm->rangesJason Gunthorpe
This list is always read and written while holding hmm->lock so there is no need for the confusing _rcu annotations. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <iweiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18mm/hmm: Remove duplicate condition test before wait_event_timeoutJason Gunthorpe
The wait_event_timeout macro already tests the condition as its first action, so there is no reason to open code another version of this, all that does is skip the might_sleep() debugging in common cases, which is not helpful. Further, based on prior patches, we can now simplify the required condition test: - If range is valid memory then so is range->hmm - If hmm_release() has run then range->valid is set to false at the same time as dead, so no reason to check both. - A valid hmm has a valid hmm->mm. Allowing the return value of wait_event_timeout() (along with its internal barriers) to compute the result of the function. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-18mm/hmm: Simplify hmm_get_or_create and make it reliableJason Gunthorpe
As coded this function can false-fail in various racy situations. Make it reliable and simpler by running under the write side of the mmap_sem and avoiding the false-failing compare/exchange pattern. Due to the mmap_sem this no longer has to avoid racing with a 2nd parallel hmm_get_or_create(). Unfortunately this still has to use the page_table_lock as the non-sleeping lock protecting mm->hmm, since the contexts where we free the hmm are incompatible with mmap_sem. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-10mm/hmm: Hold a mmgrab from hmm to mmJason Gunthorpe
So long as a struct hmm pointer exists, so should the struct mm it is linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it once the hmm refcount goes to zero. Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL mm->hmm delete the hmm_hmm_destroy(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-10mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_registerJason Gunthorpe
Ralph observes that hmm_range_register() can only be called by a driver while a mirror is registered. Make this clear in the API by passing in the mirror structure as a parameter. This also simplifies understanding the lifetime model for struct hmm, as the hmm pointer must be valid as part of a registered mirror so all we need in hmm_register_range() is a simple kref_get. Suggested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-07mm/hmm: fix use after free with struct hmm in the mmu notifiersJason Gunthorpe
mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier system will continue to reference hmm->mn until the srcu grace period expires. Resulting in use after free races like this: CPU0 CPU1 __mmu_notifier_invalidate_range_start() srcu_read_lock hlist_for_each () // mn == hmm->mn hmm_mirror_unregister() hmm_put() hmm_free() mmu_notifier_unregister_no_release() hlist_del_init_rcu(hmm-mn->list) mn->ops->invalidate_range_start(mn, range); mm_get_hmm() mm->hmm = NULL; kfree(hmm) mutex_lock(&hmm->lock); Use SRCU to kfree the hmm memory so that the notifiers can rely on hmm existing. Get the now-safe hmm struct through container_of and directly check kref_get_unless_zero to lock it against free. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-06mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blockingKuehling, Felix
Don't set this flag by default in hmm_vma_do_fault. It is set conditionally just a few lines below. Setting it unconditionally can lead to handle_mm_fault doing a non-blocking fault, returning -EBUSY and unlocking mmap_sem unexpectedly. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-06mm/hmm: support automatic NUMA balancingPhilip Yang
While the page is migrating by NUMA balancing, HMM failed to detect this condition and still return the old page. Application will use the new page migrated, but driver pass the old page physical address to GPU, this crash the application later. Use pte_protnone(pte) to return this condition and then hmm_vma_do_fault will allocate new page. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-06mm/hmm: clean up some coding style and commentsRalph Campbell
There are no functional changes, just some coding style clean ups and minor comment changes. Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-06mm/hmm: update HMM documentationRalph Campbell
Update the HMM documentation to reflect the latest API and make a few minor wording changes. Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-06mm/hmm.c: suppress compilation warnings when CONFIG_HUGETLB_PAGE is not setJason Gunthorpe
gcc reports that several variables are defined but not used. For the first hunk CONFIG_HUGETLB_PAGE the entire if block is already protected by pud_huge() which is forced to 0. None of the stuff under the ifdef causes compilation problems as it is already stubbed out in the header files. For the second hunk the dummy huge_page_shift macro doesn't touch the argument, so just inline the argument. Link: http://lkml.kernel.org/r/20190522195151.GA23955@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2019-06-02Linux 5.2-rc3v5.2-rc3Linus Torvalds
2019-06-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two fixes: a quirk for KVM guests running on certain AMD CPUs, and a KASAN related build fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor x86/boot: Provide KASAN compatible aliases for string routines
2019-06-02Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "On the kernel side there's a bunch of ring-buffer ordering fixes for a reproducible bug, plus a PEBS constraints regression fix. Plus tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools headers UAPI: Sync kvm.h headers with the kernel sources perf record: Fix s390 missing module symbol and warning for non-root users perf machine: Read also the end of the kernel perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms perf session: Add missing swap ops for namespace events perf namespace: Protect reading thread's namespace tools headers UAPI: Sync drm/drm.h with the kernel tools headers UAPI: Sync drm/i915_drm.h with the kernel tools headers UAPI: Sync linux/fs.h with the kernel tools headers UAPI: Sync linux/sched.h with the kernel tools arch x86: Sync asm/cpufeatures.h with the with the kernel tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel perf data: Fix 'strncat may truncate' build failure with recent gcc perf/ring-buffer: Use regular variables for nesting perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data perf/ring_buffer: Add ordering to rb->nest increment perf/ring_buffer: Fix exposing a temporarily decreased data_head perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
2019-06-02Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Two EFI fixes: a quirk for weird systabs, plus add more robust error handling in the old 1:1 mapping code" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Allow the number of EFI configuration tables entries to be zero efi/x86/Add missing error handling to old_memmap 1:1 mapping code
2019-06-02Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull stacktrace fix from Ingo Molnar: "Fix a stack_trace_save_tsk_reliable() regression" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stacktrace: Unbreak stack_trace_save_tsk_reliable()
2019-06-02Merge tag 'spdx-5.2-rc3-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull SPDX fixes from Greg KH: "Here are just two small patches, that fix up some found SPDX identifier issues. The first patch fixes an error in a previous SPDX fixup patch, that causes build errors when doing 'make clean' on the tree (the fact that almost no one noticed it reflects the fact that kernel developers don't like doing that option very often...) The second patch fixes up a number of places in the tree where people mistyped the string "SPDX-License-Identifier". Given that people can not even type their own name all the time without mistakes, this was bound to happen, and odds are, we will have to add some type of check for this to checkpatch.pl to catch this happening in the future. Both of these have passed testing by 0-day" * tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: treewide: fix typos of SPDX-License-Identifier crypto: ux500 - fix license comment syntax error
2019-06-02Merge tag 'powerpc-5.2-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A minor fix to our IMC PMU code to print a less confusing error message when the driver can't initialise properly. A fix for a bug where a user requesting an unsupported branch sampling filter can corrupt PMU state, preventing the PMU from counting properly. And finally a fix for a bug in our support for kexec_file_load(), which prevented loading a kernel and initramfs. Most versions of kexec don't yet use kexec_file_load(). Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi Bangoria, Thiago Jung Bauermann" * tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load() powerpc/perf: Fix MMCRA corruption by bhrb_filter powerpc/powernv: Return for invalid IMC domain
2019-06-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Fixes for PPC and s390" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Restore SPRG3 in kvmhv_p9_guest_entry() KVM: PPC: Book3S HV: Fix lockdep warning when entering guest on POWER9 KVM: PPC: Book3S HV: XIVE: Fix page offset when clearing ESB pages KVM: PPC: Book3S HV: XIVE: Take the srcu read lock when accessing memslots KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts KVM: PPC: Book3S HV: XIVE: Introduce a new mutex for the XIVE device KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID kvm: fix compile on s390 part 2
2019-06-02Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A memleak fix for the core, two driver bugfixes, as well as fixing missing file patterns to MAINTAINERS" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: add I2C DT bindings to ARM platforms MAINTAINERS: add DT bindings to i2c drivers i2c: synquacer: fix synquacer_i2c_doxfer() return value i2c: mlxcpld: Fix wrong initialization order in probe i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr
2019-06-02Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal SoC fix from Eduardo Valentin: "A single revert, detected to cause issues on the tsens driver" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"
2019-06-02Merge tag 'led-fixes-for-5.2-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "Fix for a recent change in LED core, that didn't take into account the possibility of calling led_blink_setup() from atomic context" * tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: avoid flush_work in atomic context
2019-06-02Merge tag 'for-linus-20190601' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - A set of patches fixing code comments / kerneldoc (Bart) - Don't allow loop file change for exclusive open (Jan) - Fix revalidate of hidden genhd (Jan) - Init queue failure memory free fix (Jes) - Improve rq limits failure print (John) - Fixup for queue removal/addition (Ming) - Missed error progagation for io_uring buffer registration (Pavel) * tag 'for-linus-20190601' of git://git.kernel.dk/linux-block: block: print offending values when cloned rq limits are exceeded blk-mq: Document the blk_mq_hw_queue_to_node() arguments blk-mq: Fix spelling in a source code comment block: Fix bsg_setup_queue() kernel-doc header block: Fix rq_qos_wait() kernel-doc header block: Fix blk_mq_*_map_queues() kernel-doc headers block: Fix throtl_pending_timer_fn() kernel-doc header block: Convert blk_invalidate_devt() header into a non-kernel-doc header block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header blk-mq: Fix memory leak in error handling block: don't protect generic_make_request_checks with blk_queue_enter block: move blk_exit_queue into __blk_release_queue block: Don't revalidate bdev of hidden gendisk loop: Don't change loop device under exclusive opener io_uring: Fix __io_uring_register() false success
2019-06-02Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Six minor fixes to device drivers and one to the multipath alua handler. The most extensive fix is the zfcp port remove prevention one, but it's impact is only s390" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: delete sas port if expander discover failed scsi: libsas: only clear phy->in_shutdown after shutdown event done scsi: scsi_dh_alua: Fix possible null-ptr-deref scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()
2019-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Various fixes and followups" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, compaction: make sure we isolate a valid PFN include/linux/generic-radix-tree.h: fix kerneldoc comment kernel/signal.c: trace_signal_deliver when signal_group_exit drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used spdxcheck.py: fix directory structures kasan: initialize tag to 0xff in __kasan_kmalloc z3fold: fix sheduling while atomic scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set mm/gup: continue VM_FAULT_RETRY processing even for pre-faults ocfs2: fix error path kobject memory leak memcg: make it work on sparse non-0-node systems mm, memcg: consider subtrees in memory.events prctl_set_mm: downgrade mmap_sem to read lock prctl_set_mm: refactor checks from validate_prctl_map kernel/fork.c: make max_threads symbol static arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK mm/vmalloc.c: fix typo in comment lib/sort.c: fix kernel-doc notation warnings mm: fix Documentation/vm/hmm.rst Sphinx warnings
2019-06-01mm, compaction: make sure we isolate a valid PFNSuzuki K Poulose
When we have holes in a normal memory zone, we could endup having cached_migrate_pfns which may not necessarily be valid, under heavy memory pressure with swapping enabled ( via __reset_isolation_suitable(), triggered by kswapd). Later if we fail to find a page via fast_isolate_freepages(), we may end up using the migrate_pfn we started the search with, as valid page. This could lead to accessing NULL pointer derefernces like below, due to an invalid mem_section pointer. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825] Mem abort info: ESR = 0x96000004 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9 [0000000000000008] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] SMP ... CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6 Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018 pstate: 60000005 (nZCv daif -PAN -UAO) pc : set_pfnblock_flags_mask+0x58/0xe8 lr : compaction_alloc+0x300/0x950 [...] Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5) Call trace: set_pfnblock_flags_mask+0x58/0xe8 compaction_alloc+0x300/0x950 migrate_pages+0x1a4/0xbb0 compact_zone+0x750/0xde8 compact_zone_order+0xd8/0x118 try_to_compact_pages+0xb4/0x290 __alloc_pages_direct_compact+0x84/0x1e0 __alloc_pages_nodemask+0x5e0/0xe18 alloc_pages_vma+0x1cc/0x210 do_huge_pmd_anonymous_page+0x108/0x7c8 __handle_mm_fault+0xdd4/0x1190 handle_mm_fault+0x114/0x1c0 __get_user_pages+0x198/0x3c0 get_user_pages_unlocked+0xb4/0x1d8 __gfn_to_pfn_memslot+0x12c/0x3b8 gfn_to_pfn_prot+0x4c/0x60 kvm_handle_guest_abort+0x4b0/0xcd8 handle_exit+0x140/0x1b8 kvm_arch_vcpu_ioctl_run+0x260/0x768 kvm_vcpu_ioctl+0x490/0x898 do_vfs_ioctl+0xc4/0x898 ksys_ioctl+0x8c/0xa0 __arm64_sys_ioctl+0x28/0x38 el0_svc_common+0x74/0x118 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Code: f8607840 f100001f 8b011401 9a801020 (f9400400) ---[ end trace af6a35219325a9b6 ]--- The issue was reported on an arm64 server with 128GB with holes in the zone (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while running 100 KVM guest instances. This patch fixes the issue by ensuring that the page belongs to a valid PFN when we fallback to using the lower limit of the scan range upon failure in fast_isolate_freepages(). Link: http://lkml.kernel.org/r/1558711908-15688-1-git-send-email-suzuki.poulose@arm.com Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target") Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Qian Cai <cai@lca.pw> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01include/linux/generic-radix-tree.h: fix kerneldoc commentJonathan Corbet
The DOC comment block section in include/linux/generic-radix-tree.h contained a spurious colon, causing this warning in the documentation build: include/linux/generic-radix-tree.h:1: warning: no structured comments found Remove the colon and make the docs build happy. Link: http://lkml.kernel.org/r/20190524141933.74ae9050@lwn.net Signed-off-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01kernel/signal.c: trace_signal_deliver when signal_group_exitZhenliang Wei
In the fixes commit, removing SIGKILL from each thread signal mask and executing "goto fatal" directly will skip the call to "trace_signal_deliver". At this point, the delivery tracking of the SIGKILL signal will be inaccurate. Therefore, we need to add trace_signal_deliver before "goto fatal" after executing sigdelset. Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info. Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT") Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com> Reviewed-by: Christian Brauner <christian@brauner.io> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Ivan Delalande <colona@arista.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not usedQian Cai
Commit cf04eee8bf0e ("iommu/vt-d: Include ACPI devices in iommu=pt") added for_each_active_iommu() in iommu_prepare_static_identity_mapping() but never used the each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'iommu_prepare_static_identity_mapping': drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Fixed the warning by appending a compiler attribute __maybe_unused for it. Link: http://lkml.kernel.org/r/20190523013314.2732-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01spdxcheck.py: fix directory structuresVincenzo Frascino
The LICENSE directory has recently changed structure and this makes spdxcheck fails as per below: FAIL: "Blob or Tree named 'other' not found" Traceback (most recent call last): File "scripts/spdxcheck.py", line 240, in <module> spdx = read_spdxdata(repo) File "scripts/spdxcheck.py", line 41, in read_spdxdata for el in lictree[d].traverse(): [...] KeyError: "Blob or Tree named 'other' not found" Fix the script to restore the correctness on checkpatch License checking. References: 62be257e986d ("LICENSES: Rename other to deprecated") References: 8ea8814fcdcb ("LICENSES: Clearly mark dual license only licenses") Link: http://lkml.kernel.org/r/20190523084755.56739-1-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Joe Perches <joe@perches.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Cline <jcline@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01kasan: initialize tag to 0xff in __kasan_kmallocNathan Chancellor
When building with -Wuninitialized and CONFIG_KASAN_SW_TAGS unset, Clang warns: mm/kasan/common.c:484:40: warning: variable 'tag' is uninitialized when used here [-Wuninitialized] kasan_unpoison_shadow(set_tag(object, tag), size); ^~~ set_tag ignores tag in this configuration but clang doesn't realize it at this point in its pipeline, as it points to arch_kasan_set_tag as being the point where it is used, which will later be expanded to (void *)(object) without a use of tag. Initialize tag to 0xff, as it removes this warning and doesn't change the meaning of the code. Link: https://github.com/ClangBuiltLinux/linux/issues/465 Link: http://lkml.kernel.org/r/20190502163057.6603-1-natechancellor@gmail.com Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01z3fold: fix sheduling while atomicVitaly Wool
kmem_cache_alloc() may be called from z3fold_alloc() in atomic context, so we need to pass correct gfp flags to avoid "scheduling while atomic" bug. Link: http://lkml.kernel.org/r/20190523153245.119dfeed55927e8755250ddd@gmail.com Fixes: 7c2b8baa61fe5 ("mm/z3fold.c: add structure for buddy handles") Signed-off-by: Vitaly Wool <vitaly.vul@sony.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not setFabiano Rosas
CLK_GET_RATE_NOCACHE depends on CONFIG_COMMON_CLK. Importing constants.py when CONFIG_COMMON_CLK is not defined causes: (gdb) lx-symbols (...) File "scripts/gdb/linux/proc.py", line 15, in <module> from linux import constants File "scripts/gdb/linux/constants.py", line 2, in <module> LX_CLK_GET_RATE_NOCACHE = gdb.parse_and_eval("CLK_GET_RATE_NOCACHE") gdb.error: No symbol "CLK_GET_RATE_NOCACHE" in current context. Link: http://lkml.kernel.org/r/20190523195313.24701-1-farosas@linux.ibm.com Fixes: e7e6f462c1be ("scripts/gdb: print cached rate in lx-clk-summary") Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Leonard Crestez <leonard.crestez@nxp.com> Cc: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01mm/gup: continue VM_FAULT_RETRY processing even for pre-faultsMike Rapoport
When get_user_pages*() is called with pages = NULL, the processing of VM_FAULT_RETRY terminates early without actually retrying to fault-in all the pages. If the pages in the requested range belong to a VMA that has userfaultfd registered, handle_userfault() returns VM_FAULT_RETRY *after* user space has populated the page, but for the gup pre-fault case there's no actual retry and the caller will get no pages although they are present. This issue was uncovered when running post-copy memory restore in CRIU after d9c9ce34ed5c ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails"). After this change, the copying of FPU state to the sigframe switched from copy_to_user() variants which caused a real page fault to get_user_pages() with pages parameter set to NULL. In post-copy mode of CRIU, the destination memory is managed with userfaultfd and lack of the retry for pre-fault case in get_user_pages() causes a crash of the restored process. Making the pre-fault behavior of get_user_pages() the same as the "normal" one fixes the issue. Link: http://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com Fixes: d9c9ce34ed5c ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Andrei Vagin <avagin@gmail.com> [https://travis-ci.org/avagin/linux/builds/533184940] Tested-by: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Borislav Petkov <bp@suse.de> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01ocfs2: fix error path kobject memory leakTobin C. Harding
If a call to kobject_init_and_add() fails we should call kobject_put() otherwise we leak memory. Add call to kobject_put() in the error path of call to kobject_init_and_add(). Please note, this has the side effect that the release method is called if kobject_init_and_add() fails. Link: http://lkml.kernel.org/r/20190513033458.2824-1-tobin@kernel.org Signed-off-by: Tobin C. Harding <tobin@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01memcg: make it work on sparse non-0-node systemsJiri Slaby
We have a single node system with node 0 disabled: Scanning NUMA topology in Northbridge 24 Number of physical nodes 2 Skipping disabled node 0 Node 1 MemBase 0000000000000000 Limit 00000000fbff0000 NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff] This causes crashes in memcg when system boots: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] ... RIP: 0010:list_lru_add+0x94/0x170 ... Call Trace: d_lru_add+0x44/0x50 dput.part.34+0xfc/0x110 __fput+0x108/0x230 task_work_run+0x9f/0xc0 exit_to_usermode_loop+0xf5/0x100 It is reproducible as far as 4.12. I did not try older kernels. You have to have a new enough systemd, e.g. 241 (the reason is unknown -- was not investigated). Cannot be reproduced with systemd 234. The system crashes because the size of lru array is never updated in memcg_update_all_list_lrus and the reads are past the zero-sized array, causing dereferences of random memory. The root cause are list_lru_memcg_aware checks in the list_lru code. The test in list_lru_memcg_aware is broken: it assumes node 0 is always present, but it is not true on some systems as can be seen above. So fix this by avoiding checks on node 0. Remember the memcg-awareness by a bool flag in struct list_lru. Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01mm, memcg: consider subtrees in memory.eventsChris Down
memory.stat and other files already consider subtrees in their output, and we should too in order to not present an inconsistent interface. The current situation is fairly confusing, because people interacting with cgroups expect hierarchical behaviour in the vein of memory.stat, cgroup.events, and other files. For example, this causes confusion when debugging reclaim events under low, as currently these always read "0" at non-leaf memcg nodes, which frequently causes people to misdiagnose breach behaviour. The same confusion applies to other counters in this file when debugging issues. Aggregation is done at write time instead of at read-time since these counters aren't hot (unlike memory.stat which is per-page, so it does it at read time), and it makes sense to bundle this with the file notifications. After this patch, events are propagated up the hierarchy: [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 0 oom 0 oom_kill 0 [root@ktst ~]# systemd-run -p MemoryMax=1 true Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 7 oom 1 oom_kill 1 As this is a change in behaviour, this can be reverted to the old behaviour by mounting with the `memory_localevents' flag set. However, we use the new behaviour by default as there's a lack of evidence that there are any current users of memory.events that would find this change undesirable. akpm: this is a behaviour change, so Cc:stable. THis is so that forthcoming distros which use cgroup v2 are more likely to pick up the revised behaviour. Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.name Signed-off-by: Chris Down <chris@chrisdown.name> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01prctl_set_mm: downgrade mmap_sem to read lockMichal Koutný
The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap semaphore taken.") added synchronization of reading argument/environment boundaries under mmap_sem. Later commit 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct") avoided the coarse use of mmap_sem in similar situations. But there still remained two places that (mis)use mmap_sem. get_cmdline should also use arg_lock instead of mmap_sem when it reads the boundaries. The second place that should use arg_lock is in prctl_set_mm. By protecting the boundaries fields with the arg_lock, we can downgrade mmap_sem to reader lock (analogous to what we already do in prctl_set_mm_map). [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/20190502125203.24014-3-mkoutny@suse.com Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct") Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01prctl_set_mm: refactor checks from validate_prctl_mapMichal Koutný
Despite comment of validate_prctl_map claims there are no capability checks, it is not completely true since commit 4d28df6152aa ("prctl: Allow local CAP_SYS_ADMIN changing exe_file"). Extract the check out of the function and make the function perform purely arithmetic checks. This patch should not change any behavior, it is mere refactoring for following patch. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/20190502125203.24014-2-mkoutny@suse.com Signed-off-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01kernel/fork.c: make max_threads symbol staticKefeng Wang
Fix build warning, kernel/fork.c:125:5: warning: symbol 'max_threads' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20190516015118.140561-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changesSebastian Andrzej Siewior
include/linux/cpumask.h: In function 'cpumask_parse': include/linux/cpumask.h:636:21: error: implicit declaration of function 'strchrnul'; did you mean 'strchr'? [-Werror=implicit-function-declaration] Because arch/arm/boot/compressed/decompress.c does #define _LINUX_STRING_H_ preventing linux/string.h from providing strchrnul. It also #includes asm/string.h, which for arm has a declaration of strchr(), explaining why this didn't use to fail. Link: http://lkml.kernel.org/r/20190528115346.f5a7kn3hdnuf5rts@linutronix.de Fixes: 3713a4e1fdb8d ("include/linux/cpumask.h: fix double string traverse in cpumask_parse") Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yury Norov <ynorov@marvell.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAKDavid Rientjes
CONFIG_DEBUG_SLAB_LEAK has been removed, so remove it from defconfig. Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1905201015460.96074@chino.kir.corp.google.com Fixes: 7878c231dae0 ("slab: remove /proc/slab_allocators") Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01mm/vmalloc.c: fix typo in commentAndrew Morton
Reported-by: Nicholas Joll <najoll@posteo.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>