summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-06-14tracing: Fix out-of-range read in trace_stack_print()Eiichi Tsukata
Puts range check before dereferencing the pointer. Reproducer: # echo stacktrace > trace_options # echo 1 > events/enable # cat trace > /dev/null KASAN report: ================================================================== BUG: KASAN: use-after-free in trace_stack_print+0x26b/0x2c0 Read of size 8 at addr ffff888069d20000 by task cat/1953 CPU: 0 PID: 1953 Comm: cat Not tainted 5.2.0-rc3+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 Call Trace: dump_stack+0x8a/0xce print_address_description+0x60/0x224 ? trace_stack_print+0x26b/0x2c0 ? trace_stack_print+0x26b/0x2c0 __kasan_report.cold+0x1a/0x3e ? trace_stack_print+0x26b/0x2c0 kasan_report+0xe/0x20 trace_stack_print+0x26b/0x2c0 print_trace_line+0x6ea/0x14d0 ? tracing_buffers_read+0x700/0x700 ? trace_find_next_entry_inc+0x158/0x1d0 s_show+0xea/0x310 seq_read+0xaa7/0x10e0 ? seq_escape+0x230/0x230 __vfs_read+0x7c/0x100 vfs_read+0x16c/0x3a0 ksys_read+0x121/0x240 ? kernel_write+0x110/0x110 ? perf_trace_sys_enter+0x8a0/0x8a0 ? syscall_slow_exit_work+0xa9/0x410 do_syscall_64+0xb7/0x390 ? prepare_exit_to_usermode+0x165/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f867681f910 Code: b6 fe ff ff 48 8d 3d 0f be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d f9 2d 2c 00 00 75 10 b8 00 00 00 00 04 RSP: 002b:00007ffdabf23488 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f867681f910 RDX: 0000000000020000 RSI: 00007f8676cde000 RDI: 0000000000000003 RBP: 00007f8676cde000 R08: ffffffffffffffff R09: 0000000000000000 R10: 0000000000000871 R11: 0000000000000246 R12: 00007f8676cde000 R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000000ec0 Allocated by task 1214: save_stack+0x1b/0x80 __kasan_kmalloc.constprop.0+0xc2/0xd0 kmem_cache_alloc+0xaf/0x1a0 getname_flags+0xd2/0x5b0 do_sys_open+0x277/0x5a0 do_syscall_64+0xb7/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 1214: save_stack+0x1b/0x80 __kasan_slab_free+0x12c/0x170 kmem_cache_free+0x8a/0x1c0 putname+0xe1/0x120 do_sys_open+0x2c5/0x5a0 do_syscall_64+0xb7/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff888069d20000 which belongs to the cache names_cache of size 4096 The buggy address is located 0 bytes inside of 4096-byte region [ffff888069d20000, ffff888069d21000) The buggy address belongs to the page: page:ffffea0001a74800 refcount:1 mapcount:0 mapping:ffff88806ccd1380 index:0x0 compound_mapcount: 0 flags: 0x100000000010200(slab|head) raw: 0100000000010200 dead000000000100 dead000000000200 ffff88806ccd1380 raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888069d1ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888069d1ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888069d20000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888069d20080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888069d20100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Link: http://lkml.kernel.org/r/20190610040016.5598-1-devel@etsukata.com Fixes: 4285f2fcef80 ("tracing: Remove the ULONG_MAX stack trace hackery") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-06-14gfs2: Fix rounding error in gfs2_iomap_page_prepareAndreas Gruenbacher
The pos and len arguments to the iomap page_prepare callback are not block aligned, so we need to take that into account when computing the number of blocks. Fixes: d0a22a4b03b8 ("gfs2: Fix iomap write page reclaim deadlock") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-14Merge tag 'mac80211-for-davem-2019-06-14' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Various fixes, all over: * a few memory leaks * fixes for management frame protection security and A2/A3 confusion (affecting TDLS as well) * build fix for certificates * etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14net: phylink: further mac_config documentation improvementsRussell King - ARM Linux admin
While reviewing the DPAA2 work, it has become apparent that we need better documentation about which members of the phylink link state structure are valid in the mac_config call. Improve this documentation. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Here are some arm64 fixes for -rc5. The only non-trivial change (in terms of the diffstat) is fixing our SVE ptrace API for big-endian machines, but the majority of this is actually the addition of much-needed comments and updates to the documentation to try to avoid this mess biting us again in future. There are still a couple of small things on the horizon, but nothing major at this point. Summary: - Fix broken SVE ptrace API when running in a big-endian configuration - Fix performance regression due to off-by-one in TLBI range checking - Fix build regression when using Clang" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/sve: Fix missing SVE/FPSIMD endianness conversions arm64: tlbflush: Ensure start/end of address range are aligned to stride arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS
2019-06-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/devm_memremap_pages: fix final page put race PCI/P2PDMA: track pgmap references per resource, not globally lib/genalloc: introduce chunk owners PCI/P2PDMA: fix the gen_pool_add_virt() failure path mm/devm_memremap_pages: introduce devm_memunmap_pages drivers/base/devres: introduce devm_release_action() mm/vmscan.c: fix trying to reclaim unevictable LRU page coredump: fix race condition between collapse_huge_page() and core dumping mm/mlock.c: change count_mm_mlocked_page_nr return type mm: mmu_gather: remove __tlb_reset_range() for force flush fs/ocfs2: fix race in ocfs2_dentry_attach_lock() mm/vmscan.c: fix recent_rotated history mm/mlock.c: mlockall error for flag MCL_ONFAULT scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node mm: memcontrol: don't batch updates of local VM stats and events
2019-06-14Merge branch 'drm-fixes-5.2' of git://people.freedesktop.org/~agd5f/linux ↵Daniel Vetter
into drm-fixes Fixes for 5.2: - Extend previous vce fix for resume to uvd and vcn - Fix bounds checking in ras debugfs interface - Fix a regression on SI using amdgpu Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613021856.3307-1-alexander.deucher@amd.com
2019-06-14Merge tag 'iommu-fixes-v5.2-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - three fixes for Intel VT-d to fix a potential dead-lock, a formatting fix and a bit setting fix - one fix for the ARM-SMMU to make it work on some platforms with sub-optimal SMMU emulation * tag 'iommu-fixes-v5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/arm-smmu: Avoid constant zero in TLBI writes iommu/vt-d: Set the right field for Page Walk Snoop iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock iommu: Add missing new line for dma type
2019-06-14Merge tag 'gpio-v5.2-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "A single fix for the PCA953x driver affecting some fringe variants of the chip" * tag 'gpio-v5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: hack to fix 24 bit gpio expanders
2019-06-14nfc: Ensure presence of required attributes in the deactivate_target handlerYoung Xiao
Check that the NFC_ATTR_TARGET_INDEX attributes (in addition to NFC_ATTR_DEVICE_INDEX) are provided by the netlink client prior to accessing them. This prevents potential unhandled NULL pointer dereference exceptions which can be triggered by malicious user-mode programs, if they omit one or both of these attributes. Signed-off-by: Young Xiao <92siuyang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14Merge tag 'sound-5.2-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "It might feel like deja vu to receive a bulk of changes at rc5, and it happens again; we've got a collection of fixes for ASoC. Most of fixes are targeted for the newly merged SOF (Sound Open Firmware) stuff and the relevant fixes for Intel platforms. Other than that, there are a few regression fixes for the recent ASoC core changes and HD-audio quirk, as well as a couple of FireWire fixes and for other ASoC codecs" * tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (54 commits) Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops" ALSA: ice1712: Check correct return value to snd_i2c_sendbytes (EWS/DMX 6Fire) ALSA: oxfw: allow PCM capture for Stanton SCS.1m ALSA: firewire-motu: fix destruction of data for isochronous resources ASoC: Intel: sst: fix kmalloc call with wrong flags ASoC: core: Fix deadlock in snd_soc_instantiate_card() SoC: rt274: Fix internal jack assignment in set_jack callback ALSA: hdac: fix memory release for SST and SOF drivers ASoC: SOF: Intel: hda: use the defined ppcap functions ASoC: core: move DAI pre-links initiation to snd_soc_instantiate_card ASoC: Intel: cht_bsw_rt5672: fix kernel oops with platform_name override ASoC: Intel: cht_bsw_nau8824: fix kernel oops with platform_name override ASoC: Intel: bytcht_es8316: fix kernel oops with platform_name override ASoC: Intel: cht_bsw_max98090: fix kernel oops with platform_name override ASoC: sun4i-i2s: Add offset to RX channel select ASoC: sun4i-i2s: Fix sun8i tx channel offset mask ASoC: max98090: remove 24-bit format support if RJ is 0 ASoC: da7219: Fix build error without CONFIG_I2C ASoC: SOF: Intel: hda: Fix COMPILE_TEST build error ASoC: SOF: fix DSP oops definitions in FW ABI ...
2019-06-14btrfs: start readahead also in seed devicesNaohiro Aota
Currently, btrfs does not consult seed devices to start readahead. As a result, if readahead zone is added to the seed devices, btrfs_reada_wait() indefinitely wait for the reada_ctl to finish. You can reproduce the hung by modifying btrfs/163 to have larger initial file size (e.g. xfs_io pwrite 4M instead of current 256K). Fixes: 7414a03fbf9e ("btrfs: initial readahead code and prototypes") Cc: stable@vger.kernel.org # 3.2+: ce7791ffee1e: Btrfs: fix race between readahead and device replace/removal Cc: stable@vger.kernel.org # 3.2+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-06-14x86/kasan: Fix boot with 5-level paging and KASANAndrey Ryabinin
Since commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") kernel doesn't boot with KASAN on 5-level paging machines. The bug is actually in early_p4d_offset() and introduced by commit 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging") early_p4d_offset() tries to convert pgd_val(*pgd) value to a physical address. This doesn't make sense because pgd_val() already contains the physical address. It did work prior to commit d52888aa2753 because the result of "__pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK" was the same as "pgd_val(*pgd) & PTE_PFN_MASK". __pa_nodebug() just set some high bits which were masked out by applying PTE_PFN_MASK. After the change of the PAGE_OFFSET offset in commit d52888aa2753 __pa_nodebug(pgd_val(*pgd)) started to return a value with more high bits set and PTE_PFN_MASK wasn't enough to mask out all of them. So it returns a wrong not even canonical address and crashes on the attempt to dereference it. Switch back to pgd_val() & PTE_PFN_MASK to cure the issue. Fixes: 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging") Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: kasan-dev@googlegroups.com Cc: stable@vger.kernel.org Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190614143149.2227-1-aryabinin@virtuozzo.com
2019-06-14cfg80211: report measurement start TSF correctlyAvraham Stern
Instead of reporting the AP's TSF, host time was reported. Fix it. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14cfg80211: fix memory leak of wiphy device nameEric Biggers
In wiphy_new_nm(), if an error occurs after dev_set_name() and device_initialize() have already been called, it's necessary to call put_device() (via wiphy_free()) to avoid a memory leak. Reported-by: syzbot+7fddca22578bc67c3fe4@syzkaller.appspotmail.com Fixes: 1f87f7d3a3b4 ("cfg80211: add rfkill support") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14cfg80211: util: fix bit count off by oneMordechay Goodstein
The bits of Rx MCS Map in VHT capability were enumerated with index transform - index i -> (i + 1) bit => nss i. BUG! while it should be - index i -> (i + 1) bit => (i + 1) nss. The bug was exposed in commit a53b2a0b1245 ("iwlwifi: mvm: implement VHT extended NSS support in rs.c"), where iwlwifi started using the function. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Fixes: b0aa75f0b1b2 ("ieee80211: add new VHT capability fields/parsing") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14mac80211: do not start any work during reconfigure flowNaftali Goldstein
It is not a good idea to try to perform any work (e.g. send an auth frame) during reconfigure flow. Prevent this from happening, and at the end of the reconfigure flow requeue all the works. Signed-off-by: Naftali Goldstein <naftali.goldstein@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14cfg80211: use BIT_ULL in cfg80211_parse_mbssid_data()Luca Coelho
The seen_indices variable is u64 and in other parts of the code we assume mbssid_index_ie[2] can be up to 45, so we should use the 64-bit versions of BIT, namely, BIT_ULL(). Reported-by: Dan Carpented <dan.carpenter@oracle.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14mac80211: only warn once on chanctx_conf being NULLYibo Zhao
In multiple SSID cases, it takes time to prepare every AP interface to be ready in initializing phase. If a sta already knows everything it needs to join one of the APs and sends authentication to the AP which is not fully prepared at this point of time, AP's channel context could be NULL. As a result, warning message occurs. Even worse, if the AP is under attack via tools such as MDK3 and massive authentication requests are received in a very short time, console will be hung due to kernel warning messages. WARN_ON_ONCE() could be a better way for indicating warning messages without duplicate messages to flood the console. Johannes: We still need to address the underlying problem, but we don't really have a good handle on it yet. Suppress the worst side-effects for now. Signed-off-by: Zhi Chen <zhichen@codeaurora.org> Signed-off-by: Yibo Zhao <yiboz@codeaurora.org> [johannes: add note, change subject] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14mac80211: drop robust management frames from unknown TAJohannes Berg
When receiving a robust management frame, drop it if we don't have rx->sta since then we don't have a security association and thus couldn't possibly validate the frame. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-14gpu: ipu-v3: image-convert: Fix image downsize coefficientsSteve Longerbeam
The output of the IC downsizer unit in both dimensions must be <= 1024 before being passed to the IC resizer unit. This was causing corrupted images when: input_dim > 1024, and input_dim / 2 < output_dim < input_dim Some broken examples were 1920x1080 -> 1024x768 and 1920x1080 -> 1280x1080. Fixes: 70b9b6b3bcb21 ("gpu: ipu-v3: image-convert: calculate per-tile resize coefficients") Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-06-14gpu: ipu-v3: image-convert: Fix input bytesperline for packed formatsSteve Longerbeam
The input bytesperline calculation for packed pixel formats was incorrect. The min/max clamping values must be multiplied by the packed bits-per-pixel. This was causing corrupted converted images when the input format was RGB4 (probably also other input packed formats). Fixes: d966e23d61a2c ("gpu: ipu-v3: image-convert: fix bytesperline adjustment") Reported-by: Harsha Manjula Mallikarjun <Harsha.ManjulaMallikarjun@in.bosch.com> Suggested-by: Harsha Manjula Mallikarjun <Harsha.ManjulaMallikarjun@in.bosch.com> Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-06-14gpu: ipu-v3: image-convert: Fix input bytesperline width/height alignSteve Longerbeam
The output width and height alignment values were being used in the input bytesperline calculation. Fix by separating local vars w_align and h_align into w_align_in, h_align_in, w_align_out, and h_align_out. Fixes: d966e23d61a2c ("gpu: ipu-v3: image-convert: fix bytesperline adjustment") Signed-off-by: Steve Longerbeam <slongerbeam@gmail.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-06-14thunderbolt: Implement CIO reset correctly for Titan RidgeMika Westerberg
When starting ICM firmware on Apple systems we need to perform CIO reset as part of the flow. However, it turns out that the reset register has changed to another location in Titan Ridge. Fix this by introducing ->cio_reset() callback with corresponding implementations for Alpine and Titan Ridge. Fixes: c4630d6ae6e3 ("thunderbolt: Start firmware on Titan Ridge Apple systems") Reported-by: Peter Bowen <pzb@amazon.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-14ARM: davinci: da8xx: specify dma_coherent_mask for lcdcBartosz Golaszewski
The lcdc device is missing the dma_coherent_mask definition causing the following warning on da850-evm: da8xx_lcdc da8xx_lcdc.0: found Sharp_LK043T1DG01 panel ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/dma/mapping.c:247 dma_alloc_attrs+0xc8/0x110 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.2.0-rc3-00077-g16d72dd4891f #18 Hardware name: DaVinci DA850/OMAP-L138/AM18x EVM [<c000fce8>] (unwind_backtrace) from [<c000d900>] (show_stack+0x10/0x14) [<c000d900>] (show_stack) from [<c001a4f8>] (__warn+0xec/0x114) [<c001a4f8>] (__warn) from [<c001a634>] (warn_slowpath_null+0x3c/0x48) [<c001a634>] (warn_slowpath_null) from [<c0065860>] (dma_alloc_attrs+0xc8/0x110) [<c0065860>] (dma_alloc_attrs) from [<c02820f8>] (fb_probe+0x228/0x5a8) [<c02820f8>] (fb_probe) from [<c02d3e9c>] (platform_drv_probe+0x48/0x9c) [<c02d3e9c>] (platform_drv_probe) from [<c02d221c>] (really_probe+0x1d8/0x2d4) [<c02d221c>] (really_probe) from [<c02d2474>] (driver_probe_device+0x5c/0x168) [<c02d2474>] (driver_probe_device) from [<c02d2728>] (device_driver_attach+0x58/0x60) [<c02d2728>] (device_driver_attach) from [<c02d27b0>] (__driver_attach+0x80/0xbc) [<c02d27b0>] (__driver_attach) from [<c02d047c>] (bus_for_each_dev+0x64/0xb4) [<c02d047c>] (bus_for_each_dev) from [<c02d1590>] (bus_add_driver+0xe4/0x1d8) [<c02d1590>] (bus_add_driver) from [<c02d301c>] (driver_register+0x78/0x10c) [<c02d301c>] (driver_register) from [<c000a5c0>] (do_one_initcall+0x48/0x1bc) [<c000a5c0>] (do_one_initcall) from [<c05cae6c>] (kernel_init_freeable+0x10c/0x1d8) [<c05cae6c>] (kernel_init_freeable) from [<c048a000>] (kernel_init+0x8/0xf4) [<c048a000>] (kernel_init) from [<c00090e0>] (ret_from_fork+0x14/0x34) Exception stack(0xc6837fb0 to 0xc6837ff8) 7fa0: 00000000 00000000 00000000 00000000 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 8a8073511be81dd2 ]--- Add a 32-bit mask to the platform device's definition. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2019-06-14ARM: davinci: da850-evm: call regulator_has_full_constraints()Bartosz Golaszewski
The BB expander at 0x21 i2c bus 1 fails to probe on da850-evm because the board doesn't set has_full_constraints to true in the regulator API. Call regulator_has_full_constraints() at the end of board registration just like we do in da850-lcdk and da830-evm. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2019-06-14timekeeping: Repair ktime_get_coarse*() granularityThomas Gleixner
Jason reported that the coarse ktime based time getters advance only once per second and not once per tick as advertised. The code reads only the monotonic base time, which advances once per second. The nanoseconds are accumulated on every tick in xtime_nsec up to a second and the regular time getters take this nanoseconds offset into account, but the ktime_get_coarse*() implementation fails to do so. Add the accumulated xtime_nsec value to the monotonic base time to get the proper per tick advancing coarse tinme. Fixes: b9ff604cff11 ("timekeeping: Add ktime_get_coarse_with_offset") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Cc: Waiman Long <longman@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906132136280.1791@nanos.tec.linutronix.de
2019-06-14Merge tag 'drm-misc-fixes-2019-06-13' of ↵Daniel Vetter
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Sean writes: meson: A few G12A fixes across the driver (Neil) quirks: A couple quirks for GPD devices (Hans) gem_shmem: Use writecombine when vmapping non-dmabuf BOs (Boris) panfrost: A couple tweaks to requiring devfreq (Neil & Ezequiel) edid: Ensure we return the override mode when ddc probe fails (Jani) Cc: Hans de Goede <hdegoede@redhat.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Ezequiel Garcia <ezequiel@collabora.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20190613143946.GA24233@art_vandelay
2019-06-14Revert "ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops"Hui Wang
This reverts commit 9cb40eb184c4220d244a532bd940c6345ad9dbd9. This patch introduces noise and headphone playback issue after rebooting or suspending/resuming. Let us revert it. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=203831 Fixes: 9cb40eb184c4 ("ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops") Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-06-13mm/devm_memremap_pages: fix final page put raceDan Williams
Logan noticed that devm_memremap_pages_release() kills the percpu_ref drops all the page references that were acquired at init and then immediately proceeds to unplug, arch_remove_memory(), the backing pages for the pagemap. If for some reason device shutdown actually collides with a busy / elevated-ref-count page then arch_remove_memory() should be deferred until after that reference is dropped. As it stands the "wait for last page ref drop" happens *after* devm_memremap_pages_release() returns, which is obviously too late and can lead to crashes. Fix this situation by assigning the responsibility to wait for the percpu_ref to go idle to devm_memremap_pages() with a new ->cleanup() callback. Implement the new cleanup callback for all devm_memremap_pages() users: pmem, devdax, hmm, and p2pdma. Link: http://lkml.kernel.org/r/155727339156.292046.5432007428235387859.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: 41e94a851304 ("add devm_memremap_pages") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13PCI/P2PDMA: track pgmap references per resource, not globallyDan Williams
In preparation for fixing a race between devm_memremap_pages_release() and the final put of a page from the device-page-map, allocate a percpu-ref per p2pdma resource mapping. Link: http://lkml.kernel.org/r/155727338646.292046.9922678317501435597.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13lib/genalloc: introduce chunk ownersDan Williams
The p2pdma facility enables a provider to publish a pool of dma addresses for a consumer to allocate. A genpool is used internally by p2pdma to collect dma resources, 'chunks', to be handed out to consumers. Whenever a consumer allocates a resource it needs to pin the 'struct dev_pagemap' instance that backs the chunk selected by pci_alloc_p2pmem(). Currently that reference is taken globally on the entire provider device. That sets up a lifetime mismatch whereby the p2pdma core needs to maintain hacks to make sure the percpu_ref is not released twice. This lifetime mismatch also stands in the way of a fix to devm_memremap_pages() whereby devm_memremap_pages_release() must wait for the percpu_ref ->release() callback to complete before it can proceed to teardown pages. So, towards fixing this situation, introduce the ability to store a 'chunk owner' at gen_pool_add() time, and a facility to retrieve the owner at gen_pool_{alloc,free}() time. For p2pdma this will be used to store and recall individual dev_pagemap reference counter instances per-chunk. Link: http://lkml.kernel.org/r/155727338118.292046.13407378933221579644.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13PCI/P2PDMA: fix the gen_pool_add_virt() failure pathDan Williams
The pci_p2pdma_add_resource() implementation immediately frees the pgmap if gen_pool_add_virt() fails. However, that means that when @dev triggers a devres release devm_memremap_pages_release() will crash trying to access the freed @pgmap. Use the new devm_memunmap_pages() to manually free the mapping in the error path. Link: http://lkml.kernel.org/r/155727337603.292046.13101332703665246702.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Fixes: 52916982af48 ("PCI/P2PDMA: Support peer-to-peer memory") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/devm_memremap_pages: introduce devm_memunmap_pagesDan Williams
Use the new devm_release_action() facility to allow devm_memremap_pages_release() to be manually triggered. Link: http://lkml.kernel.org/r/155727337088.292046.5774214552136776763.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13drivers/base/devres: introduce devm_release_action()Dan Williams
Patch series "mm/devm_memremap_pages: Fix page release race", v2. Logan audited the devm_memremap_pages() shutdown path and noticed that it was possible to proceed to arch_remove_memory() before all potential page references have been reaped. Introduce a new ->cleanup() callback to do the work of waiting for any straggling page references and then perform the percpu_ref_exit() in devm_memremap_pages_release() context. For p2pdma this involves some deeper reworks to reference count resources on a per-instance basis rather than a per pci-device basis. A modified genalloc api is introduced to convey a driver-private pointer through gen_pool_{alloc,free}() interfaces. Also, a devm_memunmap_pages() api is introduced since p2pdma does not auto-release resources on a setup failure. The dax and pmem changes pass the nvdimm unit tests, and the p2pdma changes should now pass testing with the pci_p2pdma_release() fix. Jrme, how does this look for HMM? This patch (of 6): The devm_add_action() facility allows a resource allocation routine to add custom devm semantics. One such user is devm_memremap_pages(). There is now a need to manually trigger devm_memremap_pages_release(). Introduce devm_release_action() so the release action can be triggered via a new devm_memunmap_pages() api in a follow-on change. Link: http://lkml.kernel.org/r/155727336530.292046.2926860263201336366.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/vmscan.c: fix trying to reclaim unevictable LRU pageMinchan Kim
There was the below bug report from Wu Fangsuo. On the CMA allocation path, isolate_migratepages_range() could isolate unevictable LRU pages and reclaim_clean_page_from_list() can try to reclaim them if they are clean file-backed pages. page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24 flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053 raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80 page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page)) page->mem_cgroup:ffffffc09bf3ee80 ------------[ cut here ]------------ kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3 Hardware name: ASR AQUILAC EVB (DT) task: ffffffc00a54cd00 task.stack: ffffffc009b00000 PC is at shrink_page_list+0x1998/0x3240 LR is at shrink_page_list+0x1998/0x3240 pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045 sp : ffffffc009b05940 .. shrink_page_list+0x1998/0x3240 reclaim_clean_pages_from_list+0x3c0/0x4f0 alloc_contig_range+0x3bc/0x650 cma_alloc+0x214/0x668 ion_cma_allocate+0x98/0x1d8 ion_alloc+0x200/0x7e0 ion_ioctl+0x18c/0x378 do_vfs_ioctl+0x17c/0x1780 SyS_ioctl+0xac/0xc0 Wu found it's due to commit ad6b67041a45 ("mm: remove SWAP_MLOCK in ttu"). Before that, unevictable pages go to cull_mlocked so that we can't reach the VM_BUG_ON_PAGE line. To fix the issue, this patch filters out unevictable LRU pages from the reclaim_clean_pages_from_list in CMA. Link: http://lkml.kernel.org/r/20190524071114.74202-1-minchan@kernel.org Fixes: ad6b67041a45 ("mm: remove SWAP_MLOCK in ttu") Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Wu Fangsuo <fangsuowu@asrmicro.com> Debugged-by: Wu Fangsuo <fangsuowu@asrmicro.com> Tested-by: Wu Fangsuo <fangsuowu@asrmicro.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Pankaj Suryawanshi <pankaj.suryawanshi@einfochips.com> Cc: <stable@vger.kernel.org> [4.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13coredump: fix race condition between collapse_huge_page() and core dumpingAndrea Arcangeli
When fixing the race conditions between the coredump and the mmap_sem holders outside the context of the process, we focused on mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping"), but those aren't the only cases where the mmap_sem can be taken outside of the context of the process as Michal Hocko noticed while backporting that commit to older -stable kernels. If mmgrab() is called in the context of the process, but then the mm_count reference is transferred outside the context of the process, that can also be a problem if the mmap_sem has to be taken for writing through that mm_count reference. khugepaged registration calls mmgrab() in the context of the process, but the mmap_sem for writing is taken later in the context of the khugepaged kernel thread. collapse_huge_page() after taking the mmap_sem for writing doesn't modify any vma, so it's not obvious that it could cause a problem to the coredump, but it happens to modify the pmd in a way that breaks an invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page() needs the mmap_sem for writing just to block concurrent page faults that call pmd_trans_huge_lock(). Specifically the invariant that "!pmd_trans_huge()" cannot become a "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs. The coredump will call __get_user_pages() without mmap_sem for reading, which eventually can invoke a lockless page fault which will need a functional pmd_trans_huge_lock(). So collapse_huge_page() needs to use mmget_still_valid() to check it's not running concurrently with the coredump... as long as the coredump can invoke page faults without holding the mmap_sem for reading. This has "Fixes: khugepaged" to facilitate backporting, but in my view it's more a bug in the coredump code that will eventually have to be rewritten to stop invoking page faults without the mmap_sem for reading. So the long term plan is still to drop all mmget_still_valid(). Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Xu <peterx@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/mlock.c: change count_mm_mlocked_page_nr return typeswkhack
On a 64-bit machine the value of "vma->vm_end - vma->vm_start" may be negative when using 32 bit ints and the "count >> PAGE_SHIFT"'s result will be wrong. So change the local variable and return value to unsigned long to fix the problem. Link: http://lkml.kernel.org/r/20190513023701.83056-1-swkhack@gmail.com Fixes: 0cf2f6f6dc60 ("mm: mlock: check against vma for actual mlock() size") Signed-off-by: swkhack <swkhack@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm: mmu_gather: remove __tlb_reset_range() for force flushYang Shi
A few new fields were added to mmu_gather to make TLB flush smarter for huge page by telling what level of page table is changed. __tlb_reset_range() is used to reset all these page table state to unchanged, which is called by TLB flush for parallel mapping changes for the same range under non-exclusive lock (i.e. read mmap_sem). Before commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"), the syscalls (e.g. MADV_DONTNEED, MADV_FREE) which may update PTEs in parallel don't remove page tables. But, the forementioned commit may do munmap() under read mmap_sem and free page tables. This may result in program hang on aarch64 reported by Jan Stancek. The problem could be reproduced by his test program with slightly modified below. ---8<--- static int map_size = 4096; static int num_iter = 500; static long threads_total; static void *distant_area; void *map_write_unmap(void *ptr) { int *fd = ptr; unsigned char *map_address; int i, j = 0; for (i = 0; i < num_iter; i++) { map_address = mmap(distant_area, (size_t) map_size, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (map_address == MAP_FAILED) { perror("mmap"); exit(1); } for (j = 0; j < map_size; j++) map_address[j] = 'b'; if (munmap(map_address, map_size) == -1) { perror("munmap"); exit(1); } } return NULL; } void *dummy(void *ptr) { return NULL; } int main(void) { pthread_t thid[2]; /* hint for mmap in map_write_unmap() */ distant_area = mmap(0, DISTANT_MMAP_SIZE, PROT_WRITE | PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); munmap(distant_area, (size_t)DISTANT_MMAP_SIZE); distant_area += DISTANT_MMAP_SIZE / 2; while (1) { pthread_create(&thid[0], NULL, map_write_unmap, NULL); pthread_create(&thid[1], NULL, dummy, NULL); pthread_join(thid[0], NULL); pthread_join(thid[1], NULL); } } ---8<--- The program may bring in parallel execution like below: t1 t2 munmap(map_address) downgrade_write(&mm->mmap_sem); unmap_region() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); free_pgtables() tlb->freed_tables = 1 tlb->cleared_pmds = 1 pthread_exit() madvise(thread_stack, 8M, MADV_DONTNEED) zap_page_range() tlb_gather_mmu() inc_tlb_flush_pending(tlb->mm); tlb_finish_mmu() if (mm_tlb_flush_nested(tlb->mm)) __tlb_reset_range() __tlb_reset_range() would reset freed_tables and cleared_* bits, but this may cause inconsistency for munmap() which do free page tables. Then it may result in some architectures, e.g. aarch64, may not flush TLB completely as expected to have stale TLB entries remained. Use fullmm flush since it yields much better performance on aarch64 and non-fullmm doesn't yields significant difference on x86. The original proposed fix came from Jan Stancek who mainly debugged this issue, I just wrapped up everything together. Jan's testing results: v5.2-rc2-24-gbec7550cca10 -------------------------- mean stddev real 37.382 2.780 user 1.420 0.078 sys 54.658 1.855 v5.2-rc2-24-gbec7550cca10 + "mm: mmu_gather: remove __tlb_reset_range() for force flush" ---------------------------------------------------------------------------------------_ mean stddev real 37.119 2.105 user 1.548 0.087 sys 55.698 1.357 [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1558322252-113575-1-git-send-email-yang.shi@linux.alibaba.com Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Jan Stancek <jstancek@redhat.com> Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Suggested-by: Will Deacon <will.deacon@arm.com> Tested-by: Will Deacon <will.deacon@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> [4.20+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13fs/ocfs2: fix race in ocfs2_dentry_attach_lock()Wengang Wang
ocfs2_dentry_attach_lock() can be executed in parallel threads against the same dentry. Make that race safe. The race is like this: thread A thread B (A1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias, so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl1 ..... (B1) enter ocfs2_dentry_attach_lock, seeing dentry->d_fsdata is NULL, and no alias found by ocfs2_find_local_alias so kmalloc a new ocfs2_dentry_lock structure to local variable "dl", dl2. ...... (A2) set dentry->d_fsdata with dl1, call ocfs2_dentry_lock() and increase dl1->dl_lockres.l_ro_holders to 1 on success. ...... (B2) set dentry->d_fsdata with dl2 call ocfs2_dentry_lock() and increase dl2->dl_lockres.l_ro_holders to 1 on success. ...... (A3) call ocfs2_dentry_unlock() and decrease dl2->dl_lockres.l_ro_holders to 0 on success. .... (B3) call ocfs2_dentry_unlock(), decreasing dl2->dl_lockres.l_ro_holders, but see it's zero now, panic Link: http://lkml.kernel.org/r/20190529174636.22364-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reported-by: Daniel Sobe <daniel.sobe@nxp.com> Tested-by: Daniel Sobe <daniel.sobe@nxp.com> Reviewed-by: Changwei Ge <gechangwei@live.cn> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/vmscan.c: fix recent_rotated historyKirill Tkhai
Johannes pointed out that after commit 886cf1901db9 ("mm: move recent_rotated pages calculation to shrink_inactive_list()") we lost all zone_reclaim_stat::recent_rotated history. This fixes it. Link: http://lkml.kernel.org/r/155905972210.26456.11178359431724024112.stgit@localhost.localdomain Fixes: 886cf1901db9 ("mm: move recent_rotated pages calculation to shrink_inactive_list()") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reported-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/mlock.c: mlockall error for flag MCL_ONFAULTPotyra, Stefan
If mlockall() is called with only MCL_ONFAULT as flag, it removes any previously applied lockings and does nothing else. This behavior is counter-intuitive and doesn't match the Linux man page. For mlockall(): EINVAL Unknown flags were specified or MCL_ONFAULT was specified without either MCL_FUTURE or MCL_CURRENT. Consequently, return the error EINVAL, if only MCL_ONFAULT is passed. That way, applications will at least detect that they are calling mlockall() incorrectly. Link: http://lkml.kernel.org/r/20190527075333.GA6339@er01809n.ebgroup.elektrobit.com Fixes: b0f205c2a308 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage") Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILEManuel Traut
At least for ARM64 kernels compiled with the crosstoolchain from Debian/stretch or with the toolchain from kernel.org the line number is not decoded correctly by 'decode_stacktrace.sh': $ echo "[ 136.513051] f1+0x0/0xc [kcrash]" | \ CROSS_COMPILE=/opt/gcc-8.1.0-nolibc/aarch64-linux/bin/aarch64-linux- \ ./scripts/decode_stacktrace.sh /scratch/linux-arm64/vmlinux \ /scratch/linux-arm64 \ /nfs/debian/lib/modules/4.20.0-devel [ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:68) kcrash If addr2line from the toolchain is used the decoded line number is correct: [ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:57) kcrash Link: http://lkml.kernel.org/r/20190527083425.3763-1-manut@linutronix.de Signed-off-by: Manuel Traut <manut@linutronix.de> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm/list_lru.c: fix memory leak in __memcg_init_list_lru_nodeShakeel Butt
Syzbot reported following memory leak: ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79 BUG: memory leak unreferenced object 0xffff888114f26040 (size 32): comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s) hex dump (first 32 bytes): 40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff @`......@`...... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: slab_post_alloc_hook mm/slab.h:439 [inline] slab_alloc mm/slab.c:3326 [inline] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 kmalloc include/linux/slab.h:547 [inline] __memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352 memcg_init_list_lru_node mm/list_lru.c:375 [inline] memcg_init_list_lru mm/list_lru.c:459 [inline] __list_lru_init+0x193/0x2a0 mm/list_lru.c:626 alloc_super+0x2e0/0x310 fs/super.c:269 sget_userns+0x94/0x2a0 fs/super.c:609 sget+0x8d/0xb0 fs/super.c:660 mount_nodev+0x31/0xb0 fs/super.c:1387 fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236 legacy_get_tree+0x27/0x80 fs/fs_context.c:661 vfs_get_tree+0x2e/0x120 fs/super.c:1476 do_new_mount fs/namespace.c:2790 [inline] do_mount+0x932/0xc50 fs/namespace.c:3110 ksys_mount+0xab/0x120 fs/namespace.c:3319 __do_sys_mount fs/namespace.c:3333 [inline] __se_sys_mount fs/namespace.c:3330 [inline] __x64_sys_mount+0x26/0x30 fs/namespace.c:3330 do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is a simple off by one bug on the error path. Link: http://lkml.kernel.org/r/20190528043202.99980-1-shakeelb@google.com Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists") Reported-by: syzbot+f90a420dfe2b1b03cb2c@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-13mm: memcontrol: don't batch updates of local VM stats and eventsJohannes Weiner
The kernel test robot noticed a 26% will-it-scale pagefault regression from commit 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty"). This appears to be caused by bouncing the additional cachelines from the new hierarchical statistics counters. We can fix this by getting rid of the batched local counters instead. Originally, there were *only* group-local counters, and they were fully maintained per cpu. A reader of a stats file high up in the cgroup tree would have to walk the entire subtree and collect each level's per-cpu counters to get the recursive view. This was prohibitively expensive, and so we switched to per-cpu batched updates of the local counters during a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting"), reducing the complexity from nr_subgroups * nr_cpus to nr_subgroups. With growing machines and cgroup trees, the tree walk itself became too expensive for monitoring top-level groups, and this is when the culprit patch added hierarchy counters on each cgroup level. When the per-cpu batch size would be reached, both the local and the hierarchy counters would get batch-updated from the per-cpu delta simultaneously. This makes local and hierarchical counter reads blazingly fast, but it unfortunately makes the write-side too cache line intense. Since local counter reads were never a problem - we only centralized them to accelerate the hierarchy walk - and use of the local counters are becoming rarer due to replacement with hierarchical views (ongoing rework in the page reclaim and workingset code), we can make those local counters unbatched per-cpu counters again. The scheme will then be as such: when a memcg statistic changes, the writer will: - update the local counter (per-cpu) - update the batch counter (per-cpu). If the batch is full: - spill the batch into the group's atomic_t - spill the batch into all ancestors' atomic_ts - empty out the batch counter (per-cpu) when a local memcg counter is read, the reader will: - collect the local counter from all cpus when a hiearchy memcg counter is read, the reader will: - read the atomic_t We might be able to simplify this further and make the recursive counters unbatched per-cpu counters as well (batch upward propagation, but leave per-cpu collection to the readers), but that will require a more in-depth analysis and testing of all the callsites. Deal with the immediate regression for now. Link: http://lkml.kernel.org/r/20190521151647.GB2870@cmpxchg.org Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: kernel test robot <rong.a.chen@intel.com> Tested-by: kernel test robot <rong.a.chen@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-14PCI: PM: Skip devices in D0 for suspend-to-idleRafael J. Wysocki
Commit d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue") attempted to avoid a problem with devices whose drivers want them to stay in D0 over suspend-to-idle and resume, but it did not go as far as it should with that. Namely, first of all, the power state of a PCI bridge with a downstream device in D0 must be D0 (based on the PCI PM spec r1.2, sec 6, table 6-1, if the bridge is not in D0, there can be no PCI transactions on its secondary bus), but that is not actively enforced during system-wide PM transitions, so use the skip_bus_pm flag introduced by commit d491f2b75237 for that. Second, the configuration of devices left in D0 (whatever the reason) during suspend-to-idle need not be changed and attempting to put them into D0 again by force is pointless, so explicitly avoid doing that. Fixes: d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue") Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-06-13Merge branch 'bpf-ppc-div-fix'Daniel Borkmann
Naveen N. Rao says: ==================== The first patch updates DIV64 overflow tests to properly detect error conditions. The second patch fixes powerpc64 JIT to generate the proper unsigned division instruction for BPF_ALU64. ==================== Acked-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13powerpc/bpf: use unsigned division instruction for 64-bit operationsNaveen N. Rao
BPF_ALU64 div/mod operations are currently using signed division, unlike BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass with this fix. Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13bpf: fix div64 overflow tests to properly detect errorsNaveen N. Rao
If the result of the division is LLONG_MIN, current tests do not detect the error since the return value is truncated to a 32-bit value and ends up being 0. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-13bpf: sync BPF_FIB_LOOKUP flag changes with BPF uapiMartynas Pumputis
Sync the changes to the flags made in "bpf: simplify definition of BPF_FIB_LOOKUP related flags" with the BPF UAPI headers. Doing in a separate commit to ease syncing of github/libbpf. Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>