linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-08-20	Merge tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Regularly scheduled fixes. The ttm one solves a problem of GPU drivers failing to load if debugfs is off in Kconfig, otherwise the i915 and mediatek, and amdgpu fixes all fairly normal. Nouveau has a couple of display fixes, but it has a fix for a longstanding race condition in it's memory manager code, and the fix mostly removes some code that wasn't working properly and has no userspace users. This fix makes the diffstat kinda larger but in a good (negative line-count) way. core: - fix drm_wait_vblank uapi copying bug ttm: - fix debugfs init when debugfs is off amdgpu: - vega10 SMU workload fix - DCN VM fix - DCN 3.01 watermark fix amdkfd: - SVM fix nouveau: - ampere display fixes - remove MM misfeature to fix a longstanding race condition i915: - tweaked display workaround for all PCHs - eDP MSO pipe sanity for ADL-P fix - remove unused symbol export mediatek: - AAL output size setting - Delete component in remove function" * tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Use DCN30 watermark calc for DCN301 drm/i915/dp: remove superfluous EXPORT_SYMBOL() drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P drm/i915: Tweaked Wa_14010685332 for all PCHs drm/nouveau: rip out nvkm_client.super drm/nouveau: block a bunch of classes from userspace drm/nouveau/fifo/nv50-: rip out dma channels drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences drm/nouveau/disp: power down unused DP links during init drm/nouveau: recognise GA107 drm: Copy drm_wait_vblank to user before returning drm/amd/display: Ensure DCN save after VM setup drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure drm/amd/pm: change the workload type for some cards Revert "drm/amd/pm: fix workload mismatch on vega10" drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails drm/mediatek: Add component_del in OVL and COLOR remove function drm/mediatek: Add AAL output size configuration
2021-08-20	Merge tag 'pci-v5.14-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer (Rahul Tanwar) - Add Jim Quinlan et al as Broadcom STB PCIe maintainers (Jim Quinlan) - Increase D3hot-to-D0 delay for AMD Renoir/Cezanne XHCI (Marcin Bachry) - Correct iomem_get_mapping() usage for legacy_mem sysfs (Krzysztof Wilczyński) * tag 'pci-v5.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/sysfs: Use correct variable for the legacy_mem sysfs object PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI MAINTAINERS: Add Jim Quinlan et al as Broadcom STB PCIe maintainers MAINTAINERS: Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer
2021-08-20	Merge tag 'mmc-v5.14-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - dw_mmc: Fix hang on data CRC error - mmci: Fix voltage switch procedure for the stm32 variant - sdhci-iproc: Fix some clock issues for BCM2711 - sdhci-msm: Fixup software timeout value * tag 'mmc-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711 mmc: sdhci-iproc: Cap min clock frequency on BCM2711 mmc: sdhci-msm: Update the software timeout value for sdhc mmc: mmci: stm32: Check when the voltage switch procedure should be done mmc: dw_mmc: Fix hang on data CRC error
2021-08-20	Merge tag 'sound-5.14-rc7-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull more sound fixes from Takashi Iwai: "This is a quick follow up for 5.14: a fix for a very recently introduced regression on ASoC Intel Atom driver, and another trivial HD-audio quirk for HP laptops" * tag 'sound-5.14-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: intel: atom: Fix breakage for PCM buffer address setup ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8
2021-08-20	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix cleaning of vDSO directories - Ensure CNTHCTL_EL2 is fully initialised when booting at EL2 * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: initialize all of CNTHCTL_EL2 arm64: clean vdso & vdso32 files
2021-08-20	Merge branch 'acpi-pm'	Rafael J. Wysocki
	* acpi-pm: ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
2021-08-20	Merge tag 'iommu-fixes-v5.14-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix for a potential NULL-ptr dereference in IOMMU core code - Two resource leak fixes - Cache flush fix in the Intel VT-d driver * tag 'iommu-fixes-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry() iommu/vt-d: Fix PASID reference leak iommu: Check if group is NULL before remove device iommu/dma: Fix leak in non-contiguous API
2021-08-20	Merge branch 'pm-opp'	Rafael J. Wysocki
	* pm-opp: opp: Drop empty-table checks from _put functions opp: remove WARN when no valid OPPs remain
2021-08-20	RDMA/rxe: Zero out index member of struct rxe_queue	Xiao Yang
	1) New index member of struct rxe_queue was introduced but not zeroed so the initial value of index may be random. 2) The current index is not masked off to index_mask. In this case producer_addr() and consumer_addr() will get an invalid address by the random index and then accessing the invalid address triggers the following panic: "BUG: unable to handle page fault for address: ffff9ae2c07a1414" Fix the issue by using kzalloc() to zero out index member. Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space") Link: https://lore.kernel.org/r/20210820111509.172500-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-20	hugetlb: don't pass page cache pages to restore_reserve_on_error	Mike Kravetz
	syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1]. This BUG triggers if the HPageRestoreReserve flag is set on a page in the page cache. It should never be set, as the routine huge_add_to_page_cache explicitly clears the flag after adding a page to the cache. The only code other than huge page allocation which sets the flag is restore_reserve_on_error. It will potentially set the flag in rare out of memory conditions. syzbot was injecting errors to cause memory allocation errors which exercised this specific path. The code in restore_reserve_on_error is doing the right thing. However, there are instances where pages in the page cache were being passed to restore_reserve_on_error. This is incorrect, as once a page goes into the cache reservation information will not be modified for the page until it is removed from the cache. Error paths do not remove pages from the cache, so even in the case of error, the page will remain in the cache and no reservation adjustment is needed. Modify routines that potentially call restore_reserve_on_error with a page cache page to no longer do so. Note on fixes tag: Prior to commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality") the routine would not process page cache pages because the HPageRestoreReserve flag is not set on such pages. Therefore, this issue could not be trigggered. The code added by commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality") is needed and correct. It exposed incorrect calls to restore_reserve_on_error which is the root cause addressed by this commit. [1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/ Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: <syzbot+67654e51e54455f1c585@syzkaller.appspotmail.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE	Marco Elver
	Originally the addr != NULL check was meant to take care of the case where __kfence_pool == NULL (KFENCE is disabled). However, this does not work for addresses where addr > 0 && addr < KFENCE_POOL_SIZE. This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or any other faulting access with addr < KFENCE_POOL_SIZE. While the kernel would likely crash, the stack traces and report might be confusing due to double faults upon KFENCE's attempt to unprotect such an address. Fix it by just checking that __kfence_pool != NULL instead. Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") Signed-off-by: Marco Elver <elver@google.com> Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Acked-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> [5.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	mm: vmscan: fix missing psi annotation for node_reclaim()	Johannes Weiner
	In a debugging session the other day, Rik noticed that node_reclaim() was missing memstall annotations. This means we'll miss pressure and lost productivity resulting from reclaim on an overloaded local NUMA node when vm.zone_reclaim_mode is enabled. There haven't been any reports, but that's likely because vm.zone_reclaim_mode hasn't been a commonly used feature recently, and the intersection between such setups and psi users is probably nil. But secondary memory such as CXL-connected DIMMS, persistent memory etc, and the page demotion patches that handle them (https://lore.kernel.org/lkml/20210401183216.443C4443@viggo.jf.intel.com/) could soon make this a more common codepath again. Link: https://lkml.kernel.org/r/20210818152457.35846-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Rik van Riel <riel@surriel.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	mm/hwpoison: retry with shake_page() for unhandlable pages	Naoya Horiguchi
	HWPoisonHandlable() sometimes returns false for typical user pages due to races with average memory events like transfers over LRU lists. This causes failures in hwpoison handling. There's retry code for such a case but does not work because the retry loop reaches the retry limit too quickly before the page settles down to handlable state. Let get_any_page() call shake_page() to fix it. [naoya.horiguchi@nec.com: get_any_page(): return -EIO when retry limit reached] Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev Fixes: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation") Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reported-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> [5.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim	Johannes Weiner
	We've noticed occasional OOM killing when memory.low settings are in effect for cgroups. This is unexpected and undesirable as memory.low is supposed to express non-OOMing memory priorities between cgroups. The reason for this is proportional memory.low reclaim. When cgroups are below their memory.low threshold, reclaim passes them over in the first round, and then retries if it couldn't find pages anywhere else. But when cgroups are slightly above their memory.low setting, page scan force is scaled down and diminished in proportion to the overage, to the point where it can cause reclaim to fail as well - only in that case we currently don't retry, and instead trigger OOM. To fix this, hook proportional reclaim into the same retry logic we have in place for when cgroups are skipped entirely. This way if reclaim fails and some cgroups were scanned with diminished pressure, we'll try another full-force cycle before giving up and OOMing. [akpm@linux-foundation.org: coding-style fixes] Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Leon Yang <lnyng@fb.com> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> [5.4+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	MAINTAINERS: update ClangBuiltLinux IRC chat	Nathan Chancellor
	Everyone has moved from Freenode to Libera so updated the channel entry for MAINTAINERS. Link: https://github.com/ClangBuiltLinux/linux/issues/1402 Link: https://lkml.kernel.org/r/20210818022339.3863058-1-nathan@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names	Mike Rapoport
	printk("%pGg") outputs these two flags as hexadecimal number, rather than as a string, e.g: GFP_KERNEL\|0x1800000 Fix this by adding missing names of __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON flags to __def_gfpflag_names. Link: https://lkml.kernel.org/r/20210816133502.590-1-rppt@kernel.org Fixes: 013bb59dbb7c ("arm64: mte: handle tags zeroing at page allocation time") Fixes: c275c5c6d50a ("kasan: disable freed user page poisoning with HW tags") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	mm/page_alloc: don't corrupt pcppage_migratetype	Doug Berger
	When placing pages on a pcp list, migratetype values over MIGRATE_PCPTYPES get added to the MIGRATE_MOVABLE pcp list. However, the actual migratetype is preserved in the page and should not be changed to MIGRATE_MOVABLE or the page may end up on the wrong free_list. The impact is that HIGHATOMIC or CMA pages getting bulk freed from the PCP lists could potentially end up on the wrong buddy list. There are various consequences but minimally NR_FREE_CMA_PAGES accounting could get screwed up. [mgorman@techsingularity.net: changelog update] Link: https://lkml.kernel.org/r/20210811182917.2607994-1-opendmb@gmail.com Fixes: df1acc856923 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	Revert "mm: swap: check if swap backing device is congested or not"	Yang Shi
	Due to the change about how block layer detects congestion the justification of commit 8fd2e0b505d1 ("mm: swap: check if swap backing device is congested or not") doesn't stand anymore, so the commit could be just reverted in order to solve the race reported by commit 2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff"). The fix was reverted by the previous patch. Link: https://lkml.kernel.org/r/20210810202936.2672-3-shy828301@gmail.com Signed-off-by: Yang Shi <shy828301@gmail.com> Suggested-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	Revert "mm/shmem: fix shmem_swapin() race with swapoff"	Yang Shi
	Due to the change about how block layer detects congestion the justification of commit 8fd2e0b505d1 ("mm: swap: check if swap backing device is congested or not") doesn't stand anymore, so the commit could be just reverted in order to solve the race reported by commit 2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff"), so the fix commit could be just reverted as well. And that fix is also kind of buggy as discussed by [1] and [2]. [1] https://lore.kernel.org/linux-mm/24187e5e-069-9f3f-cefe-39ac70783753@google.com/ [2] https://lore.kernel.org/linux-mm/e82380b9-3ad4-4a52-be50-6d45c7f2b5da@google.com/ Link: https://lkml.kernel.org/r/20210810202936.2672-2-shy828301@gmail.com Signed-off-by: Yang Shi <shy828301@gmail.com> Suggested-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-20	RDMA/efa: Free IRQ vectors on error flow	Gal Pressman
	Make sure to free the IRQ vectors in case the allocation doesn't return the expected number of IRQs. Fixes: b7f5e880f377 ("RDMA/efa: Add the efa module") Link: https://lore.kernel.org/r/20210811151131.39138-2-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-20	SUNRPC: Add documentation for the fail_sunrpc/ directory	Chuck Lever
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20	SUNRPC: Server-side disconnect injection	Chuck Lever
	Disconnect injection stress-tests the ability for both client and server implementations to behave resiliently in the face of network instability. A file called /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect enables administrators to turn off server-side disconnect injection while allowing other types of sunrpc errors to be injected. The default setting is that server-side disconnect injection is enabled (ignore=false). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20	SUNRPC: Move client-side disconnect injection	Chuck Lever
	Disconnect injection stress-tests the ability for both client and server implementations to behave resiliently in the face of network instability. Convert the existing client-side disconnect injection infrastructure to use the kernel's generic error injection facility. The generic facility has a richer set of injection criteria. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20	SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory	Chuck Lever
	This directory will contain a set of administrative controls for enabling error injection for kernel RPC consumers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-20	drm/amdgpu: Cancel delayed work when GFXOFF is disabled	Michel Dänzer
	schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3 Acked-by: Christian König <christian.koenig@amd.com> # v3 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20	drm/amdgpu: use the preferred pin domain after the check	Christian König
	For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM\|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-20	riscv: Fix a number of free'd resources in init_resources()	Petr Pavlu
	Function init_resources() allocates a boot memory block to hold an array of resources which it adds to iomem_resource. The array is filled in from its end and the function then attempts to free any unused memory at the beginning. The problem is that size of the unused memory is incorrectly calculated and this can result in releasing memory which is in use by active resources. Their data then gets corrupted later when the memory is reused by a different part of the system. Fix the size of the released memory to correctly match the number of unused resource entries. Fixes: ffe0e5261268 ("RISC-V: Improve init_resources()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Nick Kossifidis <mick@ics.forth.gr> Tested-by: Sunil V L <sunilvl@ventanamicro.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-20	power: supply: core: Fix parsing of battery chemistry/technology	Dmitry Osipenko
	The power_supply_get_battery_info() fails if device-chemistry property is missing in a device-tree because error variable is propagated to the final return of the function, fix it. Fixes: 4eef766b7d4d ("power: supply: core: Parse battery chemistry/technology") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2021-08-20	e1000e: Do not take care about recovery NVM checksum	Sasha Neftin
	On new platforms, the NVM is read-only. Attempting to update the NVM is causing a lockup to occur. Do not attempt to write to the NVM on platforms where it's not supported. Emit an error message when the NVM checksum is invalid. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213667 Fixes: fb776f5d57ee ("e1000e: Add support for Tiger Lake") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Suggested-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20	e1000e: Fix the max snoop/no-snoop latency for 10M	Sasha Neftin
	We should decode the latency and the max_latency before directly compare. The latency should be presented as lat_enc = scale x value: lat_enc_d = (lat_enc & 0x0x3ff) x (1U << (5*((max_ltr_enc & 0x1c00) >> 10))) Fixes: cf8fb73c23aa ("e1000e: add support for LTR on I217/I218") Suggested-by: Yee Li <seven.yi.lee@gmail.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20	igc: Use num_tx_queues when iterating over tx_ring queue	Toshiki Nishioka
	Use num_tx_queues rather than the IGC_MAX_TX_QUEUES fixed number 4 when iterating over tx_ring queue since instantiated queue count could be less than 4 where on-line cpu count is less than 4. Fixes: ec50a9d437f0 ("igc: Add support for taprio offloading") Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Tested-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20	igc: fix page fault when thunderbolt is unplugged	Aaron Ma
	After unplug thunderbolt dock with i225, pciehp interrupt is triggered, remove call will read/write mmio address which is already disconnected, then cause page fault and make system hang. Check PCI state to remove device safely. Trace: BUG: unable to handle page fault for address: 000000000000b604 Oops: 0000 [#1] SMP NOPTI RIP: 0010:igc_rd32+0x1c/0x90 [igc] Call Trace: igc_ptp_suspend+0x6c/0xa0 [igc] igc_ptp_stop+0x12/0x50 [igc] igc_remove+0x7f/0x1c0 [igc] pci_device_remove+0x3e/0xb0 __device_release_driver+0x181/0x240 Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Fixes: b03c49cde61f ("igc: Save PTP time before a reset") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-20	gfs2: Remove redundant check from gfs2_glock_dq	Bob Peterson
	In function gfs2_glock_dq, it checks to see if this is the fast path. Before this patch, it checked both "find_first_holder(gl) == NULL" and list_empty(&gl->gl_holders), which is redundant. If gl_holders is empty then find_first_holder must return NULL. This patch removes the redundancy. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Delay withdraw from atomic context	Bob Peterson
	Before this patch, if function __gfs2_ail_flush detected an error syncing the ail list, it call gfs2_ail_error which called gfs2_withdraw. Since __gfs2_ail_flush deals with a specific glock, we shouldn't withdraw immediately because the withdraw code (signal_our_withdraw) uses glocks in its processing. This patch changes the call from gfs2_withdraw to gfs2_withdraw_delayed which defers the withdraw until a more appropriate context, such as the logd daemon, discovers the intent to withdraw. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Don't call dlm after protocol is unmounted	Bob Peterson
	In the gfs2 withdraw sequence, the dlm protocol is unmounted with a call to lm_unmount. After a withdraw, users are allowed to unmount the withdrawn file system. But at that point we may still have glocks left over that we need to free via unmount's call to gfs2_gl_hash_clear. These glocks may have never been completed because of whatever problem caused the withdraw (IO errors or whatever). Before this patch, function gdlm_put_lock would still try to call into dlm to unlock these leftover glocks, which resulted in dlm returning -EINVAL because the lock space was abandoned. These glocks were never freed because there was no mechanism after that to free them. This patch adds a check to gdlm_put_lock to see if the locking protocol was inactive (DFL_UNMOUNT flag) and if so, free the glock and not make the invalid call into dlm. I could have combined this "if" with the one that follows, related to leftover glock LVBs, but I felt the code was more readable with its own if clause. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: don't stop reads while withdraw in progress	Bob Peterson
	When gfs2 withdraws a file system, it calls signal_our_withdraw which triggers another node to replay the withdrawing node's journal. Then it waits until it knows the journal has been replayed. Part of this wait is to repeatedly call check_journal_clean which calls gfs2_jdesc_check, which checks to see if the journal is sane. As part of its sanity checks it needs to re-read its journal's metadata. But with today's code, any attempt to re-read the metadata results in -EIO because of a check for the file system withdraw in function gfs2_meta_wait. This patch adds an additional check for SDF_WITHDRAW_IN_PROG, to tell if the read is done while the withdraw is in progress. In that case we allow the metadata read to not be rejected. Therefore the metadata check is done properly, so the withdraw sequence can finish normally. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Mark journal inodes as "don't cache"	Bob Peterson
	Before this patch, journal inodes were considered regular inodes, which meant that instead of evicting them, function iput_final would just put them on the lru for later processing. If the file system withdrew for whatever reason, the withdraw would never be seen until the inode was evicted, which could be indefinitely. This patch marks all journal inodes as "don't cache" which means function iput_final will evict them immediately, allowing us to properly recover the journal on other cluster nodes. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: nit: gfs2_drop_inode shouldn't return bool	Bob Peterson
	Today, gfs2_drop_inode can return "false" for an int value. I'm sure this was just an oversight. Change to int value. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Eliminate vestigial HIF_FIRST	Bob Peterson
	Holder flag HIF_FIRST is no longer used or needed, so remove it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Make recovery error more readable	Bob Peterson
	Before this patch, withdraws could cause an error that looked like: Journal recovery skipped for 0 until next mount. This patch changes it to a more readable: Journal recovery skipped for jid 0 until next mount. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	gfs2: Don't release and reacquire local statfs bh	Bob Peterson
	Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to read in the local statefs buffer_head at mount time and keep it around until unmount time. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	Merge branch irq/misc-5.15 into irq/irqchip-next	Marc Zyngier
	* irq/misc-5.15: : . : Various irqchip fixes: : : - Fix edge interrupt support on loongson systems : - Advertise lack of wake-up logic on mtk-sysirq : - Fix mask tracking on the Apple AIC : - Correct priority reading of arm64 pseudo-NMI when SCR_EL3.FIQ==0 : . irqchip/gic-v3: Fix priority comparison when non-secure priorities are used irqchip/apple-aic: Fix irq_disable from within irq handlers Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-20	irqchip/gic-v3: Fix priority comparison when non-secure priorities are used	Chen-Yu Tsai
	When non-secure priorities are used, compared to the raw priority set, the value read back from RPR is also right-shifted by one and the highest bit set. Add a macro to do the modifications to the raw priority when doing the comparison against the RPR value. This corrects the pseudo-NMI behavior when non-secure priorities in the GIC are used. Tested on 5.10 with the "IPI as pseudo-NMI" series [1] applied on MT8195. [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/ Fixes: 336780590990 ("irqchip/gic-v3: Support pseudo-NMIs when SCR_EL3.FIQ == 0") Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> [maz: Added comment contributed by Alex] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210811171505.1502090-1-wenst@chromium.org
2021-08-20	gfs2: init system threads before freeze lock	Bob Peterson
	Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem if the system needs to withdraw due to IO errors early in the mount sequence, for example, while initializing the system statfs inode: 1. An IO error causes the statfs glock to not sync properly after recovery, and leaves items on the ail list. 2. The leftover items on the ail list causes its do_xmote call to fail, which makes it want to withdraw. But since the glock code cannot withdraw (because the withdraw sequence uses glocks) it relies upon the logd daemon to initiate the withdraw. 3. The withdraw can never be performed by the logd daemon because all this takes place before the logd daemon is started. This patch moves function init_threads from super.c to ops_fstype.c and it changes gfs2_fill_super to start its threads before holding the freeze lock, and if there's an error, stop its threads after releasing it. This allows the logd to run unblocked by the freeze lock. Thus, the logd daemon can perform its withdraw sequence properly. Fixes: 96b1454f2e8e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20	net: usb: pegasus: fixes of set_register(s) return value evaluation;	Petko Manolov
	- restore the behavior in enable_net_traffic() to avoid regressions - Jakub Kicinski; - hurried up and removed redundant assignment in pegasus_open() before yet another checker complains; Fixes: 8a160e2e9aeb ("net: usb: pegasus: Check the return value of get_geristers() and friends;") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Petko Manolov <petko.manolov@konsulko.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20	net: qrtr: fix another OOB Read in qrtr_endpoint_post	Xiaolong Huang
	This check was incomplete, did not consider size is 0: if (len != ALIGN(size, 4) + hdrlen) goto err; if size from qrtr_hdr is 0, the result of ALIGN(size, 4) will be 0, In case of len == hdrlen and size == 0 in header this check won't fail and if (cb->type == QRTR_TYPE_NEW_SERVER) { /* Remote node endpoint can bridge other distant nodes / const struct qrtr_ctrl_pkt pkt = data + hdrlen; qrtr_node_assign(node, le32_to_cpu(pkt->server.node)); } will also read out of bound from data, which is hdrlen allocated block. Fixes: 194ccc88297a ("net: qrtr: Support decoding incoming v2 packets") Fixes: ad9d24c9429e ("net: qrtr: fix OOB Read in qrtr_endpoint_post") Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-20	irqchip/apple-aic: Fix irq_disable from within irq handlers	Sven Peter
	When disable_irq_nosync for an interrupt is called from within its interrupt handler, this interrupt is only marked as disabled with the intention to mask it when it triggers again. The AIC hardware however automatically masks the interrupt when it is read. aic_irq_eoi then unmasks it again if it's not disabled and not masked. This results in a state mismatch between the hardware state and the state kept in irq_data: The hardware interrupt is masked but IRQD_IRQ_MASKED is not set. Any further calls to unmask_irq will directly return and the interrupt can never be enabled again. Fix this by keeping the hardware and irq_data state in sync by unmasking in aic_irq_eoi if and only if the irq_data state also assumes the interrupt to be unmasked. Fixes: 76cde2639411 ("irqchip/apple-aic: Add support for the Apple Interrupt Controller") Signed-off-by: Sven Peter <sven@svenpeter.dev> Acked-by: Hector Martin <marcan@marcan.st> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210812100942.17206-1-sven@svenpeter.dev
2021-08-20	powerpc/64s: Fix scv implicit soft-mask table for relocated kernels	Nicholas Piggin
	The implict soft-mask table addresses get relocated if they use a relative symbol like a label. This is right for code that runs relocated but not for unrelocated. The scv interrupt vectors run unrelocated, so absolute addresses are required for their soft-mask table entry. This fixes crashing with relocated kernels, usually an asynchronous interrupt hitting in the scv handler, then hitting the trap that checks whether r1 is in userspace. Fixes: 325678fd0522 ("powerpc/64s: add a table of implicit soft-masked addresses") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210820103431.1701240-1-npiggin@gmail.com
2021-08-20	spi: stm32: fix excluded_middle.cocci warnings	kernel test robot
	drivers/spi/spi-stm32.c:915:23-25: WARNING !A \|\| A && B is equivalent to !A \|\| B Condition !A \|\| A && B is equivalent to !A \|\| B. Generated by: scripts/coccinelle/misc/excluded_middle.cocci Fixes: 7ceb0b8a3ced ("spi: stm32: finalize message either on dma callback or EOT") CC: Alain Volmat <alain.volmat@foss.st.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Alain Volmat <alain.volmat@foss.st.com> Link: https://lore.kernel.org/r/20210713191004.GA14729@5eb5c2cbef84 Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-20	locking/semaphore: Add might_sleep() to down_*() family	Xiaoming Ni
	Semaphore is sleeping lock. Add might_sleep() to down*() family (with exception of down_trylock()) to detect atomic context sleep. Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210809021215.19991-1-nixiaoming@huawei.com