linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2021-09-09	Merge tag 'dmaengine-5.15-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New drivers/devices - Support for Renesas RZ/G2L dma controller - New driver for AMD PTDMA controller Updates: - Big pile of idxd updates - Updates for Altera driver, stm32-dma, dw etc" * tag 'dmaengine-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (83 commits) dmaengine: sh: fix some NULL dereferences dmaengine: sh: Fix unused initialization of pointer lmdesc MAINTAINERS: Fix AMD PTDMA DRIVER entry dmaengine: ptdma: remove PT_OFFSET to avoid redefnition dmaengine: ptdma: Add debugfs entries for PTDMA dmaengine: ptdma: register PTDMA controller as a DMA resource dmaengine: ptdma: Initial driver for the AMD PTDMA dmaengine: fsl-dpaa2-qdma: Fix spelling mistake "faile" -> "failed" dmaengine: idxd: remove interrupt disable for dev_lock dmaengine: idxd: remove interrupt disable for cmd_lock dmaengine: idxd: fix setting up priv mode for dwq dmaengine: xilinx_dma: Set DMA mask for coherent APIs dmaengine: ti: k3-psil-j721e: Add entry for CSI2RX dmaengine: sh: Add DMAC driver for RZ/G2L SoC dmaengine: Extend the dma_slave_width for 128 bytes dt-bindings: dma: Document RZ/G2L bindings dmaengine: ioat: depends on !UML dmaengine: idxd: set descriptor allocation size to threshold for swq dmaengine: idxd: make submit failure path consistent on desc freeing dmaengine: idxd: remove interrupt flag for completion list spinlock ...
2021-09-09	arm64: mm: limit linear region to 51 bits for KVM in nVHE mode	Ard Biesheuvel
	KVM in nVHE mode divides up its VA space into two equal halves, and picks the half that does not conflict with the HYP ID map to map its linear region. This worked fine when the kernel's linear map itself was guaranteed to cover precisely as many bits of VA space, but this was changed by commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations"). The result is that, depending on the placement of the ID map, kernel-VA to hyp-VA translations may produce addresses that either conflict with other HYP mappings (including the ID map itself) or generate addresses outside of the 52-bit addressable range, neither of which is likely to lead to anything useful. Given that 52-bit capable cores are guaranteed to implement VHE, this only affects configurations such as pKVM where we opt into non-VHE mode even if the hardware is VHE capable. So just for these configurations, let's limit the kernel linear map to 51 bits and work around the problem. Fixes: f4693c2716b3 ("arm64: mm: extend linear region for 52-bit VA configurations") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210826165613.60774-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-09	fs/ntfs3: Show uid/gid always in show_options()	Kari Argillander
	Show options should show option according documentation when some value is not default or when ever coder wants. Uid/gid are problematic because it is hard to know which are defaults. In file system there is many different implementation for this problem. Some file systems show uid/gid when they are different than root, some when user has set them and some show them always. There is also problem that what if root uid/gid change. This code just choose to show them always. This way we do not need to think this any more. Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Rename mount option no_acs_rules > (no)acsrules	Kari Argillander
	Rename mount option no_acs_rules to (no)acsrules. This allow us to use possibility to mount with options noaclrules or aclrules. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Add iocharset= mount option as alias for nls=	Kari Argillander
	Other fs drivers are using iocharset= mount option for specifying charset. So add it also for ntfs3 and mark old nls= mount option as deprecated. Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Make mount option nohidden more universal	Kari Argillander
	If we call Opt_nohidden with just keyword hidden, then we can use hidden/nohidden when mounting. We already use this method for almoust all other parameters so it is just logical that this will use same method. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Init spi more in init_fs_context than fill_super	Kari Argillander
	init_fs_context() is meant to initialize s_fs_info (spi). Move spi initializing code there which we can initialize before fill_super(). Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Use new api for mounting	Kari Argillander
	We have now new mount api as described in Documentation/filesystems. We should use it as it gives us some benefits which are desribed here lore.kernel.org/linux-fsdevel/159646178122.1784947.11705396571718464082.stgit@warthog.procyon.org.uk/ Nls loading is changed a to load with string. This did make code also little cleaner. Also try to use fsparam_flag_no as much as possible. This is just nice little touch and is not mandatory but it should not make any harm. It is just convenient that we can use example acl/noacl mount options. Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Convert mount options to pointer in sbi	Kari Argillander
	Use pointer to mount options. We want to do this because we will use new mount api which will benefit that we have spi and mount options in different allocations. When we remount we do not have to make whole new spi it is enough that we will allocate just mount options. Please note that we can do example remount lot cleaner but things will change in next patch so this should be just functional. Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Remove unnecesarry remount flag handling	Kari Argillander
	Remove unnecesarry remount flag handling. This does not do anything for this driver. We have already set SB_NODIRATIME when we fill super. Also noatime should be set from mount option. Now for some reson we try to set it when remounting. Lazytime part looks like it is copied from f2fs and there is own mount parameter for it. That is why they use it. We do not set lazytime anywhere in our code. So basically this just blocks lazytime when remounting. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	fs/ntfs3: Remove unnecesarry mount option noatime	Kari Argillander
	Remove unnecesarry mount option noatime because this will be handled by VFS. Our option parser will never get opt like this. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Kari Argillander <kari.argillander@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09	io_uring: fail links of cancelled timeouts	Pavel Begunkov
	When we cancel a timeout we should mark it with REQ_F_FAIL, so linked requests are cancelled as well, but not queued for further execution. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fff625b44eeced3a5cae79f60e6acf3fbdf8f990.1631192135.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-09	MAINTAINERS: fix update references to stm32 audio bindings	Arnaud Pouliquen
	The 00d38fd8d2524 ("MAINTAINERS: update references to stm32 audio bindings") commit update the bindings reference, by removing bindings/sound/st,stm32-adfsdm.txt, to set the new reference to bindings/iio/adc/st,stm32-.yaml. This leads to "get_maintainer finds" the match for the Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.yaml, but also to the IIO bindings Documentation/devicetree/bindings/iio/adc/st,stm32-adc.yaml And The commit fixes only a part of the problem: Documentation/devicetree/bindings/sound/st,stm32-.txt file have been also moved to yaml. Update references to include all stm32 audio bindings file and exclude the st,stm32-adc.yaml bindings file. cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Fixes: 0d38fd8d2524 ("MAINTAINERS: update references to stm32 audio bindings") Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com> Link: https://lore.kernel.org/r/20210909145449.24388-1-arnaud.pouliquen@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-09	ext4: enforce buffer head state assertion in ext4_da_map_blocks	Eric Whitney
	Remove the code that re-initializes a buffer head with an invalid block number and BH_New and BH_Delay bits when a matching delayed and unwritten block has been found in the extent status cache. Replace it with assertions that verify the buffer head already has this state correctly set. The current code masked an inline data truncation bug that left stale entries in the extent status cache. With this change, generic/130 can be used to reproduce and detect that bug. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210819144927.25163-3-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-09-09	ext4: remove extent cache entries when truncating inline data	Eric Whitney
	Conditionally remove all cached extents belonging to an inode when truncating its inline data. It's only necessary to attempt to remove cached extents when a conversion from inline to extent storage has been initiated (!EXT4_STATE_MAY_INLINE_DATA). This avoids unnecessary es lock overhead in the more common inline case. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210819144927.25163-2-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-09-09	Merge branch 'delalloc-buffer-write' into dev	Theodore Ts'o
	Fix a bug in how we update i_disksize, and the error path in inline_data_end. Finally, drop an unnecessary creation of a journal handle which was only needed for inline data, which can give us a large performance gain in delayed allocation writes. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-09-09	thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used	Matthias Kaehlcke
	adc_tm5_register_tzd() registers the thermal zone sensors for all channels of the thermal monitor. If the registration of one channel fails the function skips the processing of the remaining channels and returns an error, which results in _probe() being aborted. One of the reasons the registration could fail is that none of the thermal zones is using the channel/sensor, which hardly is a critical error (if it is an error at all). If this case is detected emit a warning and continue with processing the remaining channels. Fixes: ca66dca5eda6 ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor") Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reported-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210823134726.1.I1dd23ddf77e5b3568625d80d6827653af071ce19@changeid
2021-09-09	thermal/drivers/intel: Allow processing of HWP interrupt	Srinivas Pandruvada
	Add a weak function to process HWP (Hardware P-states) notifications and move updating HWP_STATUS MSR to this function. This allows HWP interrupts to be processed by the intel_pstate driver in HWP mode by overriding the implementation. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210820024006.2347720-1-srinivas.pandruvada@linux.intel.com
2021-09-09	spi: tegra20-slink: Declare runtime suspend and resume functions conditionally	Guenter Roeck
	The following build error is seen with CONFIG_PM=n. drivers/spi/spi-tegra20-slink.c:1188:12: error: 'tegra_slink_runtime_suspend' defined but not used drivers/spi/spi-tegra20-slink.c:1200:12: error: 'tegra_slink_runtime_resume' defined but not used Declare the functions only if PM is enabled. While at it, remove the unnecessary forward declarations. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210907045358.2138282-1-linux@roeck-us.net Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-09	ASoC: mediatek: add required config dependency	Trevor Wu
	Because SND_SOC_MT8195 depends on COMPILE_TEST, it's possible to build MT8195 driver in many different config combinations. Add all dependent config for SND_SOC_MT8195 in case some errors happen when COMPILE_TEST is enabled. Signed-off-by: Trevor Wu <trevor.wu@mediatek.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210909065533.2114-1-trevor.wu@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-09	ASoC: Intel: sof_sdw: tag SoundWire BEs as non-atomic	Pierre-Louis Bossart
	The SoundWire BEs make use of 'stream' functions for .prepare and .trigger. These functions will in turn force a Bank Switch, which implies a wait operation. Mark SoundWire BEs as nonatomic for consistency, but keep all other types of BEs as is. The initialization of .nonatomic is done outside of the create_sdw_dailink helper to avoid adding more parameters to deal with a single exception to the rule that BEs are atomic. Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <bard.liao@intel.com> Link: https://lore.kernel.org/r/20210907184436.33152-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-09-09	io-wq: fix memory leak in create_io_worker()	Qiang.zhang
	BUG: memory leak unreferenced object 0xffff888126fcd6c0 (size 192): comm "syz-executor.1", pid 11934, jiffies 4294983026 (age 15.690s) backtrace: [<ffffffff81632c91>] kmalloc_node include/linux/slab.h:609 [inline] [<ffffffff81632c91>] kzalloc_node include/linux/slab.h:732 [inline] [<ffffffff81632c91>] create_io_worker+0x41/0x1e0 fs/io-wq.c:739 [<ffffffff8163311e>] io_wqe_create_worker fs/io-wq.c:267 [inline] [<ffffffff8163311e>] io_wqe_enqueue+0x1fe/0x330 fs/io-wq.c:866 [<ffffffff81620b64>] io_queue_async_work+0xc4/0x200 fs/io_uring.c:1473 [<ffffffff8162c59c>] __io_queue_sqe+0x34c/0x510 fs/io_uring.c:6933 [<ffffffff8162c7ab>] io_req_task_submit+0x4b/0xa0 fs/io_uring.c:2233 [<ffffffff8162cb48>] io_async_task_func+0x108/0x1c0 fs/io_uring.c:5462 [<ffffffff816259e3>] tctx_task_work+0x1b3/0x3a0 fs/io_uring.c:2158 [<ffffffff81269b43>] task_work_run+0x73/0xb0 kernel/task_work.c:164 [<ffffffff812dcdd1>] tracehook_notify_signal include/linux/tracehook.h:212 [inline] [<ffffffff812dcdd1>] handle_signal_work kernel/entry/common.c:146 [inline] [<ffffffff812dcdd1>] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] [<ffffffff812dcdd1>] exit_to_user_mode_prepare+0x151/0x180 kernel/entry/common.c:209 [<ffffffff843ff25d>] __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] [<ffffffff843ff25d>] syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:302 [<ffffffff843fa4a2>] do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86 [<ffffffff84600068>] entry_SYSCALL_64_after_hwframe+0x44/0xae when create_io_thread() return error, and not retry, the worker object need to be freed. Reported-by: syzbot+65454c239241d3d647da@syzkaller.appspotmail.com Signed-off-by: Qiang.zhang <qiang.zhang@windriver.com> Link: https://lore.kernel.org/r/20210909115822.181188-1-qiang.zhang@windriver.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-09	iommu: Clarify default domain Kconfig	Robin Murphy
	Although strictly it is the AMD and Intel drivers which have an existing expectation of lazy behaviour by default, it ends up being rather unintuitive to describe this literally in Kconfig. Express it instead as an architecture dependency, to clarify that it is a valid config-time decision. The end result is the same since virtio-iommu doesn't support lazy mode and thus falls back to strict at runtime regardless. The per-architecture disparity is a matter of historical expectations: the AMD and Intel drivers have been lazy by default since 2008, and changing that gets noticed by people asking where their I/O throughput has gone. Conversely, Arm-based systems with their wider assortment of IOMMU drivers mostly only support strict mode anyway; only the Arm SMMU drivers have later grown support for passthrough and lazy mode, for users who wanted to explicitly trade off isolation for performance. These days, reducing the default level of isolation in a way which may go unnoticed by users who expect otherwise hardly seems worth risking for the sake of one line of Kconfig, so here's where we are. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/69a0c6f17b000b54b8333ee42b3124c1d5a869e2.1631105737.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09	iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()	Fenghua Yu
	pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); * DEADLOCK * 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09	iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()	Fenghua Yu
	The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09	iommu/amd: Remove iommu_init_ga()	Suravee Suthikulpanit
	Since the function has been simplified and only call iommu_init_ga_log(), remove the function and replace with iommu_init_ga_log() instead. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20210820202957.187572-4-suravee.suthikulpanit@amd.com Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09	iommu/amd: Relocate GAMSup check to early_enable_iommus	Wei Huang
	Currently, iommu_init_ga() checks and disables IOMMU VAPIC support (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set. However it forgets to clear IRQ_POSTING_CAP from the previously set amd_iommu_irq_ops.capability. This triggers an invalid page fault bug during guest VM warm reboot if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is incorrectly set, and crash the system with the following kernel trace. BUG: unable to handle page fault for address: 0000000000400dd8 RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc Call Trace: svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd] ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm] kvm_request_apicv_update+0x10c/0x150 [kvm] svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd] svm_enable_irq_window+0x26/0xa0 [kvm_amd] vcpu_enter_guest+0xbbe/0x1560 [kvm] ? avic_vcpu_load+0xd5/0x120 [kvm_amd] ? kvm_arch_vcpu_load+0x76/0x240 [kvm] ? svm_get_segment_base+0xa/0x10 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm] kvm_vcpu_ioctl+0x22a/0x5d0 [kvm] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes by moving the initializing of AMD IOMMU interrupt remapping mode (amd_iommu_guest_ir) earlier before setting up the amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag. [joro: Squashed the two patches and limited check_features_on_all_iommus() to CONFIG_IRQ_REMAP to fix a compile warning.] Signed-off-by: Wei Huang <wei.huang2@amd.com> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09	parisc: Mark sched_clock unstable only if clocks are not syncronized	Helge Deller
	We check at runtime if the cr16 clocks are stable across CPUs. Only mark the sched_clock unstable by calling clear_sched_clock_stable() if we know that we run on a system which isn't syncronized across CPUs. Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Move pci_dev_is_behind_card_dino to where it is used	Guenter Roeck
	parisc build test images fail to compile with the following error. drivers/parisc/dino.c:160:12: error: 'pci_dev_is_behind_card_dino' defined but not used Move the function just ahead of its only caller to avoid the error. Fixes: 5fa1659105fa ("parisc: Disable HP HSC-PCI Cards to prevent kernel crash") Cc: Helge Deller <deller@gmx.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Reduce sigreturn trampoline to 3 instructions	Helge Deller
	We can move the INSN_LDI_R20 instruction into the branch delay slot. Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Check user signal stack trampoline is inside TASK_SIZE	Helge Deller
	Add some additional checks to ensure the signal stack is inside userspace bounds. Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Drop useless debug info and comments from signal.c	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Drop strnlen_user() in favour of generic version	Helge Deller
	As suggested by Arnd Bergmann, drop the parisc version of strnlen_user() and switch to the generic version. Suggested-by: Arnd Bergmann <arnd@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	parisc: Add missing FORCE prerequisite in Makefile	Helge Deller
	Signed-off-by: Helge Deller <deller@gmx.de>
2021-09-09	net: ni65: Avoid typecast of pointer to u32	Guenter Roeck
	Building alpha:allmodconfig results in the following error. drivers/net/ethernet/amd/ni65.c: In function 'ni65_stop_start': drivers/net/ethernet/amd/ni65.c:751:37: error: cast from pointer to integer of different size buffer[i] = (u32) isa_bus_to_virt(tmdp->u.buffer); 'buffer[]' is declared as unsigned long, so replace the typecast to u32 with a typecast to unsigned long to fix the problem. Cc: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	Merge branch 'sfx-xdp-fallback-tx-queues'	David S. Miller
	Íñigo Huguet says: ==================== sfc: fallback for lack of xdp tx queues If there are not enough hardware resources to allocate one tx queue per CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using them resulted each time with the packet being drop and this message in the logs: XDP TX failed (-22) These patches implement 2 fallback solutions for 2 different situations that might happen: 1. There are not enough free resources for all the tx queues, but there are some free resources available 2. There are not enough free resources at all for tx queues. Both solutions are based in sharing tx queues, using __netif_tx_lock for synchronization. In the second case, as there are not XDP TX queues to share, network stack queues are used instead, but since we're taking __netif_tx_lock, concurrent access to the queues is correctly protected. The solution for this second case might affect performance both of XDP traffic and normal traffice due to lock contention if both are used intensively. That's why I call it a "last resort" fallback: it's not a desirable situation, but at least we have XDP TX working. Some tests has shown good results and indicate that the non-fallback case is not being damaged by this changes. They are also promising for the fallback cases. This is the test: 1. From another machine, send high amount of packets with pktgen, script samples/pktgen/pktgen_sample04_many_flows.sh 2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments "-a XDP_TX --swapmac" and see the results 3. In the tested machine, run also pktgen_sample04 to create high TX normal traffic, and see how xdp_rxq_info results vary Note that this test doesn't check the worst situations for the fallback solutions because XDP_TX will only be executed from the same CPUs that are processed by sfc, and not from every CPU in the system, so the performance drop due to the highest locking contention doesn't happen. I'd like to test that, as well, but I don't have access right now to a proper environment. Test results: Without doing TX: Before changes: ~2,900,000 pps After changes, 1 queues/core: ~2,900,000 pps After changes, 2 queues/core: ~2,900,000 pps After changes, 8 queues/core: ~2,900,000 pps After changes, borrowing from network stack: ~2,900,000 pps With multiflow TX at the same time: Before changes: ~1,700,000 - 2,900,000 pps After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps After changes, 2 queues/core: ~1,700,000 pps After changes, 8 queues/core: ~1,700,000 pps After changes, borrowing from network stack: 1,150,000 pps Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program and pktgen simultaneously. This was expected because XDP doesn't have any buffering system if the NIC is under very high pressure. Thousands of these warnings are shown in the case of borrowing net stack queues. As I said before, this was also expected. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	sfc: last resort fallback for lack of xdp tx queues	Íñigo Huguet
	Previous patch addressed the situation of having some free resources for xdp tx but not enough for one tx queue per CPU. This patch address the worst case of not having resources at all for xdp tx. Instead of using queues dedicated to xdp, normal queues used by network stack are shared for both cases, using __netif_tx_lock for synchronization. Also queue stop/restart must be considered in the xdp path to avoid freezing the queue. This is not the ideal situation we might want to be, and a performance penalty is expected both for normal and xdp traffic, but at least XDP will work in all possible situations (with a warning in the logs), improving a bit the pain of not knowing in what situations we can use it and in what situations we cannot. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	sfc: fallback for lack of xdp tx queues	Íñigo Huguet
	If there are not enough resources to allocate one TX queue per core for XDP TX it was completely disabled. This patch implements a fallback solution for sharing the available queues using __netif_tx_lock for synchronization. In the normal case that there is one TX queue per CPU, no locking is done, as it was before. With this fallback solution, XDP TX will work in much more cases that were failing, specially in machines with many CPUs. It's hard for XDP users to know what features are supported across different NICs and configurations, so they will benefit on having wider support. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP	Joakim Zhang
	Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid build warning with !CONFIG_PM_SLEEP: >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function] 796 \| static int stmmac_pltfr_noirq_resume(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function] 775 \| static int stmmac_pltfr_noirq_suspend(struct device dev) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Fixes: 276aae377206 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	net: xfrm: fix shift-out-of-bounds in xfrm_get_default	Pavel Skripkin
	Syzbot hit shift-out-of-bounds in xfrm_get_default. The problem was in missing validation check for user data. up->dirmask comes from user-space, so we need to check if this value is less than XFRM_USERPOLICY_DIRMASK_MAX to avoid shift-out-of-bounds bugs. Fixes: 2d151d39073a ("xfrm: Add possibility to set the default to block if we have no policy") Reported-and-tested-by: syzbot+b2be9dd8ca6f6c73ee2d@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-09-09	net/l2tp: Fix reference count leak in l2tp_udp_recv_core	Xiyu Yang
	The reference count leak issue may take place in an error handling path. If both conditions of tunnel->version == L2TP_HDR_VER_3 and the return value of l2tp_v3_ensure_opt_in_linear is nonzero, the function would directly jump to label invalid, without decrementing the reference count of the l2tp_session object session increased earlier by l2tp_tunnel_get_session(). This may result in refcount leaks. Fix this issue by decrease the reference count before jumping to the label invalid. Fixes: 4522a70db7aa ("l2tp: fix reading optional fields of L2TPv3") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	net/af_unix: fix a data-race in unix_dgram_poll	Eric Dumazet
	syzbot reported another data-race in af_unix [1] Lets change __skb_insert() to use WRITE_ONCE() when changing skb head qlen. Also, change unix_dgram_poll() to use lockless version of unix_recvq_full() It is verry possible we can switch all/most unix_recvq_full() to the lockless version, this will be done in a future kernel version. [1] HEAD commit: 8596e589b787732c8346f0482919e83cc9362db1 BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0: __skb_insert include/linux/skbuff.h:1938 [inline] __skb_queue_before include/linux/skbuff.h:2043 [inline] __skb_queue_tail include/linux/skbuff.h:2076 [inline] skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264 unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg net/socket.c:723 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392 ___sys_sendmsg net/socket.c:2446 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532 __do_sys_sendmmsg net/socket.c:2561 [inline] __se_sys_sendmmsg net/socket.c:2558 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1: skb_queue_len include/linux/skbuff.h:1869 [inline] unix_recvq_full net/unix/af_unix.c:194 [inline] unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777 sock_poll+0x23e/0x260 net/socket.c:1288 vfs_poll include/linux/poll.h:90 [inline] ep_item_poll fs/eventpoll.c:846 [inline] ep_send_events fs/eventpoll.c:1683 [inline] ep_poll fs/eventpoll.c:1798 [inline] do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226 __do_sys_epoll_wait fs/eventpoll.c:2238 [inline] __se_sys_epoll_wait fs/eventpoll.c:2233 [inline] __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000001b -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G W 5.14.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 86b18aaa2b5b ("skbuff: fix a data race in skb_queue_len()") Cc: Qian Cai <cai@lca.pw> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	net: macb: fix use after free on rmmod	Tong Zhang
	plat_dev->dev->platform_data is released by platform_device_unregister(), use of pclk and hclk is a use-after-free. Since device unregister won't need a clk device we adjust the function call sequence to fix this issue. [ 31.261225] BUG: KASAN: use-after-free in macb_remove+0x77/0xc6 [macb_pci] [ 31.275563] Freed by task 306: [ 30.276782] platform_device_release+0x25/0x80 Suggested-by: Nicolas Ferre <Nicolas.Ferre@microchip.com> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	ibmvnic: check failover_pending in login response	Sukadev Bhattiprolu
	If a failover occurs before a login response is received, the login response buffer maybe undefined. Check that there was no failover before accessing the login response buffer. Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	vhost_net: fix OoB on sendmsg() failure.	Paolo Abeni
	If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp' and 'done_idx' indexes are left unchanged. If such failure happens when batched_xdp == VHOST_NET_BATCH, the next call to vhost_net_build_xdp() will access and write memory outside the xdp buffers area. Since sendmsg() can only error with EBADFD, this change addresses the issue explicitly freeing the XDP buffers batch on error. Fixes: 0a0be13b8fe2 ("vhost_net: batch submitting XDP buffers to underlayer sockets") Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09	sched: Prevent balance_push() on remote runqueues	Thomas Gleixner
	sched_setscheduler() and rt_mutex_setprio() invoke the run-queue balance callback after changing priorities or the scheduling class of a task. The run-queue for which the callback is invoked can be local or remote. That's not a problem for the regular rq::push_work which is serialized with a busy flag in the run-queue struct, but for the balance_push() work which is only valid to be invoked on the outgoing CPU that's wrong. It not only triggers the debug warning, but also leaves the per CPU variable push_work unprotected, which can result in double enqueues on the stop machine list. Remove the warning and validate that the function is invoked on the outgoing CPU. Fixes: ae7927023243 ("sched: Optimize finish_lock_switch()") Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87zgt1hdw7.ffs@tglx
2021-09-09	sched/idle: Make the idle timer expire in hard interrupt context	Sebastian Andrzej Siewior
	The intel powerclamp driver will setup a per-CPU worker with RT priority. The worker will then invoke play_idle() in which it remains in the idle poll loop until it is stopped by the timer it started earlier. That timer needs to expire in hard interrupt context on PREEMPT_RT. Otherwise the timer will expire in ksoftirqd as a SOFT timer but that task won't be scheduled on the CPU because its priority is lower than the priority of the worker which is in the idle loop. Always expire the idle timer in hard interrupt context. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210906113034.jgfxrjdvxnjqgtmc@linutronix.de
2021-09-09	locking/rtmutex: Fix ww_mutex deadlock check	Peter Zijlstra
	Dan reported that rt_mutex_adjust_prio_chain() can be called with .orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters") unconditionally dereferences it. Since both call-sites that have .orig_waiter == NULL don't care for the return value, simply disable the deadlock squash by adding the NULL check. Notably, both callers use the deadlock condition as a termination condition for the iteration; once detected, it is sure that (de)boosting is done. Arguably step [3] would be a more natural termination point, but it's dubious whether adding a third deadlock detection state would improve the code. Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/YS9La56fHMiCCo75@hirez.programming.kicks-ass.net
2021-09-09	rtc: rx8010: select REGMAP_I2C	Yu-Tung Chang
	The rtc-rx8010 uses the I2C regmap but doesn't select it in Kconfig so depending on the configuration the build may fail. Fix it. Signed-off-by: Yu-Tung Chang <mtwget@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210830052532.40356-1-mtwget@gmail.com
2021-09-08	Input: analog - always use ktime functions	Guenter Roeck
	m68k, mips, s390, and sparc allmodconfig images fail to build with the following error. drivers/input/joystick/analog.c:160:2: error: #warning Precise timer not defined for this architecture. Remove architecture specific time handling code and always use ktime functions to determine time deltas. Also remove the now useless use_ktime kernel parameter. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20210907123734.21520-1-linux@roeck-us.net Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>