summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-07-26iommu/io-pgtable: Introduce map_pages() as a page table opIsaac J. Manjarres
Mapping memory into io-pgtables follows the same semantics that unmapping memory used to follow (i.e. a buffer will be mapped one page block per call to the io-pgtable code). This means that it can be optimized in the same way that unmapping memory was, so add a map_pages() callback to the io-pgtable ops structure, so that a range of pages of the same size can be mapped within the same call. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-4-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26iommu: Add an unmap_pages() op for IOMMU driversIsaac J. Manjarres
Add a callback for IOMMU drivers to provide a path for the IOMMU framework to call into an IOMMU driver, which can call into the io-pgtable code, to unmap a virtually contiguous range of pages of the same size. For IOMMU drivers that do not specify an unmap_pages() callback, the existing logic of unmapping memory one page block at a time will be used. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-3-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26iommu/io-pgtable: Introduce unmap_pages() as a page table opIsaac J. Manjarres
The io-pgtable code expects to operate on a single block or granule of memory that is supported by the IOMMU hardware when unmapping memory. This means that when a large buffer that consists of multiple such blocks is unmapped, the io-pgtable code will walk the page tables to the correct level to unmap each block, even for blocks that are virtually contiguous and at the same level, which can incur an overhead in performance. Introduce the unmap_pages() page table op to express to the io-pgtable code that it should unmap a number of blocks of the same size, instead of a single block. Doing so allows multiple blocks to be unmapped in one call to the io-pgtable code, reducing the number of page table walks, and indirect calls. Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/1623850736-389584-2-git-send-email-quic_c_gdjako@quicinc.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-07-26printk: Move the printk() kerneldoc comment to its new homeJonathan Corbet
Commit 337015573718 ("printk: Userspace format indexing support") turned printk() into a macro, but left the kerneldoc comment for it with the (now) _printk() function, resulting in this docs-build warning: kernel/printk/printk.c:1: warning: 'printk' not found Move the kerneldoc comment back next to the (now) macro it's meant to describe and have the docs build find it there. Fixes: 337015573718b161 ("printk: Userspace format indexing support") Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87o8aqt7qn.fsf@meer.lwn.net
2021-07-26Merge v5.14-rc3 into usb-nextGreg Kroah-Hartman
We need the fixes in here, and this resolves a merge issue with drivers/usb/dwc3/gadget.c Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-26drm: document drm_property_enum.value for bitfieldsSimon Ser
When a property has the type DRM_MODE_PROP_BITMASK, the value field stores a bitshift, not a bitmask, which can be surprising. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Leandro Ribeiro <leandro.ribeiro@collabora.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/NUZTPTKKZtAlDhxIXFB1qrUqWBYKapkBxCnb1S1bc3g@cp3-web-033.plabs.ch
2021-07-25fscrypt: add fscrypt_symlink_getattr() for computing st_sizeEric Biggers
Add a helper function fscrypt_symlink_getattr() which will be called from the various filesystems' ->getattr() methods to read and decrypt the target of encrypted symlinks in order to report the correct st_size. Detailed explanation: As required by POSIX and as documented in various man pages, st_size for a symlink is supposed to be the length of the symlink target. Unfortunately, st_size has always been wrong for encrypted symlinks because st_size is populated from i_size from disk, which intentionally contains the length of the encrypted symlink target. That's slightly greater than the length of the decrypted symlink target (which is the symlink target that userspace usually sees), and usually won't match the length of the no-key encoded symlink target either. This hadn't been fixed yet because reporting the correct st_size would require reading the symlink target from disk and decrypting or encoding it, which historically has been considered too heavyweight to do in ->getattr(). Also historically, the wrong st_size had only broken a test (LTP lstat03) and there were no known complaints from real users. (This is probably because the st_size of symlinks isn't used too often, and when it is, typically it's for a hint for what buffer size to pass to readlink() -- which a slightly-too-large size still works for.) However, a couple things have changed now. First, there have recently been complaints about the current behavior from real users: - Breakage in rpmbuild: https://github.com/rpm-software-management/rpm/issues/1682 https://github.com/google/fscrypt/issues/305 - Breakage in toybox cpio: https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html - Breakage in libgit2: https://issuetracker.google.com/issues/189629152 (on Android public issue tracker, requires login) Second, we now cache decrypted symlink targets in ->i_link. Therefore, taking the performance hit of reading and decrypting the symlink target in ->getattr() wouldn't be as big a deal as it used to be, since usually it will just save having to do the same thing later. Also note that eCryptfs ended up having to read and decrypt symlink targets in ->getattr() as well, to fix this same issue; see commit 3a60a1686f0d ("eCryptfs: Decrypt symlink target for stat size"). So, let's just bite the bullet, and read and decrypt the symlink target in ->getattr() in order to report the correct st_size. Add a function fscrypt_symlink_getattr() which the filesystems will call to do this. (Alternatively, we could store the decrypted size of symlinks on-disk. But there isn't a great place to do so, and encryption is meant to hide the original size to some extent; that property would be lost.) Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-07-26Backmerge tag 'v5.14-rc3' into drm-nextDave Airlie
Linux 5.14-rc3 Daniel said we should pull the nouveau fix from fixes in here, probably a good plan. Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-25sctp: send pmtu probe only if packet loss in Search Complete stateXin Long
This patch is to introduce last_rtx_chunks into sctp_transport to detect if there's any packet retransmission/loss happened by checking against asoc's rtx_data_chunks in sctp_transport_pl_send(). If there is, namely, transport->last_rtx_chunks != asoc->rtx_data_chunks, the pmtu probe will be sent out. Otherwise, increment the pl.raise_count and return when it's in Search Complete state. With this patch, if in Search Complete state, which is a long period, it doesn't need to keep probing the current pmtu unless there's data packet loss. This will save quite some traffic. v1->v2: - add the missing Fixes tag. Fixes: 0dac127c0557 ("sctp: do black hole detection in search complete state") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25sctp: improve the code for pmtu probe send and recv updateXin Long
This patch does 3 things: - make sctp_transport_pl_send() and sctp_transport_pl_recv() return bool type to decide if more probe is needed to send. - pr_debug() only when probe is really needed to send. - count pl.raise_count in sctp_transport_pl_send() instead of sctp_transport_pl_recv(), and it's only incremented for the 1st probe for the same size. These are preparations for the next patch to make probes happen only when there's packet loss in Search Complete state. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25can: flexcan: add platform data headerAngelo Dureghello
Add platform data header for flexcan. Link: https://lore.kernel.org/r/20210702094841.327679-1-angelo@kernel-space.org Signed-off-by: Angelo Dureghello <angelo@kernel-space.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-07-25can: bittiming: fix documentation for struct can_tdcMarc Kleine-Budde
This patch fixes a typo in the documentation for struct can_tdc::tdcv. The number "0" refers to automatic mode not the letter "O". Further two grammar errors in the documentation for struct can_tdc are fixed. First grammar error: add a missing third person 's'. Second grammar error: replace "such as" by "such that". The intent is to give a condition, not an example. Fixes: 289ea9e4ae59 ("can: add new CAN FD bittiming parameters: Transmitter Delay Compensation (TDC)") Link: https://lore.kernel.org/r/20210616095922.2430415-1-mkl@pengutronix.de Link: https://lore.kernel.org/r/20210616124057.60723-1-mailhol.vincent@wanadoo.fr Co-developed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-07-25can: rx-offload: can_rx_offload_threaded_irq_finish(): add new function to ↵Marc Kleine-Budde
be called from threaded interrupt After reading all CAN frames from the controller in the IRQ handler and storing them into a skb_queue, the driver calls napi_schedule(). In the napi poll function the skb from the skb_queue are then pushed into the networking stack. However if napi_schedule() is called from a threaded IRQ handler this triggers the following error: | NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!! To avoid this, create a new rx-offload function (can_rx_offload_threaded_irq_finish()) with a call to local_bh_disable()/local_bh_enable() around the napi_schedule() call. Convert all drivers that call can_rx_offload_irq_finish() from threaded IRQ context to can_rx_offload_threaded_irq_finish(). Link: https://lore.kernel.org/r/20210724204745.736053-4-mkl@pengutronix.de Suggested-by: Daniel Glöckner <dg@emlix.com> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-07-25can: rx-offload: can_rx_offload_irq_finish(): directly call napi_schedule()Marc Kleine-Budde
Instead of calling can_rx_offload_schedule() call napi_schedule() directly. As this was the last use of can_rx_offload_schedule() remove this helper function. Link: https://lore.kernel.org/r/20210724204745.736053-3-mkl@pengutronix.de Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-07-25can: rx-offload: add skb queue for use during ISRMarc Kleine-Budde
Adding a skb to the skb_queue in rx-offload requires to take a lock. This commit avoids this by adding an unlocked skb queue that is appended at the end of the ISR. Having one lock at the end of the ISR should be OK as the HW is empty, not about to overflow. Link: https://lore.kernel.org/r/20210724204745.736053-2-mkl@pengutronix.de Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Co-developed-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-07-25IB/mlx5: Rename is_apu_thread_cq function to is_apu_cqTal Gilboa
is_apu_thread_cq() used to detect CQs which are attached to APU threads. This was extended to support other elements as well, so the function was renamed to is_apu_cq(). c_eqn_or_apu_element was extended from 8 bits to 32 bits, which wan't reflected when the APU support was first introduced. Acked-by: Michael S. Tsirkin <mst@redhat.com> # vdpa Signed-off-by: Tal Gilboa <talgi@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-07-25nfc: constify nfc_digital_opsKrzysztof Kozlowski
Neither the core nor the drivers modify the passed pointer to struct nfc_digital_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify nfc_hci_opsKrzysztof Kozlowski
Neither the core nor the drivers modify the passed pointer to struct nfc_hci_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify nfc_opsKrzysztof Kozlowski
Neither the core nor the drivers modify the passed pointer to struct nfc_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify pointer to nfc_vendor_cmdKrzysztof Kozlowski
Neither the core nor the drivers modify the passed pointer to struct nfc_vendor_cmd, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify nci_driver_ops (prop_ops and core_ops)Krzysztof Kozlowski
Neither the core nor the drivers modify the passed pointer to struct nci_driver_ops (consisting of function pointers), so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify nci_opsKrzysztof Kozlowski
The struct nci_ops is modified by NFC core in only one case: nci_allocate_device() receives too many proprietary commands (prop_ops) to configure. This is a build time known constrain, so a graceful handling of such case is not necessary. Instead, fail the nci_allocate_device() and add BUILD_BUG_ON() to places which set these. This allows to constify the struct nci_ops (consisting of function pointers) for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25nfc: constify payload argument in nci_send_cmd()Krzysztof Kozlowski
The nci_send_cmd() payload argument is passed directly to skb_put_data() which already accepts a pointer to const, so make it const as well for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-24ACPI: fix NULL pointer dereferenceLinus Torvalds
Commit 71f642833284 ("ACPI: utils: Fix reference counting in for_each_acpi_dev_match()") started doing "acpi_dev_put()" on a pointer that was possibly NULL. That fails miserably, because that helper inline function is not set up to handle that case. Just make acpi_dev_put() silently accept a NULL pointer, rather than calling down to put_device() with an invalid offset off that NULL pointer. Link: https://lore.kernel.org/lkml/a607c149-6bf6-0fd0-0e31-100378504da2@kernel.dk/ Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Tested-by: Daniel Scally <djrscally@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-24Merge tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request (Christoph): - tracing fix (Keith Busch) - fix multipath head refcounting (Hannes Reinecke) - Write Zeroes vs PI fix (me) - drop a bogus WARN_ON (Zhihao Cheng) - Increase max blk-cgroup policy size, now that mq-deadline uses it too (Oleksandr) * tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-block: nvme: set the PRACT bit when using Write Zeroes with T10 PI nvme: fix nvme_setup_command metadata trace event nvme: fix refcounting imbalance when all paths are down nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING block: increase BLKCG_MAX_POLS
2021-07-23memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regionsMike Rapoport
Commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") didn't take into account that when there is movable_node parameter in the kernel command line, for_each_mem_range() would skip ranges marked with MEMBLOCK_HOTPLUG. The page table setup code in POWER uses for_each_mem_range() to create the linear mapping of the physical memory and since the regions marked as MEMORY_HOTPLUG are skipped, they never make it to the linear map. A later access to the memory in those ranges will fail: BUG: Unable to handle kernel data access on write at 0xc000000400000000 Faulting instruction address: 0xc00000000008a3c0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7 NIP: c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040 REGS: c000000008a57770 TRAP: 0300 Not tainted (5.13.0) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 84222202 XER: 20040000 CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0 GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000 GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200 GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300 GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000 GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0 GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680 NIP clear_user_page+0x50/0x80 LR __handle_mm_fault+0xc88/0x1910 Call Trace: __handle_mm_fault+0xc44/0x1910 (unreliable) handle_mm_fault+0x130/0x2a0 __get_user_pages+0x248/0x610 __get_user_pages_remote+0x12c/0x3e0 get_arg_page+0x54/0xf0 copy_string_kernel+0x11c/0x210 kernel_execve+0x16c/0x220 call_usermodehelper_exec_async+0x1b0/0x2f0 ret_from_kernel_thread+0x5c/0x70 Instruction dump: 79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050 7d4903a6 60000000 60000000 60000000 <7c001fec> 7c091fec 7c081fec 7c051fec ---[ end trace 490b8c67e6075e09 ]--- Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the traversal fixes this issue. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100 Link: https://lkml.kernel.org/r/20210712071132.20902-1-rppt@kernel.org Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> [5.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23mm: use kmap_local_page in memzero_pageChristoph Hellwig
The commit message introducing the global memzero_page explicitly mentions switching to kmap_local_page in the commit log but doesn't actually do that. Link: https://lkml.kernel.org/r/20210713055231.137602-3-hch@lst.de Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()Christoph Hellwig
memcpy_to_page and memzero_page can write to arbitrary pages, which could be in the page cache or in high memory, so call flush_kernel_dcache_pages to flush the dcache. This is a problem when using these helpers on dcache challeneged architectures. Right now there are just a few users, chances are no one used the PC floppy driver, the aha1542 driver for an ISA SCSI HBA, and a few advanced and optional btrfs and ext4 features on those platforms yet since the conversion. Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core") Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23swiotlb: Convert io_default_tlb_mem to static allocationWill Deacon
Since commit 69031f500865 ("swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used"), 'struct device' may hold a copy of the global 'io_default_tlb_mem' pointer if the device is using swiotlb for DMA. A subsequent call to swiotlb_exit() will therefore leave dangling pointers behind in these device structures, resulting in KASAN splats such as: | BUG: KASAN: use-after-free in __iommu_dma_unmap_swiotlb+0x64/0xb0 | Read of size 8 at addr ffff8881d7830000 by task swapper/0/0 | | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3-debug #1 | Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020 | Call Trace: | <IRQ> | dump_stack+0x9c/0xcf | print_address_description.constprop.0+0x18/0x130 | kasan_report.cold+0x7f/0x111 | __iommu_dma_unmap_swiotlb+0x64/0xb0 | nvme_pci_complete_rq+0x73/0x130 | blk_complete_reqs+0x6f/0x80 | __do_softirq+0xfc/0x3be Convert 'io_default_tlb_mem' to a static structure, so that the per-device pointers remain valid after swiotlb_exit() has been invoked. All users are updated to reference the static structure directly, using the 'nslabs' field to determine whether swiotlb has been initialised. The 'slots' array is still allocated dynamically and referenced via a pointer rather than a flexible array member. Cc: Claire Chang <tientzu@chromium.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Fixes: 69031f500865 ("swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used") Reported-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Claire Chang <tientzu@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
2021-07-23bpf: tcp: Support bpf_(get|set)sockopt in bpf tcp iterMartin KaFai Lau
This patch allows bpf tcp iter to call bpf_(get|set)sockopt. To allow a specific bpf iter (tcp here) to call a set of helpers, get_func_proto function pointer is added to bpf_iter_reg. The bpf iter is a tracing prog which currently requires CAP_PERFMON or CAP_SYS_ADMIN, so this patch does not impose other capability checks for bpf_(get|set)sockopt. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210701200619.1036715-1-kafai@fb.com
2021-07-23tcp: seq_file: Replace listening_hash with lhash2Martin KaFai Lau
This patch moves the tcp seq_file iteration on listeners from the port only listening_hash to the port+addr lhash2. When iterating from the bpf iter, the next patch will need to lock the socket such that the bpf iter can call setsockopt (e.g. to change the TCP_CONGESTION). To avoid locking the bucket and then locking the sock, the bpf iter will first batch some sockets from the same bucket and then unlock the bucket. If the bucket size is small (which usually is), it is easier to batch the whole bucket such that it is less likely to miss a setsockopt on a socket due to changes in the bucket. However, the port only listening_hash could have many listeners hashed to a bucket (e.g. many individual VIP(s):443 and also multiple by the number of SO_REUSEPORT). We have seen bucket size in tens of thousands range. Also, the chance of having changes in some popular port buckets (e.g. 443) is also high. The port+addr lhash2 was introduced to solve this large listener bucket issue. Also, the listening_hash usage has already been replaced with lhash2 in the fast path inet[6]_lookup_listener(). This patch follows the same direction on moving to lhash2 and iterates the lhash2 instead of listening_hash. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210701200606.1035783-1-kafai@fb.com
2021-07-23bpf: tcp: seq_file: Remove bpf_seq_afinfo from tcp_iter_stateMartin KaFai Lau
A following patch will create a separate struct to store extra bpf_iter state and it will embed the existing tcp_iter_state like this: struct bpf_tcp_iter_state { struct tcp_iter_state state; /* More bpf_iter specific states here ... */ } As a prep work, this patch removes the "struct tcp_seq_afinfo *bpf_seq_afinfo" where its purpose is to tell if it is iterating from bpf_iter instead of proc fs. Currently, if "*bpf_seq_afinfo" is not NULL, it is iterating from bpf_iter. The kernel should not filter by the addr family and leave this filtering decision to the bpf prog. Instead of adding a "*bpf_seq_afinfo" pointer, this patch uses the "seq->op == &bpf_iter_tcp_seq_ops" test to tell if it is iterating from the bpf iter. The bpf_iter_(init|fini)_tcp() is left here to prepare for the change of a following patch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210701200554.1034982-1-kafai@fb.com
2021-07-23drm/gem: Provide drm_gem_fb_{begin,end}_cpu_access() helpersThomas Zimmermann
Implement helpers drm_gem_fb_begin_cpu_access() and _end_cpu_access(), which call the rsp dma-buf functions for all GEM BOs of the given framebuffer. Calls to dma_buf_end_cpu_access() can return an error code on failure, while drm_gem_fb_end_cpu_access() does not. The latter runs during DRM's atomic commit or during cleanup. Both cases don't allow for errors, so leave out the return value. v2: * fix typo in docs (Daniel) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210716140801.1215-2-tzimmermann@suse.de
2021-07-23signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistencyEric W. Biederman
It helps to know which part of the siginfo structure the siginfo_layout value is talking about. v1: https://lkml.kernel.org/r/m18s4zs7nu.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-9-ebiederm@xmission.com Link: https://lkml.kernel.org/r/87zgumw8cc.fsf_-_@disp2133 Acked-by: Marco Elver <elver@google.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-07-23signal: Verify the alignment and size of siginfo_tEric W. Biederman
Update the static assertions about siginfo_t to also describe it's alignment and size. While investigating if it was possible to add a 64bit field into siginfo_t[1] it became apparent that the alignment of siginfo_t is as much a part of the ABI as the size of the structure. If the alignment changes siginfo_t when embedded in another structure can move to a different offset. Which is not acceptable from an ABI structure. So document that fact and add static assertions to notify developers if they change change the alignment by accident. [1] https://lkml.kernel.org/r/YJEZdhe6JGFNYlum@elver.google.com Acked-by: Marco Elver <elver@google.com> v1: https://lkml.kernel.org/r/20210505141101.11519-4-ebiederm@xmission.co Link: https://lkml.kernel.org/r/875yxaxmyl.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-23signal: Remove the generic __ARCH_SI_TRAPNO supportEric W. Biederman
Now that __ARCH_SI_TRAPNO is no longer set by any architecture remove all of the code it enabled from the kernel. On alpha and sparc a more explict approach of using send_sig_fault_trapno or force_sig_fault_trapno in the very limited circumstances where si_trapno was set to a non-zero value. The generic support that is being removed always set si_trapno on all fault signals. With only SIGILL ILL_ILLTRAP on sparc and SIGFPE and SIGTRAP TRAP_UNK on alpla providing si_trapno values asking all senders of fault signals to provide an si_trapno value does not make sense. Making si_trapno an ordinary extension of the fault siginfo layout has enabled the architecture generic implementation of SIGTRAP TRAP_PERF, and enables other faulting signals to grow architecture generic senders as well. v1: https://lkml.kernel.org/r/m18s4zs7nu.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-8-ebiederm@xmission.com Link: https://lkml.kernel.org/r/87bl73xx6x.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-23signal/alpha: si_trapno is only used with SIGFPE and SIGTRAP TRAP_UNKEric W. Biederman
While reviewing the signal handlers on alpha it became clear that si_trapno is only set to a non-zero value when sending SIGFPE and when sending SITGRAP with si_code TRAP_UNK. Add send_sig_fault_trapno and send SIGTRAP TRAP_UNK, and SIGFPE with it. Remove the define of __ARCH_SI_TRAPNO and remove the always zero si_trapno parameter from send_sig_fault and force_sig_fault. v1: https://lkml.kernel.org/r/m1eeers7q7.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-7-ebiederm@xmission.com Link: https://lkml.kernel.org/r/87h7gvxx7l.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-23signal/sparc: si_trapno is only used with SIGILL ILL_ILLTRPEric W. Biederman
While reviewing the signal handlers on sparc it became clear that si_trapno is only set to a non-zero value when sending SIGILL with si_code ILL_ILLTRP. Add force_sig_fault_trapno and send SIGILL ILL_ILLTRP with it. Remove the define of __ARCH_SI_TRAPNO and remove the always zero si_trapno parameter from send_sig_fault and force_sig_fault. v1: https://lkml.kernel.org/r/m1eeers7q7.fsf_-_@fess.ebiederm.org v2: https://lkml.kernel.org/r/20210505141101.11519-7-ebiederm@xmission.com Link: https://lkml.kernel.org/r/87mtqnxx89.fsf_-_@disp2133 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-07-23Merge tag 'acpi-5.14-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a recently broken Kconfig dependency and ACPI device reference counting in an iterator macro. Specifics: - Fix recently broken Kconfig dependency for the ACPI table override via built-in initrd (Robert Richter) - Fix ACPI device reference counting in the for_each_acpi_dev_match() helper macro to avoid use-after-free (Andy Shevchenko)" * tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: utils: Fix reference counting in for_each_acpi_dev_match() ACPI: Kconfig: Fix table override from built-in initrd
2021-07-23Merge tag 'sound-5.14-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, mostly covering device-specific regressions and bugs over ASoC, HD-audio and USB-audio, while the ALSA PCM core received a few additional fixes for the possible (new and old) regressions" * tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits) ALSA: usb-audio: Add registration quirk for JBL Quantum headsets ALSA: hda/hdmi: Add quirk to force pin connectivity on NUC10 ALSA: pcm: Fix mmap without buffer preallocation ALSA: pcm: Fix mmap capability check ALSA: hda: intel-dsp-cfg: add missing ElkhartLake PCI ID ASoC: ti: j721e-evm: Check for not initialized parent_clk_id ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup ALSA: hda/realtek: Fix pop noise and 2 Front Mic issues on a machine ALSA: hdmi: Expose all pins on MSI MS-7C94 board ALSA: sb: Fix potential ABBA deadlock in CSP driver ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend ASoC: amd: reverse stop sequence for stoneyridge platform ASoC: soc-pcm: add a flag to reverse the stop sequence ASoC: codecs: wcd938x: setup irq during component bind ASoC: dt-bindings: renesas: rsnd: Fix incorrect 'port' regex schema ALSA: usb-audio: Add missing proc text entry for BESPOKEN type ASoC: codecs: wcd938x: make sdw dependency explicit in Kconfig ASoC: SOF: Intel: Update ADL descriptor to use ACPI power states ASoC: rt5631: Fix regcache sync errors on resume ALSA: pcm: Call substream ack() method upon compat mmap commit ...
2021-07-23net: dsa: add support for bridge TX forwarding offloadVladimir Oltean
For a DSA switch, to offload the forwarding process of a bridge device means to send the packets coming from the software bridge as data plane packets. This is contrary to everything that DSA has done so far, because the current taggers only know to send control packets (ones that target a specific destination port), whereas data plane packets are supposed to be forwarded according to the FDB lookup, much like packets ingressing on any regular ingress port. If the FDB lookup process returns multiple destination ports (flooding, multicast), then replication is also handled by the switch hardware - the bridge only sends a single packet and avoids the skb_clone(). DSA keeps for each bridge port a zero-based index (the number of the bridge). Multiple ports performing TX forwarding offload to the same bridge have the same dp->bridge_num value, and ports not offloading the TX data plane of a bridge have dp->bridge_num = -1. The tagger can check if the packet that is being transmitted on has skb->offload_fwd_mark = true or not. If it does, it can be sure that the packet belongs to the data plane of a bridge, further information about which can be obtained based on dp->bridge_dev and dp->bridge_num. It can then compose a DSA tag for injecting a data plane packet into that bridge number. For the switch driver side, we offer two new dsa_switch_ops methods, called .port_bridge_fwd_offload_{add,del}, which are modeled after .port_bridge_{join,leave}. These methods are provided in case the driver needs to configure the hardware to treat packets coming from that bridge software interface as data plane packets. The switchdev <-> bridge interaction happens during the netdev_master_upper_dev_link() call, so to switch drivers, the effect is that the .port_bridge_fwd_offload_add() method is called immediately after .port_bridge_join(). If the bridge number exceeds the number of bridges for which the switch driver can offload the TX data plane (and this includes the case where the driver can offload none), DSA falls back to simply returning tx_fwd_offload = false in the switchdev_bridge_port_offload() call. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23net: dsa: track the number of switches in a treeVladimir Oltean
In preparation of supporting data plane forwarding on behalf of a software bridge, some drivers might need to view bridges as virtual switches behind the CPU port in a cross-chip topology. Give them some help and let them know how many physical switches there are in the tree, so that they can count the virtual switches starting from that number on. Note that the first dsa_switch_ops method where this information is reliably available is .setup(). This is because of how DSA works: in a tree with 3 switches, each calling dsa_register_switch(), the first 2 will advance until dsa_tree_setup() -> dsa_tree_setup_routing_table() and exit with error code 0 because the topology is not complete. Since probing is parallel at this point, one switch does not know about the existence of the other. Then the third switch comes, and for it, dsa_tree_setup_routing_table() returns complete = true. This switch goes ahead and calls dsa_tree_setup_switches() for everybody else, calling their .setup() methods too. This acts as the synchronization point. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23net: bridge: switchdev: allow the TX data plane forwarding to be offloadedTobias Waldekranz
Allow switchdevs to forward frames from the CPU in accordance with the bridge configuration in the same way as is done between bridge ports. This means that the bridge will only send a single skb towards one of the ports under the switchdev's control, and expects the driver to deliver the packet to all eligible ports in its domain. Primarily this improves the performance of multicast flows with multiple subscribers, as it allows the hardware to perform the frame replication. The basic flow between the driver and the bridge is as follows: - When joining a bridge port, the switchdev driver calls switchdev_bridge_port_offload() with tx_fwd_offload = true. - The bridge sends offloadable skbs to one of the ports under the switchdev's control using skb->offload_fwd_mark = true. - The switchdev driver checks the skb->offload_fwd_mark field and lets its FDB lookup select the destination port mask for this packet. v1->v2: - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long - introduce a static key "br_switchdev_fwd_offload_used" to minimize the impact of the newly introduced feature on all the setups which don't have hardware that can make use of it - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache line access - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in __br_forward() - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge is being used - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP v2->v3: - replace the solution based on .ndo_dfwd_add_station with a solution based on switchdev_bridge_port_offload - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD v3->v4: rebase v4->v5: - make sure the static key is decremented on bridge port unoffload - more function and variable renaming and comments for them: br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload fwd_accel to tx_fwd_offload Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Conflicts are simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23drm/fourcc: Add modifier definitions for Arm Fixed Rate CompressionNormunds Rieksts
Arm Fixed Rate Compression (AFRC) is a proprietary fixed rate image compression protocol and format. It is designed to provide guaranteed bandwidth and memory footprint reductions in graphics and media use-cases. This patch aims to add modifier definitions for describing AFRC. Signed-off-by: Normunds Rieksts <normunds.rieksts@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210701170709.39922-1-normunds.rieksts@arm.com
2021-07-23drm/amdgpu: add cyan_skillfish asic typeTao Zhou
Add cyan_skillfish asic family. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-23ima: Add digest and digest_len params to the functions to measure a bufferRoberto Sassu
This patch performs the final modification necessary to pass the buffer measurement to callers, so that they provide a functionality similar to ima_file_hash(). It adds the 'digest' and 'digest_len' parameters to ima_measure_critical_data() and process_buffer_measurement(). These functions calculate the digest even if there is no suitable rule in the IMA policy and, in this case, they simply return 1 before generating a new measurement entry. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-07-23ima: Return int in the functions to measure a bufferRoberto Sassu
ima_measure_critical_data() and process_buffer_measurement() currently don't return a result as, unlike appraisal-related functions, the result is not used by callers to deny an operation. Measurement-related functions instead rely on the audit subsystem to notify the system administrator when an error occurs. However, ima_measure_critical_data() and process_buffer_measurement() are a special case, as these are the only functions that can return a buffer measurement (for files, there is ima_file_hash()). In a subsequent patch, they will be modified to return the calculated digest. In preparation to return the result of the digest calculation, this patch modifies the return type from void to int, and returns 0 if the buffer has been successfully measured, a negative value otherwise. Given that the result of the measurement is still not necessary, this patch does not modify the behavior of existing callers by processing the returned value. For those, the return value is ignored. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Acked-by: Paul Moore <paul@paul-moore.com> (for the SELinux bits) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-07-23ima: Introduce ima_get_current_hash_algo()Roberto Sassu
Buffer measurements, unlike file measurements, are not accessible after the measurement is done, as buffers are not suitable for use with the integrity_iint_cache structure (there is no index, for files it is the inode number). In the subsequent patches, the measurement (digest) will be returned directly by the functions that perform the buffer measurement, ima_measure_critical_data() and process_buffer_measurement(). A caller of those functions also needs to know the algorithm used to calculate the digest. Instead of adding the algorithm as a new parameter to the functions, this patch provides it separately with the new function ima_get_current_hash_algo(). Since the hash algorithm does not change after the IMA setup phase, there is no risk of races (obtaining a digest calculated with a different algorithm than the one returned). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> [zohar@linux.ibm.com: annotate ima_hash_algo as __ro_after_init] Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-07-23net: socket: rework compat_ifreq_ioctl()Arnd Bergmann
compat_ifreq_ioctl() is one of the last users of copy_in_user() and compat_alloc_user_space(), as it attempts to convert the 'struct ifreq' arguments from 32-bit to 64-bit format as used by dev_ioctl() and a couple of socket family specific interpretations. The current implementation works correctly when calling dev_ioctl(), inet_ioctl(), ieee802154_sock_ioctl(), atalk_ioctl(), qrtr_ioctl() and packet_ioctl(). The ioctl handlers for x25, netrom, rose and x25 do not interpret the arguments and only block the corresponding commands, so they do not care. For af_inet6 and af_decnet however, the compat conversion is slightly incorrect, as it will copy more data than the native handler accesses, both of them use a structure that is shorter than ifreq. Replace the copy_in_user() conversion with a pair of accessor functions to read and write the ifreq data in place with the correct length where needed, while leaving the other ones to copy the (already compatible) structures directly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>