summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2022-11-16Merge tag 'drm-misc-next-2022-11-10-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.2: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic-helper: Add begin_fb_access and end_fb_access hooks - fb-helper: Rework to move fb emulation into helpers - scheduler: rework entity flush, kill and fini - ttm: Optimize pool allocations Driver Changes: - amdgpu: scheduler rework - hdlcd: Switch to DRM-managed resources - ingenic: Fix registration error path - lcdif: FIFO threshold tuning - meson: Fix return type of cvbs' mode_valid - ofdrm: multiple fixes (kconfig, types, endianness) - sun4i: A100 and D1 support - panel: - New Panel: Jadard JD9365DA-H3 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20221110083612.g63eaocoaa554soh@houat
2022-11-15drm/amdgpu: revert "implement tdr advanced mode"Christian König
This reverts commit e6c6338f393b74ac0b303d567bb918b44ae7ad75. This feature basically re-submits one job after another to figure out which one was the one causing a hang. This is obviously incompatible with gang-submit which requires that multiple jobs run at the same time. It's also absolutely not helpful to crash the hardware multiple times if a clean recovery is desired. For testing and debugging environments we should rather disable recovery alltogether to be able to inspect the state with a hw debugger. Additional to that the sw implementation is clearly buggy and causes reference count issues for the hardware fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15Merge tag 'br-v6.2d' of git://linuxtv.org/hverkuil/media_tree into media_stageMauro Carvalho Chehab
Tag branch * tag 'br-v6.2d' of git://linuxtv.org/hverkuil/media_tree: (35 commits) media: saa7164: remove variable cnt atomisp: fix potential NULL pointer dereferences radio-terratec: Remove variable p media: platform: s5p-mfc: Fix spelling mistake "mmaping" -> "mmapping" media: platform: mtk-mdp3: remove unused VIDEO_MEDIATEK_VPU config media: vivid: remove redundant assignment to variable checksum media: cedrus: h264: Optimize mv col buffer allocation media: cedrus: h265: Associate mv col buffers with buffer media: mediatek: vcodec: fix h264 cavlc bitstream fail media: cedrus: hevc: Fix offset adjustments media: imx-jpeg: Fix Coverity issue in probe media: v4l2-ioctl.c: Unify YCbCr/YUV terms in format descriptions media: atomisp: Fix spelling mistake "mis-match" -> "mismatch" media: c8sectpfe: Add missed header(s) media: adv748x: afe: Select input port when initializing AFE media: vimc: Update device configuration in the documentation media: adv748x: Remove dead function declaration media: mxl5005s: Make array RegAddr static const media: atomisp: Fix spelling mistake "modee" -> "mode" media: meson/vdec: always init coef_node_start ...
2022-11-15Merge tag 'br-v6.2e' of git://linuxtv.org/hverkuil/media_tree into media_stageMauro Carvalho Chehab
Tag branch * tag 'br-v6.2e' of git://linuxtv.org/hverkuil/media_tree: (29 commits) media: davinci/vpbe: Fix a typo ("defualt_mode") media: sun6i-csi: Remove unnecessary print function dev_err() media: Documentation: Drop deprecated bytesused == 0 media: platform: exynos4-is: fix return value check in fimc_md_probe() media: dvb-core: remove variable n, turn for-loop to while-loop media: vivid: fix compose size exceed boundary media: rkisp1: make const arrays ae_wnd_num and hist_wnd_num static media: dvb-core: Fix UAF due to refcount races at releasing media: rkvdec: Add required padding media: aspeed: Extend debug message media: aspeed: Support aspeed mode to reduce compressed data media: Documentation: aspeed-video: Add user documentation for the aspeed-video driver media: v4l2-ctrls: Reserve controls for ASPEED media: v4l: Add definition for the Aspeed JPEG format staging: media: tegra-video: fix device_node use after free staging: media: tegra-video: fix chan->mipi value on error media: cedrus: initialize controls a bit later media: cedrus: prefer untiled capture format media: cedrus: Remove cedrus_codec enum media: cedrus: set codec ops immediately ...
2022-11-15nvme: implement the DEAC bit for the Write Zeroes commandChristoph Hellwig
While the specification allows devices to either deallocate data or to actually write zeroes on any Write Zeroes command, many SSDs only do the sensible thing and deallocate data when the DEAC bit is specific. Set it when it is supported and the caller doesn't explicitly opt out of deallocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-15nvme: fine-granular CAP_SYS_ADMIN for nvme io commandsKanchan Joshi
Currently both io and admin commands are kept under a coarse-granular CAP_SYS_ADMIN check, disregarding file mode completely. $ ls -l /dev/ng* crw-rw-rw- 1 root root 242, 0 Sep 9 19:20 /dev/ng0n1 crw------- 1 root root 242, 1 Sep 9 19:20 /dev/ng0n2 In the example above, ng0n1 appears as if it may allow unprivileged read/write operation but it does not and behaves same as ng0n2. This patch implements a shift from CAP_SYS_ADMIN to more fine-granular control for io-commands. If CAP_SYS_ADMIN is present, nothing else is checked as before. Otherwise, following rules are in place - any admin-cmd is not allowed - vendor-specific and fabric commmand are not allowed - io-commands that can write are allowed if matching FMODE_WRITE permission is present - io-commands that read are allowed Add a helper nvme_cmd_allowed that implements above policy. Change all the callers of CAP_SYS_ADMIN to go through nvme_cmd_allowed for any decision making. Since file open mode is counted for any approval/denial, change at various places to keep file-mode information handy. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-11-15drm/connector: Add pixel clock to cmdline modeMaxime Ripard
We'll need to get the pixel clock to generate proper display modes for all the current named modes. Let's add it to struct drm_cmdline_mode and fill it when parsing the named mode. Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Tested-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com> Link: https://lore.kernel.org/r/20220728-rpi-analog-tv-properties-v9-12-24b168e5bcd5@cerno.tech Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-11-14Merge tag 'vfio-v6.1-rc6' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fixes from Alex Williamson: - Fixes for potential container registration leak for drivers not implementing a close callback, duplicate container de-registrations, and a regression in support for bus reset on last device close from a device set (Anthony DeRossi) * tag 'vfio-v6.1-rc6' of https://github.com/awilliam/linux-vfio: vfio/pci: Check the device set open count on reset vfio: Export the device set open count vfio: Fix container device registration life cycle
2022-11-14i2c: core: Introduce i2c_client_get_device_id helper functionAngel Iglesias
Introduces new helper function to aid in .probe_new() refactors. In order to use existing i2c_get_device_id() on the probe callback, the device match table needs to be accessible in that function, which would require bigger refactors in some drivers using the deprecated .probe callback. This issue was discussed in more detail in the IIO mailing list. Link: https://lore.kernel.org/all/20221023132302.911644-11-u.kleine-koenig@pengutronix.de/ Suggested-by: Nuno Sá <noname.nuno@gmail.com> Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Suggested-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Angel Iglesias <ang.iglesiasg@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-11-14ACPI: Implement a generic FFH Opregion handlerSudeep Holla
This registers the FFH OpRegion handler before ACPI tables are loaded. The platform support for the same is checked via Platform-Wide OSPM Capabilities(OSC) before registering the OpRegion handler. It relies on the special context data passed to offset and the length. However the interpretation of the values is platform/architecture specific. This generic handler just passed all the information to the platform/architecture specific callback. It also implements the default callbacks which return as not supported. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14memregion: Add cpu_cache_invalidate_memregion() interfaceDavidlohr Bueso
With CXL security features, and CXL dynamic provisioning, global CPU cache flushing nvdimm requirements are no longer specific to that subsystem, even beyond the scope of security_ops. CXL will need such semantics for features not necessarily limited to persistent memory. The functionality this is enabling is to be able to instantaneously secure erase potentially terabytes of memory at once and the kernel needs to be sure that none of the data from before the erase is still present in the cache. It is also used when unlocking a memory device where speculative reads and firmware accesses could have cached poison from before the device was unlocked. Lastly this facility is used when mapping new devices, or new capacity into an established physical address range. I.e. when the driver switches DeviceA mapping AddressX to DeviceB mapping AddressX then any cached data from DeviceA:AddressX needs to be invalidated. This capability is typically only used once per-boot (for unlock), or once per bare metal provisioning event (secure erase), like when handing off the system to another tenant or decommissioning a device. It may also be used for dynamic CXL region provisioning. Users must first call cpu_cache_has_invalidate_memregion() to know whether this functionality is available on the architecture. On x86 this respects the constraints of when wbinvd() is tolerable. It is already the case that wbinvd() is problematic to allow in VMs due its global performance impact and KVM, for example, has been known to just trap and ignore the call. With confidential computing guest execution of wbinvd() may even trigger an exception. Given guests should not be messing with the bare metal address map via CXL configuration changes cpu_cache_has_invalidate_memregion() returns false in VMs. While this global cache invalidation facility, is exported to modules, since NVDIMM and CXL support can be built as a module, it is not for general use. The intent is that this facility is not available outside of specific "device-memory" use cases. To make that expectation as clear as possible the API is scoped to a new "DEVMEM" module namespace that only the NVDIMM and CXL subsystems are expected to import. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14cxl/doe: Request exclusive DOE accessIra Weiny
The PCIE Data Object Exchange (DOE) mailbox is a protocol run over configuration cycles. It assumes one initiator at a time. While the kernel has control of the mailbox user space writes could interfere with the kernel access. Mark DOE mailbox config space exclusive when iterated by the CXL driver. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14PCI: Allow drivers to request exclusive config regionsIra Weiny
PCI config space access from user space has traditionally been unrestricted with writes being an understood risk for device operation. Unfortunately, device breakage or odd behavior from config writes lacks indicators that can leave driver writers confused when evaluating failures. This is especially true with the new PCIe Data Object Exchange (DOE) mailbox protocol where backdoor shenanigans from user space through things such as vendor defined protocols may affect device operation without complete breakage. A prior proposal restricted read and writes completely.[1] Greg and Bjorn pointed out that proposal is flawed for a couple of reasons. First, lspci should always be allowed and should not interfere with any device operation. Second, setpci is a valuable tool that is sometimes necessary and it should not be completely restricted.[2] Finally methods exist for full lock of device access if required. Even though access should not be restricted it would be nice for driver writers to be able to flag critical parts of the config space such that interference from user space can be detected. Introduce pci_request_config_region_exclusive() to mark exclusive config regions. Such regions trigger a warning and kernel taint if accessed via user space. Create pci_warn_once() to restrict the user from spamming the log. [1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/ [2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/ Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14lib/raid6: drop RAID6_USE_EMPTY_ZERO_PAGEGiulio Benetti
RAID6_USE_EMPTY_ZERO_PAGE is unused and hardcoded to 0, so let's drop it. Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-11-14Merge tag 'at91-fixes-6.1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes AT91 fixes for 6.1 It contains: - signal name fix for a pin on SAMA7G5 - memory self-refresh fix for SAMA7G5 by avoid soft resetting AC DLL which can introduce glitches in RAM controller and lead to unexpected behavior - led support fix for lan966x-pcb8291 board by enabling sgpio node * tag 'at91-fixes-6.1' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: at91: pm: avoid soft resetting AC DLL ARM: dts: lan966x: Enable sgpio on pcb8291 ARM: dts: at91: sama7g5: fix signal name of pin PB2 Link: https://lore.kernel.org/r/20221110115411.180876-1-claudiu.beznea@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-13Merge tag 'efi-fixes-for-v6.1-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Force the use of SetVirtualAddressMap() on Ampera Altra arm64 machines, which crash in SetTime() if no virtual remapping is used This is the first time we've added an SMBIOS based quirk on arm64, but fortunately, we can just call a EFI protocol to grab the type #1 SMBIOS record when running in the stub, so we don't need all the machinery we have in the kernel proper to parse SMBIOS data. - Drop a spurious warning on misaligned runtime regions when using 16k or 64k pages on arm64 * tag 'efi-fixes-for-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Fix handling of misaligned runtime regions and drop warning arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
2022-11-12xattr: use rbtree for simple_xattrsChristian Brauner
A while ago Vasily reported that it is possible to set a large number of xattrs on inodes of filesystems that make use of the simple xattr infrastructure. This includes all kernfs-based filesystems that support xattrs (e.g., cgroupfs and tmpfs). Both cgroupfs and tmpfs can be mounted by unprivileged users in unprivileged containers and root in an unprivileged container can set an unrestricted number of security.* xattrs and privileged users can also set unlimited trusted.* xattrs. As there are apparently users that have a fairly large number of xattrs we should scale a bit better. Other xattrs such as user.* are restricted for kernfs-based instances to a fairly limited number. Using a simple linked list protected by a spinlock used for set, get, and list operations doesn't scale well if users use a lot of xattrs even if it's not a crazy number. There's no need to bring in the big guns like rhashtables or rw semaphores for this. An rbtree with a rwlock, or limited rcu semanics and seqlock is enough. It scales within the constraints we are working in. By far the most common operation is getting an xattr. Setting xattrs should be a moderately rare operation. And listxattr() often only happens when copying xattrs between files or together with the contents to a new file. Holding a lock across listxattr() is unproblematic because it doesn't list the values of xattrs. It can only be used to list the names of all xattrs set on a file. And the number of xattr names that can be listed with listxattr() is limited to XATTR_LIST_MAX aka 65536 bytes. If a larger buffer is passed then vfs_listxattr() caps it to XATTR_LIST_MAX and if more xattr names are found it will return -E2BIG. In short, the maximum amount of memory that can be retrieved via listxattr() is limited. Of course, the API is broken as documented on xattr(7) already. In the future we might want to address this but for now this is the world we live in and have lived for a long time. But it does indeed mean that once an application goes over XATTR_LIST_MAX limit of xattrs set on an inode it isn't possible to copy the file and include its xattrs in the copy unless the caller knows all xattrs or limits the copy of the xattrs to important ones it knows by name (At least for tmpfs, and kernfs-based filesystems. Other filesystems might provide ways of achieving this.). Bonus of this port to rbtree+rwlock is that we shrink the memory consumption for users of the simple xattr infrastructure. Also add proper kernel documentation to all the functions. A big thanks to Paul for his comments. Cc: Vasily Averin <vvs@openvz.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-11-12Merge tag 'asoc-fix-v6.2-rc4' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.1 A relatively large collection of fixes and new platform quirks here, they're all fairly minor though - the widest possible impact is the fix to the use of prefixes on regulator names which would have broken any device that integrates regulators with DAPM and was used in a system where it had a name prefix assigning to it.
2022-11-11Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Andrii Nakryiko says: ==================== bpf 2022-11-11 We've added 11 non-merge commits during the last 8 day(s) which contain a total of 11 files changed, 83 insertions(+), 74 deletions(-). The main changes are: 1) Fix strncpy_from_kernel_nofault() to prevent out-of-bounds writes, from Alban Crequy. 2) Fix for bpf_prog_test_run_skb() to prevent wrong alignment, from Baisong Zhong. 3) Switch BPF_DISPATCHER to static_call() instead of ftrace infra, with a small build fix on top, from Peter Zijlstra and Nathan Chancellor. 4) Fix memory leak in BPF verifier in some error cases, from Wang Yufen. 5) 32-bit compilation error fixes for BPF selftests, from Pu Lehui and Yang Jihong. 6) Ensure even distribution of per-CPU free list elements, from Xu Kuohai. 7) Fix copy_map_value() to track special zeroed out areas properly, from Xu Kuohai. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix offset calculation error in __copy_map_value and zero_map_value bpf: Initialize same number of free nodes for each pcpu_freelist selftests: bpf: Add a test when bpf_probe_read_kernel_str() returns EFAULT maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault() selftests/bpf: Fix test_progs compilation failure in 32-bit arch selftests/bpf: Fix casting error when cross-compiling test_verifier for 32-bit platforms bpf: Fix memory leaks in __check_func_call bpf: Add explicit cast to 'void *' for __BPF_DISPATCHER_UPDATE() bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace) bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop") bpf, test_run: Fix alignment problem in bpf_prog_test_run_skb() ==================== Link: https://lore.kernel.org/r/20221111231624.938829-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-11Merge tag 'mm-hotfixes-stable-2022-11-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "22 hotfixes. Eight are cc:stable and the remainder address issues which were introduced post-6.0 or which aren't considered serious enough to justify a -stable backport" * tag 'mm-hotfixes-stable-2022-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) docs: kmsan: fix formatting of "Example report" mm/damon/dbgfs: check if rm_contexts input is for a real context maple_tree: don't set a new maximum on the node when not reusing nodes maple_tree: fix depth tracking in maple_state arch/x86/mm/hugetlbpage.c: pud_huge() returns 0 when using 2-level paging fs: fix leaked psi pressure state nilfs2: fix use-after-free bug of ns_writer on remount x86/traps: avoid KMSAN bugs originating from handle_bug() kmsan: make sure PREEMPT_RT is off Kconfig.debug: ensure early check for KMSAN in CONFIG_KMSAN_WARN x86/uaccess: instrument copy_from_user_nmi() kmsan: core: kmsan_in_runtime() should return true in NMI context mm: hugetlb_vmemmap: include missing linux/moduleparam.h mm/shmem: use page_mapping() to detect page cache for uffd continue mm/memremap.c: map FS_DAX device memory as decrypted Partly revert "mm/thp: carry over dirty bit when thp splits on pmd" nilfs2: fix deadlock in nilfs_count_free_blocks() mm/mmap: fix memory leak in mmap_region() hugetlbfs: don't delete error page from pagecache maple_tree: reorganize testing to restore module testing ...
2022-11-11Merge tag 'io_uring-6.1-2022-11-11' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Nothing major, just a few minor tweaks: - Tweak for the TCP zero-copy io_uring self test (Pavel) - Rather than use our internal cached value of number of CQ events available, use what the user can see (Dylan) - Fix a typo in a comment, added in this release (me) - Don't allow wrapping while adding provided buffers (me) - Fix a double poll race, and add a lockdep assertion for it too (Pavel)" * tag 'io_uring-6.1-2022-11-11' of git://git.kernel.dk/linux: io_uring/poll: lockdep annote io_poll_req_insert_locked io_uring/poll: fix double poll req->flags races io_uring: check for rollover of buffer ID when providing buffers io_uring: calculate CQEs from the user visible value io_uring: fix typo in io_uring.h comment selftests/net: don't tests batched TCP io_uring zc
2022-11-11bpf: Fix offset calculation error in __copy_map_value and zero_map_valueXu Kuohai
Function __copy_map_value and zero_map_value miscalculated copy offset, resulting in possible copy of unwanted data to user or kernel. Fix it. Fixes: cc48755808c6 ("bpf: Add zero_map_value to zero map value with special fields") Fixes: 4d7d7f69f4b1 ("bpf: Adapt copy_map_value for multiple offset case") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20221111125620.754855-1-xukuohai@huaweicloud.com
2022-11-11Merge tag 'hardening-v6.1-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening fix from Kees Cook: - Fix !SMP placement of '.data..decrypted' section (Nathan Chancellor) * tag 'hardening-v6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: vmlinux.lds.h: Fix placement of '.data..decrypted' section
2022-11-11Merge tag 'hyperv-fixes-signed-20221110' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Fix TSC MSR write for root partition (Anirudh Rayabharam) - Fix definition of vector in pci-hyperv driver (Dexuan Cui) - A few other misc patches * tag 'hyperv-fixes-signed-20221110' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: PCI: hv: Fix the definition of vector in hv_compose_msi_msg() MAINTAINERS: remove sthemmin x86/hyperv: fix invalid writes to MSRs during root partition kexec clocksource/drivers/hyperv: add data structure for reference TSC MSR Drivers: hv: fix repeated words in comments x86/hyperv: Remove BUG_ON() for kmap_local_page()
2022-11-11Merge tag 'dmaengine-fix-6.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Misc minor driver fixes and a big pile of at_hdmac driver fixes. More work on this driver is done and sitting in next: - Pile of at_hdmac driver rework which fixes many long standing issues for this driver. - couple of stm32 driver fixes for clearing structure and race fix - idxd fixes for RO device state and batch size - ti driver mem leak fix - apple fix for grabbing channels in xlate - resource leak fix in mv xor" * tag 'dmaengine-fix-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (24 commits) dmaengine: at_hdmac: Check return code of dma_async_device_register dmaengine: at_hdmac: Fix impossible condition dmaengine: at_hdmac: Don't allow CPU to reorder channel enable dmaengine: at_hdmac: Fix completion of unissued descriptor in case of errors dmaengine: at_hdmac: Fix descriptor handling when issuing it to hardware dmaengine: at_hdmac: Fix concurrency over the active list dmaengine: at_hdmac: Free the memset buf without holding the chan lock dmaengine: at_hdmac: Fix concurrency over descriptor dmaengine: at_hdmac: Fix concurrency problems by removing atc_complete_all() dmaengine: at_hdmac: Protect atchan->status with the channel lock dmaengine: at_hdmac: Do not call the complete callback on device_terminate_all dmaengine: at_hdmac: Fix premature completion of desc in issue_pending dmaengine: at_hdmac: Start transfer for cyclic channels in issue_pending dmaengine: at_hdmac: Don't start transactions at tx_submit level dmaengine: at_hdmac: Fix at_lli struct definition dmaengine: stm32-dma: fix potential race between pause and resume dmaengine: ti: k3-udma-glue: fix memory leak when register device fail dmaengine: mv_xor_v2: Fix a resource leak in mv_xor_v2_remove() dmaengine: apple-admac: Fix grabbing of channels in of_xlate dmaengine: idxd: fix RO device state error after been disabled/reset ...
2022-11-11ASoC: Set BQ parameters for some Dell modelsMark Brown
There are some Dell SKUs that need to set the parameters of the crossover filter (biquad). Each amplifier connects to one tweeter speaker and one woofer speaker. We should control HPF/LPF to output the proper frequency for the different speakers. If the codec driver got the BQ parameters from the device property, it will apply these parameters to the hardware.
2022-11-11Merge tag 'drm-fixes-2022-11-11' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly pull request for graphics, mostly amdgpu and i915, with a couple of fixes for vc4 and panfrost, panel quirks and a kconfig change for rcar-du. Nothing seems to be too strange at this stage. amdgpu: - Fix s/r in amdgpu_vram_mgr_new - SMU 13.0.4 update - GPUVM TLB race fix - DCN 3.1.4 fixes - DCN 3.2.x fixes - Vega10 fan fix - BACO fix for Beige Goby board - PSR fix - GPU VM PT locking fixes amdkfd: - CRIU fixes vc4: - HDMI fixes to vc4. panfrost: - Make panfrost's uapi header compile with C++. - Handle 1 gb boundary correctly in panfrost mmu code. panel: - Add rotation quirks for 2 panels. rcar-du: - DSI Kconfig fix i915: - Fix sg_table handling in map_dma_buf - Send PSR update also on invalidate - Do not set cache_dirty for DGFX - Restore userptr probe_range behaviour" * tag 'drm-fixes-2022-11-11' of git://anongit.freedesktop.org/drm/drm: (29 commits) drm/amd/display: only fill dirty rectangles when PSR is enabled drm/amdgpu: disable BACO on special BEIGE_GOBY card drm/amdgpu: Drop eviction lock when allocating PT BO drm/amdgpu: Unlock bo_list_mutex after error handling Revert "drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 properly"" drm/amd/display: Enforce minimum prefetch time for low memclk on DCN32 drm/amd/display: Fix gpio port mapping issue drm/amd/display: Fix reg timeout in enc314_enable_fifo drm/amd/display: Fix FCLK deviation and tool compile issues drm/amd/display: Zeromem mypipe heap struct before using it drm/amd/display: Update SR watermarks for DCN314 drm/amdgpu: workaround for TLB seq race drm/amdkfd: Fix error handling in criu_checkpoint drm/amdkfd: Fix error handling in kfd_criu_restore_events drm/amd/pm: update SMU IP v13.0.4 msg interface header drm: rcar-du: Fix Kconfig dependency between RCAR_DU and RCAR_MIPI_DSI drm/panfrost: Split io-pgtable requests properly drm/amdgpu: Fix the lpfn checking condition in drm buddy drm: panel-orientation-quirks: Add quirk for Acer Switch V 10 (SW5-017) drm: panel-orientation-quirks: Add quirk for Nanote UMPC-01 ...
2022-11-11sbitmap: Use single per-bitmap counting to wake up queued tagsGabriel Krisman Bertazi
sbitmap suffers from code complexity, as demonstrated by recent fixes, and eventual lost wake ups on nested I/O completion. The later happens, from what I understand, due to the non-atomic nature of the updates to wait_cnt, which needs to be subtracted and eventually reset when equal to zero. This two step process can eventually miss an update when a nested completion happens to interrupt the CPU in between the wait_cnt updates. This is very hard to fix, as shown by the recent changes to this code. The code complexity arises mostly from the corner cases to avoid missed wakes in this scenario. In addition, the handling of wake_batch recalculation plus the synchronization with sbq_queue_wake_up is non-trivial. This patchset implements the idea originally proposed by Jan [1], which removes the need for the two-step updates of wait_cnt. This is done by tracking the number of completions and wakeups in always increasing, per-bitmap counters. Instead of having to reset the wait_cnt when it reaches zero, we simply keep counting, and attempt to wake up N threads in a single wait queue whenever there is enough space for a batch. Waking up less than batch_wake shouldn't be a problem, because we haven't changed the conditions for wake up, and the existing batch calculation guarantees at least enough remaining completions to wake up a batch for each queue at any time. Performance-wise, one should expect very similar performance to the original algorithm for the case where there is no queueing. In both the old algorithm and this implementation, the first thing is to check ws_active, which bails out if there is no queueing to be managed. In the new code, we took care to avoid accounting completions and wakeups when there is no queueing, to not pay the cost of atomic operations unnecessarily, since it doesn't skew the numbers. For more interesting cases, where there is queueing, we need to take into account the cross-communication of the atomic operations. I've been benchmarking by running parallel fio jobs against a single hctx nullb in different hardware queue depth scenarios, and verifying both IOPS and queueing. Each experiment was repeated 5 times on a 20-CPU box, with 20 parallel jobs. fio was issuing fixed-size randwrites with qd=64 against nullb, varying only the hardware queue length per test. queue size 2 4 8 16 32 64 6.1-rc2 1681.1K (1.6K) 2633.0K (12.7K) 6940.8K (16.3K) 8172.3K (617.5K) 8391.7K (367.1K) 8606.1K (351.2K) patched 1721.8K (15.1K) 3016.7K (3.8K) 7543.0K (89.4K) 8132.5K (303.4K) 8324.2K (230.6K) 8401.8K (284.7K) The following is a similar experiment, ran against a nullb with a single bitmap shared by 20 hctx spread across 2 NUMA nodes. This has 40 parallel fio jobs operating on the same device queue size 2 4 8 16 32 64 6.1-rc2 1081.0K (2.3K) 957.2K (1.5K) 1699.1K (5.7K) 6178.2K (124.6K) 12227.9K (37.7K) 13286.6K (92.9K) patched 1081.8K (2.8K) 1316.5K (5.4K) 2364.4K (1.8K) 6151.4K (20.0K) 11893.6K (17.5K) 12385.6K (18.4K) It has also survived blktests and a 12h-stress run against nullb. I also ran the code against nvme and a scsi SSD, and I didn't observe performance regression in those. If there are other tests you think I should run, please let me know and I will follow up with results. [1] https://lore.kernel.org/all/aef9de29-e9f5-259a-f8be-12d1b734e72@google.com/ Cc: Hugh Dickins <hughd@google.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Liu Song <liusong@linux.alibaba.com> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Link: https://lore.kernel.org/r/20221105231055.25953-1-krisman@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-11soc/tegra: fuse: Use platform info with SoC revisionKartik
Tegra pre-silicon platforms do not have chip revisions. This makes the revision SoC attribute meaningless on these platforms. Instead, populate the revision SoC attribute with a combination of the platform name and the chip revision for silicon platforms, and simply with the platform name on pre-silicon platforms. Signed-off-by: Kartik <kkartik@nvidia.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-11-11ata: libata-sff: kill unused ata_sff_busy_sleep()Sergey Shtylyov
Nobody seems to call ata_sff_busy_sleep(), so we can get rid of it... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-11-10Merge tag 'net-6.1-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wifi, can and bpf. Current release - new code bugs: - can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet Previous releases - regressions: - bpf, sockmap: fix the sk->sk_forward_alloc warning - wifi: mac80211: fix general-protection-fault in ieee80211_subif_start_xmit() - can: af_can: fix NULL pointer dereference in can_rx_register() - can: dev: fix skb drop check, avoid o-o-b access - nfnetlink: fix potential dead lock in nfnetlink_rcv_msg() Previous releases - always broken: - bpf: fix wrong reg type conversion in release_reference() - gso: fix panic on frag_list with mixed head alloc types - wifi: brcmfmac: fix buffer overflow in brcmf_fweh_event_worker() - wifi: mac80211: set TWT Information Frame Disabled bit as 1 - eth: macsec offload related fixes, make sure to clear the keys from memory - tun: fix memory leaks in the use of napi_get_frags - tun: call napi_schedule_prep() to ensure we own a napi - tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent - ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to network - tipc: fix a msg->req tlv length check - sctp: clear out_curr if all frag chunks of current msg are pruned, avoid list corruption - mctp: fix an error handling path in mctp_init(), avoid leaks" * tag 'net-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits) eth: sp7021: drop free_netdev() from spl2sw_init_netdev() MAINTAINERS: Move Vivien to CREDITS net: macvlan: fix memory leaks of macvlan_common_newlink ethernet: tundra: free irq when alloc ring failed in tsi108_open() net: mv643xx_eth: disable napi when init rxq or txq failed in mv643xx_eth_open() ethernet: s2io: disable napi when start nic failed in s2io_card_up() net: atlantic: macsec: clear encryption keys from the stack net: phy: mscc: macsec: clear encryption keys when freeing a flow stmmac: dwmac-loongson: fix missing of_node_put() while module exiting stmmac: dwmac-loongson: fix missing pci_disable_device() in loongson_dwmac_probe() stmmac: dwmac-loongson: fix missing pci_disable_msi() while module exiting cxgb4vf: shut down the adapter when t4vf_update_port_info() failed in cxgb4vf_open() mctp: Fix an error handling path in mctp_init() stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz net: cxgb3_main: disable napi when bind qsets failed in cxgb_up() net: cpsw: disable napi in cpsw_ndo_open() iavf: Fix VF driver counting VLAN 0 filters ice: Fix spurious interrupt during removal of trusted VF net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions net/mlx5e: E-Switch, Fix comparing termination table instance ...
2022-11-10arm64: efi: Force the use of SetVirtualAddressMap() on Altra machinesArd Biesheuvel
Ampere Altra machines are reported to misbehave when the SetTime() EFI runtime service is called after ExitBootServices() but before calling SetVirtualAddressMap(). Given that the latter is horrid, pointless and explicitly documented as optional by the EFI spec, we no longer invoke it at boot if the configured size of the VA space guarantees that the EFI runtime memory regions can remain mapped 1:1 like they are at boot time. On Ampere Altra machines, this results in SetTime() calls issued by the rtc-efi driver triggering synchronous exceptions during boot. We can now recover from those without bringing down the system entirely, due to commit 23715a26c8d81291 ("arm64: efi: Recover from synchronous exceptions occurring in firmware"). However, it would be better to avoid the issue entirely, given that the firmware appears to remain in a funny state after this. So attempt to identify these machines based on the 'family' field in the type #1 SMBIOS record, and call SetVirtualAddressMap() unconditionally in that case. Tested-by: Alexandru Elisei <alexandru.elisei@gmail.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-11Merge tag 'drm-misc-fixes-2022-11-09' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v6.1-rc5: - HDMI fixes to vc4. - Make panfrost's uapi header compile with C++. - Add rotation quirks for 2 panels. - Fix s/r in amdgpu_vram_mgr_new - Handle 1 gb boundary correctly in panfrost mmu code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e02de501-4b85-28a0-3f6e-751ca13f5f9d@linux.intel.com
2022-11-10vfio: Export the device set open countAnthony DeRossi
The open count of a device set is the sum of the open counts of all devices in the set. Drivers can use this value to determine whether shared resources are in use without tracking them manually or accessing the private open_count in vfio_device. Signed-off-by: Anthony DeRossi <ajderossi@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20221110014027.28780-3-ajderossi@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10ASoC: SOF: ipc4-topology: Add widget queue supportMark Brown
Merge series from Peter Ujfalusi <peter.ujfalusi@linux.intel.com>: with SOF topology2 for IPC4, widgets might have mutliple queues they can be connected. The queues to use between components are descibed in the topology file. This series adds widget queue support (specify which pin to connect) for ipc4-topology with topology2. Note: currently queue 0 of a widget is used as hardwired default.
2022-11-10mm: kasan: Extend kasan_metadata_size() to also cover in-object sizeFeng Tang
When kasan is enabled for slab/slub, it may save kasan' free_meta data in the former part of slab object data area in slab object's free path, which works fine. There is ongoing effort to extend slub's debug function which will redzone the latter part of kmalloc object area, and when both of the debug are enabled, there is possible conflict, especially when the kmalloc object has small size, as caught by 0Day bot [1]. To solve it, slub code needs to know the in-object kasan's meta data size. Currently, there is existing kasan_metadata_size() which returns the kasan's metadata size inside slub's metadata area, so extend it to also cover the in-object meta size by adding a boolean flag 'in_object'. There is no functional change to existing code logic. [1]. https://lore.kernel.org/lkml/YuYm3dWwpZwH58Hu@xsang-OptiPlex-9020/ Reported-by: kernel test robot <oliver.sang@intel.com> Suggested-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-11-10memory: omap-gpmc: fix coverity issue "Control flow issues"Benedikt Niedermayr
Assign a big positive integer instead of an negative integer to an u32 variable. Also remove the check for ">= 0" which doesn't make sense for unsigned integers. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1527139 ("Control flow issues") Fixes: 89aed3cd5cb9 ("memory: omap-gpmc: wait pin additions") Signed-off-by: Benedikt Niedermayr <benedikt.niedermayr@siemens.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20221109102454.174320-1-benedikt.niedermayr@siemens.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-11-10pinctrl: Put space between type and data in compound literalAndy Shevchenko
It's slightly better to read when compound literal data and type are separated by a space. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Link: https://lore.kernel.org/r/20221109152356.39868-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-11-10dt-bindings: pinctrl: Correct the header guard of mt6795-pinfunc.hWei Li
Rename the header guard of mt6795-pinfunc.h from __DTS_MT8173_PINFUNC_H to __DTS_MT6795_PINFUNC_H what corresponding with the file name. Fixes: 81557a71564a ("dt-bindings: pinctrl: Add MediaTek MT6795 pinctrl bindings") Signed-off-by: Wei Li <liwei391@huawei.com> Link: https://lore.kernel.org/r/20221108094529.3597920-1-liwei391@huawei.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-11-09PM: domains: Store the next hrtimer wakeup in genpdMaulik Shah
The arch timer cannot wake up the Qualcomm Technologies, Inc. (QTI) SoCs from the deeper CPUidle states. To be able to wakeup from these deeper states, another always-on timer needs to be programmed through the so called CONTROL_TCS. As the RSC is part of CPU subsystem and the corresponding APSS RSC device is attached to the cluster PM domain (through genpd), it holds the responsibility to program the always-on timer, before entering any of these deeper CPUidle states. However, programming the timer requires information about the next hrtimer wakeup for the cluster PM domain, which is currently only known by genpd. Therefore, let's share this data through a new genpd helper function, dev_pm_genpd_get_next_hrtimer(). Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> [Ulf: Reworked the code and updated the commit message] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # SM8450 Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221018152837.619426-5-ulf.hansson@linaro.org
2022-11-10soc/tegra: pmc: Add I/O pad table for Tegra234Petlozu Pravareshwar
Add I/O pad table for Tegra234 to allow configuring DPD mode and switching the pins to 1.8V or 3.3V as needed. On Tegra234, DPD registers are reorganized such that there is a DPD_REQ register and a DPD_STATUS register per pad group. Update the PMC driver accordingly. While at it, use the generated tables from tegra-pinmux-scripts to make the formatting of these tables more consistent. Signed-off-by: Petlozu Pravareshwar <petlozup@nvidia.com> [treding@nvidia.com: generate tables from tegra-pinmux-scripts] Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-11-09drm/amdgpu: Set MTYPE in PTE based on BO flagsFelix Kuehling
The same BO may need different MTYPEs and SNOOP flags in PTEs depending on its current location relative to the mapping GPU. Setting MTYPEs from clients ahead of time is not practical for coherent memory sharing. Instead determine the correct MTYPE for the desired coherence model and current BO location when updating the page tables. To maintain backwards compatibility with MTYPE-selection in AMDGPU_VA_OP_MAP, the coherence-model-based MTYPE selection is only applied if it chooses an MTYPE other than MTYPE_NC (the default). Add two AMDGPU_GEM_CREATE_... flags to indicate the coherence model. The default if no flag is specified is non-coherent (i.e. coarse-grained coherent at dispatch boundaries). Update amdgpu_amdkfd_gpuvm.c to use this new method to choose the correct MTYPE depending on the current memory location. v2: * check that bo is not NULL (e.g. PRT mappings) * Fix missing ~ bitmask in gmc_v11_0.c v3: * squash in "drm/amdgpu: Inherit coherence flags on dmabuf import" Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-09Merge tag 'slab-for-6.1-rc4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: "Most are small fixups as described below. The !CONFIG_TRACING fix is a bit bigger and would normally be done in the next merge window as part of upcoming hardening changes. But we realized it can make the kmalloc waste tracking introduced in this window inaccurate, so decided to go with it now. Summary: - Remove !CONFIG_TRACING kmalloc() wrappers intended to save a function call, due to incompatilibity with recently introduced wasted space tracking and planned hardening changes. - A tracing parameter regression fix, by Kees Cook. - Two kernel-doc warning fixups, by Lukas Bulwahn and myself * tag 'slab-for-6.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm, slab: remove duplicate kernel-doc comment for ksize() mm/slab_common: Restore passing "caller" for tracing mm/slab: remove !CONFIG_TRACING variants of kmalloc_[node_]trace() mm/slab_common: repair kernel-doc for __ksize()
2022-11-09block: add check when merging zone device pagesLogan Gunthorpe
Consecutive zone device pages should not be merged into the same sgl or bvec segment with other types of pages or if they belong to different pgmaps. Otherwise getting the pgmap of a given segment is not possible without scanning the entire segment. This helper returns true either if both pages are not zone device pages or both pages are zone device pages with the same pgmap. Add a helper to determine if zone device pages are mergeable and use this helper in page_is_mergeable(). Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221021174116.7200-5-logang@deltatee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-09iov_iter: introduce iov_iter_get_pages_[alloc_]flags()Logan Gunthorpe
Add iov_iter_get_pages_flags() and iov_iter_get_pages_alloc_flags() which take a flags argument that is passed to get_user_pages_fast(). This is so that FOLL_PCI_P2PDMA can be passed when appropriate. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221021174116.7200-4-logang@deltatee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-09mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pagesLogan Gunthorpe
GUP Callers that expect PCI P2PDMA pages can now set FOLL_PCI_P2PDMA to allow obtaining P2PDMA pages. If GUP is called without the flag and a P2PDMA page is found, it will return an error in try_grab_page() or try_grab_folio(). The check is safe to do before taking the reference to the page in both cases seeing the page should be protected by either the appropriate ptl or mmap_lock; or the gup fast guarantees preventing TLB flushes. try_grab_folio() has one call site that WARNs on failure and cannot actually deal with the failure of this function (it seems it will get into an infinite loop). Expand the comment there to document a couple more conditions on why it will not fail. FOLL_PCI_P2PDMA cannot be set if FOLL_LONGTERM is set. This is to copy fsdax until pgmap refcounts are fixed (see the link below for more information). Link: https://lkml.kernel.org/r/Yy4Ot5MoOhsgYLTQ@ziepe.ca Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221021174116.7200-3-logang@deltatee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-09mm: allow multiple error returns in try_grab_page()Logan Gunthorpe
In order to add checks for P2PDMA memory into try_grab_page(), expand the error return from a bool to an int/error code. Update all the callsites handle change in usage. Also remove the WARN_ON_ONCE() call at the callsites seeing there already is a WARN_ON_ONCE() inside the function if it fails. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221021174116.7200-2-logang@deltatee.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-09firmware/nvram: bcm47xx: support init from IO memoryRafał Miłecki
Provide NVMEM content to the NVRAM driver from a simple memory resource. This is necessary to use NVRAM in a memory- mapped flash device. Patch taken from OpenWrts development tree. This patch makes it possible to use memory-mapped NVRAM on the D-Link DWL-8610AP and the D-Link DIR-890L. Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: linux-mips@vger.kernel.org Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Rafał Miłecki <rafal@milecki.pl> [Added an export for modules potentially using the init symbol] Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20221103082529.359084-1-linus.walleij@linaro.org Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2022-11-09scs: add support for dynamic shadow call stacksArd Biesheuvel
In order to allow arches to use code patching to conditionally emit the shadow stack pushes and pops, rather than always taking the performance hit even on CPUs that implement alternatives such as stack pointer authentication on arm64, add a Kconfig symbol that can be set by the arch to omit the SCS codegen itself, without otherwise affecting how support code for SCS and compiler options (for register reservation, for instance) are emitted. Also, add a static key and some plumbing to omit the allocation of shadow call stack for dynamic SCS configurations if SCS is disabled at runtime. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-3-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09arm64: unwind: add asynchronous unwind tables to kernel and modulesArd Biesheuvel
Enable asynchronous unwind table generation for both the core kernel as well as modules, and emit the resulting .eh_frame sections as init code so we can use the unwind directives for code patching at boot or module load time. This will be used by dynamic shadow call stack support, which will rely on code patching rather than compiler codegen to emit the shadow call stack push and pop instructions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>