summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2022-07-20genirq/generic_chip: Export irq_unmap_generic_chipJianmin Lv
Some irq controllers have to re-implement a private version for irq_generic_chip_ops, because they have a different xlate to translate hwirq. Export irq_unmap_generic_chip to allow reusing in drivers. Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1658314292-35346-5-git-send-email-lvjianmin@loongson.cn
2022-07-20ACPI: irq: Allow acpi_gsi_to_irq() to have an arch-specific fallbackMarc Zyngier
It appears that the generic version of acpi_gsi_to_irq() doesn't fallback to establishing a mapping if there is no pre-existing one while the x86 version does. While arm64 seems unaffected by it, LoongArch is relying on the x86 behaviour. In an effort to prevent new architectures from reinventing the proverbial wheel, provide an optional callback that the arch code can set to restore the x86 behaviour. Hopefully we can eventually get rid of this in the future once the expected behaviour has been clarified. Reported-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Tested-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1658314292-35346-4-git-send-email-lvjianmin@loongson.cn
2022-07-20APCI: irq: Add support for multiple GSI domainsMarc Zyngier
In an unfortunate departure from the ACPI spec, the LoongArch architecture split its GSI space across multiple interrupt controllers. In order to be able to reuse the core code and prevent architectures from reinventing an already square wheel, offer the arch code the ability to register a dispatcher function that will return the domain fwnode for a given GSI. The ARM GIC drivers are updated to support this (with a single domain, as intended). Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Tested-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1658314292-35346-3-git-send-email-lvjianmin@loongson.cn
2022-07-20tools: Fixed MIPS builds due to struct flock re-definitionFlorian Fainelli
Building perf for MIPS failed after 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") with the following error: CC /home/fainelli/work/buildroot/output/bmips/build/linux-custom/tools/perf/trace/beauty/fcntl.o In file included from ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:77, from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../include/uapi/asm-generic/fcntl.h:188:8: error: redefinition of 'struct flock' struct flock { ^~~~~ In file included from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:63:8: note: originally defined here struct flock { ^~~~~ This is due to the local copy under tools/include/uapi/asm-generic/fcntl.h including the toolchain's kernel headers which already define 'struct flock' and define HAVE_ARCH_STRUCT_FLOCK to future inclusions make a decision as to whether re-defining 'struct flock' is appropriate or not. Make sure what do not re-define 'struct flock' when HAVE_ARCH_STRUCT_FLOCK is already defined. Fixes: 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [arnd: sync with include/uapi/asm-generic/fcntl.h as well] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-20arm64: enable THP_SWAP for arm64Barry Song
THP_SWAP has been proven to improve the swap throughput significantly on x86_64 according to commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after swapped out"). As long as arm64 uses 4K page size, it is quite similar with x86_64 by having 2MB PMD THP. THP_SWAP is architecture-independent, thus, enabling it on arm64 will benefit arm64 as well. A corner case is that MTE has an assumption that only base pages can be swapped. We won't enable THP_SWAP for ARM64 hardware with MTE support until MTE is reworked to coexist with THP_SWAP. A micro-benchmark is written to measure thp swapout throughput as below, unsigned long long tv_to_ms(struct timeval tv) { return tv.tv_sec * 1000 + tv.tv_usec / 1000; } main() { struct timeval tv_b, tv_e;; #define SIZE 400*1024*1024 volatile void *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (!p) { perror("fail to get memory"); exit(-1); } madvise(p, SIZE, MADV_HUGEPAGE); memset(p, 0x11, SIZE); /* write to get mem */ gettimeofday(&tv_b, NULL); madvise(p, SIZE, MADV_PAGEOUT); gettimeofday(&tv_e, NULL); printf("swp out bandwidth: %ld bytes/ms\n", SIZE/(tv_to_ms(tv_e) - tv_to_ms(tv_b))); } Testing is done on rk3568 64bit Quad Core Cortex-A55 platform - ROCK 3A. thp swp throughput w/o patch: 2734bytes/ms (mean of 10 tests) thp swp throughput w/ patch: 3331bytes/ms (mean of 10 tests) Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Steven Price <steven.price@arm.com> Cc: Yang Shi <shy828301@gmail.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Barry Song <v-songbaohua@oppo.com> Link: https://lore.kernel.org/r/20220720093737.133375-1-21cnbao@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2022-07-20Merge tag 'linux-can-next-for-5.20-20220720' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== this is a pull request of 29 patches for net-next/master. The first 6 patches target the slcan driver. Dan Carpenter contributes a hardening patch, followed by 5 cleanup patches. Biju Das contributes 5 patches to prepare the sja1000 driver to support the Renesas RZ/N1 SJA1000 CAN controller. Dario Binacchi's patch for the slcan driver fixes a sleep with held spin lock. Another patch by Dario Binacchi fixes a wrong comment in the c_can driver. Pavel Pisa updates the CTU CAN FD IP core registers. Stephane Grosjean contributes 3 patches to the peak_usb driver for cleanups and support of a new MCU. The last 12 patches are by Vincent Mailhol, they fix and improve the txerr and rxerr reporting in all CAN drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.Kuniyuki Iwashima
While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20udp: Fix a data-race around sysctl_udp_l3mdev_accept.Kuniyuki Iwashima
While reading sysctl_udp_l3mdev_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20ip: Fix data-races around sysctl_ip_prot_sock.Kuniyuki Iwashima
sysctl_ip_prot_sock is accessed concurrently, and there is always a chance of data-race. So, all readers and writers need some basic protection to avoid load/store-tearing. Fixes: 4548b683b781 ("Introduce a sysctl that modifies the value of PROT_SOCK.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20KVM: s390: resetting the Topology-Change-ReportPierre Morel
During a subsystem reset the Topology-Change-Report is cleared. Let's give userland the possibility to clear the MTCR in the case of a subsystem reset. To migrate the MTCR, we give userland the possibility to query the MTCR state. We indicate KVM support for the CPU topology facility with a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20220714194334.127812-1-pmorel@linux.ibm.com> Link: https://lore.kernel.org/all/20220714194334.127812-1-pmorel@linux.ibm.com/ [frankja@linux.ibm.com: Simple conflict resolution in Documentation/virt/kvm/api.rst] Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-07-20can: error: add definitions for the different CAN error thresholdsVincent Mailhol
Currently, drivers are using magic numbers to derive the CAN error states from the error counter. Add three macro declarations to remediate this. For reference, the error-active, error-passive and bus-off are defined in ISO 11898, section 12.1.4.2 "Error counting". Although ISO 11898 does not define error-warning state, this extra value is also commonly used and is thus also added. Link: https://lore.kernel.org/all/20220719143550.3681-13-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20can: add CAN_ERR_CNT flag to notify availability of error counterVincent Mailhol
Add a dedicated flag in uapi/linux/can/error.h to notify the userland that fields data[6] and data[7] of the CAN error frame were respectively populated with the tx and rx error counters. For all driver tree-wide, set up this flags whenever needed. Link: https://lore.kernel.org/all/20220719143550.3681-12-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-20can: error: specify the values of data[5..7] of CAN error framesVincent Mailhol
Currently, data[5..7] of struct can_frame, when used as a CAN error frame, are defined as being "controller specific". Device specific behaviours are problematic because it prevents someone from writing code which is portable between devices. As a matter of fact, data[5] is never used, data[6] is always used to report TX error counter and data[7] is always used to report RX error counter. can-utils also relies on this. This patch updates the comment in the uapi header to specify that data[5] is reserved (and thus should not be used) and that data[6..7] are used for error counters. Fixes: 0d66548a10cb ("[CAN]: Add PF_CAN core module") Link: https://lore.kernel.org/all/20220719143550.3681-11-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-19net/sched: remove qdisc_root_lock() helperDavide Caratti
the last caller has been removed with commit 96f5e66e8a79 ("mac80211: fix aggregation for hardware with ampdu queues"), so it's safe to remove this function. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/703d549e3088367651d92a059743f1be848d74b7.1658133689.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19driver-core: Introduce BIN_ATTR_ADMIN_{RO,RW}Ira Weiny
Many binary attributes need to limit access to CAP_SYS_ADMIN only; ie many binary attributes specify is_visible with 0400 or 0600. Make setting the permissions of such attributes more explicit by defining BIN_ATTR_ADMIN_{RO,RW}. Cc: Bjorn Helgaas <bhelgaas@google.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Suggested-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-6-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19PCI/DOE: Add DOE mailbox support functionsJonathan Cameron
Introduced in a PCIe r6.0, sec 6.30, DOE provides a config space based mailbox with standard protocol discovery. Each mailbox is accessed through a DOE Extended Capability. Each DOE mailbox must support the DOE discovery protocol in addition to any number of additional protocols. Define core PCIe functionality to manage a single PCIe DOE mailbox at a defined config space offset. Functionality includes iterating, creating, query of supported protocol, and task submission. Destruction of the mailboxes is device managed. Cc: "Li, Ming" <ming4.li@intel.com> Cc: Bjorn Helgaas <helgaas@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Co-developed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220719205249.566684-4-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19PCI: Add vendor ID for the PCI SIGJonathan Cameron
This ID is used in DOE headers to identify protocols that are defined within the PCI Express Base Specification, PCIe r6.0, sec 6.30.1.1 table 6-32. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20220719205249.566684-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19Merge branch 'io_uring-zerocopy-send' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux Pavel Begunkov says: ==================== io_uring zerocopy send The patchset implements io_uring zerocopy send. It works with both registered and normal buffers, mixing is allowed but not recommended. Apart from usual request completions, just as with MSG_ZEROCOPY, io_uring separately notifies the userspace when buffers are freed and can be reused (see API design below), which is delivered into io_uring's Completion Queue. Those "buffer-free" notifications are not necessarily per request, but the userspace has control over it and should explicitly attaching a number of requests to a single notification. The series also adds some internal optimisations when used with registered buffers like removing page referencing. From the kernel networking perspective there are two main changes. The first one is passing ubuf_info into the network layer from io_uring (inside of an in kernel struct msghdr). This allows extra optimisations, e.g. ubuf_info caching on the io_uring side, but also helps to avoid cross-referencing and synchronisation problems. The second part is an optional optimisation removing page referencing for requests with registered buffers. Benchmarking UDP with an optimised version of the selftest (see [1]), which sends a bunch of requests, waits for completions and repeats. "+ flush" column posts one additional "buffer-free" notification per request, and just "zc" doesn't post buffer notifications at all. NIC (requests / second): IO size | non-zc | zc | zc + flush 4000 | 495134 | 606420 (+22%) | 558971 (+12%) 1500 | 551808 | 577116 (+4.5%) | 565803 (+2.5%) 1000 | 584677 | 592088 (+1.2%) | 560885 (-4%) 600 | 596292 | 598550 (+0.4%) | 555366 (-6.7%) dummy (requests / second): IO size | non-zc | zc | zc + flush 8000 | 1299916 | 2396600 (+84%) | 2224219 (+71%) 4000 | 1869230 | 2344146 (+25%) | 2170069 (+16%) 1200 | 2071617 | 2361960 (+14%) | 2203052 (+6%) 600 | 2106794 | 2381527 (+13%) | 2195295 (+4%) Previously it also brought a massive performance speedup compared to the msg_zerocopy tool (see [3]), which is probably not super interesting. There is also an additional bunch of refcounting optimisations that was omitted from the series for simplicity and as they don't change the picture drastically, they will be sent as follow up, as well as flushing optimisations closing the performance gap b/w two last columns. For TCP on localhost (with hacks enabling localhost zerocopy) and including additional overhead for receive: IO size | non-zc | zc 1200 | 4174 | 4148 4096 | 7597 | 11228 Using a real NIC 1200 bytes, zc is worse than non-zc ~5-10%, maybe the omitted optimisations will somewhat help, should look better for 4000, but couldn't test properly because of setup problems. Links: liburing (benchmark + tests): [1] https://github.com/isilence/liburing/tree/zc_v4 kernel repo: [2] https://github.com/isilence/linux/tree/zc_v4 RFC v1: [3] https://lore.kernel.org/io-uring/cover.1638282789.git.asml.silence@gmail.com/ RFC v2: https://lore.kernel.org/io-uring/cover.1640029579.git.asml.silence@gmail.com/ Net patches based: git@github.com:isilence/linux.git zc_v4-net-base or https://github.com/isilence/linux/tree/zc_v4-net-base API design overview: The series introduces an io_uring concept of notifactors. From the userspace perspective it's an entity to which it can bind one or more requests and then requesting to flush it. Flushing a notifier makes it impossible to attach new requests to it, and instructs the notifier to post a completion once all requests attached to it are completed and the kernel doesn't need the buffers anymore. Notifications are stored in notification slots, which should be registered as an array in io_uring. Each slot stores only one notifier at any particular moment. Flushing removes it from the slot and the slot automatically replaces it with a new notifier. All operations with notifiers are done by specifying an index of a slot it's currently in. When registering a notification the userspace specifies a u64 tag for each slot, which will be copied in notification completion entries as cqe::user_data. cqe::res is 0 and cqe::flags is equal to wrap around u32 sequence number counting notifiers of a slot. ==================== Link: https://lore.kernel.org/r/cover.1657643355.git.asml.silence@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19net: introduce __skb_fill_page_desc_noaccPavel Begunkov
Managed pages contain pinned userspace pages and controlled by upper layers, there is no need in tracking skb->pfmemalloc for them. Introduce a helper for filling frags but ignoring page tracking, it'll be needed later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19net: introduce managed frags infrastructurePavel Begunkov
Some users like io_uring can do page pinning more efficiently, so we want a way to delegate referencing to other subsystems. For that add a new flag called SKBFL_MANAGED_FRAG_REFS. When set, skb doesn't hold page references and upper layers are responsivle to managing page lifetime. It's allowed to convert skbs from managed to normal by calling skb_zcopy_downgrade_managed(). The function will take all needed page references and clear the flag. It's needed, for instance, to avoid mixing managed modes. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19net: Allow custom iter handler in msghdrDavid Ahern
Add support for custom iov_iter handling to msghdr. The idea is that in-kernel subsystems want control over how an SG is split. Signed-off-by: David Ahern <dsahern@kernel.org> [pavel: move callback into msghdr] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19skbuff: carry external ubuf_info in msghdrPavel Begunkov
Make possible for network in-kernel callers like io_uring to pass in a custom ubuf_info by setting it in a new field of struct msghdr. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19net/mlx5: Expose ts_cqe_metadata_size2wqe_counterAya Levin
Add capability field which indicates the mask for wqe_counter which connects between loopback CQE and the original WQE. With this connection the driver can identify lost of the loopback CQE and reply PTP synchronization with timestamp given in the original CQE. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19rcu: tiny: Record kvfree_call_rcu() call stack for KASANJohannes Berg
When running KASAN with Tiny RCU (e.g. under ARCH=um, where a working KASAN patch is now available), we don't get any information on the original kfree_rcu() (or similar) caller when a problem is reported, as Tiny RCU doesn't record this. Add the recording, which required pulling kvfree_call_rcu() out of line for the KASAN case since the recording function (kasan_record_aux_stack_noalloc) is neither exported, nor can we include kasan.h into rcutiny.h. without KASAN, the patch has no size impact (ARCH=um kernel): text data bss dec hex filename 6151515 4423154 33148520 43723189 29b29b5 linux 6151515 4423154 33148520 43723189 29b29b5 linux + patch with KASAN, the impact on my build was minimal: text data bss dec hex filename 13915539 7388050 33282304 54585893 340ea25 linux 13911266 7392114 33282304 54585684 340e954 linux + patch -4273 +4064 +-0 -209 Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-07-19RDMA/mlx5: Expose steering anchor to userspaceMark Bloch
Expose a steering anchor per priority to allow users to re-inject packets back into default NIC pipeline for additional processing. MLX5_IB_METHOD_STEERING_ANCHOR_CREATE returns a flow table ID which a user can use to re-inject packets at a specific priority. A FTE (flow table entry) can be created and the flow table ID used as a destination. When a packet is taken into a RDMA-controlled steering domain (like software steering) there may be a need to insert the packet back into the default NIC pipeline. This exposes a flow table ID to the user that can be used as a destination in a flow table entry. With this new method priorities that are exposed to users via MLX5_IB_METHOD_FLOW_MATCHER_CREATE can be reached from a non-zero UID. As user-created flow tables (via RDMA DEVX) are created with a non-zero UID thus it's impossible to point to a NIC core flow table (core driver flow tables are created with UID value of zero) from userspace. Create flow tables that are exposed to users with the shared UID, this allows users to point to default NIC flow tables. Steering loops are prevented at FW level as FW enforces that no flow table at level X can point to a table at level lower than X. Link: https://lore.kernel.org/all/20220703205407.110890-6-saeed@kernel.org/ Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-19Merge branch 'mlx5-next' into wip/leon-for-nextLeon Romanovsky
Mark Bloch Says: ================ Expose steering anchor Expose a steering anchor per priority to allow users to re-inject packets back into default NIC pipeline for additional processing. MLX5_IB_METHOD_STEERING_ANCHOR_CREATE returns a flow table ID which a user can use to re-inject packets at a specific priority. A FTE (flow table entry) can be created and the flow table ID used as a destination. When a packet is taken into a RDMA-controlled steering domain (like software steering) there may be a need to insert the packet back into the default NIC pipeline. This exposes a flow table ID to the user that can be used as a destination in a flow table entry. With this new method priorities that are exposed to users via MLX5_IB_METHOD_FLOW_MATCHER_CREATE can be reached from a non-zero UID. As user-created flow tables (via RDMA DEVX) are created with a non-zero UID thus it's impossible to point to a NIC core flow table (core driver flow tables are created with UID value of zero) from userspace. Create flow tables that are exposed to users with the shared UID, this allows users to point to default NIC flow tables. Steering loops are prevented at FW level as FW enforces that no flow table at level X can point to a table at level lower than X. ================ Link: https://lore.kernel.org/all/20220703205407.110890-1-saeed@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-19bpf: fix bpf_skb_pull_data documentationJoanne Koong
Fix documentation for bpf_skb_pull_data() helper for when len == 0. Fixes: fa15601ab31e ("bpf: add documentation for eBPF helpers (33-41)") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220715193800.3940070-1-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19bpf: Don't redirect packets with invalid pkt_lenZhengchao Shao
Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any skbs, that is, the flow->head is null. The root cause, as the [2] says, is because that bpf_prog_test_run_skb() run a bpf prog which redirects empty skbs. So we should determine whether the length of the packet modified by bpf prog or others like bpf_prog_test is valid before forwarding it directly. LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5 LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220715115559.139691-1-shaozhengchao@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19scsi: qla2xxx: tracing: Use the new __vstring() helperSteven Rostedt (Google)
Instead of open coding a __dynamic_array() with a fixed length (which defeats the purpose of the dynamic array in the first place). Use the new __vstring() helper that will use a va_list and only write enough of the string into the ring buffer that is needed. Link: https://lkml.kernel.org/r/20220705224750.896553364@goodmis.org Cc: Bart Van Assche <bvanassche@acm.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-19scsi: iscsi: tracing: Use the new __vstring() helperSteven Rostedt (Google)
Instead of open coding a __dynamic_array() with a fixed length (which defeats the purpose of the dynamic array in the first place). Use the new __vstring() helper that will use a va_list and only write enough of the string into the ring buffer that is needed. Link: https://lkml.kernel.org/r/20220705224750.715763972@goodmis.org Cc: Fred Herard <fred.herard@oracle.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-19Merge branch 'nuvoton/newsoc' into arm/newsocArnd Bergmann
Merge the new SoC support from Tomer Maimon: "This patchset adds initial support for the Nuvoton Arbel NPCM8XX Board Management controller (BMC) SoC family. The Nuvoton Arbel NPCM8XX SoC is a fourth-generation BMC. The NPCM8XX computing subsystem comprises a quadcore ARM Cortex A35 ARM-V8 architecture. This patchset adds minimal architecture and drivers such as: Clocksource, Clock, Reset, and WD. Some of the Arbel NPCM8XX peripherals are based on Poleg NPCM7XX. This patchset was tested on the Arbel NPCM8XX evaluation board." I'm leaving out the clk controller driver, which is still under review. * nuvoton/newsoc: arm64: defconfig: Add Nuvoton NPCM family support arm64: dts: nuvoton: Add initial NPCM845 EVB device tree arm64: dts: nuvoton: Add initial NPCM8XX device tree arm64: npcm: Add support for Nuvoton NPCM8XX BMC SoC dt-bindings: arm: npcm: Add nuvoton,npcm845 GCR compatible string dt-bindings: arm: npcm: Add nuvoton,npcm845 compatible string dt-bindings: arm: npcm: Add maintainer reset: npcm: Add NPCM8XX support dt-bindings: reset: npcm: Add support for NPCM8XX reset: npcm: using syscon instead of device data ARM: dts: nuvoton: add reset syscon property dt-bindings: reset: npcm: add GCR syscon property dt-binding: clk: npcm845: Add binding for Nuvoton NPCM8XX Clock dt-bindings: watchdog: npcm: Add npcm845 compatible string dt-bindings: timer: npcm: Add npcm845 compatible string
2022-07-19dt-binding: clk: npcm845: Add binding for Nuvoton NPCM8XX ClockTomer Maimon
Add binding for the Arbel BMC NPCM8XX Clock controller. Signed-off-by: Tomer Maimon <tmaimon77@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-19fs: add mode_strip_sgid() helperYang Xu
Add a dedicated helper to handle the setgid bit when creating a new file in a setgid directory. This is a preparatory patch for moving setgid stripping into the vfs. The patch contains no functional changes. Currently the setgid stripping logic is open-coded directly in inode_init_owner() and the individual filesystems are responsible for handling setgid inheritance. Since this has proven to be brittle as evidenced by old issues we uncovered over the last months (see [1] to [3] below) we will try to move this logic into the vfs. Link: e014f37db1a2 ("xfs: use setattr_copy to set vfs inode attributes") [1] Link: 01ea173e103e ("xfs: fix up non-directory creation in SGID directories") [2] Link: fd84bfdddd16 ("ceph: fix up non-directory creation in SGID directories") [3] Link: https://lore.kernel.org/r/1657779088-2242-1-git-send-email-xuyang2018.jy@fujitsu.com Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-and-Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Yang Xu <xuyang2018.jy@fujitsu.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-07-19KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean statsOliver Upton
commit 1b870fa5573e ("kvm: stats: tell userspace which values are boolean") added a new stat unit (boolean) but failed to raise KVM_STATS_UNIT_MAX. Fix by pointing UNIT_MAX at the new max value of UNIT_BOOLEAN. Fixes: 1b870fa5573e ("kvm: stats: tell userspace which values are boolean") Reported-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20220719125229.2934273-1-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19Revert "platform/chrome: Add Type-C mux set command definitions"Greg Kroah-Hartman
This reverts commit 28a6ed8e39f77f6ac613ec9b7461aa75e85fa79a. The chrome platform driver changes need to come in through the platform tree due to some api changes that showed up there that cause build errors in linux-next Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20220719160821.5e68e30b@oak.ozlabs.ibm.com Cc: Prashant Malani <pmalani@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-19Merge branch irq/renesas-irqc into irq/irqchip-nextMarc Zyngier
* irq/renesas-irqc: : . : New Renesas RZ/G2L IRQC driver from Lad Prabhakar, equipped with : its companion GPIO driver. : . dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/V2L SoC gpio: thunderx: Don't directly include asm-generic/msi.h pinctrl: renesas: pinctrl-rzg2l: Add IRQ domain to handle GPIO interrupt dt-bindings: pinctrl: renesas,rzg2l-pinctrl: Document the properties to handle GPIO IRQ gpio: gpiolib: Allow free() callback to be overridden irqchip: Add RZ/G2L IA55 Interrupt Controller driver dt-bindings: interrupt-controller: Add Renesas RZ/G2L Interrupt Controller gpio: Remove dynamic allocation from populate_parent_alloc_arg() Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-07-19amt: use workqueue for gateway side message handlingTaehee Yoo
There are some synchronization issues(amt->status, amt->req_cnt, etc) if the interface is in gateway mode because gateway message handlers are processed concurrently. This applies a work queue for processing these messages instead of expanding the locking context. So, the purposes of this patch are to fix exist race conditions and to make gateway to be able to validate a gateway status more correctly. When the AMT gateway interface is created, it tries to establish to relay. The establishment step looks stateless, but it should be managed well. In order to handle messages in the gateway, it saves the current status(i.e. AMT_STATUS_XXX). This patch makes gateway code to be worked with a single thread. Now, all messages except the multicast are triggered(received or delay expired), and these messages will be stored in the event queue(amt->events). Then, the single worker processes stored messages asynchronously one by one. The multicast data message type will be still processed immediately. Now, amt->lock is only needed to access the event queue(amt->events) if an interface is the gateway mode. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-19mfd: mt6397: Add basic support for MT6331+MT6332 PMICAngeloGioacchino Del Regno
Add support for the MT6331 PMIC with MT6332 Companion PMIC, found in MT6795 Helio X10 smartphone platforms. This combo has support for multiple devices but, for a start, only the following have been implemented: - Regulators (two instances, one in MT6331, one in MT6332) - RTC (MT6331) - Keys (MT6331) - Interrupts (MT6331 also dispatches MT6332's interrupts) There's more to be implemented, especially for MT6332, which will come at a later stage. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220627123954.64299-1-angelogioacchino.delregno@collabora.com
2022-07-19mfd: ipaq-micro: Fix spelling mistake of "receive{d}"Zhang Jiaming
Change 'receieved' to 'received' and 'recieve' to 'receive'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220623073333.5675-1-jiaming@nfschina.com
2022-07-19mfd: tc6393xb: Make disable callback return voidUwe Kleine-König
All implementations return 0, so simplify accordingly. This is a preparation for making platform remove callbacks return void. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220619082655.53728-1-u.kleine-koenig@pengutronix.de
2022-07-19mfd: twl: Remove platform data supportUwe Kleine-König
There is no in-tree machine that provides a struct twl4030_platform_data since commit e92fc4f04a34 ("ARM: OMAP2+: Drop legacy board file for LDP"). So assume dev_get_platdata() returns NULL in twl_probe() and simplify accordingly. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220614152148.252820-1-u.kleine-koenig@pengutronix.de
2022-07-19mfd: cros_ec: Add SCP Core-1 as a new CrOS EC MCUTinghan Shen
MT8195 System Companion Processors(SCP) is a dual-core RISC-V MCU. Add a new CrOS feature ID to represent the SCP's 2nd core. The 1st core is referred to as 'core 0', and the 2nd core is referred to as 'core 1'. Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220601112201.15510-16-tinghan.shen@mediatek.com
2022-07-19mfd: mt6358-irq: Add MT6357 PMIC supportFabien Parent
Add MT6357 PMIC IRQ support. Signed-off-by: Fabien Parent <fparent@baylibre.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220531124959.202787-6-fparent@baylibre.com
2022-07-19mfd: mt6397-core: Add MT6357 PMIC supportFabien Parent
Adds support for PMIC keys, Regulator, and RTC for the MT6357 PMIC. Signed-off-by: Fabien Parent <fparent@baylibre.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220531124959.202787-5-fparent@baylibre.com
2022-07-19mfd: tc6387xb: Drop disable callback that is never calledUwe Kleine-König
The driver never calls the disable callback, so drop the member from the platform struct and all callbacks from the actual platform datas. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220530192430.2108217-4-u.kleine-koenig@pengutronix.de
2022-07-19mfd: t7l66xb: Drop platform disable callbackUwe Kleine-König
None of the in-tree instantiations of struct t7l66xb_platform_data provides a disable callback. So better don't dereference this function pointer unconditionally. As there is no user, drop it completely instead of calling it conditional. This is a preparation for making platform remove callbacks return void. Fixes: 1f192015ca5b ("mfd: driver for the T7L66XB TMIO SoC") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220530192430.2108217-3-u.kleine-koenig@pengutronix.de
2022-07-19mfd: max77714: Update Luca Ceresoli's e-mail addressLuca Ceresoli
My Bootlin address is preferred from now on. Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://lore.kernel.org/r/20220603155727.1232061-4-luca@lucaceresoli.net
2022-07-19Merge branches 'ib-mfd-acpi-for-rafael-5.20', ↵Lee Jones
'ib-mfd-edac-i2c-leds-pinctrl-platform-watchdog-5.20' and 'ib-mfd-soc-bcm-5.20' into ibs-for-mfd-merged
2022-07-19scsi: sd: allow max_sectors be capped at DMA optimal size limitJohn Garry
Streaming DMA mappings may be considerably slower when mappings go through an IOMMU and the total mapping length is somewhat long. This is because the IOMMU IOVA code allocates and free an IOVA for each mapping, which may affect performance. New member Scsi_Host.opt_sectors is added, which is the optimal host max_sectors, and use this value to cap the request queue max_sectors when set. It could be considered to have request queues io_opt value initially set at Scsi_Host.opt_sectors in __scsi_init_queue(), but that is not really the purpose of io_opt. Finally, even though Scsi_Host.opt_sectors value should never be greater than the request queue max_hw_sectors value, continue to limit to this value for safety. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-07-19dt-bindings: gpio: add pull-disable flagNuno Sá
This extends the flags that can be used in GPIO specifiers to indicate that no bias is intended in the pin. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>