summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2022-03-10io_uring: add support for IORING_OP_MSG_RING commandJens Axboe
This adds support for IORING_OP_MSG_RING, which allows an SQE to signal another ring. That allows either waking up someone waiting on the ring, or even passing a 64-bit value via the user_data field in the CQE. sqe->fd must contain the fd of a ring that should receive the CQE. sqe->off will be propagated to the cqe->user_data on the target ring, and sqe->len will be propagated to cqe->res. The results CQE will have IORING_CQE_F_MSG set in its flags, to indicate that this CQE was generated from a messaging request rather than a SQE issued locally on that ring. This effectively allows passing a 64-bit and a 32-bit quantify between the two rings. This request type has the following request specific error cases: - -EBADFD. Set if the sqe->fd doesn't point to a file descriptor that is of the io_uring type. - -EOVERFLOW. Set if we were not able to deliver a request to the target ring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-10io_uring: add support for registering ring file descriptorsJens Axboe
Lots of workloads use multiple threads, in which case the file table is shared between them. This makes getting and putting the ring file descriptor for each io_uring_enter(2) system call more expensive, as it involves an atomic get and put for each call. Similarly to how we allow registering normal file descriptors to avoid this overhead, add support for an io_uring_register(2) API that allows to register the ring fds themselves: 1) IORING_REGISTER_RING_FDS - takes an array of io_uring_rsrc_update structs, and registers them with the task. 2) IORING_UNREGISTER_RING_FDS - takes an array of io_uring_src_update structs, and unregisters them. When a ring fd is registered, it is internally represented by an offset. This offset is returned to the application, and the application then uses this offset and sets IORING_ENTER_REGISTERED_RING for the io_uring_enter(2) system call. This works just like using a registered file descriptor, rather than a real one, in an SQE, where IOSQE_FIXED_FILE gets set to tell io_uring that we're using an internal offset/descriptor rather than a real file descriptor. In initial testing, this provides a nice bump in performance for threaded applications in real world cases where the batch count (eg number of requests submitted per io_uring_enter(2) invocation) is low. In a microbenchmark, submitting NOP requests, we see the following increases in performance: Requests per syscall Baseline Registered Increase ---------------------------------------------------------------- 1 ~7030K ~8080K +15% 2 ~13120K ~14800K +13% 4 ~22740K ~25300K +11% Co-developed-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-10io-uring: Make tracepoints consistent.Stefan Roesch
This makes the io-uring tracepoints consistent. Where it makes sense the tracepoints start with the following four fields: - context (ring) - request - user_data - opcode. Signed-off-by: Stefan Roesch <shr@fb.com> Link: https://lore.kernel.org/r/20220214180430.70572-3-shr@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-10io_uring: remove trace for eventfdUsama Arif
The information on whether eventfd is registered is not very useful and would result in the tracepoint being enclosed in an rcu_readlock in a later patch that tries to avoid ring quiesce for registering eventfd. Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Usama Arif <usama.arif@bytedance.com> Link: https://lore.kernel.org/r/20220204145117.1186568-2-usama.arif@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-10Clean ups and preparation for IPC abstraction in the SOF driverMark Brown
Merge series from Ranjani Sridharan <ranjani.sridharan@linux.intel.com>: In preparation for adding support for the new IPC version that has been introduced in the SOF firmware, this patch set includes some clean ups and necessary modifications to commonly used functions that will be re-used across different IPC-specific code.
2022-03-10Merge branch irq/aic-pmu into irq/irqchip-nextMarc Zyngier
* irq/aic-pmu: : . : Prefix branch for the M1 PMU support, adding the required : irqchip changes. Shared with the arm64 tree. : . irqchip/apple-aic: Fix cpumask allocation for FIQs irqchip/apple-aic: Move PMU-specific registers to their own include file arm64: dts: apple: Add t8303 PMU nodes arm64: dts: apple: Add t8103 PMU interrupt affinities irqchip/apple-aic: Wire PMU interrupts irqchip/apple-aic: Parse FIQ affinities from device-tree dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts dt-bindings: arm-pmu: Document Apple PMU compatible strings Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-03-10can: isotp: set default value for N_As to 50 micro secondsOliver Hartkopp
The N_As value describes the time a CAN frame needs on the wire when transmitted by the CAN controller. Even very short CAN FD frames need arround 100 usecs (bitrate 1Mbit/s, data bitrate 8Mbit/s). Having N_As to be zero (the former default) leads to 'no CAN frame separation' when STmin is set to zero by the receiving node. This 'burst mode' should not be enabled by default as it could potentially dump a high number of CAN frames into the netdev queue from the soft hrtimer context. This does not affect the system stability but is just not nice and cooperative. With this N_As/frame_txtime value the 'burst mode' is disabled by default. As user space applications usually do not set the frame_txtime element of struct can_isotp_options the new in-kernel default is very likely overwritten with zero when the sockopt() CAN_ISOTP_OPTS is invoked. To make sure that a N_As value of zero is only set intentional the value '0' is now interpreted as 'do not change the current value'. When a frame_txtime of zero is required for testing purposes this CAN_ISOTP_FRAME_TXTIME_ZERO u32 value has to be set in frame_txtime. Link: https://lore.kernel.org/all/20220309120416.83514-2-socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-03-10dma-mapping: benchmark: extract a common header file for map_benchmark ↵Tian Tao
definition kernel/dma/map_benchmark.c and selftests/dma/dma_map_benchmark.c have duplicate map_benchmark definitions, which tends to lead to inconsistent changes to map_benchmark on both sides, extract a common header file to avoid this problem. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Acked-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-03-09Merge tag 'xsa396-5.17-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Several Linux PV device frontends are using the grant table interfaces for removing access rights of the backends in ways being subject to race conditions, resulting in potential data leaks, data corruption by malicious backends, and denial of service triggered by malicious backends: - blkfront, netfront, scsifront and the gntalloc driver are testing whether a grant reference is still in use. If this is not the case, they assume that a following removal of the granted access will always succeed, which is not true in case the backend has mapped the granted page between those two operations. As a result the backend can keep access to the memory page of the guest no matter how the page will be used after the frontend I/O has finished. The xenbus driver has a similar problem, as it doesn't check the success of removing the granted access of a shared ring buffer. - blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p, kbdfront, and pvcalls are using a functionality to delay freeing a grant reference until it is no longer in use, but the freeing of the related data page is not synchronized with dropping the granted access. As a result the backend can keep access to the memory page even after it has been freed and then re-used for a different purpose. - netfront will fail a BUG_ON() assertion if it fails to revoke access in the rx path. This will result in a Denial of Service (DoS) situation of the guest which can be triggered by the backend" * tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/netfront: react properly to failing gnttab_end_foreign_access_ref() xen/gnttab: fix gnttab_end_foreign_access() without page specified xen/pvcalls: use alloc/free_pages_exact() xen/9p: use alloc/free_pages_exact() xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done() xen: remove gnttab_query_foreign_access() xen/gntalloc: don't use gnttab_query_foreign_access() xen/scsifront: don't use gnttab_query_foreign_access() for mapped status xen/netfront: don't use gnttab_query_foreign_access() for mapped status xen/blkfront: don't use gnttab_query_foreign_access() for mapped status xen/grant-table: add gnttab_try_end_foreign_access() xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
2022-03-09tcp: adjust TSO packet sizes based on min_rttEric Dumazet
Back when tcp_tso_autosize() and TCP pacing were introduced, our focus was really to reduce burst sizes for long distance flows. The simple heuristic of using sk_pacing_rate/1024 has worked well, but can lead to too small packets for hosts in the same rack/cluster, when thousands of flows compete for the bottleneck. Neal Cardwell had the idea of making the TSO burst size a function of both sk_pacing_rate and tcp_min_rtt() Indeed, for local flows, sending bigger bursts is better to reduce cpu costs, as occasional losses can be repaired quite fast. This patch is based on Neal Cardwell implementation done more than two years ago. bbr is adjusting max_pacing_rate based on measured bandwidth, while cubic would over estimate max_pacing_rate. /proc/sys/net/ipv4/tcp_tso_rtt_log can be used to tune or disable this new feature, in logarithmic steps. Tested: 100Gbit NIC, two hosts in the same rack, 4K MTU. 600 flows rate-limited to 20000000 bytes per second. Before patch: (TSO sizes would be limited to 20000000/1024/4096 -> 4 segments per TSO) ~# echo 0 >/proc/sys/net/ipv4/tcp_tso_rtt_log ~# nstat -n;perf stat ./super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000;nstat|egrep "TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered" 96005 Performance counter stats for './super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000': 65,945.29 msec task-clock # 2.845 CPUs utilized 1,314,632 context-switches # 19935.279 M/sec 5,292 cpu-migrations # 80.249 M/sec 940,641 page-faults # 14264.023 M/sec 201,117,030,926 cycles # 3049769.216 GHz (83.45%) 17,699,435,405 stalled-cycles-frontend # 8.80% frontend cycles idle (83.48%) 136,584,015,071 stalled-cycles-backend # 67.91% backend cycles idle (83.44%) 53,809,530,436 instructions # 0.27 insn per cycle # 2.54 stalled cycles per insn (83.36%) 9,062,315,523 branches # 137422329.563 M/sec (83.22%) 153,008,621 branch-misses # 1.69% of all branches (83.32%) 23.182970846 seconds time elapsed TcpInSegs 15648792 0.0 TcpOutSegs 58659110 0.0 # Average of 3.7 4K segments per TSO packet TcpExtTCPDelivered 58654791 0.0 TcpExtTCPDeliveredCE 19 0.0 After patch: ~# echo 9 >/proc/sys/net/ipv4/tcp_tso_rtt_log ~# nstat -n;perf stat ./super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000;nstat|egrep "TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered" 96046 Performance counter stats for './super_netperf 600 -H otrv6 -l 20 -- -K dctcp -q 20000000': 48,982.58 msec task-clock # 2.104 CPUs utilized 186,014 context-switches # 3797.599 M/sec 3,109 cpu-migrations # 63.472 M/sec 941,180 page-faults # 19214.814 M/sec 153,459,763,868 cycles # 3132982.807 GHz (83.56%) 12,069,861,356 stalled-cycles-frontend # 7.87% frontend cycles idle (83.32%) 120,485,917,953 stalled-cycles-backend # 78.51% backend cycles idle (83.24%) 36,803,672,106 instructions # 0.24 insn per cycle # 3.27 stalled cycles per insn (83.18%) 5,947,266,275 branches # 121417383.427 M/sec (83.64%) 87,984,616 branch-misses # 1.48% of all branches (83.43%) 23.281200256 seconds time elapsed TcpInSegs 1434706 0.0 TcpOutSegs 58883378 0.0 # Average of 41 4K segments per TSO packet TcpExtTCPDelivered 58878971 0.0 TcpExtTCPDeliveredCE 9664 0.0 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20220309015757.2532973-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09net/tls: Provide {__,}tls_driver_ctx() unconditionallyDimitris Michailidis
Having the definitions of {__,}tls_driver_ctx() under an #if guard means code referencing them also needs to rely on the preprocessor. The protection doesn't appear needed so make the definitions unconditional. Fixes: db37bc177dae ("net/funeth: add the data path") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09ptp: idt82p33: use rsmu driver to access i2c/spi busMin Li
rsmu (Renesas Synchronization Management Unit ) driver is located in drivers/mfd and responsible for creating multiple devices including idt82p33 phc, which will then use the exposed regmap and mutex handle to access i2c/spi bus. Signed-off-by: Min Li <min.li.xe@renesas.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/1646748651-16811-1-git-send-email-min.li.xe@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09drivers/nvdimm: Add perf interface to expose nvdimm performance statsKajol Jain
A common interface is added to get performance stats reporting support for nvdimm devices. Added interface defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. Interface also added support for pmu register/unregister functions, cpu hotplug feature along with macros for handling events addition via sysfs. It adds attribute groups for format, cpumask and events to the pmu structure. User could use the standard perf tool to access perf events exposed via nvdimm pmu. [Declare pmu functions in nd.h file to resolve implicit-function-declaration warning and make hotplug function static as reported by kernel test robot] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Link: https://lore.kernel.org/all/202202241242.zqzGkguy-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Madhavan Srinivasan <maddy@in.ibm.com> Link: https://lore.kernel.org/r/20220225143024.47947-3-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-09drivers/nvdimm: Add nvdimm pmu structureKajol Jain
A structure is added called nvdimm_pmu, for performance stats reporting support of nvdimm devices. It can be used to add device pmu data such as pmu data structure for performance stats, nvdimm device pointer along with cpumask attributes. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@in.ibm.com> Link: https://lore.kernel.org/r/20220225143024.47947-2-kjain@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-10Merge tag 'amd-drm-next-5.18-2022-03-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.18-2022-03-09: amdgpu: - Misc code cleanups - Misc display fixes - PSR display fixes - More RAS cleanup - Hotplug fix - Bump minor version for hotplug tests - SR-IOV fixes - GC 10.3.7 updates - Remove some firmwares which are no longer used - Mode2 reset refactor - Aldebaran fixes - Add VCN fwlog feature for VCN debugging - CS code cleanup - Fix clang warning - Fix CS clean up rebase breakage amdkfd: - SVM fixes - SMI event fixes and cleanups - vmid_pasid mapping fix for gfx10.3 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220309224439.2178877-1-alexander.deucher@amd.com
2022-03-10Merge tag 'drm-msm-next-2022-03-08' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Follow-up pull req for v5.18 to pull in some important fixes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvwHFHEd+9df-0aBOCfmw+ULvTS3f18sJuq_cvGKLDSjw@mail.gmail.com
2022-03-09bpf: Add "live packet" mode for XDP in BPF_PROG_RUNToke Høiland-Jørgensen
This adds support for running XDP programs through BPF_PROG_RUN in a mode that enables live packet processing of the resulting frames. Previous uses of BPF_PROG_RUN for XDP returned the XDP program return code and the modified packet data to userspace, which is useful for unit testing of XDP programs. The existing BPF_PROG_RUN for XDP allows userspace to set the ingress ifindex and RXQ number as part of the context object being passed to the kernel. This patch reuses that code, but adds a new mode with different semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES flag. When running BPF_PROG_RUN in this mode, the XDP program return codes will be honoured: returning XDP_PASS will result in the frame being injected into the networking stack as if it came from the selected networking interface, while returning XDP_TX and XDP_REDIRECT will result in the frame being transmitted out that interface. XDP_TX is translated into an XDP_REDIRECT operation to the same interface, since the real XDP_TX action is only possible from within the network drivers themselves, not from the process context where BPF_PROG_RUN is executed. Internally, this new mode of operation creates a page pool instance while setting up the test run, and feeds pages from that into the XDP program. The setup cost of this is amortised over the number of repetitions specified by userspace. To support the performance testing use case, we further optimise the setup step so that all pages in the pool are pre-initialised with the packet data, and pre-computed context and xdp_frame objects stored at the start of each page. This makes it possible to entirely avoid touching the page content on each XDP program invocation, and enables sending up to 9 Mpps/core on my test box. Because the data pages are recycled by the page pool, and the test runner doesn't re-initialise them for each run, subsequent invocations of the XDP program will see the packet data in the state it was after the last time it ran on that particular page. This means that an XDP program that modifies the packet before redirecting it has to be careful about which assumptions it makes about the packet content, but that is only an issue for the most naively written programs. Enabling the new flag is only allowed when not setting ctx_out and data_out in the test specification, since using it means frames will be redirected somewhere else, so they can't be returned. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com
2022-03-09net/mlx5: DR, Add support for ConnectX-7 steeringYevgeny Kliteynik
Add support for a new SW format version that is implemented by ConnectX-7. Except for several differences, the STEv2 is identical to STEv1, so for most callbacks the STEv2 context struct will call STEv1 functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: DR, Add support for matching on Internet Header Length (IHL)Yevgeny Kliteynik
Add support for matching on new field - Internet Header Length (IHL). Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add debugfs counters for page commands failuresMoshe Shemesh
Add the following new debugfs counters for debug and verbosity: fw_pages_alloc_failed - number of pages FW requested but driver failed to allocate. give_pages_dropped - number of pages given to FW, but command give pages failed by FW. reclaim_pages_discard - number of pages which were about to reclaim back and FW failed the command. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add pages debugfsMoshe Shemesh
Add pages debugfs to expose the following counters for debuggability: fw_pages_total - How many pages were given to FW and not returned yet. vfs_pages - For SRIOV, how many pages were given to FW for virtual functions usage. host_pf_pages - For ECPF, how many pages were given to FW for external hosts physical functions usage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Move debugfs entries to separate structMoshe Shemesh
Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Change release_all_pages cap bit locationMoshe Shemesh
mlx5 FW has changed release_all_pages cap bit by one bit offset to reflect a fix in the FW flow for release_all_pages. The driver should use the new bit to ensure it calls release_all_pages only if the FW fix is there. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Add command failures data to debugfsMoshe Shemesh
Add new counters to command interface debugfs to count command failures. The following counters added: total_failed - number of times command failed (any kind of failure). failed_mbox_status - number of times command failed on bad status returned by FW. In addition, add data about last command failure to command interface debugfs: last_failed_errno - last command failed returned errno. last_failed_mbox_status - last bad status returned by FW. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5e: SHAMPO, reduce TIR indicationBen Ben-Ishay
SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the first place to enforce suitability between connected TIR and RQ, this enforcement does not exist in current the Firmware implementation and was redundant in the first place. Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload") Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09net/mlx5: Fix size field in bufferx_reg structMohammad Kabat
According to HW spec the field "size" should be 16 bits in bufferx register. Fixes: e281682bf294 ("net/mlx5_core: HW data structs/types definitions cleanup") Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-03-09arm64/mte: Remove asymmetric mode from the prctl() interfaceMark Brown
As pointed out by Evgenii Stepanov one potential issue with the new ABI for enabling asymmetric is that if there are multiple places where MTE is configured in a process, some of which were compiled with the old prctl.h and some of which were compiled with the new prctl.h, there may be problems keeping track of which MTE modes are requested. For example some code may disable only sync and async modes leaving asymmetric mode enabled when it intended to fully disable MTE. In order to avoid such mishaps remove asymmetric mode from the prctl(), instead implicitly allowing it if both sync and async modes are requested. This should not disrupt userspace since a process requesting both may already see a mix of sync and async modes due to differing defaults between CPUs or changes in default while the process is running but it does mean that userspace is unable to explicitly request asymmetric mode without changing the system default for CPUs. Reported-by: Evgenii Stepanov <eugenis@google.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Evgenii Stepanov <eugenis@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Branislav Rankov <branislav.rankov@arm.com> Link: https://lore.kernel.org/r/20220309131200.112637-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-03-09block: add ->poll_bio to block_device_operationsMing Lei
Prepare for supporting IO polling for bio-based driver. Add ->poll_bio callback so that bio-based driver can provide their own logic for polling bio. Also fix ->submit_bio_bio typo in comment block above __submit_bio_noacct. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-03-09net: tcp: fix shim definition of tcp_inbound_md5_hashVladimir Oltean
When CONFIG_TCP_MD5SIG isn't enabled, there is a compilation bug due to the fact that the static inline definition of tcp_inbound_md5_hash() has an unexpected semicolon. Remove it. Fixes: 1330b6ef3313 ("skb: make drop reason booleanable") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220309122012.668986-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09dt-bindings: clock: add QCOM SM6125 display clock bindingsMartin Botka
Add device tree bindings for display clock controller for Qualcomm Technology Inc's SM6125 SoC. Signed-off-by: Martin Botka <martin.botka@somainline.org> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220303131812.302302-3-marijn.suijten@somainline.org
2022-03-09clk: qcom: gcc: Add emac GDSC support for SM8150Bhupesh Sharma
Add the EMAC GDSC defines and driver structures for SM8150. Cc: Stephen Boyd <sboyd@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220303084824.284946-4-bhupesh.sharma@linaro.org
2022-03-09clk: qcom: gcc: Add UFS_CARD and UFS_PHY GDSCs for SM8150Bhupesh Sharma
Add the UFS_CARD and UFS_PHY GDSC defines & driver structures for SM8150. Cc: Stephen Boyd <sboyd@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220303082140.240745-2-bhupesh.sharma@linaro.org
2022-03-09clk: qcom: gcc: Add PCIe0 and PCIe1 GDSC for SM8150Bhupesh Sharma
Add the PCIe0 and PCIe1 GDSC defines & driver structures for SM8150. Cc: Stephen Boyd <sboyd@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220302203045.184500-4-bhupesh.sharma@linaro.org
2022-03-09clk: qcom: smd: Add missing RPM clocks for msm8992/4Konrad Dybcio
XO and MSS_CFG were omitted when first adding the clocks for these SoCs. Add them, and while at it, move the XO clock to the top of the definition list, as ideally everyone should start using it sooner or later.. Fixes: b4297844995f ("clk: qcom: smd: Add support for MSM8992/4 rpm clocks") Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20220226214126.21209-2-konrad.dybcio@somainline.org
2022-03-09Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2022-03-09 1) Fix IPv6 PMTU discovery for xfrm interfaces. From Lina Wang. 2) Revert failing for policies and states that are configured with XFRMA_IF_ID 0. It broke a user configuration. From Kai Lueke. 3) Fix a possible buffer overflow in the ESP output path. 4) Fix ESP GSO for tunnel and BEET mode on inter address family tunnels. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09regulator: Add bindings for Richtek RT5190A PMICChiYuan Huang
Add bindings for Richtek RT5190A PMIC. Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/1646812903-32496-2-git-send-email-u0084500@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ASoC: Intel: soc-acpi: quirk topology filename dynamicallyPierre-Louis Bossart
Different topology filenames may be required depending on which SSP is used, and whether or not digital mics are present. This patch adds a tplg_quirk_mask and in the case of the SOF driver adds the relevant configurations. This is a short-term solution to the ES8336 support issues. In a long-term solution, we would need an interface where the machine driver or platform driver have the ability to alter the topology hard-coded low-level hardware support, e.g. by substituting an interface for another, or disabling an interface that is not supported on a given skew. BugLink: https://github.com/thesofproject/linux/issues/3248 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20220308192610.392950-7-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ALSA: intel-nhlt: add helper to detect SSP link maskPierre-Louis Bossart
The NHLT information can be used to figure out which SSPs are enabled in a platform. The 'SSP' link type is too broad for machine drivers, since it can cover the Bluetooth sideband and the analog audio codec connections, so this helper exposes a parameter to filter with the device type (DEVICE_I2S refers to analog audio codec in NHLT parlance). The helper returns a mask, since more than one SSP may be used for analog audio, e.g. the NHLT spec describes the use of SSP0 for amplifiers and SSP1 for headset codec. Note that if more than one bit is set, it's impossible to determine which SSP is connected to what external component. Additional platform-specific information based on e.g. DMI quirks would still be required in the machine driver to configure the relevant dailinks. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Acked-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20220308192610.392950-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ASoC: soc-acpi: add information on I2S/TDM link maskPierre-Louis Bossart
The platform driver may have information on which I2S/TDM link(s) to enable in the machine driver. In the case of Intel devices, this may be extracted from NHLT tables in platform firmware. This link information is necessary to make sure machine driver and topology are aligned. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20220308192610.392950-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ASoC: soc-acpi: fix kernel-doc descriptorPierre-Louis Bossart
Add missing dmic_num mention and clarify that 'links' mean 'SoundWire links', not to be used for other links. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20220308192610.392950-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ASoC: SOF: make struct snd_sof_dai IPC agnosticRanjani Sridharan
Remove the comp_dai and dai_config members of struct snd_sof_dai and replace it with a void *private field. Introduce a new struct sof_dai_private_data that will contain the pointer to these two fields. The topology parser will populate this structure and save it as part of the "private" member in snd_sof_dai. Change all users of these fields to use the private member instead. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20220308164344.577647-18-ranjani.sridharan@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09ASoC: SOF: make struct snd_sof_widget IPC agnosticRanjani Sridharan
Parse the UUID token and save it in the new uuid field in struct snd_sof_widget. struct sof_ipc_comp_ext is no longer needed. So remove it too. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20220308164344.577647-12-ranjani.sridharan@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-03-09kasan: fix a missing header include of static_keys.hJoey Gouly
The kasan-enabled.h header relies on static keys, so make sure to include the header to avoid compilation errors (with JUMP_LABEL=n). It fixes the following: ./include/linux/kasan-enabled.h:9:1: warning: data definition has no type or storage class 9 | DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled); | ^~~~~~~~~~~~~~~~~~~~~~~~ error: type defaults to 'int' in declaration of 'DECLARE_STATIC_KEY_FALSE' [-Werror=implicit-int] Fixes: f9b5e46f4097eb29 ("kasan: split kasan_*enabled() functions into a separate header") Cc: Peter Collingbourne <pcc@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20220301154518.19456-1-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-03-09skb: make drop reason booleanableJakub Kicinski
We have a number of cases where function returns drop/no drop decision as a boolean. Now that we want to report the reason code as well we have to pass extra output arguments. We can make the reason code evaluate correctly as bool. I believe we're good to reorder the reasons as they are reported to user space as strings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09net: dsa: felix: avoid early deletion of host FDB entriesVladimir Oltean
The Felix driver declares FDB isolation but puts all standalone ports in VID 0. This is mostly problem-free as discussed with Alvin here: https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870 however there is one catch. DSA still thinks that FDB entries are installed on the CPU port as many times as there are user ports, and this is problematic when multiple user ports share the same MAC address. Consider the default case where all user ports inherit their MAC address from the DSA master, and then the user runs: ip link set swp0 address 00:01:02:03:04:05 The above will make dsa_slave_set_mac_address() call dsa_port_standalone_host_fdb_add() for 00:01:02:03:04:05 in port 0's standalone database, and dsa_port_standalone_host_fdb_del() for the old address of swp0, again in swp0's standalone database. Both the ->port_fdb_add() and ->port_fdb_del() will be propagated down to the felix driver, which will end up deleting the old MAC address from the CPU port. But this is still in use by other user ports, so we end up breaking unicast termination for them. There isn't a problem in the fact that DSA keeps track of host standalone addresses in the individual database of each user port: some drivers like sja1105 need this. There also isn't a problem in the fact that some drivers choose the same VID/FID for all standalone ports. It is just that the deletion of these host addresses must be delayed until they are known to not be in use any longer, and only the driver has this knowledge. Since DSA keeps these addresses in &cpu_dp->fdbs and &cpu_db->mdbs, it is just a matter of walking over those lists and see whether the same MAC address is present on the CPU port in the port db of another user port. I have considered reusing the generic dsa_port_walk_fdbs() and dsa_port_walk_mdbs() schemes for this, but locking makes it difficult. In the ->port_fdb_add() method and co, &dp->addr_lists_lock is held, but dsa_port_walk_fdbs() also acquires that lock. Also, even assuming that we introduce an unlocked variant of the address iterator, we'd still need some relatively complex data structures, and a void *ctx in the dsa_fdb_walk_cb_t which we don't currently pass, such that drivers are able to figure out, after iterating, whether the same MAC address is or isn't present in the port db of another port. All the above, plus the fact that I expect other drivers to follow the same model as felix where all standalone ports use the same FID, made me conclude that a generic method provided by DSA is necessary: dsa_fdb_present_in_other_db() and the mdb equivalent. Felix calls this from the ->port_fdb_del() handler for the CPU port, when the database was classified to either a port db, or a LAG db. For symmetry, we also call this from ->port_fdb_add(), because if the address was installed once, then installing it a second time serves no purpose: it's already in hardware in VID 0 and it affects all standalone ports. This change moves dsa_db_equal() from switch.c to dsa.c, since it now has one more caller. Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-08mptcp: introduce implicit endpointsPaolo Abeni
In some edge scenarios, an MPTCP subflows can use a local address mapped by a "implicit" endpoint created by the in-kernel path manager. Such endpoints presence can be confusing, as it's creation is hard to track and will prevent the later endpoint creation from the user-space using the same address. Define a new endpoint flag to mark implicit endpoints and allow the user-space to replace implicit them with user-provided data at endpoint creation time. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08mptcp: add tracepoint in mptcp_sendmsg_fragGeliang Tang
The tracepoint in get_mapping_status() only dumped the incoming mpext fields. This patch added a new tracepoint in mptcp_sendmsg_frag() to dump the outgoing mpext too. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08scsi: don't use disk->private_data to find the scsi_driverChristoph Hellwig
Requiring every ULP to have the scsi_drive as first member of the private data is rather fragile and not necessary anyway. Just use the driver hanging off the SCSI device instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220308055200.735835-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08blk-mq: manage hctx map via xarrayMing Lei
First code becomes more clean by switching to xarray from plain array. Second use-after-free on q->queue_hw_ctx can be fixed because queue_for_each_hw_ctx() may be run when updating nr_hw_queues is in-progress. With this patch, q->hctx_table is defined as xarray, and this structure will share same lifetime with request queue, so queue_for_each_hw_ctx() can use q->hctx_table to lookup hctx reliably. Reported-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220308073219.91173-7-ming.lei@redhat.com [axboe: fix blk_mq_hw_ctx forward declaration] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08fs: remove fs.f_write_hintChristoph Hellwig
The value is now completely unused except for reporting it back through the F_GET_FILE_RW_HINT ioctl, so remove the value and the two ioctls for it. Trying to use the F_SET_FILE_RW_HINT and F_GET_FILE_RW_HINT fcntls will now return EINVAL, just like it would on a kernel that never supported this functionality in the first place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220308060529.736277-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>