summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-01-31rxrpc: De-atomic call->ackr_window and call->ackr_nr_unackedDavid Howells
call->ackr_window doesn't need to be atomic as ACK generation and ACK transmission are now done in the same thread, so drop the atomic64 handling and split it into two separate members. Similarly, call->ackr_nr_unacked doesn't need to be atomic now either. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: Generate extra pings for RTT during heavy-receive callDavid Howells
When doing a call that has a single transmitted data packet and a massive amount of received data packets, we only ping for one RTT sample, which means we don't get a good reading on it. Fix this by converting occasional IDLE ACKs into PING ACKs to elicit a response. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-31rxrpc: Shrink the tabulation in the rxrpc trace header a bitDavid Howells
Shrink the tabulation in the rxrpc trace header a bit to allow for fields with long type names that have been removed. Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31rxrpc: Remove whitespace before ')' in trace headerDavid Howells
Work around checkpatch warnings in the rxrpc trace header by removing whitespace before ')' on lines defining the trace record struct. Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31block: Remove mm.h from bvec.hMatthew Wilcox
This was originally added for the definition of nth_page(), but we no longer use nth_page() in this header, so we can drop the heavyweight mm.h now. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230131050132.2627124-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-31kunit: fix kunit_test_init_section_suites(...)Brendan Higgins
Looks like kunit_test_init_section_suites(...) was messed up in a merge conflict. This fixes it. kunit_test_init_section_suites(...) was not updated to avoid the extra level of indirection when .kunit_test_suites was flattened. Given no-one was actively using it, this went unnoticed for a long period of time. Fixes: e5857d396f35 ("kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites") Signed-off-by: Brendan Higgins <brendan.higgins@linux.dev> Signed-off-by: David Gow <davidgow@google.com> Tested-by: Martin Fernandez <martin.fernandez@eclypsium.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-31Merge branch '20230112204446.30236-2-quic_molvera@quicinc.com' into ↵Bjorn Andersson
arm64-for-6.3 Merge DT binding in order to get GCC clock defines.
2023-01-31Merge branch 'icc-qdu1000-immutable' of ↵Bjorn Andersson
https://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into HEAD Merge DT binding to gain interconnect defines.
2023-01-31sched/clock: Make local_clock() noinstrPeter Zijlstra
With sched_clock() noinstr, provide a noinstr implementation of local_clock(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.760767043@infradead.org
2023-01-31sched/clock/x86: Mark sched_clock() noinstrPeter Zijlstra
In order to use sched_clock() from noinstr code, mark it and all it's implenentations noinstr. The whole pvclock thing (used by KVM/Xen) is a bit of a pain, since it calls out to watchdogs, create a pvclock_clocksource_read_nowd() variant doesn't do that and can be noinstr. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.702003578@infradead.org
2023-01-31cpuidle: tracing: Warn about !rcu_is_watching()Peter Zijlstra
When using noinstr, WARN when tracing hits when RCU is disabled. Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.466670589@infradead.org
2023-01-31cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUGPeter Zijlstra
In order to avoid WARN/BUG from generating nested or even recursive warnings, force rcu_is_watching() true during WARN/lockdep_rcu_suspicious(). Notably things like unwinding the stack can trigger rcu_dereference() warnings, which then triggers more unwinding which then triggers more warnings etc.. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230126151323.408156109@infradead.org
2023-01-31Merge tag 'v6.2-rc6' into sched/core, to pick up fixesIngo Molnar
Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31vdso/bits.h: Add BIT_ULL() for the sake of consistencyAndy Shevchenko
The minimization done in 3945ff37d2f4 ("linux/bits.h: Extract common header for vDSO") was required to isolate the VDSO build from the larger kernel header impact. The split added some inconsistency since BIT() and BIT_ULL() are now defined in the different files which confuses unprepared reader. Move BIT_ULL() to vdso/bits.h. No functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221128141003.77929-1-andriy.shevchenko@linux.intel.com
2023-01-31cc2520: move to gpio descriptorsArnd Bergmann
cc2520 supports both probing from static platform_data and from devicetree, but there have never been any definitions of the platform data in the mainline kernel, so it's safe to assume that only the DT path is used. After folding cc2520_platform_data into the driver itself, the GPIO handling can be simplified by moving to the modern gpiod interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230126161658.2983292-1-arnd@kernel.org Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2023-01-30net: mscc: ocelot: expose vsc7514_regmap definitionColin Foster
The VSC7514 target regmap is identical for ones shared with similar hardware, specifically the VSC7512. Share this resource, and change the name to match the pattern of other exported resources. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30net: mscc: ocelot: expose ocelot_reset routineColin Foster
Resetting the switch core is the same whether it is done internally or externally. Move this routine to the ocelot library so it can be used by other drivers. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30net: mscc: ocelot: expose vcap_props structureColin Foster
The vcap_props structure is common to other devices, specifically the VSC7512 chip that can only be controlled externally. Export this structure so it doesn't need to be recreated. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30net: mscc: ocelot: expose regfield definition to be used by other driversColin Foster
The ocelot_regfields struct is common between several different chips, some of which can only be controlled externally. Export this structure so it doesn't have to be duplicated in these other drivers. Rename the structure as well, to follow the conventions of other shared resources. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30net: mscc: ocelot: expose ocelot wm functionsColin Foster
Expose ocelot_wm functions so they can be shared with other drivers. Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-30net/mlx5: Prepare for fast crypto key update if hardware supports itJianbo Liu
Add CAP for crypto offload, do the simple initialization if hardware supports it. Currently set log_dek_obj_range to 12, so 4k DEKs will be created in one bulk allocation. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-30net/mlx5: Change key type to key purposeJianbo Liu
Change the naming of key type in DEK fields and macros, to be consistent with the device spec. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-30net/mlx5: Add IFC bits and enums for crypto keyJianbo Liu
Add and extend structure layouts and defines for fast crypto key update. This is a prerequisite to support bulk creation, key modification and destruction, software wrapped DEK, and SYNC_CRYPTO command. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-30net/mlx5: Add IFC bits for general obj create paramJianbo Liu
Before this patch, the log_obj_range was defined inside general_obj_in_cmd_hdr to support bulk allocation. However, we need to modify/query one of the object in the bulk in later patch, so change those fields to param bits for parameters specific for cmd header, and add general_obj_create_param according to what was updated in spec. We will also add general_obj_query_param for modify/query later. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-01-30Merge tag 'amlogic-arm64-dt-for-v6.3' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into arm/dt Amlogic ARM64 DT changes for v6.3: - Merge of immutable bindings branch with Reset & power domain binding - New boards: - Odroid-N2L (Smaller version of Odroid-N2+) - BananaPi M2-Pro (Variant of BPI=M5 with on-board wifi) - Radxa Zero2 (New version of Radza Zero with A311D SoC) - Add DT node for the VIPNano-QI on the A311D - DT bindings fixups covering all SoC families - MAC address nodes - ethernet PHY node name - scpi & child node names - SD/SDIO node name - USB supply name - invalid clock-names - rng node name - rtc node name - ETH phy mux node name - button & adc keys node name - leds node names - RK818 pmic properties - remove CPU opps below 1GHz for G12A boards, like it was done for G12B/SM1 - Fix WiFi/Bt definition around P212 & Khadas VIM1 - Add audio node to P212 - Fix FAN trip definition to Odroid-HC4 - Fix gpio-fan gpios definition - Permit Radxa Zero OTG on USB1 - Fix VDDIO_C enable gpio by using OPEN DRAIN flag * tag 'amlogic-arm64-dt-for-v6.3' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: (43 commits) arm64: dts: meson: add support for Radxa Zero2 dt-bindings: arm: amlogic: add support for Radxa Zero2 arm64: dts: meson: add support for BananaPi M2-Pro dt-bindings: arm: amlogic: add support for BananaPi M2-Pro arm64: dts: meson: bananapi-m5: convert dts to dtsi arm64: dts: meson: bananapi-m5: remove redundant status from sound node arm64: dts: meson: bananapi-m5: switch VDDIO_C pin to OPEN_DRAIN arm64: dts: meson: radxa-zero: allow usb otg mode arm64: dts: meson-gxm-khadas-vim2: use gpio-fan matrix instead of an array arm64: dts: meson-g12b-odroid: Add initial support for Hardkernel ODROID-N2L arm64: dts: meson-g12b: move common node into new odroid.dtsi dt-bindings: arm: amlogic: document Odroid-N2L arm64: dts: amlogic: meson-sm1-odroid-hc4: fix active fan thermal trip arm64: dts: meson: add audio playback to S905X-P212 dts arm64: dts: meson: remove WiFi/BT nodes from Khadas VIM1 arm64: dts: meson: move pwm_ef node in P212 dtsi arm64: dts: meson: add Broadcom WiFi to P212 dtsi arm64: dts: amlogic: meson-g12b-odroid-go-ultra: fix rk818 pmic properties arm64: dts: amlogic: meson-gxbb-kii-pro: fix led node name arm64: dts: amlogic: meson-gxl-s905d-phicomm-n1: fix led node name ... Link: https://lore.kernel.org/r/c1641ffd-71c9-9ac9-89d9-c22da4acea10@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-30kunit: fix bug in KUNIT_EXPECT_MEMEQRae Moar
In KUNIT_EXPECT_MEMEQ and KUNIT_EXPECT_MEMNEQ, add check if one of the inputs is NULL and fail if this is the case. Currently, the kernel crashes if one of the inputs is NULL. Instead, fail the test and add an appropriate error message. Fixes: b8a926bea8b1 ("kunit: Introduce KUNIT_EXPECT_MEMEQ and KUNIT_EXPECT_MEMNEQ macros") This was found by the kernel test robot: https://lore.kernel.org/all/202212191448.D6EDPdOh-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rae Moar <rmoar@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-30Merge tag 'qcom-arm64-for-6.3' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt Qualcomm ARM64 Devicetree updates for v6.3 This introduces support for the new Snapdragon 8 Gen 2 (SM8550) platform. In addition to the adding support for the MTP on this platform, support the following devices is introduced: - GPLUS FL8005A - Google Zombie with LTE and NVMe - Google Zombie with NVMe - Lenovo Tab P11 - Motorola G5 Plus - Motorola G7 Power - Motorola Moto G6 - Samsung Galaxy J5 (2016) - Samsung Galaxy Tab A 8.0 - Samsung Galaxy Tab A 9.7 - Xiaomi Mi A1 - Xiaomi Mi A2 Lite - Xiaomi Redmi 5 Plus - Xiaomi Redmi Note 4X On IPQ8074 the PCIe PHY register regions and PHY clock names are corrected. On MSM8916 DMA for the I2C controllers are introduced and blsp_dma is unconditionally enabled. Per-sensor calibration data is provided for the thermal sensor (tsens) block. The GPLUS FL8005A device is introduced and gains support for touchscreen and flash LED. An additional Samsung Galaxy J5 variant is added, and support is added for hall sensor and MUIC. Per-sensor calibration information is introduced for the thermal sensor on MSM8956 as well. On MSM8996, GPLL0 is added as a possible Kryo clock controller input, a carveout is added to get modem metadata out of System RAM. Missing bus clocks are added for agnoc2. SDHCI1 is enabled on the Sony Xperia Tone platform and USB is limited to high-speed, to make USB work. MSM8998 gains the same modem carveout as other platforms, and the description of the clock hierarchy is improved. On QCS404 the clock hierarchy description is improved, the CDSP PAS node is adjusted to match the binding and the thermal sensor (tsens) gains per-sensor calibration information. On SC7180 the Data Capture and Compare block is intorduced, and a carveout for the modem metadata is introduced, to get this out of System RAM. Pazquel360 gains touchscreen support, the regulator off-on-time is adjusted for the Trogdor eDP and touchscreen. Data lane and frequency properties are introduced for the DisplayPort links. SC7280 also gets Data Capture and Compare support, as well as the dedicated modem metadata region. Herobrine gains DP audio support. IPA description is updated so that it's only active on boards with a modem. On SC8280XP the display subsystem is introduced, currently with support for most of the DisplayPort controllers. GPR, SoundWire and LPASS is introduced, for audio support. Missing I2C and SPI controllers are introduced. Support for EDP is introduced for the CRD, the Lenovo ThinkPad X13s and the SA8295P ADP automotive board. The SA8540P Ride platform enables one i2c and pcie controllers. A CMA region is defined for the CRD and X13s, to avoid allocation issues from the NVMe support. Fairphone FP3 gains NFC support and the Sony Xperia Nile platform gains a description of simplefb. SDM670 gains QFPROM definition. SDM845 gains a carveout for the modem metadata and support for the Data Capture and Compare block is introduced. Lenovo Yoga C630 firmware paths are aligned with all other Qualcomm platforms. On SM6125 apss SMMU is introduced and streams are defined for USB and SDHCI controllers. GPI DMA description is introduced, as well as missing SPI and I2C serial engines. On Sony Xperia 10 IIa regulator definitions are improved, SDHCI2 is introduced, and I2C and related GPI DMA blocks are enabled. On SM6350 IPA is introduced. DDR and L3 scaling is introduced based on CPUfreq. Fairphone FP4, on SM7225 also has IPA enabled, and the Flash LED is enabled as well. On SM8150 the display subsystem is introduced, with clock controller, DPU and two DSI controllers. The Data Capture and Compare block is introduced. For the Sony Xperia Kumano platform, GPIO keys and NFC support is introduced. For SM8350 PCIe is introduced, as is the display subsystem with display clock controller, DPU and two DSI controllers. #interconnect-cells is changed to 2, to align with other platforms and allow for active-only votes. The display is enabled and the LT9611uxc found on the SM8350 Hardware Development Kit board is described, to provide HDMI output. On SM8450 the display subsystem is introduced, with DPU and two DSI controllers. GIC-ITS support is introduced for both PCIe0 and PCIe1. SPMI bus support is introduced and pmics are wired up across the various devices. The display subsystem is enabled and the LT9611uxc is described to provide HDMI output on the SM8450 Hardware Development Kit. On Sony Xperia Nagara platform, GPIO keys and GPIO line names are introduced. As is the SLG51000 PMIC and camera regulators are defined. Support for SM8550 is introduced, with support for storage, USB, remoteprocs, PCIe, low-speed buses, crypto and display subsystem. These blocks are enabled on the MTP. Lastly, the work continue to align Devicetree source with bindings across all platforms. * tag 'qcom-arm64-for-6.3' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (320 commits) arm64: dts: qcom: sc7280: Add a carveout for modem metadata arm64: dts: qcom: sc7180: Add a carveout for modem metadata arm64: dts: qcom: sdm845: Add a carveout for modem metadata arm64: dts: qcom: msm8998: Add a carveout for modem metadata arm64: dts: qcom: msm8996: Add a carveout for modem metadata arm64: dts: qcom: ipq8074: correct PCIe QMP PHY output clock names arm64: dts: qcom: ipq8074: fix Gen3 PCIe node arm64: dts: qcom: ipq8074: set Gen2 PCIe pcie max-link-speed arm64: dts: qcom: ipq8074: correct Gen2 PCIe ranges arm64: dts: qcom: ipq8074: fix Gen3 PCIe QMP PHY arm64: dts: qcom: ipq8074: fix Gen2 PCIe QMP PHY arm64: dts: qcom: sdm845-db845c: drop label from I2C controllers arm64: dts: qcom: msm8996: support using GPLL0 as kryocc input arm64: dts: qcom: sm8450: Allow both GIC-ITS and internal MSI controller arm64: dts: qcom: sm8550-mtp: Add USB PHYs and HC nodes arm64: dts: qcom: sm8550: Add USB PHYs and controller nodes arm64: dts: qcom: sm8250: drop unused properties from tx-macro arm64: dts: qcom: sm8250: drop unused clock-frequency from wsa-macro arm64: dts: qcom: align OPP table node name with DT schema arm64: dts: qcom: rename mdp nodes to display-controller ... Link: https://lore.kernel.org/r/20230126202528.3691539-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-30rxrpc: Fix trace stringDavid Howells
Fix a trace string to indicate that it's discarding the local endpoint for a preallocated peer, not a preallocated connection. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2023-01-30devlink: remove devlink featuresJiri Pirko
Devlink features were introduced to disallow devlink reload calls of userspace before the devlink was fully initialized. The reason for this workaround was the fact that devlink reload was originally called without devlink instance lock held. However, with recent changes that converted devlink reload to be performed under devlink instance lock, this is redundant so remove devlink features entirely. Note that mlx5 used this to enable devlink reload conditionally only when device didn't act as multi port slave. Move the multi port check into mlx5_devlink_reload_down() callback alongside with the other checks preventing the device from reload in certain states. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30Merge tag 'batadv-next-pullrequest-20230127' of ↵David S. Miller
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - drop prandom.h includes, by Sven Eckelmann - fix mailing list address, by Sven Eckelmann - multicast feature preparation, by Linus Lüssing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-30net: stmmac: do not stop RX_CLK in Rx LPI state for qcs404 SoCAndrey Konovalov
Currently in phy_init_eee() the driver unconditionally configures the PHY to stop RX_CLK after entering Rx LPI state. This causes an LPI interrupt storm on my qcs404-base board. Change the PHY initialization so that for "qcom,qcs404-ethqos" compatible device RX_CLK continues to run even in Rx LPI state. Signed-off-by: Andrey Konovalov <andrey.konovalov@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-29s390/bpf: Implement arch_prepare_bpf_trampoline()Ilya Leoshkevich
arch_prepare_bpf_trampoline() is used for direct attachment of eBPF programs to various places, bypassing kprobes. It's responsible for calling a number of eBPF programs before, instead and/or after whatever they are attached to. Add a s390x implementation, paying attention to the following: - Reuse the existing JIT infrastructure, where possible. - Like the existing JIT, prefer making multiple passes instead of backpatching. Currently 2 passes is enough. If literal pool is introduced, this needs to be raised to 3. However, at the moment adding literal pool only makes the code larger. If branch shortening is introduced, the number of passes needs to be increased even further. - Support both regular and ftrace calling conventions, depending on the trampoline flags. - Use expolines for indirect calls. - Handle the mismatch between the eBPF and the s390x ABIs. - Sign-extend fmod_ret return values. invoke_bpf_prog() produces about 120 bytes; it might be possible to slightly optimize this, but reaching 50 bytes, like on x86_64, looks unrealistic: just loading cookie, __bpf_prog_enter, bpf_func, insnsi and __bpf_prog_exit as literals already takes at least 5 * 12 = 60 bytes, and we can't use relative addressing for most of them. Therefore, lower BPF_MAX_TRAMP_LINKS on s390x. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-5-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and ↵Yu Kuai
blkcg_deactivate_policy() Currently parent pd can be freed before child pd: t1: remove cgroup C1 blkcg_destroy_blkgs blkg_destroy list_del_init(&blkg->q_node) // remove blkg from queue list percpu_ref_kill(&blkg->refcnt) blkg_release call_rcu t2: from t1 __blkg_release blkg_free schedule_work t4: deactivate policy blkcg_deactivate_policy pd_free_fn // parent of C1 is freed first t3: from t2 blkg_free_workfn pd_free_fn If policy(for example, ioc_timer_fn() from iocost) access parent pd from child pd after pd_offline_fn(), then UAF can be triggered. Fix the problem by delaying 'list_del_init(&blkg->q_node)' from blkg_destroy() to blkg_free_workfn(), and using a new disk level mutex to synchronize blkg_free_workfn() and blkcg_deactivate_policy(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230119110350.2287325-4-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29block: introduce bdev_zone_no helperPankaj Raghav
Add a generic bdev_zone_no() helper to calculate zone number for a given sector in a block device. This helper internally uses disk_zone_no() to find the zone number. Use the helper bdev_zone_no() to calculate nr of zones. This lets us make modifications to the math if needed in one place. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20230110143635.77300-4-p.raghav@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29block: add a new helper bdev_{is_zone_start, offset_from_zone_start}Pankaj Raghav
Instead of open coding to check for zone start, add a helper to improve readability and store the logic in one place. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230110143635.77300-3-p.raghav@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29block: remove superfluous check for request queue in bdev_is_zoned()Pankaj Raghav
Remove the superfluous request queue check in bdev_is_zoned() as bdev_get_queue() can never return NULL. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20230110143635.77300-2-p.raghav@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29ublk_drv: add mechanism for supporting unprivileged ublk deviceMing Lei
unprivileged ublk device is helpful for container use case, such as: ublk device created in one unprivileged container can be controlled and accessed by this container only. Implement this feature by adding flag of UBLK_F_UNPRIVILEGED_DEV, and if this flag isn't set, any control command has been run from privileged user. Otherwise, any control command can be sent from any unprivileged user, but the user has to be permitted to access the ublk char device to be controlled. In case of UBLK_F_UNPRIVILEGED_DEV: 1) for command UBLK_CMD_ADD_DEV, it is always allowed, and user needs to provide owner's uid/gid in this command, so that udev can set correct ownership for the created ublk device, since the device owner uid/gid can be queried via command of UBLK_CMD_GET_DEV_INFO. 2) for other control commands, they can only be run successfully if the current user is allowed to access the specified ublk char device, for running the permission check, path of the ublk char device has to be provided by these commands. Also add one control of command UBLK_CMD_GET_DEV_INFO2 which always include the char dev path in payload since userspace may not have knowledge if this device is created in unprivileged mode. For applying this mechanism, system administrator needs to take the following policies: 1) chmod 0666 /dev/ublk-control 2) change ownership of ublkcN & ublkbN - chown owner_uid:owner_gid /dev/ublkcN - chown owner_uid:owner_gid /dev/ublkbN Both can be done via one simple udev rule. Userspace: https://github.com/ming1/ubdsrv/tree/unprivileged-ublk 'ublk add -t $TYPE --un_privileged=1' is for creating one un-privileged ublk device if the user is un-privileged. Link: https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/ Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230106041711.914434-7-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29ublk_drv: add device parameter UBLK_PARAM_TYPE_DEVTMing Lei
Userspace side only knows device ID, but the associated path of ublkc* and ublkb* could be changed by udev, and that depends on userspace's policy, so add parameter of UBLK_PARAM_TYPE_DEVT for retrieving major/minor of the ublkc* and ublkb*, then user may figure out major/minor of the ublk disks he/she owns. With major/minor, it is easy to find the device node path. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230106041711.914434-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29drbd: make limits unsignedChristoph Böhmwalder
These are almost always used as unsigned integers, so mark them as such. Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20230113123538.144276-4-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29drbd: fix DRBD_VOLUME_MAX 65535 -> 65534Robert Altnoeder
The protocol uses -1 as a reserved value for 'no specific volume', and since the protocol field is a 16 bit unsigned value, -1 is converted to 65535. Therefore, limit the range of valid volume numbers to [0, 65534]. Signed-off-by: Robert Altnoeder <robert.altnoeder@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20230113123538.144276-3-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29drbd: adjust drbd_limits license headerChristoph Böhmwalder
See also commit 93c68cc46a07 ("drbd: use consistent license"). We only want to license drbd under GPL-2.0, so use the corresponding SPDX header consistently. Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20230113123538.144276-2-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29drbd: split off drbd_config into separate fileChristoph Böhmwalder
To be more similar to what we do in the out-of-tree module and ease the upstreaming process. Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20230113123506.144082-4-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29drbd: drop API_VERSION defineChristoph Böhmwalder
Use the genetlink api version as defined in drbd_genl_api.h. Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20230113123506.144082-3-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29block: save user max_sectors limitKeith Busch
The user can set the max_sectors limit to any valid value via sysfs /sys/block/<dev>/queue/max_sectors_kb attribute. If the device limits are ever rescanned, though, the limit reverts back to the potentially artificially low BLK_DEF_MAX_SECTORS value. Preserve the user's setting as the max_sectors limit as long as it's valid. The user can reset back to defaults by writing 0 to the sysfs file. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20230105205146.3610282-3-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29block: make BLK_DEF_MAX_SECTORS unsignedKeith Busch
This is used as an unsigned value, so define it that way to avoid having to cast it. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20230105205146.3610282-2-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29io_uring: optimise ctx flags layoutPavel Begunkov
There may be different cost for reeading just one byte or more, so it's benificial to keep ctx flag bits that we access together in a single byte. That affected code generation of __io_cq_unlock_post_flush() and removed one memory load. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bbe8ca4705704690319d65e45845f9fc9d35f420.1673887636.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29io_uring: add io_req_local_work_add wake fast pathPavel Begunkov
Don't wake the master task after queueing a deferred tw unless it's currently waiting in io_cqring_wait. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/717702d772825a6647e6c315b4690277ba84c3fc.1673274244.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29io_uring: add lazy poll_wq activationPavel Begunkov
Even though io_poll_wq_wake()'s waitqueue_active reuses a barrier we do for another waitqueue, it's not going to be the case in the future and so we want to have a fast path for it when the ring has never been polled. Move poll_wq wake ups into __io_commit_cqring_flush() using a new flag called ->poll_activated. The idea behind the flag is to set it when the ring was polled for the first time. This requires additional sync to not miss events, which is done here by using task_work for ->task_complete rings, and by default enabling the flag for all other types of rings. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/060785e8e9137a920b232c0c7f575b131af19cac.1673274244.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29io_uring: separate wq for ring pollingPavel Begunkov
Don't use ->cq_wait for ring polling but add a separate wait queue for it. We need it for following patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dea0be0bf990503443c5c6c337fc66824af7d590.1673274244.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29io_uring: move submitter_task out of cold cachelinePavel Begunkov
->submitter_task is used somewhat more frequent now than before, i.e. for local tw enqueue and run, let's move it from the end of ctx, which is full of cold data, to the first cacheline with mostly constants. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/415ca91dc5ad1dec612b892e489cda98e1069542.1673274244.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>