Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Introduce and use a single page frag cache for allocating small skb
heads, clawing back the 10-20% performance regression in UDP flood
test from previous fixes.
- Run packets which already went thru HW coalescing thru SW GRO. This
significantly improves TCP segment coalescing and simplifies
deployments as different workloads benefit from HW or SW GRO.
- Shrink the size of the base zero-copy send structure.
- Move TCP init under a new slow / sleepable version of DO_ONCE().
BPF:
- Add BPF-specific, any-context-safe memory allocator.
- Add helpers/kfuncs for PKCS#7 signature verification from BPF
programs.
- Define a new map type and related helpers for user space -> kernel
communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF).
- Allow targeting BPF iterators to loop through resources of one
task/thread.
- Add ability to call selected destructive functions. Expose
crash_kexec() to allow BPF to trigger a kernel dump. Use
CAP_SYS_BOOT check on the loading process to judge permissions.
- Enable BPF to collect custom hierarchical cgroup stats efficiently
by integrating with the rstat framework.
- Support struct arguments for trampoline based programs. Only
structs with size <= 16B and x86 are supported.
- Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping
sockets (instead of just TCP and UDP sockets).
- Add a helper for accessing CLOCK_TAI for time sensitive network
related programs.
- Support accessing network tunnel metadata's flags.
- Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open.
- Add support for writing to Netfilter's nf_conn:mark.
Protocols:
- WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation
(MLO) work (802.11be, WiFi 7).
- vsock: improve support for SO_RCVLOWAT.
- SMC: support SO_REUSEPORT.
- Netlink: define and document how to use netlink in a "modern" way.
Support reporting missing attributes via extended ACK.
- IPSec: support collect metadata mode for xfrm interfaces.
- TCPv6: send consistent autoflowlabel in SYN_RECV state and RST
packets.
- TCP: introduce optional per-netns connection hash table to allow
better isolation between namespaces (opt-in, at the cost of memory
and cache pressure).
- MPTCP: support TCP_FASTOPEN_CONNECT.
- Add NEXT-C-SID support in Segment Routing (SRv6) End behavior.
- Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets.
- Open vSwitch:
- Allow specifying ifindex of new interfaces.
- Allow conntrack and metering in non-initial user namespace.
- TLS: support the Korean ARIA-GCM crypto algorithm.
- Remove DECnet support.
Driver API:
- Allow selecting the conduit interface used by each port in DSA
switches, at runtime.
- Ethernet Power Sourcing Equipment and Power Device support.
- Add tc-taprio support for queueMaxSDU parameter, i.e. setting per
traffic class max frame size for time-based packet schedules.
- Support PHY rate matching - adapting between differing host-side
and link-side speeds.
- Introduce QUSGMII PHY mode and 1000BASE-KX interface mode.
- Validate OF (device tree) nodes for DSA shared ports; make
phylink-related properties mandatory on DSA and CPU ports.
Enforcing more uniformity should allow transitioning to phylink.
- Require that flash component name used during update matches one of
the components for which version is reported by info_get().
- Remove "weight" argument from driver-facing NAPI API as much as
possible. It's one of those magic knobs which seemed like a good
idea at the time but is too indirect to use in practice.
- Support offload of TLS connections with 256 bit keys.
New hardware / drivers:
- Ethernet:
- Microchip KSZ9896 6-port Gigabit Ethernet Switch
- Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs
- Analog Devices ADIN1110 and ADIN2111 industrial single pair
Ethernet (10BASE-T1L) MAC+PHY.
- Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP).
- Ethernet SFPs / modules:
- RollBall / Hilink / Turris 10G copper SFPs
- HALNy GPON module
- WiFi:
- CYW43439 SDIO chipset (brcmfmac)
- CYW89459 PCIe chipset (brcmfmac)
- BCM4378 on Apple platforms (brcmfmac)
Drivers:
- CAN:
- gs_usb: HW timestamp support
- Ethernet PHYs:
- lan8814: cable diagnostics
- Ethernet NICs:
- Intel (100G):
- implement control of FCS/CRC stripping
- port splitting via devlink
- L2TPv3 filtering offload
- nVidia/Mellanox:
- tunnel offload for sub-functions
- MACSec offload, w/ Extended packet number and replay window
offload
- significantly restructure, and optimize the AF_XDP support,
align the behavior with other vendors
- Huawei:
- configuring DSCP map for traffic class selection
- querying standard FEC statistics
- querying SerDes lane number via ethtool
- Marvell/Cavium:
- egress priority flow control
- MACSec offload
- AMD/SolarFlare:
- PTP over IPv6 and raw Ethernet
- small / embedded:
- ax88772: convert to phylink (to support SFP cages)
- altera: tse: convert to phylink
- ftgmac100: support fixed link
- enetc: standard Ethtool counters
- macb: ZynqMP SGMII dynamic configuration support
- tsnep: support multi-queue and use page pool
- lan743x: Rx IP & TCP checksum offload
- igc: add xdp frags support to ndo_xdp_xmit
- Ethernet high-speed switches:
- Marvell (prestera):
- support SPAN port features (traffic mirroring)
- nexthop object offloading
- Microchip (sparx5):
- multicast forwarding offload
- QoS queuing offload (tc-mqprio, tc-tbf, tc-ets)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- support RGMII cmode
- NXP (felix):
- standardized ethtool counters
- Microchip (lan966x):
- QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets)
- traffic policing and mirroring
- link aggregation / bonding offload
- QUSGMII PHY mode support
- Qualcomm 802.11ax WiFi (ath11k):
- cold boot calibration support on WCN6750
- support to connect to a non-transmit MBSSID AP profile
- enable remain-on-channel support on WCN6750
- Wake-on-WLAN support for WCN6750
- support to provide transmit power from firmware via nl80211
- support to get power save duration for each client
- spectral scan support for 160 MHz
- MediaTek WiFi (mt76):
- WiFi-to-Ethernet bridging offload for MT7986 chips
- RealTek WiFi (rtw89):
- P2P support"
* tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits)
eth: pse: add missing static inlines
once: rename _SLOW to _SLEEPABLE
net: pse-pd: add regulator based PSE driver
dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller
ethtool: add interface to interact with Ethernet Power Equipment
net: mdiobus: search for PSE nodes by parsing PHY nodes.
net: mdiobus: fwnode_mdiobus_register_phy() rework error handling
net: add framework to support Ethernet PSE and PDs devices
dt-bindings: net: phy: add PoDL PSE property
net: marvell: prestera: Propagate nh state from hw to kernel
net: marvell: prestera: Add neighbour cache accounting
net: marvell: prestera: add stub handler neighbour events
net: marvell: prestera: Add heplers to interact with fib_notifier_info
net: marvell: prestera: Add length macros for prestera_ip_addr
net: marvell: prestera: add delayed wq and flush wq on deinit
net: marvell: prestera: Add strict cleanup of fib arbiter
net: marvell: prestera: Add cleanup of allocated fib_nodes
net: marvell: prestera: Add router nexthops ABI
eth: octeon: fix build after netif_napi_add() changes
net/mlx5: E-Switch, Return EBUSY if can't get mode lock
...
|
|
into clk-next
- Convert Baikal-T1 CCU driver to platform driver
- Split reset support out of primary Baikal-T1 CCU driver
- Add some missing clks required for RPiVid Video Decoder on RaspberryPi
- Mark PLLC critical on bcm2835
- Support for Renesas VersaClock7 clock generator family
* clk-baikal:
clk: baikal-t1: Convert to platform device driver
clk: baikal-t1: Add DDR/PCIe directly controlled resets support
dt-bindings: clk: baikal-t1: Add DDR/PCIe reset IDs
clk: baikal-t1: Move reset-controls code into a dedicated module
clk: baikal-t1: Add SATA internal ref clock buffer
clk: baikal-t1: Add shared xGMAC ref/ptp clocks internal parent
clk: baikal-t1: Fix invalid xGMAC PTP clock divider
clk: vc5: Fix 5P49V6901 outputs disabling when enabling FOD
* clk-broadcom:
clk: bcm: rpi: Add support for VEC clock
clk: bcm: rpi: Handle pixel clock in firmware
clk: bcm: rpi: Add support HEVC clock
clk: bcm2835: fix bcm2835_clock_rate_from_divisor declaration
clk: bcm2835: Round UART input clock up
clk: bcm2835: Make peripheral PLLC critical
* clk-vc5:
clk: vc5: Add support for IDT/Renesas VersaClock 5P49V6975
dt-bindings: clock: vc5: Add 5P49V6975
clk: vc5: Use regmap_{set,clear}_bits() where appropriate
clk: vc5: Check IO access results
* clk-versaclock:
clk: Renesas versaclock7 ccf device driver
dt-bindings: Renesas versaclock7 device tree bindings
|
|
into clk-next
- More devm helpers for fixed rate registration
- Add Spreadtrum UMS512 SoC clk support
- Various PXA168 clk driver fixes
* clk-fixed-rate:
clk: fixed-rate: add devm_clk_hw_register_fixed_rate
clk: asm9260: use parent index to link the reference clock
* clk-spreadtrum:
clk: sprd: Add clocks support for UMS512
* clk-pxa:
clk: pxa: add a check for the return value of kzalloc()
clk: mmp: pxa168: control shared SDH bits with separate clock
dt-bindings: marvell,pxa168: add clock ids for SDH AXI clocks
clk: mmp: pxa168: add clocks for SDH2 and SDH3
dt-bindings: marvell,pxa168: add clock id for SDH3
clk: mmp: pxa168: fix GPIO clock enable bits
clk: mmp: pxa168: add muxes for more peripherals
clk: mmp: pxa168: fix incorrect parent clocks
clk: mmp: pxa168: fix const-correctness
clk: mmp: pxa168: add new clocks for peripherals
dt-bindings: marvell,pxa168: add clock ids for additional dividers
clk: mmp: pxa168: fix incorrect dividers
clk: mmp: pxa168: add additional register defines
* clk-ti:
clk: davinci: cfgchip: Use dev_err_probe() helper
clk: davinci: pll: fix spelling typo in comment
MAINTAINERS: add header file to TI DAVINCI SERIES CLOCK DRIVER
|
|
'clk-allwinner' and 'clk-imx' into clk-next
* clk-rockchip:
dt-bindings: clock: rockchip: change SPDX-License-Identifier
dt-bindings: clock: convert rockchip,rk3128-cru.txt to YAML
clk: rockchip: Add clock controller support for RV1126 SoC
dt-bindings: clock: rockchip: Document RV1126 CRU
clk: rockchip: Add dt-binding header for RV1126
clk: rockchip: Add MUXTBL variant
* clk-renesas:
clk: renesas: r8a779g0: Add EtherAVB clocks
clk: renesas: r8a779g0: Add PFC/GPIO clocks
clk: renesas: r8a779g0: Add I2C clocks
clk: renesas: r8a779g0: Add watchdog clock
dt-bindings: clock: renesas,rzg2l: Document RZ/Five SoC
clk: renesas: r8a779f0: Add MSIOF clocks
clk: renesas: r9a09g011: Add IIC clock and reset entries
clk: renesas: r9a07g044: Add conditional compilation for r9a07g044_cpg_info
clk: renesas: r8a779f0: Add TMU and parent SASYNC clocks
clk: renesas: r8a779f0: Add CMT clocks
clk: renesas: r8a779f0: Add SDH0 clock
* clk-microchip:
clk: at91: sama5d2: Add Generic Clocks for UART/USART
clk: microchip: add PolarFire SoC fabric clock support
dt-bindings: clk: add PolarFire SoC fabric clock ids
dt-bindings: clk: document PolarFire SoC fabric clocks
dt-bindings: clk: rename mpfs-clkcfg binding
clk: microchip: mpfs: update module authorship & licencing
clk: microchip: mpfs: convert periph_clk to clk_gate
clk: microchip: mpfs: convert cfg_clk to clk_divider
clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
clk: microchip: mpfs: simplify control reg access
clk: microchip: mpfs: move id & offset out of clock structs
clk: microchip: mpfs: add MSS pll's set & round rate
MAINTAINERS: add polarfire soc reset controller
reset: add polarfire soc reset support
clk: microchip: mpfs: add reset controller
dt-bindings: clk: microchip: mpfs: add reset controller support
clk: microchip: mpfs: make the rtc's ahb clock critical
clk: microchip: mpfs: fix clk_cfg array bounds violation
* clk-allwinner:
clk: sunxi-ng: ccu-sun9i-a80-usb: Use dev_err_probe() helper
clk: sunxi-ng: ccu-sun9i-a80-de: Use dev_err_probe() helper
clk: sunxi-ng: sun8i-de2: Use dev_err_probe() helper
clk: sunxi-ng: d1: Limit PLL rates to stable ranges
* clk-imx:
clk: imx: scu: fix memleak on platform_device_add() fails
clk: imx93: add SAI IPG clk
clk: imx93: add MU1/2 clock
clk: imx93: switch to use new clk gate API
clk: imx: add i.MX93 clk gate
clk: imx: clk-composite-93: check white_list
clk: imx: clk-composite-93: check slice busy
dt-bindings: clock: imx93-clock: add more MU/SAI clocks
dt-bindings: clock: imx8mm: don't use multiple blank lines
clk: imx8mp: tune the order of enet_qos_root_clk
|
|
into clk-next
- Add resets for MediaTek MT8195 PCIe and USB
- Remove DaVinci DM644x and DM646x clk driver support
* clk-samsung:
clk: samsung: MAINTAINERS: add Krzysztof Kozlowski
clk: samsung: exynos850: Implement CMU_MFCMSCL domain
clk: samsung: exynos850: Implement CMU_IS domain
clk: samsung: exynos850: Implement CMU_AUD domain
clk: samsung: exynos850: Style fixes
clk: samsung: exynosautov9: add fsys1 clock support
clk: samsung: exynosautov9: add fsys0 clock support
clk: samsung: exynosautov9: correct register offsets of peric0/c1
clk: samsung: exynosautov9: add missing gate clks for peric0/c1
dt-bindings: clock: exynos850: Add Exynos850 CMU_MFCMSCL
dt-bindings: clock: exynos850: Add Exynos850 CMU_IS
dt-bindings: clock: exynos850: Add Exynos850 CMU_AUD
dt-bindings: clock: exynosautov9: add schema for cmu_fsys0/1
dt-bindings: clock: exynosautov9: add fsys1 clock definitions
dt-bindings: clock: exynosautov9: add fys0 clock definitions
clk: samsung: exynos7885: Add TREX clocks
clk: samsung: exynos7885: Implement CMU_FSYS domain
dt-bindings: clock: exynosautov9: correct clock numbering of peric0/c1
clk: samsung: exynos-clkout: Use of_device_get_match_data()
* clk-mtk: (42 commits)
clk: mediatek: add driver for MT8365 SoC
clk: mediatek: Export required common code symbols
clk: mediatek: Provide mtk_devm_alloc_clk_data
dt-bindings: clock: mediatek: add bindings for MT8365 SoC
clk: mediatek: mt8192: deduplicate parent clock lists
clk: mediatek: Migrate remaining clk_unregister_*() to clk_hw_unregister_*()
clk: mediatek: fix unregister function in mtk_clk_register_dividers cleanup
clk: mediatek: clk-mt8192: Add clock mux notifier for mfg_pll_sel
clk: mediatek: clk-mt8192-mfg: Propagate rate changes to parent
clk: mediatek: clk-mt8195-topckgen: Drop univplls from mfg mux parents
clk: mediatek: clk-mt8195-topckgen: Add GPU clock mux notifier
clk: mediatek: clk-mt8195-topckgen: Register mfg_ck_fast_ref as generic mux
clk: mediatek: clk-mt8195-mfg: Reparent mfg_bg3d and propagate rate changes
clk: mediatek: mt8183: Add clk mux notifier for MFG mux
clk: mediatek: mux: add clk notifier functions
clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent
clk: mediatek: Use mtk_clk_register_gates_with_dev in simple probe
clk: mediatek: gate: Export mtk_clk_register_gates_with_dev
clk: mediatek: add VDOSYS1 clock
dt-bindings: clk: mediatek: Add MT8195 DPI clocks
...
* clk-rm:
clk: davinci: remove PLL and PSC clocks for DaVinci DM644x and DM646x
* clk-ast:
clk: ast2600: BCLK comes from EPLL
* clk-qcom: (97 commits)
clk: qcom: gcc-sm6375: Ensure unsigned long type
clk: qcom: gcc-sm6375: Remove unused variables
clk: qcom: kpss-xcc: convert to parent data API
clk: introduce (devm_)hw_register_mux_parent_data_table API
clk: qcom: gcc-msm8939: use ARRAY_SIZE instead of specifying num_parents
clk: qcom: gcc-msm8939: use parent_hws where possible
dt-bindings: clock: move qcom,gcc-msm8939 to qcom,gcc-msm8916.yaml
clk: qcom: gcc-sm6350: Update the .pwrsts for usb gdscs
clk: qcom: gcc-sc8280xp: use retention for USB power domains
clk: qcom: gdsc: add missing error handling
dt-bindings: clocks: qcom,gcc-sc8280xp: Fix typos
clk: qcom: Add global clock controller driver for SM6375
dt-bindings: clock: add SM6375 QCOM global clock bindings
clk: qcom: alpha: Add support for programming the PLL_FSM_LEGACY_MODE bit
clk: qcom: gcc-sc7280: Update the .pwrsts for usb gdscs
clk: qcom: gcc-sc7180: Update the .pwrsts for usb gdsc
clk: qcom: gdsc: Fix the handling of PWRSTS_RET support
clk: qcom: Add SC8280XP GPU clock controller
dt-bindings: clock: Add Qualcomm SC8280XP GPU binding
clk: qcom: smd: Add SM6375 clocks
...
|
|
'clk-xilinx' into clk-next
- Miscellaneous of_node_put() fixes
- Nuke dt-bindings/clk path (again) by moving headers to dt-bindings/clock
- Convert gpio-clk-gate binding to YAML
- Various fixes to AMD/Xilinx Zynqmp clk driver
- Graduate AMD/Xilinx "clocking wizard" driver from staging
* clk-ofnode:
clk: ti: Balance of_node_get() calls for of_find_node_by_name()
clk: tegra20: Fix refcount leak in tegra20_clock_init
clk: tegra: Fix refcount leak in tegra114_clock_init
clk: tegra: Fix refcount leak in tegra210_clock_init
clk: sprd: Hold reference returned by of_get_parent()
clk: berlin: Add of_node_put() for of_get_parent()
clk: at91: dt-compat: Hold reference returned by of_get_parent()
clk: qoriq: Hold reference returned by of_get_parent()
clk: oxnas: Hold reference returned by of_get_parent()
clk: st: Hold reference returned by of_get_parent()
clk: tegra: Add missing of_node_put()
clk: meson: Hold reference returned by of_get_parent()
clk: nomadik: Add missing of_node_put()
* clk-bindings:
dt-bindings: clock: drop minItems equal to maxItems
dt-bindings: clock: gpio-gate-clock: Convert to json-schema
dt-bindings: clock: Move versaclock.h to dt-bindings/clock
dt-bindings: clock: Move lochnagar.h to dt-bindings/clock
* clk-cleanup:
clk: allow building lan966x as a module
clk: clk-xgene: simplify if-if to if-else
clk: nxp: fix typo in comment
clk: mvebu: armada-37xx-tbg: Remove the unneeded result variable
clk: ti: dra7-atl: Fix reference leak in of_dra7_atl_clk_probe
clkdev: Simplify devm_clk_hw_register_clkdev() function
clkdev: Remove never used devm_clk_release_clkdev()
clk: Remove never used devm_of_clk_del_provider()
clk: pistachio: Fix initconst confusion
clk: clk-npcm7xx: Remove unused struct npcm7xx_clk_gate_data and npcm7xx_clk_div_fixed_data
clk: do not initialize ret
clk: remove extra empty line
clk: Fix comment typo
clk: move from strlcpy with unused retval to strscpy
* clk-zynq:
clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate
clk: zynqmp: Check the return type zynqmp_pm_query_data
clk: zynqmp: Add a check for NULL pointer
clk: zynqmp: Replaced strncpy() with strscpy()
clk: zynqmp: Fix stack-out-of-bounds in strncpy`
clk: zynqmp: make bestdiv unsigned
* clk-xilinx:
clk: clocking-wizard: Depend on HAS_IOMEM
clk: clocking-wizard: Use dev_err_probe() helper
clk: clocking-wizard: Update the compatible
clk: clocking-wizard: Fix the reconfig for 5.2
clk: clocking-wizard: Rename nr-outputs to xlnx,nr-outputs
clk: clocking-wizard: Move clocking-wizard out
dt-bindings: add documentation of xilinx clocking wizard
|
|
This PLL frequency needs a UL postfix to avoid compiler warnings on
32-bit architectures.
Fixes: 184fdd873d83 ("clk: qcom: Add global clock controller driver for SM6375")
Cc: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
gcc_parent_data_15 and gcc_parent_map_15 are not used in this driver.
Remove them.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Link: https://lore.kernel.org/r/20221003211438.25691-1-konrad.dybcio@somainline.org
Fixes: 184fdd873d83 ("clk: qcom: Add global clock controller driver for SM6375")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS updates from Borislav Petkov:
- Fix the APEI MCE callback handler to consult the hardware about the
granularity of the memory error instead of hard-coding it
- Offline memory pages on Intel machines after 2 errors reported per
page
* tag 'ras_core_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Retrieve poison range from hardware
RAS/CEC: Reduce offline page threshold for Intel systems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
- Add support for Skylake-S CPUs to ie31200_edac
- Improve error decoding speed of the Intel drivers by avoiding the
ACPI facilities but doing decoding in the driver itself
- Other misc improvements to the Intel drivers
- The usual cleanups and fixlets all over EDAC land
* tag 'edac_updates_for_v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/i7300: Correct the i7300_exit() function name in comment
x86/sb_edac: Add row column translation for Broadwell
EDAC/i10nm: Print an extra register set of retry_rd_err_log
EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM
EDAC/skx_common: Add ChipSelect ADXL component
EDAC/ppc_4xx: Reorder symbols to get rid of a few forward declarations
EDAC: Remove obsolete declarations in edac_module.h
EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs
EDAC/skx_common: Make output format similar
EDAC/skx_common: Use driver decoder first
EDAC/mc: Drop duplicated dimm->nr_pages debug printout
EDAC/mc: Replace spaces with tabs in memtype flags definition
EDAC/wq: Remove unneeded flush_workqueue()
EDAC/ie31200: Add Skylake-S support
|
|
The for_each_acpi_consumer_dev() takes a reference to the iterator
and if we break a loop we must drop that reference. This usually
happens when error handling is involved. However it's not the case
for skl_int3472_fill_clk_pdata().
Don't leak reference on error by dropping it properly.
Fixes: 43cf36974d76 ("platform/x86: int3472: Support multiple clock consumers")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
If an error is detected as a result of user-space process accessing a
corrupt memory location, the CPU may take an abort. Then the platform
firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA,
NOTIFY_SOFTWARE_DELEGATED, etc.
For NMI like notifications, commit 7f17b4a121d0 ("ACPI: APEI: Kick the
memory_failure() queue for synchronous errors") keep track of whether
memory_failure() work was queued, and make task_work pending to flush out
the queue so that the work is processed before return to user-space.
The code use init_mm to check whether the error occurs in user space:
if (current->mm != &init_mm)
The condition is always true, becase _nobody_ ever has "init_mm" as a real
VM any more.
In addition to abort, errors can also be signaled as asynchronous
exceptions, such as interrupt and SError. In such case, the interrupted
current process could be any kind of thread. When a kernel thread is
interrupted, the work ghes_kick_task_work deferred to task_work will never
be processed because entry_handler returns to call ret_to_kernel() instead
of ret_to_user(). Consequently, the estatus_node alloced from
ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed.
After around 200 allocations in our platform, the ghes_estatus_pool will
run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a
result, the event failed to be processed.
sdei: event 805 on CPU 113 failed with error: -2
Finally, a lot of unhandled events may cause platform firmware to exceed
some threshold and reboot.
The condition should generally just do
if (current->mm)
as described in active_mm.rst documentation.
Then if an asynchronous error is detected when a kernel thread is running,
(e.g. when detected by a background scrubber), do not add task_work to it
as the original patch intends to do.
Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit d60cd06331a3 ("PM: ACPI: reboot: Use S5 for reboot") caused Dell
PowerEdge r440 hangs at reboot.
The issue is fixed by commit 2ca1c94ce0b6 ("tg3: Disable tg3 device on
system reboot to avoid triggering AER"), so use the new sysoff API to
reinstate S5 for reboot on ACPI-based systems.
Using S5 for reboot is default behavior under Windows: "A full shutdown
(S5) occurs when a system restart is requested" [1].
Link: https://docs.microsoft.com/en-us/windows/win32/power/system-power-state # [1]
Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This change adds support for ACPI devices that use ExclusiveAndWake or
SharedAndWake in their _CRS GpioInt definition (instead of using _PRW),
and also provide power resources. Previously the ACPI subsystem had no
idea if the device had a wake capable interrupt armed. This resulted
in the ACPI device PM system placing the device into D3Cold, and thus
cutting power to the device. With this change we will now query the
_S0W method to figure out the appropriate wake capable D-state.
Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Device tree already has a mechanism to pass the wake_irq. It does this
by looking for the wakeup-source property and setting the
I2C_CLIENT_WAKE flag. This CL adds the ACPI equivalent. It uses the
ACPI interrupt wake flag to determine if the interrupt can be used to
wake the system. Previously the i2c drivers had to make assumptions and
blindly enable the wake IRQ. This can cause spurious wake events. e.g.,
If there is a device with an Active Low interrupt and the device gets
powered off while suspending, the interrupt line will go low since it's
no longer powered and wakes the system. For this reason we should
respect the board designers wishes and honor the wake bit defined on the
interrupt.
Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPI IRQ/Interrupt resources contain a bit that describes if the
interrupt should wake the system. This change exposes that bit via
a new IORESOURCE_IRQ_WAKECAPABLE flag. Drivers should check this flag
before arming an IRQ to wake the system.
Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The ACPI spec defines the SharedAndWake and ExclusiveAndWake share type
keywords. This is an indication that the GPIO IRQ can also be used as a
wake source. This change exposes the wake_capable bit so drivers can
correctly enable wake functionality instead of making an assumption.
Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Combine all queued EDAC changes for submission into v6.1:
* edac-drivers:
EDAC/ie31200: Add Skylake-S support
* edac-misc:
EDAC/i7300: Correct the i7300_exit() function name in comment
x86/sb_edac: Add row column translation for Broadwell
EDAC/i10nm: Print an extra register set of retry_rd_err_log
EDAC/i10nm: Retrieve and print retry_rd_err_log registers for HBM
EDAC/skx_common: Add ChipSelect ADXL component
EDAC/ppc_4xx: Reorder symbols to get rid of a few forward declarations
EDAC: Remove obsolete declarations in edac_module.h
EDAC/i10nm: Add driver decoder for Ice Lake and Tremont CPUs
EDAC/skx_common: Make output format similar
EDAC/skx_common: Use driver decoder first
EDAC/mc: Drop duplicated dimm->nr_pages debug printout
EDAC/mc: Replace spaces with tabs in memtype flags definition
EDAC/wq: Remove unneeded flush_workqueue()
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Convert the driver to parent data API. From the Documentation pll8_vote
and pxo should be declared in the DTS so fw_name can be used instead of
parent_names. .name is changed to the legacy pxo_board following how
it's declared in other drivers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220914144743.17369-2-ansuelsmth@gmail.com
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Fix a couple races found with a new torture test
- Improve errors when api functions are used incorrectly
- Improve tracing for lock requests from user space
- Fix use after free in recently added tracing cod.
- Small internal code cleanups
* tag 'dlm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: fix possible use after free if tracing
fs: dlm: const void resource name parameter
fs: dlm: LSFL_CB_DELAY only for kernel lockspaces
fs: dlm: remove DLM_LSFL_FS from uapi
fs: dlm: trace user space callbacks
fs: dlm: change ls_clear_proc_locks to spinlock
fs: dlm: remove dlm_del_ast prototype
fs: dlm: handle rcom in else if branch
fs: dlm: allow lockspaces have zero lvblen
fs: dlm: fix invalid derefence of sb_lvbptr
fs: dlm: handle -EINVAL as log_error()
fs: dlm: use __func__ for function name
fs: dlm: handle -EBUSY first in unlock validation
fs: dlm: handle -EBUSY first in lock arg validation
fs: dlm: fix race between test_bit() and queue_work()
fs: dlm: fix race in lowcomms
|
|
Merge in the left-over fixes before the net-next pull-request.
Conflicts:
drivers/net/ethernet/mediatek/mtk_ppe.c
ae3ed15da588 ("net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear")
9d8cb4c096ab ("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc")
https://lore.kernel.org/all/6cb6893b-4921-a068-4c30-1109795110bb@tessares.net/
kernel/bpf/helpers.c
8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF")
5679ff2f138f ("bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF")
8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types")
https://lore.kernel.org/all/20221003201957.13149-1-daniel@iogearbox.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add generic, regulator based PSE driver to support simple Power Sourcing
Equipment without automatic classification support.
This driver was tested on 10Bast-T1L switch with regulator based PoDL PSE.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add interface to support Power Sourcing Equipment. At current step it
provides generic way to address all variants of PSE devices as defined
in IEEE 802.3-2018 but support only objects specified for IEEE 802.3-2018 104.4
PoDL Power Sourcing Equipment (PSE).
Currently supported and mandatory objects are:
IEEE 802.3-2018 30.15.1.1.3 aPoDLPSEPowerDetectionStatus
IEEE 802.3-2018 30.15.1.1.2 aPoDLPSEAdminState
IEEE 802.3-2018 30.15.1.2.1 acPoDLPSEAdminControl
This is minimal interface needed to control PSE on each separate
ethernet port but it provides not all mandatory objects specified in
IEEE 802.3-2018.
Since "PoDL PSE" and "PSE" have similar names, but some different values
I decide to not merge them and keep separate naming schema. This should
allow as to be as close to IEEE 802.3 spec as possible and avoid name
conflicts in the future.
This implementation is connected to PHYs instead of MACs because PSE
auto classification can potentially interfere with PHY auto negotiation.
So, may be some extra PHY related initialization will be needed.
With WIP version of ethtools interaction with PSE capable link looks
as following:
$ ip l
...
5: t1l1@eth0: <BROADCAST,MULTICAST> ..
...
$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: disabled
PoDL PSE Power Detection Status: disabled
$ ethtool --set-pse t1l1 podl-pse-admin-control enable
$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: enabled
PoDL PSE Power Detection Status: delivering power
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some PHYs can be linked with PSE (Power Sourcing Equipment), so search
for related nodes and attach it to the phydev.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rework error handling as preparation for PSE patch. This patch should
make it easier to extend this function.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This framework was create with intention to provide support for Ethernet PSE
(Power Sourcing Equipment) and PDs (Powered Device).
At current step this patch implements generic PSE support for PoDL (Power over
Data Lines 802.3bu) specification with reserving name space for PD devices as
well.
This framework can be extended to support 802.3af and 802.3at "Power via the
Media Dependent Interface" (or PoE/Power over Ethernet)
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
"Most of the collected changes here are fixes across the tree for
various hardening features (details noted below).
The most notable new feature here is the addition of the memcpy()
overflow warning (under CONFIG_FORTIFY_SOURCE), which is the next step
on the path to killing the common class of "trivially detectable"
buffer overflow conditions (i.e. on arrays with sizes known at compile
time) that have resulted in many exploitable vulnerabilities over the
years (e.g. BleedingTooth).
This feature is expected to still have some undiscovered false
positives. It's been in -next for a full development cycle and all the
reported false positives have been fixed in their respective trees.
All the known-bad code patterns we could find with Coccinelle are also
either fixed in their respective trees or in flight.
The commit message in commit 54d9469bc515 ("fortify: Add run-time WARN
for cross-field memcpy()") for the feature has extensive details, but
I'll repeat here that this is a warning _only_, and is not intended to
actually block overflows (yet). The many patches fixing array sizes
and struct members have been landing for several years now, and we're
finally able to turn this on to find any remaining stragglers.
Summary:
Various fixes across several hardening areas:
- loadpin: Fix verity target enforcement (Matthias Kaehlcke).
- zero-call-used-regs: Add missing clobbers in paravirt (Bill
Wendling).
- CFI: clean up sparc function pointer type mismatches (Bart Van
Assche).
- Clang: Adjust compiler flag detection for various Clang changes
(Sami Tolvanen, Kees Cook).
- fortify: Fix warnings in arch-specific code in sh, ARM, and xen.
Improvements to existing features:
- testing: improve overflow KUnit test, introduce fortify KUnit test,
add more coverage to LKDTM tests (Bart Van Assche, Kees Cook).
- overflow: Relax overflow type checking for wider utility.
New features:
- string: Introduce strtomem() and strtomem_pad() to fill a gap in
strncpy() replacement needs.
- um: Enable FORTIFY_SOURCE support.
- fortify: Enable run-time struct member memcpy() overflow warning"
* tag 'hardening-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (27 commits)
Makefile.extrawarn: Move -Wcast-function-type-strict to W=1
hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero
sparc: Unbreak the build
x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled
x86/paravirt: clean up typos and grammaros
fortify: Convert to struct vs member helpers
fortify: Explicitly check bounds are compile-time constants
x86/entry: Work around Clang __bdos() bug
ARM: decompressor: Include .data.rel.ro.local
fortify: Adjust KUnit test for modular build
sh: machvec: Use char[] for section boundaries
kunit/memcpy: Avoid pathological compile-time string size
lib: Improve the is_signed_type() kunit test
LoadPin: Require file with verity root digests to have a header
dm: verity-loadpin: Only trust verity targets with enforcement
LoadPin: Fix Kconfig doc about format of file with verity digests
um: Enable FORTIFY_SOURCE
lkdtm: Update tests for memcpy() run-time warnings
fortify: Add run-time WARN for cross-field memcpy()
fortify: Use SIZE_MAX instead of (size_t)-1
...
|
|
We poll nexthops in HW and call for each active nexthop appropriate
neighbour.
Also we provide implicity neighbour resolving.
For example, user have added nexthop route:
# ip route add 5.5.5.5 via 1.1.1.2
But neighbour 1.1.1.2 doesn't exist. In this case we will try to call
neigh_event_send, even if there is no traffic.
This is useful, when you have add route, which will be used after some
time but with a lot of traffic (burst). So, we has prepared, offloaded
route in advance.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move forward and use new PRESTERA_FIB_TYPE_UC_NH to provide basic
nexthop routes support.
Provide deinitialization sequence for all created router objects.
Limitations:
- Only "local" and "main" tables supported
- Only generic interfaces supported for router (no bridges or vlans)
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Actual handler will be added in next patches
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This will be used to implement nexthops related logic in next patches.
Also try to keep ipv4/6 abstraction to be able to reuse helpers for ipv6
in the future.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add macros to determine IP address length (internal driver types).
This will be used in next patches for nexthops logic.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Flushing workqueues ensures, that no more pending works, related to just
unregistered or deinitialized notifiers. After that we can free memory.
Delayed wq will be used for neighbours in next patches.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This will, ensure, that there is no more, preciously allocated fib_cache
entries left after deinit.
Will be used to free allocated resources of nexthop routes, that points
to "not our" port (e.g. eth0).
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Do explicity cleanup on router_hw_fini, to ensure, that all allocated
objects cleaned. This will be used in cases,
when upper layer (cache) is not mapped to router_hw layer.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
- Add functions to allocate/delete/set nexthop group
- NOTE: non-ECMP nexthop is nexthop group with allocated size = 1
- Add function to read state of HW nh (if packets going through it)
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kcfi updates from Kees Cook:
"This replaces the prior support for Clang's standard Control Flow
Integrity (CFI) instrumentation, which has required a lot of special
conditions (e.g. LTO) and work-arounds.
The new implementation ("Kernel CFI") is specific to C, directly
designed for the Linux kernel, and takes advantage of architectural
features like x86's IBT. This series retains arm64 support and adds
x86 support.
GCC support is expected in the future[1], and additional "generic"
architectural support is expected soon[2].
Summary:
- treewide: Remove old CFI support details
- arm64: Replace Clang CFI support with Clang KCFI support
- x86: Introduce Clang KCFI support"
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1]
Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2]
* tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
x86: Add support for CONFIG_CFI_CLANG
x86/purgatory: Disable CFI
x86: Add types to indirectly called assembly functions
x86/tools/relocs: Ignore __kcfi_typeid_ relocations
kallsyms: Drop CONFIG_CFI_CLANG workarounds
objtool: Disable CFI warnings
objtool: Preserve special st_shndx indexes in elf_update_symbol
treewide: Drop __cficanonical
treewide: Drop WARN_ON_FUNCTION_MISMATCH
treewide: Drop function_nocfi
init: Drop __nocfi from __init
arm64: Drop unneeded __nocfi attributes
arm64: Add CFI error handling
arm64: Add types to indirect called assembly functions
psci: Fix the function type for psci_initcall_t
lkdtm: Emit an indirect call for CFI tests
cfi: Add type helper macros
cfi: Switch to -fsanitize=kcfi
cfi: Drop __CFI_ADDRESSABLE
cfi: Remove CONFIG_CFI_CLANG_SHADOW
...
|
|
Guenter reports I missed a netif_napi_add() call
in one of the platform-specific drivers:
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c: In function 'octeon_mgmt_probe':
drivers/net/ethernet/cavium/octeon/octeon_mgmt.c:1399:9: error: too many arguments to function 'netif_napi_add'
1399 | netif_napi_add(netdev, &p->napi, octeon_mgmt_napi_poll,
| ^~~~~~~~~~~~~~
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: b48b89f9c189 ("net: drop the weight argument from netif_napi_add")
Link: https://lore.kernel.org/r/20221002175650.1491124-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It is to avoid tc retrying during device mode change.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, qos group will be updated and qos will be enabled when
unregistering devlink port. Actually no need to update group if qos
is not enabled.
Add a check to prevent unnecessary enabling and disabling qos for
every port.
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Dmytro Linkin <dlinkin@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before this commit a fwd dest flow table resulted in ignoring vport dests
which is incorrect and is supported.
With this commit the dests can be a mix of flow table and vport dests.
There is still a limitation that there cannot be more than one flow table dest.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, driver sets the same grace period for fw fatal health reporter
to any type of function.
Since the lower level functions are more vulnerable to fw fatal errors as a
result of parent function closure/reload, set a smaller grace period for
the lower level functions, as follows:
1. For ECPF: 180 seconds.
2. For PF: 60 seconds.
3. For VF/SF: 30 seconds.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Start health poll at earlier stage, so if fw fatal issue occurred before
or during initialization commands such as init_hca or set_hca_cap the
poll health can detect and indicate that the driver is already in error
state.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the rx_oversize_pkts_buffer counter to ethtool statistics.
This counter exposes the number of dropped received packets due to
length which arrived to RQ and exceed software buffer size allocated by
the device for incoming traffic. It might imply that the device MTU is
larger than the software buffers size.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When XSK frame size is 3072 (or another power of two multiplied by 3),
KLM mechanism for NIC virtual memory page mapping can be optimized by
replacing it with KSM.
Before this change, two KLM entries were needed to map an XSK frame that
is not a power of two: one entry maps the UMEM memory up to the frame
length, the other maps the rest of the stride to the garbage page.
When the frame length divided by 3 is a power of two, it can be mapped
using 3 KSM entries, and the fourth will map the rest of the stride to
the garbage page. All 4 KSM entries are of the same size, which allows
for a much faster lookup.
Frame size 3072 is useful in certain use cases, because it allows
packing 4 frames into 3 pages. Generally speaking, other frame sizes
equal to PAGE_SIZE minus a power of two can be optimized in a similar
way, but it will require many more KSMs per frame, which slows down UMRs
a little bit, but more importantly may hit the limit for the maximum
number of KSM entries.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On striding RQ, when the XSK frame size doesn't match the MKey page
size, KLM is used for memory mappings, which is a slower mechanism than
MTT or KSM. It may happen in two cases:
1. Frame size is not a power of two (only possible in the unaligned mode
of XSK).
2. Frame size is 2048 bytes, and the firmware doesn't support MKey pages
smaller than 4096 bytes.
Depending on the case, print a warning and recommend to disable striding
RQ or upgrade the firmware.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
XSK RQs support striding RQ linear mode, but the stride size may be
bigger than the XSK frame size, because:
1. The stride size must be a power of two.
2. The stride size must be equal to the UMR page size. Each XSK frame is
treated as a separate page, because they aren't necessarily adjacent in
physical memory, so the driver can't put more than one stride per page.
3. The minimal MTT page size is 4096 on older firmware.
That means that if XSK frame size is 2048 or not a power of two, the
strides may be bigger than XSK frames. Normally, it's not a problem if
the hardware enforces the MTU. However, traffic between vports skips the
hardware MTU check, and oversized packets may be received.
If an oversized packet is bigger than the XSK frame but not bigger than
the stride, it will cause overwriting of the adjacent UMEM region. If
the packet takes more than one stride, they can be recycled for reuse,
so it's not a problem when the XSK frame size matches the stride size.
Work around the above issue by leveraging KLM to make a more
fine-grained mapping. The beginning of each stride is mapped to the
frame memory, and the padding up to the closest power of two is mapped
to the overflow page that doesn't belong to UMEM. This way, application
data corruption won't happen upon receiving packets bigger than MTU.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make mlx5e_mpwrq_mtts_per_wqe take into account that KSM requires
smaller alignment than MTT.
Ensure that there is always an even amount of MTTs in a UMR WQE, so that
complete octwords are formed, and no garbage is mapped.
Drop extra alignment in MLX5_MTT_OCTW that may cause setting too big
ucseg->xlt_octowords, also leading to mapping garbage.
Generalize some calculations by introducing the MLX5_OCTWORD constant.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of passing the unaligned flag, pass an enum that indicates the
UMR mode. The next commit will add the third mode (KLM for certain
configurations of XSK), which will be added to this enum instead of
adding another bool flag everywhere.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
XSK need_wakeup mechanism allows the driver to stop busy waiting for
buffers when the fill ring is empty, yield to the application and signal
it that the driver needs to be waken up after the application refills
the fill ring.
Add protection against the race condition on the RX (refill) side: if
the application refills buffers after xskrq->post_wqes is called, but
before mlx5e_xsk_update_rx_wakeup, NAPI will exit, skipping taking these
buffers to the hardware WQ, and the application won't wake it up again.
Optimize the whole need_wakeup logic, removing unneeded flows, to
compensate for this new check.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|