Age | Commit message (Collapse) | Author |
|
The cs35l56 smart amplifier has some informational firmware controls
that are populated by a tuning bin file to unique values - logging these
during firmware load identifies the specific configuration being used on
that device instance.
Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/2fcc0e6fc5b8669acb026bebe44a4995ac83b967.1747142267.git.simont@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The cs35l56 smart amplifier has some informational firmware controls
that are populated by a tuning bin file to unique values - logging these
during firmware load identifies the specific configuration being used on
that device instance.
Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://patch.msgid.link/47762a5f1ce2b178ad863c6698296aea09b72e10.1747142267.git.simont@opensource.cirrus.com
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver only supports 512 bytes ECC step size and 4 bit ECC strength
at the moment, however it does not reject unsupported step/strength
configurations. Due to this, whenever the driver is used with a flash
chip which needs stronger ECC protection, the following warning is shown
in the kernel log:
[ 0.574648] spi-nand spi0.0: GigaDevice SPI NAND was found.
[ 0.635748] spi-nand spi0.0: 256 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[ 0.649079] nand: WARNING: (null): the ECC used on your system is too weak compared to the one required by the NAND chip
Although the message indicates that something is wrong, but it often gets
unnoticed, which can cause serious problems. For example when the user
writes something into the flash chip despite the warning, the written data
may won't be readable by the bootloader or by the boot ROM. In the worst
case, when the attached SPI NAND chip is the boot device, the board may not
be able to boot anymore.
Also, it is not even possible to create a backup of the flash, because
reading its content results in bogus data. For example, dumping the first
page of the flash gives this:
# hexdump -C -n 2048 /dev/mtd0
00000000 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000040 0f 0f 0f 0f 0f 0f 0f 0d 0f 0f 0f 0f 0f 0f 0f 0f |................|
00000050 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
000001c0 0f 0f 0f 0f ff 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
000001d0 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000200 0f 0f 0f 0f f5 5b ff ff 0f 0f 0f 0f 0f 0f 0f 0f |.....[..........|
00000210 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
000002f0 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 1f 0f 0f |................|
00000300 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
000003c0 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f ff 0f 0f 0f |................|
000003d0 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000400 0f 0f 0f 0f 0f 0f 0f 0f e9 74 c9 06 f5 5b ff ff |.........t...[..|
00000410 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
000005d0 0f 0f 0f 0f ff 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
000005e0 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000600 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f c6 be 0f c3 |................|
00000610 e9 74 c9 06 f5 5b ff ff 0f 0f 0f 0f 0f 0f 0f 0f |.t...[..........|
00000620 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000770 0f 0f 0f 0f 8f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
00000780 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000800
#
Doing the same by using the downstream kernel results in different output:
# hexdump -C -n 2048 /dev/mtd0
00000000 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f 0f |................|
*
00000800
#
This patch adds some sanity checks to the code to prevent using the driver
with unsupported ECC step/strength configurations. After the change, probing
of the driver fails in such cases:
[ 0.655038] spi-nand spi0.0: GigaDevice SPI NAND was found.
[ 0.659159] spi-nand spi0.0: 256 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[ 0.669138] qcom_snand 79b0000.spi: only 4 bits ECC strength is supported
[ 0.677476] nand: No suitable ECC configuration
[ 0.689909] spi-nand spi0.0: probe with driver spi-nand failed with error -95
This helps to avoid the aforementioned hassles until support for 8 bit ECC
strength gets implemented.
Fixes: 7304d1909080 ("spi: spi-qpic: add driver for QCOM SPI NAND flash Interface")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Link: https://patch.msgid.link/20250501-qpic-snand-validate-ecc-v1-1-532776581a66@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The SPI interface is activated before the CPOL setting is applied. In
that moment, the clock idles high and CS goes low. After a short delay,
CPOL and other settings are applied, which may cause the clock to change
state and idle low. This transition is not part of a clock cycle, and it
can confuse the receiving device.
To prevent this unexpected transition, activate the interface while CPOL
and the other settings are being applied.
Signed-off-by: Alessandro Grassi <alessandro.grassi@mailbox.org>
Link: https://patch.msgid.link/20250502095520.13825-1-alessandro.grassi@mailbox.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
When using HDMI PLL frequency division coefficient at 50.25MHz
that is calculated by rk_hdptx_phy_clk_pll_calc(), it fails to
get PHY LANE lock. Although the calculated values are within the
allowable range of PHY PLL configuration.
In order to fix the PHY LANE lock error and provide the expected
50.25MHz output, manually compute the required PHY PLL frequency
division coefficient and add it to ropll_tmds_cfg configuration
table.
Signed-off-by: Algea Cao <algea.cao@rock-chips.com>
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250427095124.3354439-1-algea.cao@rock-chips.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
The kernel test robot reported the following error:
drivers/net/ethernet/freescale/enetc/ntmp.c: In function 'ntmp_fill_request_hdr':
drivers/net/ethernet/freescale/enetc/ntmp.c:203:38: error: implicit
declaration of function 'FIELD_PREP' [-Wimplicit-function-declaration]
203 | cbd->req_hdr.access_method = FIELD_PREP(NTMP_ACCESS_METHOD,
| ^~~~~~~~~~
Therefore, add "bitfield.h" to ntmp_private.h to fix this issue.
Fixes: 4701073c3deb ("net: enetc: add initial netc-lib driver to support NTMP")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505101047.NTMcerZE-lkp@intel.com/
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are wrong "#endif" comments in .h files need to be corrected.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
JH7110 USB 2.0 host fails to detect USB 2.0 devices occasionally. With a
long time of debugging and testing, we found that setting Rx clock gating
control signal to normal power consumption mode can solve this problem.
Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Link: https://lore.kernel.org/r/20250422101244.51686-1-hal.feng@starfivetech.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Because current PWM Kconfig is sorting by symbol name,
it looks strange ordering in menuconfig.
=> [ ] Renesas R-Car PWM support
=> [ ] Renesas TPU PWM support
[ ] Rockchip PWM support
=> [ ] Renesas RZ/G2L General PWM Timer support
=> [ ] Renesas RZ/G2L MTU3a PWM Timer support
Let's use common CONFIG_PWM_RENESAS_xxx symbol name for Renesas,
and sort it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/r/877c2mxrrr.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into pwm/for-next
Renesas ARM defconfig updates for v6.16 (take two)
- Enable modular support for the Renesas RZ/G2L GPT and MSIOF sound in
the ARM64 defconfig,
- Enable more support for RZN1D-DB/EB in shmobile_defconfig.
It is merged into the pwm tree due to the next patch renaming
PWM_RZG2L_GPT which has a use added in the arm64 defconfig.
|
|
The global pseudo-constants 'page_offset_base', 'vmalloc_base' and
'vmemmap_base' are not used extremely early during the boot, and cannot be
used safely until after the KASLR memory randomization code in
kernel_randomize_memory() executes, which may update their values.
So there is no point in setting these variables extremely early, and it
can wait until after the kernel itself is mapped and running from its
permanent virtual mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250513111157.717727-9-ardb+git@google.com
|
|
Warnings generated with 'make W=1':
arch/x86/power/hibernate.c:47: warning: Function parameter or struct member 'pfn' not described in 'pfn_is_nosave'
arch/x86/power/hibernate.c:92: warning: Function parameter or struct member 'max_size' not described in 'arch_hibernation_header_save'
Add missing parameter documentation in hibernate functions to
fix kernel-doc warnings.
Signed-off-by: Shivank Garg <shivankg@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250514062637.3287779-2-shivankg@amd.com
|
|
Building the kernel with W=1 generates the following warning:
arch/x86/mm/pat/memtype.c:692: warning: Function parameter or struct member 'pfn' not described in 'pat_pfn_immune_to_uc_mtrr'
Add missing parameter documentation to fix the kernel-doc warning.
Signed-off-by: Shivank Garg <shivankg@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250514062637.3287779-3-shivankg@amd.com
|
|
In tegra_ahub_probe(), check the result of function
of_device_get_match_data(), return an error code in case it fails.
Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com>
Link: https://patch.msgid.link/20250513123744.3041724-1-ruc_gongyuanjun@163.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Recent patch fixed an old commit
'fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")'
which is to include building of ppc_save_reg.c only when XMON
and KEXEC_CORE and PPC_BOOK3S are enabled. This was valid, since
ppc_save_regs was called only in replay_system_reset() of old
irq.c which was under BOOK3S.
But there has been multiple refactoring of irq.c and have
added call to ppc_save_regs() from __replay_soft_interrupts
-> replay_soft_interrupts which is part of irq_64.c included
under CONFIG_PPC64. And since ppc_save_regs is called in
CRASH_DUMP path as part of crash_setup_regs in kexec.h,
CONFIG_PPC32 also needs it.
So with this recent patch which enabled the building of
ppc_save_regs.c caused a build break when none of these
(XMON, KEXEC_CORE, BOOK3S) where enabled as part of config.
Patch to enable building of ppc_save_regs.c by defaults.
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250511041111.841158-1-maddy@linux.ibm.com
|
|
A change to QEMU resulted in all nvme controllers (single and
multi-controller subsystems) to have its CMIC.MCTRS bit set which
indicates the subsystem supports multiple controllers and it is possible
a namespace can be shared between those multiple controllers in a
multipath configuration.
When a namespace of a CMIC.MCTRS enabled subsystem is allocated, a
multipath node is created. The queue limits for this node are inherited
from the namespace being allocated. When inheriting queue limits, the
features being inherited need to be specified. The atomic write feature
(BLK_FEAT_ATOMIC_WRITES) was not specified so the atomic queue limits
were not inherited by the multipath disk node which resulted in the sysfs
atomic write attributes being zeroed. The fix is to include
BLK_FEAT_ATOMIC_WRITES in the list of features to be inherited.
Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Prior to this patch, the mark is sanitized (applying the state's mask to
the state's value) only on inserts when checking if a conflicting XFRM
state or policy exists.
We discovered in Cilium that this same sanitization does not occur
in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark
is simply compared to the state's value:
if ((mark & x->mark.m) != x->mark.v)
continue;
Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will
never match any packet.
This commit updates __xfrm_state_insert and xfrm_policy_insert to store
the sanitized marks, thus removing this footgun.
This has the side effect of changing the ip output, as the
returned mark will have the mask applied to it when printed.
Fixes: 3d6acfa7641f ("xfrm: SA lookups with mark")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Co-developed-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
The name is Mimi Phuong-Thao Vo.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241110162139.5179-2-thorsten.blum@linux.dev
|
|
Now that neither crc16_table nor crc16_byte() is used outside
lib/crc16.c, fold them into lib/crc16.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250513022115.39109-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
Instead of looping through each byte and calling crc16_byte(), instead
just call crc16() on the whole buffer. No functional change.
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250513022115.39109-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
When running the following code on an ext4 filesystem with inline_data
feature enabled, it will lead to the bug below.
fd = open("file1", O_RDWR | O_CREAT | O_TRUNC, 0666);
ftruncate(fd, 30);
pwrite(fd, "a", 1, (1UL << 40) + 5UL);
That happens because write_begin will succeed as when
ext4_generic_write_inline_data calls ext4_prepare_inline_data, pos + len
will be truncated, leading to ext4_prepare_inline_data parameter to be 6
instead of 0x10000000006.
Then, later when write_end is called, we hit:
BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);
at ext4_write_inline_data.
Fix it by using a loff_t type for the len parameter in
ext4_prepare_inline_data instead of an unsigned int.
[ 44.545164] ------------[ cut here ]------------
[ 44.545530] kernel BUG at fs/ext4/inline.c:240!
[ 44.545834] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[ 44.546172] CPU: 3 UID: 0 PID: 343 Comm: test Not tainted 6.15.0-rc2-00003-g9080916f4863 #45 PREEMPT(full) 112853fcebfdb93254270a7959841d2c6aa2c8bb
[ 44.546523] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 44.546523] RIP: 0010:ext4_write_inline_data+0xfe/0x100
[ 44.546523] Code: 3c 0e 48 83 c7 48 48 89 de 5b 41 5c 41 5d 41 5e 41 5f 5d e9 e4 fa 43 01 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 0b <0f> 0b 0f 1f 44 00 00 55 41 57 41 56 41 55 41 54 53 48 83 ec 20 49
[ 44.546523] RSP: 0018:ffffb342008b79a8 EFLAGS: 00010216
[ 44.546523] RAX: 0000000000000001 RBX: ffff9329c579c000 RCX: 0000010000000006
[ 44.546523] RDX: 000000000000003c RSI: ffffb342008b79f0 RDI: ffff9329c158e738
[ 44.546523] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[ 44.546523] R10: 00007ffffffff000 R11: ffffffff9bd0d910 R12: 0000006210000000
[ 44.546523] R13: fffffc7e4015e700 R14: 0000010000000005 R15: ffff9329c158e738
[ 44.546523] FS: 00007f4299934740(0000) GS:ffff932a60179000(0000) knlGS:0000000000000000
[ 44.546523] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.546523] CR2: 00007f4299a1ec90 CR3: 0000000002886002 CR4: 0000000000770eb0
[ 44.546523] PKRU: 55555554
[ 44.546523] Call Trace:
[ 44.546523] <TASK>
[ 44.546523] ext4_write_inline_data_end+0x126/0x2d0
[ 44.546523] generic_perform_write+0x17e/0x270
[ 44.546523] ext4_buffered_write_iter+0xc8/0x170
[ 44.546523] vfs_write+0x2be/0x3e0
[ 44.546523] __x64_sys_pwrite64+0x6d/0xc0
[ 44.546523] do_syscall_64+0x6a/0xf0
[ 44.546523] ? __wake_up+0x89/0xb0
[ 44.546523] ? xas_find+0x72/0x1c0
[ 44.546523] ? next_uptodate_folio+0x317/0x330
[ 44.546523] ? set_pte_range+0x1a6/0x270
[ 44.546523] ? filemap_map_pages+0x6ee/0x840
[ 44.546523] ? ext4_setattr+0x2fa/0x750
[ 44.546523] ? do_pte_missing+0x128/0xf70
[ 44.546523] ? security_inode_post_setattr+0x3e/0xd0
[ 44.546523] ? ___pte_offset_map+0x19/0x100
[ 44.546523] ? handle_mm_fault+0x721/0xa10
[ 44.546523] ? do_user_addr_fault+0x197/0x730
[ 44.546523] ? do_syscall_64+0x76/0xf0
[ 44.546523] ? arch_exit_to_user_mode_prepare+0x1e/0x60
[ 44.546523] ? irqentry_exit_to_user_mode+0x79/0x90
[ 44.546523] entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ 44.546523] RIP: 0033:0x7f42999c6687
[ 44.546523] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
[ 44.546523] RSP: 002b:00007ffeae4a7930 EFLAGS: 00000202 ORIG_RAX: 0000000000000012
[ 44.546523] RAX: ffffffffffffffda RBX: 00007f4299934740 RCX: 00007f42999c6687
[ 44.546523] RDX: 0000000000000001 RSI: 000055ea6149200f RDI: 0000000000000003
[ 44.546523] RBP: 00007ffeae4a79a0 R08: 0000000000000000 R09: 0000000000000000
[ 44.546523] R10: 0000010000000005 R11: 0000000000000202 R12: 0000000000000000
[ 44.546523] R13: 00007ffeae4a7ac8 R14: 00007f4299b86000 R15: 000055ea61493dd8
[ 44.546523] </TASK>
[ 44.546523] Modules linked in:
[ 44.568501] ---[ end trace 0000000000000000 ]---
[ 44.568889] RIP: 0010:ext4_write_inline_data+0xfe/0x100
[ 44.569328] Code: 3c 0e 48 83 c7 48 48 89 de 5b 41 5c 41 5d 41 5e 41 5f 5d e9 e4 fa 43 01 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 0b <0f> 0b 0f 1f 44 00 00 55 41 57 41 56 41 55 41 54 53 48 83 ec 20 49
[ 44.570931] RSP: 0018:ffffb342008b79a8 EFLAGS: 00010216
[ 44.571356] RAX: 0000000000000001 RBX: ffff9329c579c000 RCX: 0000010000000006
[ 44.571959] RDX: 000000000000003c RSI: ffffb342008b79f0 RDI: ffff9329c158e738
[ 44.572571] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[ 44.573148] R10: 00007ffffffff000 R11: ffffffff9bd0d910 R12: 0000006210000000
[ 44.573748] R13: fffffc7e4015e700 R14: 0000010000000005 R15: ffff9329c158e738
[ 44.574335] FS: 00007f4299934740(0000) GS:ffff932a60179000(0000) knlGS:0000000000000000
[ 44.575027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.575520] CR2: 00007f4299a1ec90 CR3: 0000000002886002 CR4: 0000000000770eb0
[ 44.576112] PKRU: 55555554
[ 44.576338] Kernel panic - not syncing: Fatal exception
[ 44.576517] Kernel Offset: 0x1a600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Reported-by: syzbot+fe2a25dae02a207717a0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=fe2a25dae02a207717a0
Fixes: f19d5870cbf7 ("ext4: add normal write support for inline data")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://patch.msgid.link/20250415-ext4-prepare-inline-overflow-v1-1-f4c13d900967@igalia.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory
allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is
not freed. Fix that by jumping to the error path that frees them, by
calling qlcnic_free_mbx_args(). This was found using static analysis.
Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation")
Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The functionality of mdiobus_register_board_info() typically isn't
optional for the caller. Therefore remove the stub.
Note: Currently we have only one caller of mdiobus_register_board_info(),
in a DSA/PHYLINK context. Therefore CONFIG_MDIO_DEVICE is selected anyway.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/410a2222-c4e8-45b0-9091-d49674caeb00@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New timestamping API was introduced in commit 66f7223039c0 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the mlxsw driver to the new API, so that the
ndo_eth_ioctl() path can be removed completely.
The UAPI is still ioctl-only, but it's best to remove the "ioctl"
mentions from the driver in case a netlink variant appears.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250512154411.848614-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It can't vary, stop storing the same magic number everywhere.
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Alex Elder <elder@kernel.org>
Link: https://patch.msgid.link/20250512-topic-ipa_smem-v1-1-302679514a0d@oss.qualcomm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The first paragraph makes no grammatical sense. I suppose a portion of
the intended sentece is missing: "[The challenge with ] stacked PHCs
(...) is that they uncover bugs".
Rephrase, and at the same time simplify the structure of the sentence a
little bit, it is not easy to follow.
Fixes: 94d9f78f4d64 ("docs: networking: timestamping: add section for stacked PHC devices")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://patch.msgid.link/20250512131751.320283-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New timestamping API was introduced in commit 66f7223039c0 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the ENETC driver to the new API, so that the
ndo_eth_ioctl() path can be removed completely.
Move the enetc_hwtstamp_get() and enetc_hwtstamp_set() calls away from
enetc_ioctl() to dedicated net_device_ops for the LS1028A PF and VF
(NETC v4 does not yet implement enetc_ioctl()), adapt the prototypes and
export these symbols (enetc_ioctl() is also exported).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20250512112402.4100618-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For unknown reasons, sometimes the value of MISC interrupt is 0 in the
IRQ handle function. In this case, wx_intr_enable() is also should be
invoked to clear the interrupt. Otherwise, the next interrupt would
never be reported.
Fixes: a9843689e2de ("net: txgbe: add sriov function support")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/F4F708403CE7090B+20250512100652.139510-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently we are requiring AlwaysRefCounted in most trait bounds for gem
objects, and implementing it by hand for our only current type of gem
object. However, all gem objects use the same functions for reference
counting - and all gem objects support reference counting.
We're planning on adding support for shmem gem objects, let's move this
around a bit by instead making IntoGEMObject require AlwaysRefCounted as a
trait bound, and then provide a blanket AlwaysRefCounted implementation for
any object that implements IntoGEMObject so all gem object types can use
the same AlwaysRefCounted implementation. This also makes things less
verbose by making the AlwaysRefCounted trait bound implicit for any
IntoGEMObject bound.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250513221046.903358-5-lyude@redhat.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
There's a few changes here:
* The rename, of course (this should also let us drop the clippy annotation
here)
* Return *mut bindings::drm_gem_object instead of
&Opaque<bindings::drm_gem_object> - the latter doesn't really have any
benefit and just results in conversion from the rust type to the C type
having to be more verbose than necessary.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250513221046.903358-4-lyude@redhat.com
[ Fixup s/into_gem_obj()/as_raw()/ in safety comment. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
MACsec offload is not supported in switchdev mode for uplink
representors. When switching to the uplink representor profile, the
MACsec offload feature must be cleared from the netdevice's features.
If left enabled, attempts to add offloads result in a null pointer
dereference, as the uplink representor does not support MACsec offload
even though the feature bit remains set.
Clear NETIF_F_HW_MACSEC in mlx5e_fix_uplink_rep_features().
Kernel log:
Oops: general protection fault, probably for non-canonical address 0xdffffc000000000f: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000078-0x000000000000007f]
CPU: 29 UID: 0 PID: 4714 Comm: ip Not tainted 6.14.0-rc4_for_upstream_debug_2025_03_02_17_35 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__mutex_lock+0x128/0x1dd0
Code: d0 7c 08 84 d2 0f 85 ad 15 00 00 8b 35 91 5c fe 03 85 f6 75 29 49 8d 7e 60 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 a6 15 00 00 4d 3b 76 60 0f 85 fd 0b 00 00 65 ff
RSP: 0018:ffff888147a4f160 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001
RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000078
RBP: ffff888147a4f2e0 R08: ffffffffa05d2c19 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000018 R15: ffff888152de0000
FS: 00007f855e27d800(0000) GS:ffff88881ee80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004e5768 CR3: 000000013ae7c005 CR4: 0000000000372eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
Call Trace:
<TASK>
? die_addr+0x3d/0xa0
? exc_general_protection+0x144/0x220
? asm_exc_general_protection+0x22/0x30
? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
? __mutex_lock+0x128/0x1dd0
? lockdep_set_lock_cmp_fn+0x190/0x190
? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
? mutex_lock_io_nested+0x1ae0/0x1ae0
? lock_acquire+0x1c2/0x530
? macsec_upd_offload+0x145/0x380
? lockdep_hardirqs_on_prepare+0x400/0x400
? kasan_save_stack+0x30/0x40
? kasan_save_stack+0x20/0x40
? kasan_save_track+0x10/0x30
? __kasan_kmalloc+0x77/0x90
? __kmalloc_noprof+0x249/0x6b0
? genl_family_rcv_msg_attrs_parse.constprop.0+0xb5/0x240
? mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
mlx5e_macsec_add_secy+0xf9/0x700 [mlx5_core]
? mlx5e_macsec_add_rxsa+0x11a0/0x11a0 [mlx5_core]
macsec_update_offload+0x26c/0x820
? macsec_set_mac_address+0x4b0/0x4b0
? lockdep_hardirqs_on_prepare+0x284/0x400
? _raw_spin_unlock_irqrestore+0x47/0x50
macsec_upd_offload+0x2c8/0x380
? macsec_update_offload+0x820/0x820
? __nla_parse+0x22/0x30
? genl_family_rcv_msg_attrs_parse.constprop.0+0x15e/0x240
genl_family_rcv_msg_doit+0x1cc/0x2a0
? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
? cap_capable+0xd4/0x330
genl_rcv_msg+0x3ea/0x670
? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
? lockdep_set_lock_cmp_fn+0x190/0x190
? macsec_update_offload+0x820/0x820
netlink_rcv_skb+0x12b/0x390
? genl_family_rcv_msg_dumpit+0x2a0/0x2a0
? netlink_ack+0xd80/0xd80
? rwsem_down_read_slowpath+0xf90/0xf90
? netlink_deliver_tap+0xcd/0xac0
? netlink_deliver_tap+0x155/0xac0
? _copy_from_iter+0x1bb/0x12c0
genl_rcv+0x24/0x40
netlink_unicast+0x440/0x700
? netlink_attachskb+0x760/0x760
? lock_acquire+0x1c2/0x530
? __might_fault+0xbb/0x170
netlink_sendmsg+0x749/0xc10
? netlink_unicast+0x700/0x700
? __might_fault+0xbb/0x170
? netlink_unicast+0x700/0x700
__sock_sendmsg+0xc5/0x190
____sys_sendmsg+0x53f/0x760
? import_iovec+0x7/0x10
? kernel_sendmsg+0x30/0x30
? __copy_msghdr+0x3c0/0x3c0
? filter_irq_stacks+0x90/0x90
? stack_depot_save_flags+0x28/0xa30
___sys_sendmsg+0xeb/0x170
? kasan_save_stack+0x30/0x40
? copy_msghdr_from_user+0x110/0x110
? do_syscall_64+0x6d/0x140
? lock_acquire+0x1c2/0x530
? __virt_addr_valid+0x116/0x3b0
? __virt_addr_valid+0x1da/0x3b0
? lock_downgrade+0x680/0x680
? __delete_object+0x21/0x50
__sys_sendmsg+0xf7/0x180
? __sys_sendmsg_sock+0x20/0x20
? kmem_cache_free+0x14c/0x4e0
? __x64_sys_close+0x78/0xd0
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f855e113367
Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
RSP: 002b:00007ffd15e90c88 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f855e113367
RDX: 0000000000000000 RSI: 00007ffd15e90cf0 RDI: 0000000000000004
RBP: 00007ffd15e90dbc R08: 0000000000000028 R09: 000000000045d100
R10: 00007f855e011dd8 R11: 0000000000000246 R12: 0000000000000019
R13: 0000000067c6b785 R14: 00000000004a1e80 R15: 0000000000000000
</TASK>
Modules linked in: 8021q garp mrp sch_ingress openvswitch nsh mlx5_ib mlx5_fwctl mlx5_dpll mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core]
---[ end trace 0000000000000000 ]---
Fixes: 8ff0ac5be144 ("net/mlx5: Add MACsec offload Tx command support")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1746958552-561295-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
net/mlx5: HWS, Complex Matchers and rehash mechanism fixes
Motivation:
----------
A matcher can match a certain set of match parameters. However,
the number and size of match params for a single matcher are
limited — all the parameters must fit within a single definer.
A common example of this limitation is IPv6 address matching, where
matching both source and destination IPs requires more bits than a
single definer can support.
SW Steering addresses this limitation by chaining multiple Steering
Table Entries (STEs) within the same matcher, where each STE matches
on a subset of the parameters.
In HW Steering, such chaining is not possible — the matcher's STEs
are managed in a hash table, and a single definer is used to calculate
the hash index for STEs.
Overview:
--------
To address this limitation in HW Steering, we introduce
*Complex Matchers*, which consist of two chained matchers. This allows
matching on twice as many parameters. Complex Matchers are filled with
*Complex Rules* — rules that are split into two parts and inserted into
their respective matchers.
The first half of the Complex Matcher is a regular matcher and points
to the second half, which is an *Isolated Matcher*. An Isolated Matcher
has its own isolated table and is accessible only by traffic coming
from the first half of the Complex Matcher.
This splitting of matchers/rules into multiple parts is transparent to
users. It is hidden behind the BWC HWS API. It becomes visible only
when dumping steering debug information, where the Complex Matcher
appears as two separate matchers: one in the user-created table and
another in its isolated table.
Implementation Details:
----------------------
All user actions are performed on the second part of the rules only.
The first part handles matching and applies two actions: modify header
(set metadata, see details below) and go-to-table (directing traffic
to the isolated table containing the isolated matcher).
Rule updates (updating rule actions) are applied to the second part
of the rule since user-provided actions are not executed in the first
matcher.
We use REG_C_6 metadata register to set and match on unique per-rule
tag (see details below).
Splitting rules into two parts introduces new challenges:
1. Invalid Combinations
Consider two rules with different matching values:
- Rule 1: A+B
- Rule 2: C+D
Let's split the rules into two parts as follows:
|-----Complex Matcher-------|
| |
| 1st matcher 2nd matcher |
| |---| |---| |
| | A | | B | |
| |---| -----> |---| |
| | C | | D | |
| |---| |---| |
| |
|---------------------------|
Splitting these rules results in invalid combinations: A+D and C+B:
any packet that matched on A will be forwarded to the 2nd matcher,
where it will try to match on B (which is legal, and it is what the
user asked for), but it will also try to match on D (which is not
what the user asked for). To resolve this, we assign unique tags
to each rule on the first matcher and match on these tags on the
second matcher:
|----------| |---------|
| A | | B, TagA |
| action: | | |
| set TagA | | |
|----------| --> |---------|
| C | | D, TagB |
| action: | | |
| set TagB | | |
|----------| |---------|
2. Duplicated Entries:
Consider two rules with overlapping values:
- Rule 1: A+B
- Rule 2: A+D
Let's split the rules into two parts as follows:
|---| |---|
| A | | B |
|---| --> |---|
| | | D |
|---| |---|
This leads to the duplicated entries on the first matcher, which HWS
doesn't allow: subsequent delete of either of the rules will delete
the only entry in the first matcher, leaving the remaining rule
broken. To address this, we use a reference count for entries in the
first matcher and delete STEs only when their refcount reaches zero.
Both challenges are resolved by having a per-matcher data structure
(implemented with rhashtable) that manages refcounts for the first part
of the rules and holds unique tags (managed via IDA) for these rules to
set and to match on the second matcher.
Limitations:
-----------
We utilize metadata register REG_C_6 in this implementation, so its
usage anywhere along the flow that might include the need for Complex
Matcher is prohibited.
The number and size of match parameters remain limited — now
constrained by what can be represented by two definers instead of one.
This architectural limitation arises from the structure of Complex
Matchers. If future requirements demand more parameters, Complex
Matchers can be extended beyond two matchers.
Additionally, there is an implementation limit of 32 match parameters
per matcher (disregarding parameter size). This limit can be lifted
if needed.
Patches:
-------
- Patches 1-3: small additions/refactoring in preparation for
Complex Matcher: exposed mlx5hws_table_ft_set_next_ft() in header,
added definer function to convert field name enum to string,
expose the polling function mlx5hws_bwc_queue_poll() in a header.
- Patch 4: in preparation for Complex Matcher, this patch adds
support for Isolated Matcher.
- Patch 5: the main patch - Complex Matchers implementation.
[2]
Patch 6: fixing the usecase where rule insertion was failing,
but rehash couldn't be initiated if the number of rules in
the table is below the rehash threshold.
Patch 7: fixing the usecase where many rules in parallel
would require rehash, due to the way the counting of rules
was done.
Patch 8: fixing the case where rules were requiring action
template extension in parallel, leading to unneeded extensions
with the same templates.
Patch 9: refactor and simplify the rehash loop.
Patch 10: dump error completion details, which helps a lot
in trying to understand what went wrong, especially during
rehash.
====================
Link: https://patch.msgid.link/1746992290-568936-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Failing to insert/delete a rule should not happen. If it does happen,
it would be good to know at which stage it happened and what was the
failure. This patch adds printing of bad CQE details.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-11-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reworking the rehash loop - simplifying the code and making it less
error prone:
- Instead of doing round-robin on all the queues with batch of rules in
each cycle, just go over all the queues and move all the rules that
belong to this queue.
- If at some stage of moving the rule we get a failure (which should
not happen), this can't be rolled back. So instead of aborting
rehash and leaving the matcher in a broken state, allow the loop
to continue: attempt to move the rest of the rules and delete the
old matcher. A rule that failed to move to a new matcher will loose
its match STE once the rehash is completed and the old matcher is
deleted, so the rule won't match any traffic any more. This rule's
packets will fall back to the steering pipeline w/o HW offload.
Rehash procedure will return an error, which will cause the rule
insertion to fail for the rule that started this whole rehash.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-10-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a rule is inserted into a matcher, we search for the suitable
action template. If such template is not found, action template array
is extended with the new template. However, when several threads are
performing this in parallel, there is a race - we can end up with
extending the action templates array with the same template.
This patch is doing the following:
- refactor the code to find action template index in rule create and
update, have the common code in an auxiliary function
- after locking all the queues, check again if the action template
array still needs to be extended
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-9-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the counter that counts number of rules in a matcher is
increased only when rule insertion is completed. In a multi-threaded
usecase this can lead to a scenario that many rules can be in process
of insertion in the same matcher, while none of them has completed
the insertion and the rule counter is not updated. This results in
a rule insertion failure for many of them at first attempt, which
leads to all of them requiring rehash and requiring locking of all
the queue locks.
This patch fixes the case by increasing the rule counter in the
beginning of insertion process and decreasing in case of any failure.
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-8-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rules are inserted into hash table in accordance with their hash index.
When a certain number of rules is reached, the table is rehashed:
a bigger new table is allocated and all the rules are moved there.
But sometimes a new rule can't be inserted into the hash table
because its index is full, even though the number of rules in the
table is well below the threshold. The hash function is not perfect,
so such cases are not rare. When that happens, we want to do the same
rehash, in order to increase the table size and lower the probability
for such cases.
This patch fixes the usecase where rule insertion was failing, but
rehash couldn't be initiated due to low number of rules: it adds flag
that denotes that rehash is required, even if the number of rules in
the table is below the rehash threshold.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-7-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds support for Complex Matchers/Rules
Overview:
--------
A matcher can match on a certain set of match parameters. However, the
number and size of match params for a single matcher are limited: all
the parameters must fit within a single definer.
A common example of this limitation is IPv6 address matching, where
matching both source and destination IPs requires more bits than a
single definer can support.
SW Steering addresses this limitation by chaining multiple Steering
Table Entries (STEs) within the same matcher, where each STE matches
on a subset of the parameters.
In HW Steering, such chaining is not possible — the matcher's STEs
are managed in a hash table, and a single definer is used to calculate
the hash index for STEs.
To address this limitation in HW Steering, we introduce Complex
Matchers, which consist of two chained matchers. This allows matching
on twice as many parameters. Complex Matchers are filled with Complex
Rules — rules that are split into two parts and inserted into their
respective matchers.
The first half of the Complex Matcher is a regular matcher and points
to the second half, which is an Isolated Matcher. An Isolated Matcher
has its own isolated table and is accessible only by traffic coming
from the first half of the Complex Matcher.
This splitting of matchers/rules into multiple parts is transparent to
users. It is hidden under the BWC HWS API. It becomes visible only when
dumping steering debug information, where the Complex Matcher appears
as two separate matchers: one in the user-created table and another
in its isolated table.
Some implementation details:
---------------------------
All user actions are performed on the second part of the rules only.
The first part handles matching and applies two actions: modify header
(set metadata, see details below) and go-to-table (directing traffic to
the isolated table containing the isolated matcher).
Rule updates (updating rule actions) are applied to the second part of
the rule since user-provided actions are not executed in the first
matcher.
We use REG_C_6 metadata register to set and match on unique per-rule
tag (see details below).
Splitting rules into two parts introduces new challenges:
1. Invalid Combinations
Consider two rules with different matching values:
- Rule 1: A+B
- Rule 2: C+D
Let's split the rules into two parts as follows:
|---| |---|
| A | | B |
|---| --> |---|
| C | | D |
|---| |---|
Splitting these rules results in invalid combinations like A+D
and C+B.
To resolve this, we assign unique tags to each rule on the first
matcher and match these tags on the second matcher (the tag is
implemented through modify_hdr action that sets value to metadata
register REG_C_6):
|----------| |---------|
| A | | B, TagA |
| action: | | |
| set TagA | | |
|----------| --> |---------|
| C | | D, TagB |
| action: | | |
| set TagB | | |
|----------| |---------|
2. Duplicated Entries:
Consider two rules with overlapping values:
- Rule 1: A+B
- Rule 2: A+D
Let's split the rules into two parts as follows:
|---| |---|
| A | | B |
|---| --> |---|
| | | D |
|---| |---|
This leads to the duplicated entries on the first matcher, which HWS
doesn't allow: subsequent delete of either of the rules will delete
the only entry in the first matcher, leaving the remaining rule
broken.
To address this, we use a reference count for entries in the first
matcher and delete STEs only when their refcount reaches zero.
Both challenges are resolved by having a per-matcher data structure
(implemented with rhashtable) that manages refcounts for the first part
of the rules and holds unique tags (managed via IDA) for these rules to
set and to match on the second matcher.
Limitations:
-----------
We utilize metadata register REG_C_6 in this implementation, so its
usage anywhere along the steering of the flow that might include the
need for Complex Matcher is prohibited.
The number and size of match parameters remain limited — now it is
constrained by what can be represented by two definers instead of one.
This architectural limitation arises from the structure of Complex
Matchers. If future requirements demand more parameters,
Complex Matchers can be extended beyond two matchers.
Additionally, there is an implementation limit of 32 match parameters
per rule (disregarding parameter size). This limit can be lifted if
needed.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for complex matcher support, introduce the isolated
matcher.
Isolated matcher is a matcher that has its own isolated table.
It is used as the second half of the complex matcher: when the rule
is split into two parts (complex rule), then matching on the first
part will send the packet to the isolated matcher that will try to
match on the second part. In case of miss, the packet goes back to
the matcher's end flow table.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for complex matcher, expose the function that is
polling queue for completion (mlx5hws_bwc_queue_poll) in header
file, so that it will be used by complex matcher code.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for complex matcher support, add function for
converting definer fname to str, which will be used in following
patches.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for complex matcher support, make function
mlx5hws_table_ft_set_next_ft() non-static and expose it in header.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There's a few issues with this function, mainly:
* This function -probably- should have been unsafe from the start. Pointers
are not always necessarily valid, but you want a function that does
field-projection for a pointer that can travel outside of the original
struct to be unsafe, at least if I understand properly.
* *mut Self is not terribly useful in this context, the majority of uses of
from_gem_obj() grab a *mut Self and then immediately convert it into a
&'a Self. It also goes against the ffi conventions we've set in the rest
of the kernel thus far.
* from_gem_obj() also doesn't follow the naming conventions in the rest of
the DRM bindings at the moment, as_ref() would be a better name.
So, let's:
* Make from_gem_obj() unsafe
* Convert it to return &'a Self
* Rename it to as_ref()
* Update all call locations
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250513221046.903358-3-lyude@redhat.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
There is usually not much of a reason to use a raw pointer in a data
struct, so move this to NonNull instead.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250513221046.903358-2-lyude@redhat.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Improve the installation procedure for the systemd service unit
'cpupower.service', to be more distro-agnostic. Do not install the
service unit configuration file to /etc/default/ (a directory that
is used by Debian and Debian-derivatives and only rarely by other
distros).
Also, clarify the role of the configuration file in its own comments.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#ma8a3fa80acc4036af6c754e8ecabacc55b288ad1
Link: https://lore.kernel.org/r/20250513163937.61062-5-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix the installation procedure for the systemd service unit
'cpupower.service'. Do not call "systemctl daemon-reload" in the
Makefile, but explain when this command should be manually issued
in the README file.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#mfbb938f9c0d5a21173acb92a061eb9205fd0abfe
Link: https://lore.kernel.org/r/20250513163937.61062-4-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix the use of DESTDIR variable in the Makefile, as far as the
installation of the systemd service unit 'cpupower.service' is
concerned.
This was caused by a misunderstanding about the purpose of the DESTDIR
variable in the Makefile, which is instead meant to support staged
installations: its value should not end up into installed file contents.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#mfbb938f9c0d5a21173acb92a061eb9205fd0abfe
Link: https://www.gnu.org/prep/standards/html_node/DESTDIR.html
Link: https://lore.kernel.org/r/20250513163937.61062-3-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
These tests:
"SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes"
"SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes"
output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)".
They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data
have been received by the other side. However, sometimes there is a delay
in updating this "unsent bytes" counter, and the test fails even though
the counter properly goes to 0 several milliseconds later.
The delay occurs in the kernel because the used buffer notification
callback virtio_vsock_tx_done(), called upon receipt of the data by the
other side, doesn't update the counter itself. It delegates that to
a kernel thread (via vsock->tx_work). Sometimes that thread is delayed
more than the test expects.
Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs.
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test")
Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit ce6cb8113c84 ("tools: ynl-gen: individually free previous
values on double set"), specifying the "multi-attr" property raises an
error unless the "nested-attributes" property is specified as well:
File "tools/net/ynl/./pyynl/ynl_gen_c.py", line 1147, in _load_nested_sets
child = self.pure_nested_structs.get(nested)
^^^^^^
UnboundLocalError: cannot access local variable 'nested' where it is not associated with a value
This appears to be a bug since there are existing specs which omit
"nested-attributes" on "multi-attr" attributes. Also, according to
Documentation/userspace-api/netlink/specs.rst, multi-attr "is the
recommended way of implementing arrays (no extra nesting)", suggesting
that nesting should even be avoided in favor of multi-attr.
Fix the indentation of the if-block introduced by the commit to avoid
the error.
Fixes: ce6cb8113c84 ("tools: ynl-gen: individually free previous values on double set")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/d6b58684b7e5bfb628f7313e6893d0097904e1d1.1746940107.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix several build errors when CONFIG_MODULES=n, including the following:
../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module'
195 | for (int i = 0; i < mod->its_num_pages; i++) {
Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|