Age | Commit message (Collapse) | Author |
|
UDP-encapsulated ESP data paths'
Mike Yu says:
====================
Currently, IPsec crypto offload is enabled for GRO code path. However, there
are other code paths where the XFRM stack is involved; for example, IPv6 ESP
packets handled by xfrm6_esp_rcv() in ESP layer, and IPv4 UDP-encapsulated
ESP packets handled by udp_rcv() in UDP layer.
This patchset extends the crypto offload support to cover these two cases.
This is useful for devices with traffic accounting (e.g., Android), where GRO
can lead to inaccurate accounting on the underlying network. For example, VPN
traffic might not be counted on the wifi network interface wlan0 if the packets
are handled in GRO code path before entering the network stack for accounting.
Below is the RX data path scenario the crypto offload can be applied to.
+-----------+ +-------+
| HW Driver |-->| wlan0 |--------+
+-----------+ +-------+ |
v
+---------------+ +------+
+------>| Network Stack |-->| Apps |
| +---------------+ +------+
| |
| v
+--------+ +------------+
| ipsec1 |<--| XFRM Stack |
+--------+ +------------+
====================
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
When updating the callbacks, adding their kernel-doc was forgotten. Add
it now.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/r/20240712165527.75e4ddc9@canb.auug.org.au
Fixes: a93c2e5fe766 ("i2c: reword i2c_algorithm according to newest specification")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow
This release includes significant updates, with the primary
change being the renaming from "master/slave" to
"controller/target" to adhere to I2C v7 and SMBus 3.2 standards.
New Support:
- Added support for Intel Arrow Lake-H.
- Added I2C support in the Arioha SoC by linking the Mediatek
I2C controller.
Cleanups:
- Added the MODULE_DESCRIPTION() macro, resolving a modpost
warning in the ALi 1563 Southbridge driver.
- Constified the regmap_config declaration in the i2c-designware
driver.
- Improved the coding style in the Renesas R-Car driver by
removing unnecessary semicolons after brackets.
General improvements:
- In the OMAP device, replaced NOIRQ_SYSTEM_SLEEP_PM_OPS with
RUNTIME_PM_OPS to enable waking up the controller during
suspend() before suspend_noirq() kicks in.
- Improved logging in the Xilinx driver.
- Added a warning (WARN()) in the Renesas R-Car driver for
spurious interrupts.
DTS Changes:
- Removed address-cell and size-cell from the Atmel at91sam,
nVidia Tegra 20, and Samsung S3c2410 devices.
- Fixed Texas Instruments OMAP4 I2C controller to comply with
the i2c-controller.yaml schema.
- Improved indentation in DTS examples for several I2C devices.
- Converted the NXP LPC1788 binding to the dt-schema.
- Added documentation for the compatible string
thead,th1520-i2c.
- Added the "power-domains" property for the Meson I2C driver.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
This tag includes three fixes for the Renesas R-Car driver:
1. Ensures the device is in a known state after probing.
2. Allows clearing the NO_RXDMA flag after a reset.
3. Forces a reset before any transfer on Gen3+ platforms to
prevent disruption of the configuration during parallel
transfers.
|
|
Describe the pool argument in the kernel-doc comment for
swiotlb_del_transient.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
idpf: XDP chapter I: convert Rx to libeth
Alexander Lobakin says:
XDP for idpf is currently 5 chapters:
* convert Rx to libeth (this);
* convert Tx and stats to libeth;
* generic XDP and XSk code changes, libeth_xdp;
* actual XDP for idpf via libeth_xdp;
* XSk for idpf (^).
Part I does the following:
* splits &idpf_queue into 4 (RQ, SQ, FQ, CQ) and puts them on a diet;
* ensures optimal cacheline placement, strictly asserts CL sizes;
* moves currently unused/dead singleq mode out of line;
* reuses libeth's Rx ptype definitions and helpers;
* uses libeth's Rx buffer management for both header and payload;
* eliminates memcpy()s and coherent DMA uses on hotpath, uses
napi_build_skb() instead of in-place short skb allocation.
Most idpf patches, except for the queue split, removes more lines
than adds.
Expect far better memory utilization and +5-8% on Rx depending on
the case (+17% on skb XDP_DROP :>).
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
idpf: use libeth Rx buffer management for payload buffer
idpf: convert header split mode to libeth + napi_build_skb()
libeth: support different types of buffers for Rx
idpf: remove legacy Page Pool Ethtool stats
idpf: reuse libeth's definitions of parsed ptype structures
idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ
idpf: merge singleq and splitq &net_device_ops
idpf: strictly assert cachelines of queue and queue vector structures
idpf: avoid bloating &idpf_q_vector with big %NR_CPUS
idpf: split &idpf_queue into 4 strictly-typed queue structures
idpf: stop using macros for accessing queue descriptors
libeth: add cacheline / struct layout assertion helpers
page_pool: use __cacheline_group_{begin, end}_aligned()
cache: add __cacheline_group_{begin, end}_aligned() (+ couple more)
====================
Link: https://patch.msgid.link/20240710203031.188081-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-07-12
We've added 23 non-merge commits during the last 3 day(s) which contain
a total of 18 files changed, 234 insertions(+), 243 deletions(-).
The main changes are:
1) Improve BPF verifier by utilizing overflow.h helpers to check
for overflows, from Shung-Hsi Yu.
2) Fix NULL pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT
when attr->attach_prog_fd was not specified, from Tengda Wu.
3) Fix arm64 BPF JIT when generating code for BPF trampolines with
BPF_TRAMP_F_CALL_ORIG which corrupted upper address bits,
from Puranjay Mohan.
4) Remove test_run callback from lwt_seg6local_prog_ops which never worked
in the first place and caused syzbot reports,
from Sebastian Andrzej Siewior.
5) Relax BPF verifier to accept non-zero offset on KF_TRUSTED_ARGS/
/KF_RCU-typed BPF kfuncs, from Matt Bobrowski.
6) Fix a long standing bug in libbpf with regards to handling of BPF
skeleton's forward and backward compatibility, from Andrii Nakryiko.
7) Annotate btf_{seq,snprintf}_show functions with __printf,
from Alan Maguire.
8) BPF selftest improvements to reuse common network helpers in sk_lookup
test and dropping the open-coded inetaddr_len() and make_socket() ones,
from Geliang Tang.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
selftests/bpf: Test for null-pointer-deref bugfix in resolve_prog_type()
bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT
selftests/bpf: DENYLIST.aarch64: Skip fexit_sleep again
bpf: use check_sub_overflow() to check for subtraction overflows
bpf: use check_add_overflow() to check for addition overflows
bpf: fix overflow check in adjust_jmp_off()
bpf: Eliminate remaining "make W=1" warnings in kernel/bpf/btf.o
bpf: annotate BTF show functions with __printf
bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG
selftests/bpf: Close obj in error path in xdp_adjust_tail
selftests/bpf: Null checks for links in bpf_tcp_ca
selftests/bpf: Use connect_fd_to_fd in sk_lookup
selftests/bpf: Use start_server_addr in sk_lookup
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Add ASSERT_OK_FD macro
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: fix compilation failure when CONFIG_NF_FLOW_TABLE=m
bpf: Remove tst_run from lwt_seg6local_prog_ops.
bpf: relax zero fixed offset constraint on KF_TRUSTED_ARGS/KF_RCU
...
====================
Link: https://patch.msgid.link/20240712212448.5378-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
f7ce5eb2cb79 ("bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()")
20c8ad72eb7f ("eth: bnxt: use the RSS context XArray instead of the local list")
Adjacent changes:
net/ethtool/ioctl.c
503757c80928 ("net: ethtool: Fix RSS setting")
eac9122f0c41 ("net: ethtool: record custom RSS contexts in the XArray")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
eth: bnxt: use the new RSS API
Convert bnxt from using the set_rxfh API to separate create/modify/remove
callbacks.
Two small extensions to the core APIs are necessary:
- the ability to discard contexts if for some catastrophic reasons
device can no longer provide them;
- the ability to reserve space in the context for RSS table growth.
The driver is adjusted to store indirection tables on u32 to make
it easier to use core structs directly.
With that out of the way the conversion is fairly straightforward.
Since the opposition to discarding contexts was relatively mild
and its what bnxt does already, I'm sticking to that. We may very
well need to revisit that at a later time.
v1: https://lore.kernel.org/all/20240702234757.4188344-1-kuba@kernel.org/
====================
Link: https://patch.msgid.link/20240711220713.283778-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of allocating a separate indir table in the vnic use
the one already present in the RSS context allocated by the core.
This saves some LoC and also we won't have to worry about syncing
the local version back to the core, once core learns how to dump
contexts.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-12-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ethtool core stores indirection table with u32 entries, "just to be safe".
Switch the type in the driver, so that it's easier to swap local tables
for the core ones. Memory allocations already use sizeof(*entry), switch
the memset()s as well.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bnxt allocates tables of max size, and changes the used size
based on number of active rings. The unused entries get padded
out with zeros. bnxt_modify_rss() seems to always pad out
the table of the main / default RSS context, instead of
the table of the modified context.
I haven't observed any behavior change due to this patch,
so I don't think it's a fix. Not entirely sure what role
the padding plays, 0 is a valid queue ID.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core already maintains all RSS contexts in an XArray, no need
to keep a second list in the driver.
Remove bnxt_get_max_rss_ctx_ring() completely since core performs
the same check already.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core can allocate space for per-context driver-private data,
use it for struct bnxt_rss_ctx. Inline bnxt_alloc_rss_ctx()
at this point, most of the init (as in the actions bnxt_del_one_rss_ctx()
will undo) is open coded in bnxt_create_rxfh_context(), anyway.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New RSS context API removes old contexts on netdev unregister.
No need to wipe them manually.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core will allocate IDs for the driver, from the range
[1, BNXT_MAX_ETH_RSS_CTX], no need to track the allocations.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the new ethtool ops for RSS context management. The conversion
is pretty straightforward cut / paste of the right chunks of the
combined handler. Main change is that we let the core pick the IDs
(bitmap will be removed separately for ease of review), so we need
to tell the core when we lose a context.
Since the new API passes rxfh as const, change bnxt_modify_rss()
to also take const.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Contexts get deleted from FW when the device is down, but they
are kept in SW and re-added back on open. bnxt_set_rxfh_context()
apparently does not want to deal with complexity of dealing with
both the device down and device up cases. This is perhaps acceptable
for creating new contexts, but not being able to delete contexts
makes core-driven cleanups messy. Specifically with the new RSS
API core will try to delete contexts automatically after bringing
the device down.
Support the delete-while-down case. Skip the FW logic and delete
just the driver state.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some drivers (bnxt but I think also mlx5 from ML discussions) change
the size of the indirection table depending on the number of Rx rings.
Decouple the max table size from the size of the currently used table,
so that we can reserve space in the context for table growth.
Static members in ethtool_ops are good enough for now, we can add
callbacks to read the max size more dynamically if someone needs
that.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
RSS contexts may get lost from a device, in various extreme circumstances.
Specifically if the firmware leaks resources and resets, or crashes and
either recovers in partially working state or the crash causes a
different FW version to run - creating the context again may fail.
Drivers should do their absolute best to prevent this from happening.
When it does, however, telling user that a context exists, when it can't
possibly be used any more is counter productive. Add a helper for
drivers to discard contexts. Print an error, in the future netlink
notification will also be sent.
More robust approaches were proposed, like keeping the contexts
but marking them as "dead" (but possibly resurrected by next reset).
That may be better but it's unclear at this stage whether the
effort is worth the benefits.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Smatch complains that:
fs/erofs/zdata.c:1047 z_erofs_scan_folio()
error: uninitialized symbol 'err'.
The issue is if we hit this (!(map->m_flags & EROFS_MAP_MAPPED)) {
condition then "err" isn't set. It's inside a loop so we would have to
hit that condition on every iteration. Initialize "err" to zero to
solve this.
Fixes: 5b9654efb604 ("erofs: teach z_erofs_scan_folios() to handle multi-page folios")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/f78ab50e-ed6d-4275-8dd4-a4159fa565a2@stanley.mountain
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
"A quick follow up to yesterday's pull. We got a regressions report for
the bnxt patch as soon as it got to your tree. The ethtool fix is also
good to have, although it's an older regression.
Current release - regressions:
- eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW
when user tries to decrease the ring count
Previous releases - regressions:
- ethtool: fix RSS setting, accept "no change" setting if the driver
doesn't support the new features
- eth: i40e: remove needless retries of NVM update, don't wait 20min
when we know the firmware update won't succeed"
* tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
octeontx2-af: fix issue with IPv4 match for RSS
octeontx2-af: fix issue with IPv6 ext match for RSS
octeontx2-af: fix detection of IP layer
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
octeontx2-af: replace cpt slot with lf id on reg write
i40e: fix: remove needless retries of NVM update
net: ethtool: Fix RSS setting
|
|
On older chips not supporting multiple RSS contexts, reducing
ethtool channels will crash:
BUG: kernel NULL pointer dereference, address: 00000000000000b8
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 7032 Comm: ethtool Tainted: G S 6.10.0-rc4 #1
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83
RSP: 0018:ffffaaa501d23ba8 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8efdf600c940 RCX: 0000000000000000
RDX: 000000000000007f RSI: ffffffffacf429c4 RDI: ffff8efdf600ca78
RBP: 0000000000000080 R08: 0000000000000000 R09: 0000000000000100
R10: 0000000000000001 R11: ffffaaa501d238c0 R12: 0000000000000080
R13: 0000000000000000 R14: ffff8efdf600c000 R15: 0000000000000006
FS: 00007f977a7d2740(0000) GS:ffff8f041f840000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b8 CR3: 00000002320aa004 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die_body+0x15/0x60
? page_fault_oops+0x157/0x440
? do_user_addr_fault+0x60/0x770
? _raw_spin_lock_irqsave+0x12/0x40
? exc_page_fault+0x61/0x120
? asm_exc_page_fault+0x22/0x30
? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en]
bnxt_set_channels+0x9d/0x340 [bnxt_en]
ethtool_set_channels+0x14b/0x210
__dev_ethtool+0xdf8/0x2890
? preempt_count_add+0x6a/0xa0
? percpu_counter_add_batch+0x23/0x90
? filemap_map_pages+0x417/0x4a0
? avc_has_extended_perms+0x185/0x420
? __pfx_udp_ioctl+0x10/0x10
? sk_ioctl+0x55/0xf0
? kmalloc_trace_noprof+0xe0/0x210
? dev_ethtool+0x54/0x170
dev_ethtool+0xa2/0x170
dev_ioctl+0xbe/0x530
sock_do_ioctl+0xa3/0xf0
sock_ioctl+0x20d/0x2e0
bp->rss_ctx_list is not initialized if the chip or firmware does not
support multiple RSS contexts. Fix it by adding a check in
bnxt_get_max_rss_ctx_ring() before proceeding to reference
bp->rss_ctx_list.
Fixes: 0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts")
Reported-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The locking rules in the driver came from era when sysfs attributes
could live past the point of time when device would be unbound from
the driver, and so used module-global semaphore (potentially shared
between multiple yealink devices). Thankfully these times are long
gone and attributes will not be accessible once they are removed.
Simplify the logic by moving to per-device mutex, stop checking if
there is driver data instance attached to the interface, and use
guard notation to acquire the mutex.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240710234855.311366-2-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct usb_driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240710234855.311366-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct usb_driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Ville Syrjälä <syrjala@sci.fi>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo8gaF_lKPAfcye1@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct driver->dev_groups pointer to have the driver core
do it.
This also fixes issue with the attribute not being deleted on driver
unbind.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo9nofWJ1xg9MgKs@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes,
set struct driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo9kSFeGOZB9b3rq@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
These functions are redundant after commit 0daa7a0afd0f ("hwrng: Avoid
manual device_create_file() calls").
Let's call misc_(de)register() directly.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
We checked that "nlimbs" is non-zero in the outside if statement so delete
the duplicate check here.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Implementations of hash functions often have special cases when lengths
are a multiple of the hash function's internal block size (e.g. 64 for
SHA-256, 128 for SHA-512). Currently, when the fuzz testing code
generates lengths, it doesn't prefer any length mod 64 over any other.
This limits the coverage of these special cases.
Therefore, this patch updates the fuzz testing code to generate
power-of-2 lengths and divide messages exactly in half a bit more often.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
We could leak stack memory through the payload field when running
AES with a key from one of the hardware's key slots. Fix this by
ensuring the payload field is set to 0 in such cases.
This does not affect the common use case when the key is supplied
from main memory via the descriptor payload.
Signed-off-by: David Gstir <david@sigma-star.at>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202405270146.Y9tPoil8-lkp@intel.com/
Fixes: 3d16af0b4cfa ("crypto: mxs-dcp: Add support for hardware-bound keys")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Do not enable by default the CN10K HW random generator driver.
CN10K Random Number Generator is available only on some specific
Marvell SoCs, however the driver is in practice enabled by default on
all arm64 configs.
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Replace the deprecated[1] use of strncpy() in bacct_add_tsk(). Since this
is UAPI, include trailing padding in the copy.
Link: https://github.com/KSPP/linux/issues/90 [1]
Link: https://lkml.kernel.org/r/20240711171308.work.995-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Cc: "Dr. Thomas Orgis" <thomas.orgis@uni-hamburg.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ismael Luceno <ismael@iodev.co.uk>
Cc: Peng Liu <liupeng256@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Use the swap() macro to simplify the functions solve_linear_system() and
gf_poly_gcd() and improve their readability. Remove the local variable
tmp.
Fixes the following three Coccinelle/coccicheck warnings reported by
swap.cocci:
WARNING opportunity for swap()
WARNING opportunity for swap()
WARNING opportunity for swap()
Link: https://lkml.kernel.org/r/20240708224023.9312-2-thorsten.blum@toblux.com
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Replace commas between expression statements with semicolons.
Link: https://lkml.kernel.org/r/20240709034323.586185-1-nichen@iscas.ac.cn
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit eb8f689046b8 ("Use separate sections for __dev/
_cpu/__mem code/data").
Check section mismatch to __meminit* only when CONFIG_MEMORY_HOTPLUG=n.
With this change, the linker script and modpost become simpler, and we
can get rid of the __ref annotations from the memory hotplug code.
[sfr@canb.auug.org.au: remove MEM_KEEP from arch/powerpc/kernel/vmlinux.lds.S]
Link: https://lkml.kernel.org/r/20240710093213.2aefb25f@canb.auug.org.au
Link: https://lkml.kernel.org/r/20240706160511.2331061-2-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
These macros are not used anywhere.
Link: https://lkml.kernel.org/r/20240706160511.2331061-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
'struct kobj_type' is not modified in this driver. It is only used with
kobject_init_and_add() which takes a "const struct kobj_type *" parameter.
Constifying this structure moves some data to a read-only section, so
increase overall security.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
22403 4184 24 26611 67f3 fs/nilfs2/sysfs.o
After:
=====
text data bss dec hex filename
22723 3928 24 26675 6833 fs/nilfs2/sysfs.o
Link: https://lkml.kernel.org/r/20240708143242.3296-1-konishi.ryusuke@gmail.com
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The legacy PMD-sized THP counters at /proc/vmstat include thp_file_alloc,
thp_file_fallback and thp_file_fallback_charge, which rather confusingly
refer to shmem THP and do not include any other types of file pages. This
is inconsistent since in most other places in the kernel, THP counters are
explicitly separated for anon, shmem and file flavours. However, we are
stuck with it since it constitutes a user ABI.
Recently, commit 66f44583f9b6 ("mm: shmem: add mTHP counters for anonymous
shmem") added equivalent mTHP stats for shmem, keeping the same "file_"
prefix in the names. But in future, we may want to add extra stats to
cover actual file pages, at which point, it would all become very
confusing.
So let's take the opportunity to rename these new counters "shmem_" before
the change makes it upstream and the ABI becomes immutable. While we are
at it, let's improve the documentation for the legacy counters to make it
clear that they count shmem pages only.
Link: https://lkml.kernel.org/r/20240710095503.3193901-1-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert to use folio_alloc_mpol() helper() in __read_swap_cache_async().
Link: https://lkml.kernel.org/r/20240709105508.3933823-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This issue is not from any report yet, but by code observation only.
This is yet another fix besides Hugh's patch [1] but on relevant code
path, where eager split of folio can happen if the folio is already on
deferred list during a folio migration.
Here the issue is NUMA path (migrate_misplaced_folio()) may start to
encounter such folio split now even with MR_NUMA_MISPLACED hint applied.
Then when migrate_pages() didn't migrate all the folios, it's possible the
split small folios be put onto the list instead of the original folio.
Then putting back only the head page won't be enough.
Fix it by putting back all the folios on the list.
[1] https://lore.kernel.org/all/46c948b4-4dd8-6e03-4c7b-ce4e81cfa536@google.com/
[akpm@linux-foundation.org: remove now unused local `nr_pages']
Link: https://lkml.kernel.org/r/20240708215537.2630610-1-peterx@redhat.com
Fixes: 7262f208ca68 ("mm/migrate: split source folio if it is on deferred split list")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Make clear_shadow_entry() clear shadow entries in `struct folio_batch` so
that it can reduce contention on i_lock and i_pages locks, e.g.,
watchdog: BUG: soft lockup - CPU#29 stuck for 11s! [fio:2701649]
clear_shadow_entry+0x3d/0x100
mapping_try_invalidate+0x117/0x1d0
invalidate_mapping_pages+0x10/0x20
invalidate_bdev+0x3c/0x50
blkdev_common_ioctl+0x5f7/0xa90
blkdev_ioctl+0x109/0x270
Also, rename clear_shadow_entry() to clear_shadow_entries() accordingly.
[yuzhao@google.com: v2]
Link: https://lkml.kernel.org/r/20240710060933.3979380-1-yuzhao@google.com
Link: https://lkml.kernel.org/r/20240708212753.3120511-1-yuzhao@google.com
Reported-by: Bharata B Rao <bharata@amd.com>
Closes: https://lore.kernel.org/d2841226-e27b-4d3d-a578-63587a3aa4f3@amd.com/
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The page cannot become compound pages again just after a folio is split as
an extra refcnt is held. So the MF_MSG_DIFFERENT_COMPOUND case is
obsolete and can be removed to get rid of this false assumption and code
burden. But add one WARN_ON() here to keep the situation clear.
Link: https://lkml.kernel.org/r/20240708030544.196919-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Now that folio_undo_large_rmappable() is an inline function checking
order and large_rmappable for itself (and __folio_undo_large_rmappable()
is now declared even when CONFIG_TRANASPARENT_HUGEPAGE is off) there is
no need for folio_migrate_mapping() to check large and large_rmappable
first (in the mapping case when it has had to freeze anyway).
Link: https://lkml.kernel.org/r/68feee73-050e-8e98-7a3a-abf78738d92c@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
__free_pages_core() is only used in bootmem init and hot-add memory init
path. Let's put it in __meminit section.
Link: https://lkml.kernel.org/r/20240706061615.30322-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
After the commit 7fb1b252afb5 ("mm: shmem: add mTHP support for anonymous
shmem"), we can configure different policies through the multi-size THP
sysfs interface for anonymous shmem. But currently "THPeligible"
indicates only whether the mapping is eligible for allocating THP-pages as
well as the THP is PMD mappable or not for anonymous shmem, we need to
support semantics for mTHP with anonymous shmem similar to those for mTHP
with anonymous memory.
Link: https://lkml.kernel.org/r/20240705032309.24933-1-libang.li@antgroup.com
Signed-off-by: Bang Li <libang.li@antgroup.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When folio is isolated, the PG_lru bit is cleared. So the PG_lru check in
stable_page_flags() will miss this kind of isolated folios. Use
folio_test_large_rmappable() instead to also include isolated folios.
Since pagecache supports large folios and the introduction of mTHP, the
semantics of KPF_THP have been expanded, now it indicates not only
PMD-sized THP. Update related documentation to clearly state that KPF_THP
indicates multiple order THPs.
[ran.xiaokai@zte.com.cn: directly use is_zero_folio(), per David]
Link: https://lkml.kernel.org/r/20240708062601.165215-1-ranxiaokai627@163.com
Link: https://lkml.kernel.org/r/20240705104343.112680-1-ranxiaokai627@163.com
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Svetly Todorov <svetly.todorov@memverge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since the introduction of mTHP, the docuementation has stated that
khugepaged would be enabled when any mTHP size is enabled, and disabled
when all mTHP sizes are disabled. There are 2 problems with this; 1.
this is not what was implemented by the code and 2. this is not the
desirable behavior.
Desirable behavior is for khugepaged to be enabled when any PMD-sized THP
is enabled, anon or file. (Note that file THP is still controlled by the
top-level control so we must always consider that, as well as the PMD-size
mTHP control for anon). khugepaged only supports collapsing to PMD-sized
THP so there is no value in enabling it when PMD-sized THP is disabled.
So let's change the code and documentation to reflect this policy.
Further, per-size enabled control modification events were not previously
forwarded to khugepaged to give it an opportunity to start or stop.
Consequently the following was resulting in khugepaged eroneously not
being activated:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
[ryan.roberts@arm.com: v3]
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com
Link: https://lkml.kernel.org/r/20240704091051.2411934-1-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The current memory tier initialization process is distributed across two
different functions, memory_tier_init() and memory_tier_late_init(). This
design is hard to maintain. Thus, this patch is proposed to reduce the
possible code paths by consolidating different initialization patches into
one.
The earlier discussion with Jonathan and Ying is listed here:
https://lore.kernel.org/lkml/20240405150244.00004b49@Huawei.com/
If we want to put these two initializations together, they must be placed
together in the later function. Because only at that time, the HMAT
information will be ready, adist between nodes can be calculated, and
memory tiering can be established based on the adist. So we position the
initialization at memory_tier_init() to the memory_tier_late_init() call.
Moreover, it's natural to keep memory_tier initialization in drivers at
device_initcall() level.
If we simply move the set_node_memory_tier() from memory_tier_init() to
late_initcall(), it will result in HMAT not registering the
mt_adistance_algorithm callback function, because set_node_memory_tier()
is not performed during the memory tiering initialization phase, leading
to a lack of correct default_dram information.
Therefore, we introduced a nodemask to pass the information of the default
DRAM nodes. The reason for not choosing to reuse default_dram_type->nodes
is that it is not clean enough. So in the end, we use a __initdata
variable, which is a variable that is released once initialization is
complete, including both CPU and memory nodes for HMAT to iterate through.
Link: https://lkml.kernel.org/r/20240704072646.437579-1-horen.chuang@linux.dev
Signed-off-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gregory Price <gourry.memverge@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Ravi Jonnalagadda <ravis.opensrc@micron.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|