Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
This tag includes three fixes for the Renesas R-Car driver:
1. Ensures the device is in a known state after probing.
2. Allows clearing the NO_RXDMA flag after a reset.
3. Forces a reset before any transfer on Gen3+ platforms to
prevent disruption of the configuration during parallel
transfers.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
idpf: XDP chapter I: convert Rx to libeth
Alexander Lobakin says:
XDP for idpf is currently 5 chapters:
* convert Rx to libeth (this);
* convert Tx and stats to libeth;
* generic XDP and XSk code changes, libeth_xdp;
* actual XDP for idpf via libeth_xdp;
* XSk for idpf (^).
Part I does the following:
* splits &idpf_queue into 4 (RQ, SQ, FQ, CQ) and puts them on a diet;
* ensures optimal cacheline placement, strictly asserts CL sizes;
* moves currently unused/dead singleq mode out of line;
* reuses libeth's Rx ptype definitions and helpers;
* uses libeth's Rx buffer management for both header and payload;
* eliminates memcpy()s and coherent DMA uses on hotpath, uses
napi_build_skb() instead of in-place short skb allocation.
Most idpf patches, except for the queue split, removes more lines
than adds.
Expect far better memory utilization and +5-8% on Rx depending on
the case (+17% on skb XDP_DROP :>).
* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
idpf: use libeth Rx buffer management for payload buffer
idpf: convert header split mode to libeth + napi_build_skb()
libeth: support different types of buffers for Rx
idpf: remove legacy Page Pool Ethtool stats
idpf: reuse libeth's definitions of parsed ptype structures
idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ
idpf: merge singleq and splitq &net_device_ops
idpf: strictly assert cachelines of queue and queue vector structures
idpf: avoid bloating &idpf_q_vector with big %NR_CPUS
idpf: split &idpf_queue into 4 strictly-typed queue structures
idpf: stop using macros for accessing queue descriptors
libeth: add cacheline / struct layout assertion helpers
page_pool: use __cacheline_group_{begin, end}_aligned()
cache: add __cacheline_group_{begin, end}_aligned() (+ couple more)
====================
Link: https://patch.msgid.link/20240710203031.188081-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
f7ce5eb2cb79 ("bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()")
20c8ad72eb7f ("eth: bnxt: use the RSS context XArray instead of the local list")
Adjacent changes:
net/ethtool/ioctl.c
503757c80928 ("net: ethtool: Fix RSS setting")
eac9122f0c41 ("net: ethtool: record custom RSS contexts in the XArray")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of allocating a separate indir table in the vnic use
the one already present in the RSS context allocated by the core.
This saves some LoC and also we won't have to worry about syncing
the local version back to the core, once core learns how to dump
contexts.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-12-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ethtool core stores indirection table with u32 entries, "just to be safe".
Switch the type in the driver, so that it's easier to swap local tables
for the core ones. Memory allocations already use sizeof(*entry), switch
the memset()s as well.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bnxt allocates tables of max size, and changes the used size
based on number of active rings. The unused entries get padded
out with zeros. bnxt_modify_rss() seems to always pad out
the table of the main / default RSS context, instead of
the table of the modified context.
I haven't observed any behavior change due to this patch,
so I don't think it's a fix. Not entirely sure what role
the padding plays, 0 is a valid queue ID.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core already maintains all RSS contexts in an XArray, no need
to keep a second list in the driver.
Remove bnxt_get_max_rss_ctx_ring() completely since core performs
the same check already.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core can allocate space for per-context driver-private data,
use it for struct bnxt_rss_ctx. Inline bnxt_alloc_rss_ctx()
at this point, most of the init (as in the actions bnxt_del_one_rss_ctx()
will undo) is open coded in bnxt_create_rxfh_context(), anyway.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
New RSS context API removes old contexts on netdev unregister.
No need to wipe them manually.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Core will allocate IDs for the driver, from the range
[1, BNXT_MAX_ETH_RSS_CTX], no need to track the allocations.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the new ethtool ops for RSS context management. The conversion
is pretty straightforward cut / paste of the right chunks of the
combined handler. Main change is that we let the core pick the IDs
(bitmap will be removed separately for ease of review), so we need
to tell the core when we lose a context.
Since the new API passes rxfh as const, change bnxt_modify_rss()
to also take const.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Contexts get deleted from FW when the device is down, but they
are kept in SW and re-added back on open. bnxt_set_rxfh_context()
apparently does not want to deal with complexity of dealing with
both the device down and device up cases. This is perhaps acceptable
for creating new contexts, but not being able to delete contexts
makes core-driven cleanups messy. Specifically with the new RSS
API core will try to delete contexts automatically after bringing
the device down.
Support the delete-while-down case. Skip the FW logic and delete
just the driver state.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20240711220713.283778-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
"A quick follow up to yesterday's pull. We got a regressions report for
the bnxt patch as soon as it got to your tree. The ethtool fix is also
good to have, although it's an older regression.
Current release - regressions:
- eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW
when user tries to decrease the ring count
Previous releases - regressions:
- ethtool: fix RSS setting, accept "no change" setting if the driver
doesn't support the new features
- eth: i40e: remove needless retries of NVM update, don't wait 20min
when we know the firmware update won't succeed"
* tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
octeontx2-af: fix issue with IPv4 match for RSS
octeontx2-af: fix issue with IPv6 ext match for RSS
octeontx2-af: fix detection of IP layer
octeontx2-af: fix a issue with cpt_lf_alloc mailbox
octeontx2-af: replace cpt slot with lf id on reg write
i40e: fix: remove needless retries of NVM update
net: ethtool: Fix RSS setting
|
|
On older chips not supporting multiple RSS contexts, reducing
ethtool channels will crash:
BUG: kernel NULL pointer dereference, address: 00000000000000b8
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 7032 Comm: ethtool Tainted: G S 6.10.0-rc4 #1
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83
RSP: 0018:ffffaaa501d23ba8 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8efdf600c940 RCX: 0000000000000000
RDX: 000000000000007f RSI: ffffffffacf429c4 RDI: ffff8efdf600ca78
RBP: 0000000000000080 R08: 0000000000000000 R09: 0000000000000100
R10: 0000000000000001 R11: ffffaaa501d238c0 R12: 0000000000000080
R13: 0000000000000000 R14: ffff8efdf600c000 R15: 0000000000000006
FS: 00007f977a7d2740(0000) GS:ffff8f041f840000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b8 CR3: 00000002320aa004 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die_body+0x15/0x60
? page_fault_oops+0x157/0x440
? do_user_addr_fault+0x60/0x770
? _raw_spin_lock_irqsave+0x12/0x40
? exc_page_fault+0x61/0x120
? asm_exc_page_fault+0x22/0x30
? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en]
bnxt_set_channels+0x9d/0x340 [bnxt_en]
ethtool_set_channels+0x14b/0x210
__dev_ethtool+0xdf8/0x2890
? preempt_count_add+0x6a/0xa0
? percpu_counter_add_batch+0x23/0x90
? filemap_map_pages+0x417/0x4a0
? avc_has_extended_perms+0x185/0x420
? __pfx_udp_ioctl+0x10/0x10
? sk_ioctl+0x55/0xf0
? kmalloc_trace_noprof+0xe0/0x210
? dev_ethtool+0x54/0x170
dev_ethtool+0xa2/0x170
dev_ioctl+0xbe/0x530
sock_do_ioctl+0xa3/0xf0
sock_ioctl+0x20d/0x2e0
bp->rss_ctx_list is not initialized if the chip or firmware does not
support multiple RSS contexts. Fix it by adding a check in
bnxt_get_max_rss_ctx_ring() before proceeding to reference
bp->rss_ctx_list.
Fixes: 0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts")
Reported-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The locking rules in the driver came from era when sysfs attributes
could live past the point of time when device would be unbound from
the driver, and so used module-global semaphore (potentially shared
between multiple yealink devices). Thankfully these times are long
gone and attributes will not be accessible once they are removed.
Simplify the logic by moving to per-device mutex, stop checking if
there is driver data instance attached to the interface, and use
guard notation to acquire the mutex.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240710234855.311366-2-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct usb_driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240710234855.311366-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct usb_driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Ville Syrjälä <syrjala@sci.fi>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo8gaF_lKPAfcye1@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes
set struct driver->dev_groups pointer to have the driver core
do it.
This also fixes issue with the attribute not being deleted on driver
unbind.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo9nofWJ1xg9MgKs@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Instead of manually creating driver-specific device attributes,
set struct driver->dev_groups pointer to have the driver core
do it.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/Zo9kSFeGOZB9b3rq@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
These functions are redundant after commit 0daa7a0afd0f ("hwrng: Avoid
manual device_create_file() calls").
Let's call misc_(de)register() directly.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
We could leak stack memory through the payload field when running
AES with a key from one of the hardware's key slots. Fix this by
ensuring the payload field is set to 0 in such cases.
This does not affect the common use case when the key is supplied
from main memory via the descriptor payload.
Signed-off-by: David Gstir <david@sigma-star.at>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202405270146.Y9tPoil8-lkp@intel.com/
Fixes: 3d16af0b4cfa ("crypto: mxs-dcp: Add support for hardware-bound keys")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Do not enable by default the CN10K HW random generator driver.
CN10K Random Number Generator is available only on some specific
Marvell SoCs, however the driver is in practice enabled by default on
all arm64 configs.
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The current memory tier initialization process is distributed across two
different functions, memory_tier_init() and memory_tier_late_init(). This
design is hard to maintain. Thus, this patch is proposed to reduce the
possible code paths by consolidating different initialization patches into
one.
The earlier discussion with Jonathan and Ying is listed here:
https://lore.kernel.org/lkml/20240405150244.00004b49@Huawei.com/
If we want to put these two initializations together, they must be placed
together in the later function. Because only at that time, the HMAT
information will be ready, adist between nodes can be calculated, and
memory tiering can be established based on the adist. So we position the
initialization at memory_tier_init() to the memory_tier_late_init() call.
Moreover, it's natural to keep memory_tier initialization in drivers at
device_initcall() level.
If we simply move the set_node_memory_tier() from memory_tier_init() to
late_initcall(), it will result in HMAT not registering the
mt_adistance_algorithm callback function, because set_node_memory_tier()
is not performed during the memory tiering initialization phase, leading
to a lack of correct default_dram information.
Therefore, we introduced a nodemask to pass the information of the default
DRAM nodes. The reason for not choosing to reuse default_dram_type->nodes
is that it is not clean enough. So in the end, we use a __initdata
variable, which is a variable that is released once initialization is
complete, including both CPU and memory nodes for HMAT to iterate through.
Link: https://lkml.kernel.org/r/20240704072646.437579-1-horen.chuang@linux.dev
Signed-off-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Gregory Price <gourry.memverge@gmail.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Ravi Jonnalagadda <ravis.opensrc@micron.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Using memfd_pin_folios() will ensure that the pages are pinned
correctly using FOLL_PIN. And, this also ensures that we don't
accidentally break features such as memory hotunplug as it would
not allow pinning pages in the movable zone.
Using this new API also simplifies the code as we no longer have
to deal with extracting individual pages from their mappings or
handle shmem and hugetlb cases separately.
Link: https://lkml.kernel.org/r/20240624063952.1572359-9-vivek.kasireddy@intel.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This is mainly a preparatory patch to use memfd_pin_folios() API for
pinning folios. Using folios instead of pages makes sense as the udmabuf
driver needs to handle both shmem and hugetlb cases. And, using the
memfd_pin_folios() API makes this easier as we no longer need to
separately handle shmem vs hugetlb cases in the udmabuf driver.
Note that, the function vmap_udmabuf() still needs a list of pages; so, we
collect all the head pages into a local array in this case.
Other changes in this patch include the addition of helpers for checking
the memfd seals and exporting dmabuf. Moving code from udmabuf_create()
into these helpers improves readability given that udmabuf_create() is a
bit long.
Link: https://lkml.kernel.org/r/20240624063952.1572359-8-vivek.kasireddy@intel.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A user or admin can configure a VMM (Qemu) Guest's memory to be backed by
hugetlb pages for various reasons. However, a Guest OS would still
allocate (and pin) buffers that are backed by regular 4k sized pages. In
order to map these buffers and create dma-bufs for them on the Host, we
first need to find the hugetlb pages where the buffer allocations are
located and then determine the offsets of individual chunks (within those
pages) and use this information to eventually populate a scatterlist.
Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500 options
were passed to the Host kernel and Qemu was launched with these
relevant options: qemu-system-x86_64 -m 4096m....
-device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
-display gtk,gl=on
-object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M
-machine memory-backend=mem1
Replacing -display gtk,gl=on with -display gtk,gl=off above would
exercise the mmap handler.
Link: https://lkml.kernel.org/r/20240624063952.1572359-7-vivek.kasireddy@intel.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com> (v2)
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add VM_PFNMAP to vm_flags in the mmap handler to ensure that the mappings
would be managed without using struct page.
And, in the vm_fault handler, use vmf_insert_pfn to share the page's pfn
to userspace instead of directly sharing the page (via struct page *).
Link: https://lkml.kernel.org/r/20240624063952.1572359-6-vivek.kasireddy@intel.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There is no !CONFIG_MMU version of vmf_insert_pfn():
arm-linux-gnueabi-ld: drivers/dma-buf/udmabuf.o: in function `udmabuf_vm_fault':
udmabuf.c:(.text+0xaa): undefined reference to `vmf_insert_pfn'
Link: https://lkml.kernel.org/r/20240624063952.1572359-5-vivek.kasireddy@intel.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
PCIe ACS settings control the level of isolation and the possible P2P paths
between devices. With greater isolation the kernel will create smaller
iommu_groups and with less isolation there is more HW that can achieve P2P
transfers. From a virtualization perspective all devices in the same
iommu_group must be assigned to the same VM as they lack security
isolation.
There is no way for the kernel to automatically know the correct ACS
settings for any given system and workload. Existing command line options
(e.g., disable_acs_redir) allow only for large scale change, disabling all
isolation, but this is not sufficient for more complex cases.
Add a kernel command-line option 'config_acs' to directly control all the
ACS bits for specific devices, which allows the operator to setup the right
level of isolation to achieve the desired P2P configuration. The
definition is future proof; when new ACS bits are added to the spec the
open syntax can be extended.
ACS needs to be setup early in the kernel boot as the ACS settings affect
how iommu_groups are formed. iommu_group formation is a one time event
during initial device discovery, so changing ACS bits after kernel boot can
result in an inaccurate view of the iommu_groups compared to the current
isolation configuration.
ACS applies to PCIe Downstream Ports and multi-function devices. The
default ACS settings are strict and deny any direct traffic between two
functions. This results in the smallest iommu_group the HW can support.
Frequently these values result in slow or non-working P2PDMA.
ACS offers a range of security choices controlling how traffic is
allowed to go directly between two devices. Some popular choices:
- Full prevention
- Translated requests can be direct, with various options
- Asymmetric direct traffic, A can reach B but not the reverse
- All traffic can be direct
Along with some other less common ones for special topologies.
The intention is that this option would be used with expert knowledge of
the HW capability and workload to achieve the desired configuration.
Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[bhelgaas: add example, tidy printk formats]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
One of the true positives that the cfg_access_lock lockdep effort
identified is this sequence:
WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70
RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70
Call Trace:
<TASK>
? __warn+0x8c/0x190
? pci_bridge_secondary_bus_reset+0x5d/0x70
? report_bug+0x1f8/0x200
? handle_bug+0x3c/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? pci_bridge_secondary_bus_reset+0x5d/0x70
pci_reset_bus+0x1d8/0x270
vmd_probe+0x778/0xa10
pci_device_probe+0x95/0x120
Where pci_reset_bus() users are triggering unlocked secondary bus resets.
Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses
pci_bus_lock() before issuing the reset which locks everything *but* the
bridge itself.
For the same motivation as adding:
bridge = pci_upstream_bridge(dev);
if (bridge)
pci_dev_lock(bridge);
to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add
pci_dev_lock() for @bus->self to pci_bus_lock().
Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.com
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuch
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
[bhelgaas: squash in recursive locking deadlock fix from Keith Busch:
https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
|
|
Similar to commit adbf4c4954e3 ("ubi: block: fix memleak in
ubiblock_create()"), 'dev->gd' is not assigned but dereferenced if
blk_mq_alloc_tag_set() fails, and leading to a null-pointer-dereference.
Fix it by using pr_err() and variable 'dev' to print error log.
Additionally, the log in the error handle path of idr_alloc() has
been improved by using pr_err(), too. Before initializing device
name, using dev_err() will print error log with 'null' instead of
the actual device name, like this:
block (null): ...
~~~~~~
It is unclear. Using pr_err() can print more details of the device.
The improved log is:
ubiblock0_0: ...
Fixes: 77567b25ab9f ("ubi: use blk_mq_alloc_disk and blk_cleanup_disk")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The UBIFS_DFS_DIR_LEN macro, which defines the maximum length of the UBIFS
debugfs directory name, has an incorrect formula and misleading comments.
The current formula is (3 + 1 + 2*2 + 1), which assumes that both UBI device
number and volume ID are limited to 2 characters. However, UBI device number
ranges from 0 to 31 (2 characters), and volume ID ranges from 0 to 127 (up
to 3 characters).
Although the current code works due to the cancellation of mathematical
errors (9 + 1 = 10, which matches the correct UBIFS_DFS_DIR_LEN value), it
can lead to confusion and potential issues in the future.
This patch aims to improve the code clarity and maintainability by making
the following changes:
1. Corrects the UBIFS_DFS_DIR_LEN macro definition to (3 + 1 + 2 + 3 + 1),
accommodating the maximum lengths of both UBI device number and volume ID,
plus the separators and null terminator.
2. Updates the snprintf calls to use UBIFS_DFS_DIR_LEN instead of
UBIFS_DFS_DIR_LEN + 1, removing the unnecessary +1.
3. Modifies the error checks to compare against UBIFS_DFS_DIR_LEN using >=
instead of >, aligning with the corrected macro definition.
4. Removes the redundant +1 in the dfs_dir_name array definitions in ubi.h
and debug.h.
While these changes do not affect the runtime behavior, they make the code
more readable, maintainable, and less prone to future errors.
v2->v3:
- Removes the duplicated UBIFS_DFS_DIR_LEN and UBIFS_DFS_DIR_NAME macro
definitions in ubifs.h, as they are already defined in debug.h.
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
We need to clean-up debugfs and ubiblock if we fail after initialising
them.
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Fixes: 927c145208b0 ("mtd: ubi: attach from device tree")
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
The use of do_div() in ubi_nvmem_reg_read() makes calling it on
32-bit machines rather expensive. Since the 'from' variable is
known to be a 32-bit quantity, it is clearly never needed and
can be optimized into a regular division operation.
Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the ubi_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
In case of a memory allocation failure in the volumes loop we can only
process the already allocated scan_eba and fm_eba array elements on the
error path - others are still uninitialized.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 00abf3041590 ("UBI: Add self_check_eba()")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fix from Ulf Hansson:
- qcom: Skip retention level for rpmhpd's
* tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: qcom: rpmhpd: Skip retention level for Power Domains
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- davinci_mmc: Prevent transmitted data size from exceeding sgm's
length
- sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
* tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length
mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
|
|
This commit introduces the dm target flag mempool_needs_integrity. When
the flag is set, device mapper will call bioset_integrity_create on it's
bio sets. The target can then call bio_integrity_alloc on the bios
allocated from the table's mempool.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
|
s/stream/steam/
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Link: https://patch.msgid.link/20240705045458.65108-2-thorsten.blum@toblux.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
Remove unnecessary semicolon at the end of the switch statement.
This is detected by coccinelle.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240709012223.17393-1-nichen@iscas.ac.cn
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
There are spelling mistakes in a comment and in the module description.
Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20240711083513.282724-1-colin.i.king@gmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
Both KUNIT_FAIL and KUNIT_ASSERT_FAILURE defined to KUNIT_FAIL_ASSERTION
with different tpye of kunit_assert_type. The current naming of
KUNIT_ASSERT_FAILURE and KUNIT_FAIL_ASSERTION is confusing due to their
similarities. To improve readability and symmetry, renames
KUNIT_ASSERT_FAILURE to KUNIT_FAIL_AND_ABORT. Makes the naming
consistent, with KUNIT_FAIL and KUNIT_FAIL_AND_ABORT being symmetrical.
Additionally, an explanation for KUNIT_FAIL_AND_ABORT has been added to
clarify its usage.
Signed-off-by: Eric Chan <ericchancf@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Most of these changes are Qualcomm SoC specific and came in just after
I sent out the last set of fixes. This includes two regression fixes
for SoC drivers, a defconfig change to ensure the Lenovo X13s is
usable and 11 changes to DT files to fix regressions and minor
platform specific issues.
Tony and Chunyan step back from their respective maintainership roles
on the omap and unisoc platforms, and Christophe in turn takes over
maintaining some of the Freescale SoC drivers that he has been taking
care of in practice already.
Lastly, there are two trivial fixes for the davinci and sunxi
platforms"
* tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
MAINTAINERS: Add more maintainers for omaps
ARM: davinci: Convert comma to semicolon
MAINTAINERS: Move myself from SPRD Maintainer to Reviewer
Revert "dt-bindings: cache: qcom,llcc: correct QDU1000 reg entries"
arm64: dts: qcom: qdu1000: Fix LLCC reg property
arm64: dts: qcom: sm6115: add iommu for sdhc_1
arm64: dts: qcom: x1e80100-crd: fix DAI used for headset recording
arm64: dts: qcom: x1e80100-crd: fix WCD audio codec TX port mapping
soc: qcom: pmic_glink: disable UCSI on sc8280xp
arm64: defconfig: enable Elan i2c-hid driver
arm64: dts: qcom: sc8280xp-crd: use external pull up for touch reset
arm64: dts: qcom: sc8280xp-x13s: fix touchscreen power on
arm64: dts: qcom: x1e80100: Fix PCIe 6a reg offsets and add MHI
arm64: dts: qcom: sa8775p: Correct IRQ number of EL2 non-secure physical timer
arm64: dts: allwinner: Fix PMIC interrupt number
arm64: dts: qcom: sc8280xp: Set status = "reserved" on PSHOLD
arm64: dts: qcom: x1e80100-*: Allocate some CMA buffers
arm64: dts: qcom: sc8180x: Fix LLCC reg property again
|
|
* iommu/iommufd/paging-domain-alloc:
RDMA/usnic: Use iommu_paging_domain_alloc()
wifi: ath11k: Use iommu_paging_domain_alloc()
wifi: ath10k: Use iommu_paging_domain_alloc()
drm/msm: Use iommu_paging_domain_alloc()
vhost-vdpa: Use iommu_paging_domain_alloc()
vfio/type1: Use iommu_paging_domain_alloc()
iommufd: Use iommu_paging_domain_alloc()
iommu: Add iommu_paging_domain_alloc() interface
|
|
* iommu/iommufd/attach-handles:
iommu: Extend domain attach group with handle support
iommu: Add attach handle to struct iopf_group
iommu: Remove sva handle list
iommu: Introduce domain attachment handle
|
|
* iommu/pci/ats:
arm64: dts: fvp: Enable PCIe ATS for Base RevC FVP
iommu/of: Support ats-supported device-tree property
dt-bindings: PCI: generic: Add ats-supported property
|
|
* iommu/fwspec-ops-removal:
iommu: Remove iommu_fwspec ops
OF: Simplify of_iommu_configure()
ACPI: Retire acpi_iommu_fwspec_ops()
iommu: Resolve fwspec ops automatically
iommu/mediatek-v1: Clean up redundant fwspec checks
[will: Fixed conflict in drivers/iommu/tegra-smmu.c between fwspec ops
removal and fwspec driver fix as per Robin and Jon]
|
|
* iommu/core:
docs: iommu: Remove outdated Documentation/userspace-api/iommu.rst
iommufd: Use atomic_long_try_cmpxchg() in incr_user_locked_vm()
iommu/iova: Add missing MODULE_DESCRIPTION() macro
iommu/dma: Prune redundant pgprot arguments
iommu: Make iommu_sva_domain_alloc() static
|
|
* iommu/nvidia/tegra:
iommu/tegra-smmu: Pass correct fwnode to iommu_fwspec_init()
|