Age | Commit message (Collapse) | Author |
|
Smatch static checker reported below warning:
drivers/iommu/iommufd/fault.c:131 iommufd_device_get_attach_handle()
warn: 'handle' is an error pointer or valid
Fix it by checking 'handle' with IS_ERR().
Fixes: b7d8833677ba ("iommufd: Fault-capable hwpt attach/detach/replace")
Link: https://lore.kernel.org/r/20240712025819.63147-1-baolu.lu@linux.intel.com
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-iommu/8bb4f37a-4514-4dea-aabb-7380be303895@stanley.mountain/
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
While performing RSS based on IPv4, packets with
IPv4 options are not being considered. Adding changes
to match both plain IPv4 and IPv4 with option header.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While performing RSS based on IPv6, extension ltype
is not being considered. This will be problem for
fragmented packets or packets with extension header.
Adding changes to match IPv6 ext header along with IPv6
ltype.
Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Checksum and length checks are not enabled for IPv4 header with
options and IPv6 with extension headers.
To fix this a change in enum npc_kpu_lc_ltype is required which will
allow adjustment of LTYPE_MASK to detect all types of IP headers.
Fixes: 21e6699e5cd6 ("octeontx2-af: Add NPC KPU profile")
Signed-off-by: Michal Mazur <mmazur2@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes CPT_LF_ALLOC mailbox error due to
incompatible mailbox message format. Specifically, it
corrects the `blkaddr` field type from `int` to `u8`.
Fixes: de2854c87c64 ("octeontx2-af: Mailbox changes for 98xx CPT block")
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace slot id with global CPT lf id on reg read/write as
CPTPF/VF driver would send slot number instead of global
lf id in the reg offset. And also update the mailbox response
with the global lf's register offset.
Fixes: ae454086e3c2 ("octeontx2-af: add mailbox interface for CPT")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If we encounter an error on i2c packet transmit, we won't have a valid
flow anymore; since we didn't transmit a valid packet sequence, we'll
have to wait for the key to timeout instead of dropping it on the reply.
This causes the i2c lock to be held for longer than necessary.
Instead, invalidate the flow on TX error, and release the i2c lock
immediately.
Cc: Bonnie Lo <Bonnie_Lo@wiwynn.com>
Tested-by: Jerry C Chen <Jerry_C_Chen@wiwynn.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The response code from user space is only allowed to be SUCCESS or
INVALID. All other values are treated by the device as a response code of
Response Failure according to PCI spec, section 10.4.2.1. This response
disables the Page Request Interface for the Function.
Add a check in iommufd_fault_fops_write() to avoid invalid response
code.
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Link: https://lore.kernel.org/r/20240710083341.44617-3-baolu.lu@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
With the introduction of binder_available_for_proc_work_ilocked() in
commit 1b77e9dcc3da ("ANDROID: binder: remove proc waitqueue") a binder
thread can only "wait_for_proc_work" after its thread->looper has been
marked as BINDER_LOOPER_STATE_{ENTERED|REGISTERED}.
This means an unregistered reader risks waiting indefinitely for work
since it never gets added to the proc->waiting_threads. If there are no
further references to its waitqueue either the task will hang. The same
applies to readers using the (e)poll interface.
I couldn't find the rationale behind this restriction. So this patch
restores the previous behavior of allowing unregistered threads to
"wait_for_proc_work". Note that an error message for this scenario,
which had previously become unreachable, is now re-enabled.
Fixes: 1b77e9dcc3da ("ANDROID: binder: remove proc waitqueue")
Cc: stable@vger.kernel.org
Cc: Martijn Coenen <maco@google.com>
Cc: Arve Hjønnevåg <arve@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240711201452.2017543-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The I.MX SDHCI driver assumes that the frequency of the 'per' clock
can be obtained even on disabled clocks, which is not always the case.
According to 'clk_get_rate' documentation, it is only valid
once the clock source has been enabled.
Signed-off-by: Ciprian Costea <ciprianmarian.costea@oss.nxp.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240708121018.246476-3-ciprianmarian.costea@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
In case of S32G based platforms, GPIO CD used for card detect
wake mechanism is not available.
For this scenario the newly introduced flag
'ESDHC_FLAG_SKIP_CD_WAKE' is used.
Signed-off-by: Ciprian Costea <ciprianmarian.costea@oss.nxp.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240708121018.246476-2-ciprianmarian.costea@oss.nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
There is no reason to restrict access to this attribute, as it merely
reports whether crash elfcorehdr is automatically updated on CPU hot
plug/unplug and/or online/offline events.
Note that since commit 79365026f8694 ("crash: add a new kexec flag for
hotplug support"), this maps to the same flag which is world-accessible
through /sys/devices/system/memory/crash_hotplug.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Acked-by: Baoquan He <bhe@redhat.com>
Acked-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Link: https://lore.kernel.org/r/20240711103409.319673-1-petr.tesarik@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The Lenovo Yoga C630 Embedded Controller is only present on the Qualcomm
Snapdragon-based Lenovo Yoga C630 laptop. Hence add a dependency on
ARCH_QCOM, to prevent asking the user about this driver when configuring
a kernel without Qualcomm SoC support.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/0e4c9ffdc8a5caffcda2afb8d5480900f7adebf6.1720707932.git.geert+renesas@glider.be
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The Acer Aspire 1 Embedded Controller is only present on the Qualcomm
Snapdragon-based Acer Aspire 1 laptop. Hence add a dependency on
ARCH_QCOM, to prevent asking the user about this driver when configuring
a kernel without Qualcomm SoC support.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Nikita Travkin <nikita@trvn.ru>
Link: https://lore.kernel.org/r/f5f38709c01d369ed9e375ceb2a9a12986457a1a.1720707932.git.geert+renesas@glider.be
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
DPI hardware is an on-chip PCIe device on Marvell's arm64 SoC
platforms. As Arnd suggested, CN10K belongs to ARCH_THUNDER
lineage.
Patch makes mrvl_cn10k_dpi driver dependent on CONFIG_ARCH_THUNDER.
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Link: https://lore.kernel.org/r/20240711120115.4069401-1-vattunuru@marvell.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/virtio/virtio_dma_buf.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240602-md-virtio_dma_buf-v1-1-ce602d47e257@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
With ARCH=powerpc, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/uninorth-agp.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://lore.kernel.org/r/20240615-md-powerpc-drivers-char-agp-v1-1-b79bfd07da42@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/hisi-spmi-controller.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/spmi-pmic-arb.o
Add the missing invocations of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240609-md-drivers-spmi-v1-1-f1d5b24e7a66@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The remove callback's return value is about to change from int to void,
this is done by commit 0edb555a65d1 ("platform: Make
platform_driver::remove() return void"). Prepare for merging the patch by
switching the PiSP driver from remove to remove_new callback.
Fixes: 12187bd5d4f8 ("media: raspberrypi: Add support for PiSP BE")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Naushir Patuck <naush@raspberrypi.com>
Acked-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
|
|
While efi_memory_attributes_table_t::entry isn't used directly as an
array, it is used as a base for pointer arithmetic. The type is wrong
as it's not technically an array of efi_memory_desc_t's; they could be
larger. Regardless, leave the type unchanged and remove the old style
"0" array size. Additionally replace the open-coded entry offset code
with the existing efi_memdesc_ptr() helper.
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The "early" part of the helper's name isn't accurate[1]. Drop it in
preparation for adding a new (not early) usage.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/lkml/CAMj1kXEyDjH0uu3Z4eBesV3PEnKGi5ArXXMp7R-hn8HdRytiPg@mail.gmail.com [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
Documentation say that clock internal config needs to be set BEFORE chip
is enabled. Align code to this and move the CHIP enable after the CHIP
is configured.
While at it also make use of STATUS reg and check when STARTUP is
completed instead of sleep for 1-2 ms.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20240712004556.15601-3-ansuelsmth@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Better handle enabling clock internal setting. In further testing it was
notice that internal clock config MUST be set before clock output config
or the LED CHIP might stop working.
This wasn't documented and was actually found on devices that have 2
chip chained where one chip provide clock for the other.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20240712004556.15601-2-ansuelsmth@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Remove extra x from driver name as this was a typo from copy-paste
error.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20240712004556.15601-1-ansuelsmth@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
The to-be-fixed commit rightfully prevented that the registers will be
cleared. However, the index must be cleared. Otherwise a read message
will re-issue the last work. Fix it and add a comment describing the
situation.
Fixes: c422b6a63024 ("i2c: testunit: don't erase registers after STOP")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
block
It would be possible to put any statement in TP_fast_assign().
This commit obsoletes the helper function and put its statements to
TP_fast_assign() for the code simplicity.
Link: https://lore.kernel.org/r/20240712003010.87341-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.10-2024-07-11:
amdgpu:
- PSR-SU fix
- Reseved VMID fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712005534.803064-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Use write-back caching mode for system memory on DGFX (Thomas)
Driver Changes:
- Do not leak object when finalizing hdcp gsc (Nirmoy)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/vgqz35btnxdddko3byrgww5ii36wig2tvondg2p3j3b3ourj4i@rqgolll3wwkh
|
|
Prior to commit dc6fcaaba5a5 ("drm/omap: Allow build with
COMPILE_TEST=y"), it was only possible to build the omapdrm driver with
a 4KB page size. After that change, when the PAGE_SIZE is 64KB or
larger, clang points out that the driver has some assumptions around the
page size implicitly by passing PAGE_SIZE to a parameter with a type of
u16:
drivers/gpu/drm/omapdrm/omap_gem.c:758:7: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
757 | block = tiler_reserve_2d(fmt, omap_obj->width, omap_obj->height,
| ~~~~~~~~~~~~~~~~
758 | PAGE_SIZE);
| ^~~~~~~~~
arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
25 | #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT)
| ~~~~~~~~~~~~~^~~~~~~~~~~~~
drivers/gpu/drm/omapdrm/omap_gem.c:1504:44: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
1504 | block = tiler_reserve_2d(fmts[i], w, h, PAGE_SIZE);
| ~~~~~~~~~~~~~~~~ ^~~~~~~~~
arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
25 | #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT)
| ~~~~~~~~~~~~~^~~~~~~~~~~~~
2 errors generated.
As there is a lot of use of a u16 type throughout this driver and it
will only ever be run on hardware that has a 4KB page size, just
restrict compile testing to when the page size is less than 64KB (as no
other issues have been discussed and it keeps compile testing relatively
more available).
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240620-omapdrm-restrict-compile-test-to-sub-64kb-page-size-v1-1-5e56de71ffca@kernel.org
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Rename xe perf layer as xe observation layer (Ashutosh)
Driver Changes:
- Drop trace_xe_hw_fence_free (Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zo_3ustogPDVKZwu@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
A fix for fbdev on big endian systems, a condition fix for a sharp panel
at removal, and a fix for qxl to prevent unpinned buffer access under
certain conditions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711-benign-rich-mouflon-2eeafe@houat
|
|
When disabling a netconsole target, enabled_store() is called with
enabled=false. Currently, this results in updating the nt->enabled
field twice:
1. Inside the if/else block, with the target_list_lock spinlock held
2. Later, without the target_list_lock
This patch eliminates the redundancy by setting the field only once,
improving efficiency and reducing potential race conditions.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240709144403.544099-3-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The 'enabled' variable is already a bool, so casting it to its value
is redundant.
Remove the superfluous cast, improving code clarity without changing
functionality.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240709144403.544099-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Linux 6.9+ is unable to start a degraded RAID1 array with one drive,
when that drive has a write-mostly flag set. During such an attempt,
the following assertion in bio_split() is hit:
BUG_ON(sectors <= 0);
Call Trace:
? bio_split+0x96/0xb0
? exc_invalid_op+0x53/0x70
? bio_split+0x96/0xb0
? asm_exc_invalid_op+0x1b/0x20
? bio_split+0x96/0xb0
? raid1_read_request+0x890/0xd20
? __call_rcu_common.constprop.0+0x97/0x260
raid1_make_request+0x81/0xce0
? __get_random_u32_below+0x17/0x70
? new_slab+0x2b3/0x580
md_handle_request+0x77/0x210
md_submit_bio+0x62/0xa0
__submit_bio+0x17b/0x230
submit_bio_noacct_nocheck+0x18e/0x3c0
submit_bio_noacct+0x244/0x670
After investigation, it turned out that choose_slow_rdev() does not set
the value of max_sectors in some cases and because of it,
raid1_read_request calls bio_split with sectors == 0.
Fix it by filling in this variable.
This bug was introduced in
commit dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
but apparently hidden until
commit 0091c5a269ec ("md/raid1: factor out helpers to choose the best rdev from read_balance()")
shortly thereafter.
Cc: stable@vger.kernel.org # 6.9.x+
Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()")
Cc: Song Liu <song@kernel.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Paul Luse <paul.e.luse@linux.intel.com>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@o2.pl/
--
Tested on both Linux 6.10 and 6.9.8.
Inside a VM, mdadm testsuite for RAID1 on 6.10 did not find any problems:
./test --dev=loop --no-error --raidtype=raid1
(on 6.9.8 there was one failure, caused by external bitmap support not
compiled in).
Notes:
- I was reliably getting deadlocks when adding / removing devices
on such an array - while the array was loaded with fsstress with 20
concurrent processes. When the array was idle or loaded with fsstress
with 8 processes, no such deadlocks happened in my tests.
This occurred also on unpatched Linux 6.8.0 though, but not on
6.1.97-rc1, so this is likely an independent regression (to be
investigated).
- I was also getting deadlocks when adding / removing the bitmap on the
array in similar conditions - this happened on Linux 6.1.97-rc1
also though. fsstress with 8 concurrent processes did cause it only
once during many tests.
- in my testing, there was once a problem with hot adding an
internal bitmap to the array:
mdadm: Cannot add bitmap while array is resyncing or reshaping etc.
mdadm: failed to set internal bitmap.
even though no such reshaping was happening according to /proc/mdstat.
This seems unrelated, though.
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240711202316.10775-1-mat.jonczyk@o2.pl
|
|
The commit db5e653d7c9f ("md: delay choosing sync action to
md_start_sync()") delays the start of the sync action. In a
clustered environment, this will cause another node to first
activate the spare disk and skip recovery. As a result, no
nodes will perform recovery when a disk is added or re-added.
Before db5e653d7c9f:
```
node1 node2
----------------------------------------------------------------
md_check_recovery
+ md_update_sb
| sendmsg: METADATA_UPDATED
+ md_choose_sync_action process_metadata_update
| remove_and_add_spares //node1 has not finished adding
+ call mddev->sync_work //the spare disk:do nothing
md_start_sync
starts md_do_sync
md_do_sync
+ grabbed resync_lockres:DLM_LOCK_EX
+ do syncing job
md_check_recovery
sendmsg: METADATA_UPDATED
process_metadata_update
//activate spare disk
... ...
md_do_sync
waiting to grab resync_lockres:EX
```
After db5e653d7c9f:
(note: if 'cmd:idle' sets MD_RECOVERY_INTR after md_check_recovery
starts md_start_sync, setting the INTR action will exacerbate the
delay in node1 calling the md_do_sync function.)
```
node1 node2
----------------------------------------------------------------
md_check_recovery
+ md_update_sb
| sendmsg: METADATA_UPDATED
+ calls mddev->sync_work process_metadata_update
//node1 has not finished adding
//the spare disk:do nothing
md_start_sync
+ md_choose_sync_action
| remove_and_add_spares
+ calls md_do_sync
md_check_recovery
md_update_sb
sendmsg: METADATA_UPDATED
process_metadata_update
//activate spare disk
... ... ... ...
md_do_sync
+ grabbed resync_lockres:EX
+ raid1_sync_request skip sync under
conf->fullsync:0
md_do_sync
1. waiting to grab resync_lockres:EX
2. when node1 could grab EX lock,
node1 will skip resync under recovery_offset:MaxSector
```
How to trigger:
```(commands @node1)
# to easily watch the recovery status
echo 2000 > /proc/sys/dev/raid/speed_limit_max
ssh root@node2 "echo 2000 > /proc/sys/dev/raid/speed_limit_max"
mdadm -CR /dev/md0 -l1 -b clustered -n 2 /dev/sda /dev/sdb --assume-clean
ssh root@node2 mdadm -A /dev/md0 /dev/sda /dev/sdb
mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda
mdadm --manage /dev/md0 --add /dev/sdc
=== "cat /proc/mdstat" on both node, there are no recovery action. ===
```
How to fix:
because md layer code logic is hard to restore for speeding up sync job
on local node, we add new cluster msg to pending the another node to
active disk.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Su Yue <glass.su@suse.com>
Acked-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240709104120.22243-2-heming.zhao@suse.com
|
|
The commit 1bbe254e4336 ("md-cluster: check for timeout while a
new disk adding") is correct in terms of code syntax but not
suite real clustered code logic.
When a timeout occurs while adding a new disk, if recv_daemon()
bypasses the unlock for ack_lockres:CR, another node will be waiting
to grab EX lock. This will cause the cluster to hang indefinitely.
How to fix:
1. In dlm_lock_sync(), change the wait behaviour from forever to a
timeout, This could avoid the hanging issue when another node
fails to handle cluster msg. Another result of this change is
that if another node receives an unknown msg (e.g. a new msg_type),
the old code will hang, whereas the new code will timeout and fail.
This could help cluster_md handle new msg_type from different
nodes with different kernel/module versions (e.g. The user only
updates one leg's kernel and monitors the stability of the new
kernel).
2. The old code for __sendmsg() always returns 0 (success) under the
design (must successfully unlock ->message_lockres). This commit
makes this function return an error number when an error occurs.
Fixes: 1bbe254e4336 ("md-cluster: check for timeout while a new disk adding")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Su Yue <glass.su@suse.com>
Acked-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240709104120.22243-1-heming.zhao@suse.com
|
|
To have enough space to write all possible sprintf() args. Currently
'name' size is 16, but the first '%s' specifier may already need at
least 16 characters, since 'bnad->netdev->name' is used there.
For '%d' specifiers, assume that they require:
* 1 char for 'tx_id + tx_info->tcb[i]->id' sum, BNAD_MAX_TXQ_PER_TX is 8
* 2 chars for 'rx_id + rx_info->rx_ctrl[i].ccb->id', BNAD_MAX_RXP_PER_RX
is 16
And replace sprintf with snprintf.
Detected using the static analysis tool - Svace.
Fixes: 8b230ed8ec96 ("bna: Brocade 10Gb Ethernet device driver")
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change caused PSR SU panels to not read from their remote fb,
preventing us from entering self-refresh. It is a regression.
This reverts commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b.
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Remove wrong EIO to EGAIN conversion and pass all errors as is.
After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only
replace F/W specific error codes with Linux kernel generic, all EIO errors
suddenly started to be converted into EAGAIN which leads nvmupdate to retry
until it timeouts and sometimes fails after more than 20 minutes in the
middle of NVM update, so NVM becomes corrupted.
The bug affects users only at the time when they try to update NVM, and
only F/W versions that generate errors while nvmupdate. For example, X710DA2
with 0x8000ECB7 F/W is affected, but there are probably more...
Command for reproduction is just NVM update:
./nvmupdate64
In the log instead of:
i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM)
appears:
i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM
i40e: eeprom check failed (-5), Tx/Rx traffic disabled
The problematic code did silently convert EIO into EAGAIN which forced
nvmupdate to ignore EAGAIN error and retry the same operation until timeout.
That's why NVM update takes 20+ minutes to finish with the fail in the end.
Fixes: 230f3d53a547 ("i40e: remove i40e_status")
Co-developed-by: Kelvin Kang <kelvin.kang@intel.com>
Signed-off-by: Kelvin Kang <kelvin.kang@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.11
Most likely the last "new features" pull request for v6.11 with
changes both in stack and in drivers. The big thing is the multiple
radios for wiphy feature which makes it possible to better advertise
radio capabilities to user space. mt76 enabled MLO and iwlwifi
re-enabled MLO, ath12k and rtw89 Wi-Fi 6 devices got WoWLAN support.
Major changes:
cfg80211/mac80211
* remove DEAUTH_NEED_MGD_TX_PREP flag
* multiple radios per wiphy support
mac80211_hwsim
* multi-radio wiphy support
ath12k
* DebugFS support for datapath statistics
* WCN7850: support for WoW (Wake on WLAN)
* WCN7850: device-tree bindings
ath11k
* QCA6390: device-tree bindings
iwlwifi
* mvm: re-enable Multi-Link Operation (MLO)
* aggregation (A-MSDU) optimisations
rtw89
* preparation for RTL8852BE-VT support
* WoWLAN support for WiFi 6 chips
* 36-bit PCI DMA support
mt76
* mt7925 Multi-Link Operation (MLO) support
* tag 'wireless-next-2024-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (204 commits)
wifi: mac80211: fix AP chandef capturing in CSA
wifi: iwlwifi: correctly reference TSO page information
wifi: mt76: mt792x: fix scheduler interference in drv own process
wifi: mt76: mt7925: enabling MLO when the firmware supports it
wifi: mt76: mt7925: remove the unused mt7925_mcu_set_chan_info
wifi: mt76: mt7925: update mt7925_mac_link_bss_add for MLO
wifi: mt76: mt7925: update mt7925_mcu_bss_basic_tlv for MLO
wifi: mt76: mt7925: update mt7925_mcu_set_timing for MLO
wifi: mt76: mt7925: update mt7925_mcu_sta_phy_tlv for MLO
wifi: mt76: mt7925: update mt7925_mcu_sta_rate_ctrl_tlv for MLO
wifi: mt76: mt7925: add mt7925_mcu_sta_eht_mld_tlv for MLO
wifi: mt76: mt7925: update mt7925_mcu_sta_update for MLO
wifi: mt76: mt7925: update mt7925_mcu_add_bss_info for MLO
wifi: mt76: mt7925: update mt7925_mcu_bss_mld_tlv for MLO
wifi: mt76: mt7925: update mt7925_mcu_sta_mld_tlv for MLO
wifi: mt76: mt7925: add mt7925_[assign,unassign]_vif_chanctx
wifi: mt76: add def_wcid to struct mt76_wcid
wifi: mt76: mt7925: report link information in rx status
wifi: mt76: mt7925: update rate index according to link id
wifi: mt76: mt7925: add link handling in the mt7925_ipv6_addr_change
...
====================
Link: https://patch.msgid.link/20240711102353.0C849C116B1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Series to fix XOR math for DPA to SPA translation
- Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa()
- Complete DPA->HPA->SPA translation and correct XOR translation issue
- Add new method to verify a CXL target position
- Remove old method of CXL target position verifiation
|
|
R-Car Gen3+ needs a reset before every controller transfer. That erases
configuration of a potentially in parallel running local target
instance. To avoid this disruption, avoid controller transfers if a
local target is running. Also, disable SMBusHostNotify because it
requires being a controller and local target at the same time.
Fixes: 3b770017b03a ("i2c: rcar: handle RXDMA HW behaviour on Gen3")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The CXL Spec 3.1 Table 9-22 requires that the BIOS populate the CFMWS
target list in interleave target order. This means the calculations
the CXL driver added to determine positions when XOR math is in use,
along with the entire XOR vs Modulo call back setup is not needed.
A prior patch added a common method to verify positions.
Remove the now unused code related to the cxl_calc_hb_fn.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/2e2c32a2d0f1007e920b58712d15edad2e48d857.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
When a root decoder is configured the interleave target list is read
from the BIOS populated CFMWS structure. Per the CXL spec 3.1 Table
9-22 the target list is in interleave order. The CXL driver populates
its decoder target list in the same order and stores it in 'struct
cxl_switch_decoder' field "@target: active ordered target list in
current decoder configuration"
Given the promise of an ordered list, the driver can stop duplicating
the work of BIOS and simply check target positions against the ordered
list during region configuration.
The simplified check against the ordered list is presented here.
A follow-on patch will remove the unused code.
For Modulo arithmetic this is not a fix, only a simplification.
For XOR arithmetic this is a fix for HB IW of 3,6,12.
Fixes: f9db85bfec0d ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/35d08d3aba08fee0f9b86ab1cef0c25116ca8a55.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
When a device reports a DPA in events like poison, general_media,
and dram, the driver translates that DPA back to an HPA. Presently,
the CXL driver translation only considers the Modulo position and
will report the wrong HPA for XOR configured root decoders.
Add a helper function that restores the XOR'd bits during DPA->HPA
address translation. Plumb a root decoder callback to the new helper
when XOR interleave arithmetic is in use. For Modulo arithmetic, just
let the callback be NULL - as in no extra work required.
Upon completion of a DPA->HPA translation a couple of checks are
performed on the result. One simply confirms that the calculated
HPA is within the address range of the region. That test is useful
for both Modulo and XOR interleave arithmetic decodes.
A second check confirms that the HPA is within an expected chunk
based on the endpoints position in the region and the region
granularity. An XOR decode disrupts the Modulo pattern making the
chunk check useless.
To align the checks with the proper decode, pull the region range
check inline and use the helper to do the chunk check for Modulo
decodes only.
A cxl-test unit test is posted for upstream review here:
https://lore.kernel.org/20240624210644.495563-1-alison.schofield@intel.com/
Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/1a1ac880d9f889bd6384e657e810431b9a0a72e5.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA
addresses the work it performs is a DPA to HPA translation not a
trace. Tidy up this naming by moving the minimal work done in
cxl_trace_hpa() into cxl_dpa_to_hpa() and use cxl_dpa_to_hpa()
for trace event callbacks.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/452a9b0c525b774c72d9d5851515ffa928750132.1719980933.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:
- Fix broken discard for device mapper VDO target
* tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm vdo: replace max_discard_sectors with max_hw_discard_sectors
|
|
When the PCI devres API was introduced to this driver, it was wrongly
assumed that initializing the device with pcim_enable_device() instead of
pci_enable_device() will make all PCI functions managed.
This is wrong and was caused by the quite confusing PCI devres API in which
some, but not all, functions become managed that way.
The function pci_iomap_range() is never managed.
Replace pci_iomap_range() with the managed function pcim_iomap_range().
Fixes: 8558de401b5f ("drm/vboxvideo: use managed pci functions")
Link: https://lore.kernel.org/r/20240613115032.29098-14-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
|
|
The only managed mapping function currently is pcim_iomap() which doesn't
allow for mapping an area starting at a certain offset, which many drivers
want.
Add pcim_iomap_range() as an exported function.
Link: https://lore.kernel.org/r/20240613115032.29098-13-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Thanks to preceding cleanup steps, pcim_release() is now not needed
anymore and can be replaced by pcim_disable_device(), which is the exact
counterpart to pcim_enable_device().
This permits removing further parts of the old PCI devres implementation.
Replace pcim_release() with pcim_disable_device(). Remove the now unused
function get_pci_dr(). Remove the struct pci_devres from pci.h.
Link: https://lore.kernel.org/r/20240613115032.29098-12-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|