Age | Commit message (Collapse) | Author |
|
This issue was found and also debugged by Carl Lei <me@xecycle.info>.
If the device is not enabled, iblock/file will have not setup their
se_device to bdev/file mappings. If a user tries to config the unmap
settings at this time, we will then crash trying to access a NULL pointer
where the bdev/file should be.
This patch adds a check to make sure the device is configured before
we try to call the configure_unmap callout.
Fixes: 34bd1dcacf0d ("scsi: target: Detect UNMAP support post configuration")
Reported-by: Carl Lei <me@xecycle.info>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20240209215247.5213-1-michael.christie@oracle.com
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The VFs don't run the health thread, so don't try to
stop or restart the non-existent timer or work item.
Fixes: d9407ff11809 ("pds_core: Prevent health thread from running during reset/remove")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240210002002.49483-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We should be doing as little as possible besides freeing Tx
space when our napi routines are called with budget of 0, so
jump out before doing anything besides Tx cleaning.
See commit afbed3f74830 ("net/mlx5e: do as little as possible in napi poll when budget is 0")
for more info.
Fixes: fe8c30b50835 ("ionic: separate interrupt for Tx and Rx")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240210001307.48450-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit introduces and uses the string constants dpp_tx_err and
dpp_rx_err. These are assigned to constant fields of the array
dwxgmac3_error_desc.
It has been reported that on GCC 6 and 7.5.0 this results in warnings
such as:
.../dwxgmac2_core.c:836:20: error: initialiser element is not constant
{ true, "TDPES0", dpp_tx_err },
I have been able to reproduce this using: GCC 7.5.0, 8.4.0, 9.4.0 and 10.5.0.
But not GCC 13.2.0.
So it seems this effects older compilers but not newer ones.
As Jon points out in his report, the minimum compiler supported by
the kernel is GCC 5.1, so it does seem that this ought to be fixed.
It is not clear to me what combination of 'const', if any, would address
this problem. So this patch takes of using #defines for the string
constants
Compile tested only.
Fixes: 46eba193d04f ("net: stmmac: xgmac: fix handling of DPP safety error for DMA channels")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/netdev/c25eb595-8d91-40ea-9f52-efa15ebafdbc@nvidia.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402081135.lAxxBXHk-lkp@intel.com/
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240208-xgmac-const-v1-1-e69a1eeabfc8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Seth reported that on his side XDP traffic can not survive a round of
down/up against i40e interface. Dmesg output was telling us that we were
not able to disable the very first XDP ring. That was due to the fact
that in i40e_vsi_stop_rings() in a pre-work that is done before calling
i40e_vsi_wait_queues_disabled(), XDP Tx queues were not taken into the
account.
To fix this, let us distinguish between Rx and Tx queue boundaries and
take into the account XDP queues for Tx side.
Reported-by: Seth Forshee <sforshee@kernel.org>
Closes: https://lore.kernel.org/netdev/ZbkE7Ep1N1Ou17sA@do-x1extreme/
Fixes: 65662a8dcdd0 ("i40e: Fix logic of disabling queues")
Tested-by: Seth Forshee <sforshee@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently, when interface is being brought down and
i40e_vsi_stop_rings() is called, i40e_pf_rxq_wait() is called two times,
which is wrong. To showcase this scenario, simplified call stack looks
as follows:
i40e_vsi_stop_rings()
i40e_control wait rx_q()
i40e_control_rx_q()
i40e_pf_rxq_wait()
i40e_vsi_wait_queues_disabled()
i40e_pf_rxq_wait() // redundant call
To fix this, let us s/i40e_control_wait_rx_q/i40e_control_rx_q within
i40e_vsi_stop_rings().
Fixes: 65662a8dcdd0 ("i40e: Fix logic of disabling queues")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Mask used for clearing PRTDCB_RETSTCC register in function
i40e_dcb_hw_rx_ets_bw_config() is incorrect as there is used
define I40E_PRTDCB_RETSTCC_ETSTC_SHIFT instead of define
I40E_PRTDCB_RETSTCC_ETSTC_MASK.
The PRTDCB_RETSTCC register is used to configure whether ETS
or strict priority is used as TSA in Rx for particular TC.
In practice it means that once the register is set to use ETS
as TSA then it is not possible to switch back to strict priority
without CoreR reset.
Fix the value in the clearing mask.
Fixes: 90bc8e003be2 ("i40e: Add hardware configuration for software based DCB")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The function i40e_pf_wait_queues_disabled() iterates all PF's VSIs
up to 'pf->hw.func_caps.num_vsis' but this is incorrect because
the real number of VSIs can be up to 'pf->num_alloc_vsi' that
can be higher. Fix this loop.
Fixes: 69129dc39fac ("i40e: Modify Tx disable wait flow in case of DCB reconfiguration")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently when PF administratively sets VF's MAC address and the VF
is put down (VF tries to delete all MACs) then the MAC is removed
from MAC filters and primary VF MAC is zeroed.
Do not allow untrusted VF to remove primary MAC when it was set
administratively by PF.
Reproducer:
1) Create VF
2) Set VF interface up
3) Administratively set the VF's MAC
4) Put VF interface down
[root@host ~]# echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
[root@host ~]# ip link set enp2s0f0v0 up
[root@host ~]# ip link set enp2s0f0 vf 0 mac fe:6c:b5:da:c7:7d
[root@host ~]# ip link show enp2s0f0
23: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 3c:ec:ef:b7:dd:04 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether fe:6c:b5:da:c7:7d brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
[root@host ~]# ip link set enp2s0f0v0 down
[root@host ~]# ip link show enp2s0f0
23: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 3c:ec:ef:b7:dd:04 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
Fixes: 700bbf6c1f9e ("i40e: allow VF to remove any MAC filter")
Fixes: ceb29474bbbc ("i40e: Add support for VF to specify its primary MAC address")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240208180335.1844996-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 2e85d32b1c86 ("xen/xenbus: Add 'will_handle' callback support in
xenbus_watch_path()") added will_handle argument to xenbus_watch_path()
and its wrapper, xenbus_watch_pathfmt(), but didn't document it on the
kerneldoc comments of the function. This is causing warnings that
reported by kernel test robot. Add the documentation to fix it.
Fixes: 2e85d32b1c86 ("xen/xenbus: Add 'will_handle' callback support in xenbus_watch_path()")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401121154.FI8jDGun-lkp@intel.com/
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20240112185903.83737-1-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
For i2c read operation in GSI mode, we are getting timeout
due to malformed TRE basically incorrect TRE sequence
in gpi(drivers/dma/qcom/gpi.c) driver.
I2C driver has geni_i2c_gpi(I2C_WRITE) function which generates GO TRE and
geni_i2c_gpi(I2C_READ)generates DMA TRE. Hence to generate GO TRE before
DMA TRE, we should move geni_i2c_gpi(I2C_WRITE) before
geni_i2c_gpi(I2C_READ) inside the I2C GSI mode transfer function
i.e. geni_i2c_gpi_xfer().
TRE stands for Transfer Ring Element - which is basically an element with
size of 4 words. It contains all information like slave address,
clk divider, dma address value data size etc).
Mainly we have 3 TREs(Config, GO and DMA tre).
- CONFIG TRE : consists of internal register configuration which is
required before start of the transfer.
- DMA TRE : contains DDR/Memory address, called as DMA descriptor.
- GO TRE : contains Transfer directions, slave ID, Delay flags, Length
of the transfer.
I2c driver calls GPI driver API to config each TRE depending on the
protocol.
For read operation tre sequence will be as below which is not aligned
to hardware programming guide.
- CONFIG tre
- DMA tre
- GO tre
As per Qualcomm's internal Hardware Programming Guide, we should configure
TREs in below sequence for any RX only transfer.
- CONFIG tre
- GO tre
- DMA tre
Fixes: d8703554f4de ("i2c: qcom-geni: Add support for GPI DMA")
Reviewed-by: Andi Shyti <andi.shyti@kernel.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # qrb5165-rb5
Co-developed-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>
Signed-off-by: Mukesh Kumar Savaliya <quic_msavaliy@quicinc.com>
Signed-off-by: Viken Dadhaniya <quic_vdadhani@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
|
|
The conversion to kvcalloc() mixed up the object size and count
arguments, causing a warning:
drivers/gpu/drm/nouveau/nouveau_svm.c: In function 'nouveau_svm_fault_buffer_ctor':
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: error: 'kvcalloc' sizes specified with 'sizeof' in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
1010 | buffer->fault = kvcalloc(sizeof(*buffer->fault), buffer->entries, GFP_KERNEL);
| ^
drivers/gpu/drm/nouveau/nouveau_svm.c:1010:40: note: earlier argument should specify number of elements, later size of each element
The behavior is still correct aside from the warning, but fixing it avoids
the warnings and can help the compiler track the individual objects better.
Fixes: 71e4bbca070e ("nouveau/svm: Use kvcalloc() instead of kvzalloc()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240212112230.1117284-1-arnd@kernel.org
|
|
During the cache sync test we verify that values we expect to have been
written only to the cache do not appear in the hardware. This works most
of the time but since we randomly generate both the original and new values
there is a low probability that these values may actually be the same.
Wrap get_random_bytes() to ensure that the values are different, there
are other tests which should have similar verification that we actually
changed something.
While we're at it refactor the test to use three changed values rather
than attempting to use one of them twice, that just complicates checking
that our new values are actually new.
We use random generation to try to avoid data dependencies in the tests.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://msgid.link/r/20240211-regmap-kunit-random-change-v3-1-e387a9ea4468@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add Intel Lunar Lake-M PCI ID to the driver list of supported devices.
This is the same controller found in previous generations.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://msgid.link/r/20240212082027.2462849-1-mika.westerberg@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
MCSPI controller have few limitations regarding the transaction
size when the FIFO buffer is enabled and the WCNT feature is used
to find the end of word, in this case if WCNT is not a multiple of
the FIFO Almost Empty Level (AEL), then the FIFO empty event is not
generated correctly. In addition to this limitation, few other unknown
sequence of events that causes the FIFO empty status to not reflect the
exact status were found when FIFO is being used without DMA enabled
during extended testing in AM65x platform. Till the exact root cause
is found and fixed, revert the FIFO support without DMA.
See J721E Technical Reference Manual (SPRUI1C), section 12.1.5
for further details: http://www.ti.com/lit/pdf/spruil1
This reverts commit 75223bbea840e ("spi: omap2-mcspi: Add FIFO support
without DMA")
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Link: https://msgid.link/r/20240212120049.438495-1-vaishnav.a@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Omit to create scheduler instances when using the legacy uAPI. When
using the legacy NOUVEAU_GEM_PUSHBUF ioctl no scheduler instance is
required, hence omit creating scheduler instances in
nouveau_abi16_ioctl_channel_alloc().
Tested-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202000606.3526-2-dakr@redhat.com
|
|
nouveau_abi16_ioctl_channel_alloc() and nouveau_cli_init() simply call
their corresponding *_fini() counterpart. This can lead to
nouveau_sched_fini() being called without struct nouveau_sched ever
being initialized in the first place.
Instead of embedding struct nouveau_sched into struct nouveau_cli and
struct nouveau_chan_abi16, allocate struct nouveau_sched separately,
such that we can check for the corresponding pointer to be NULL in the
particular *_fini() functions.
It makes sense to allocate struct nouveau_sched separately anyway, since
in a subsequent commit we can also avoid to allocate a struct
nouveau_sched in nouveau_abi16_ioctl_channel_alloc() at all, if the
VM_BIND uAPI has been disabled due to the legacy uAPI being used.
Fixes: 5f03a507b29e ("drm/nouveau: implement 1:1 scheduler - entity relationship")
Reported-by: Timur Tabi <ttabi@nvidia.com>
Tested-by: Timur Tabi <ttabi@nvidia.com>
Closes: https://lore.kernel.org/nouveau/20240131213917.1545604-1-ttabi@nvidia.com/
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202000606.3526-1-dakr@redhat.com
|
|
Limit the link rate to HBR3 or below (<=8.1Gbps) in SST mode.
UHBR (10Gbps+) link rates require 128b/132b channel encoding
which we have not yet hooked up into the SST/no-sideband codepaths.
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208154552.14545-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 6061811d72e14f41f71b6a025510920b187bfcca)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Commit bd077259d0a9 ("drm/i915/vdsc: Add function to read any PPS
register") defines a new macro to calculate the DSC PPS register
addresses with PPS number as an input. This macro correctly calculates
the addresses till PPS 11 since the addresses increment by 4. So in that
case the following macro works correctly to give correct register
address:
_MMIO(_DSCA_PPS_0 + (pps) * 4)
However after PPS 11, the register address for PPS 12 increments by 12
because of RC Buffer memory allocation in between. Because of this
discontinuity in the address space, the macro calculates wrong addresses
for PPS 12 - 16 resulting into incorrect DSC PPS parameter value
read/writes causing DSC corruption.
This fixes it by correcting this macro to add the offset of 12 for PPS
>=12.
v3: Add correct paranthesis for pps argument (Jani Nikula)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10172
Fixes: bd077259d0a9 ("drm/i915/vdsc: Add function to read any PPS register")
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Sean Paul <sean@poorly.run>
Cc: Drew Davenport <ddavenport@chromium.org>
Signed-off-by: Manasi Navare <navaremanasi@chromium.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240205204619.1991673-1-navaremanasi@chromium.org
(cherry picked from commit 6074be620c31dc2ae11af96a1a5ea95580976fb5)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Issue IP reset before shutdown in order to
complete all upstream requests to the SOC.
Without this DevTLB is complaining about
incomplete transactions and NPU cannot resume from
suspend.
This problem is only happening on recent IFWI
releases.
IP reset in rare corner cases can mess up PCI
configuration, so save it before the reset.
After this happens it is also impossible to
issue PLL requests and D0->D3->D0 cycle is needed
to recover the NPU. Add WP 0 request on power up,
so the PUNIT is always notified about NPU reset.
Use D0/D3 cycle for recovery as it can recover
from failed IP reset and FLR cannot.
Fixes: 3f7c0634926d ("accel/ivpu/37xx: Fix hangs related to MMIO reset")
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240207102446.3126981-1-jacek.lawrynowicz@linux.intel.com
|
|
Since commit 24778be20f87 ("spi: convert drivers to use
bits_per_word_mask") the bits_per_word variable is only written to. The
check that was there before isn't needed any more as the spi core
ensures that only 8 bit transfers are used, so the variable can go away
together with all assignments to it.
Fixes: 24778be20f87 ("spi: convert drivers to use bits_per_word_mask")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20240210164006.208149-8-u.kleine-koenig@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
I failed to adapt this driver because it's not enabled in a powerpc
allmodconfig build and also wasn't hit by my grep expertise. Fix
accordingly.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402100815.XQXw9XCF-lkp@intel.com/
Fixes: 2259233110d9 ("spi: bitbang: Follow renaming of SPI "master" to "controller"")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20240210164006.208149-7-u.kleine-koenig@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver uses several symbols declared in <linux/platform_device.h>,
e.g module_platform_driver(). Include this header explicitly now that
<linux/of_platform.h> doesn't include <linux/platform_device.h> any
more.
Fixes: ef175b29a242 ("of: Stop circularly including of_device.h and of_platform.h")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20240210164006.208149-6-u.kleine-koenig@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Currently, GPIO_CTRL bits are set even if the pins are used for
measurements.
GPIO_CTRL bits should only be set if the pin is not used for
other functionality.
Fix this by only setting the GPIO_CTRL bits if the pin has no
other function.
Fixes: 62094060cf3a ("iio: adc: ad4130: add AD4130 driver")
Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20240207132007.253768-2-demonsingur@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
The clk_init_data struct does not have all its members
initialized, causing issues when trying to expose the internal
clock on the CLK pin.
Fix this by zero-initializing the clk_init_data struct.
Fixes: 62094060cf3a ("iio: adc: ad4130: add AD4130 driver")
Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com>
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20240207132007.253768-1-demonsingur@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Update a potentially stale firmware attribute (Maurizio)
- Fixes for the recent verbose error logging (Keith, Chaitanya)
- Protection information payload size fix for passthrough (Francis)
- Fix for a queue freezing issue in virtblk (Yi)
- blk-iocost underflow fix (Tejun)
- blk-wbt task detection fix (Jan)
* tag 'block-6.8-2024-02-10' of git://git.kernel.dk/linux:
virtio-blk: Ensure no requests in virtqueues before deleting vqs.
blk-iocost: Fix an UBSAN shift-out-of-bounds warning
nvme: use ns->head->pi_size instead of t10_pi_tuple structure size
nvme-core: fix comment to reflect right functions
nvme: move passthrough logging attribute to head
blk-wbt: Fix detection of dirty-throttled tasks
nvme-host: fix the updating of the firmware version
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
"A change to accelerate the device detection step in some cases.
In the self-identification step after bus-reset, all nodes in the same
bus broadcast selfID packet including the value of gap count. The
value is related to the cable hops between nodes, and used to
calculate the subaction gap and the arbitration reset gap.
When each node has the different value of the gap count, the
asynchronous communication between them is unreliable, since an
asynchronous transaction could be interrupted by another asynchronous
transaction before completion. The gap count inconsistency can be
resolved by several ways; e.g. the transfer of PHY configuration
packet and generation of bus-reset.
The current implementation of firewire stack can correctly detect the
gap count inconsistency, however the recovery action from the
inconsistency tends to be delayed after reading configuration ROM of
root node. This results in the long time to probe devices in some
combinations of hardware.
Here the stack is changed to schedule the action as soon as possible"
* tag 'firewire-fixes-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: core: send bus reset promptly on gap count error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small driver fixes and one core fix.
The core fix being a fixup to the one in the last pull request which
didn't entirely move checking of scsi_host_busy() out from under the
host lock"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Remove the ufshcd_release() in ufshcd_err_handling_prepare()
scsi: ufs: core: Fix shift issue in ufshcd_clear_cmd()
scsi: lpfc: Use unsigned type for num_sge
scsi: core: Move scsi_host_busy() out of host lock if it is for per-command
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the DSA loopback fixed PHY module.
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20240208164244.3818498-10-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the IP-VLAN based tap driver.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240208164244.3818498-9-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a crash when adding one of the lan966x interfaces under a lag
interface. The issue can be reproduced like this:
ip link add name bond0 type bond miimon 100 mode balance-xor
ip link set dev eth0 master bond0
The reason is because when adding a interface under the lag it would go
through all the ports and try to figure out which other ports are under
that lag interface. And the issue is that lan966x can have ports that are
NULL pointer as they are not probed. So then iterating over these ports
it would just crash as they are NULL pointers.
The fix consists in actually checking for NULL pointers before accessing
something from the ports. Like we do in other places.
Fixes: cabc9d49333d ("net: lan966x: Add lag support for lan966x")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206123054.3052966-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling. In fact, only
devices in low power states should be polled.
However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present. It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().
Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.
This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.
Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com
Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S <sanath.s@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sanath S <sanath.s@amd.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
"The only notable change here is the patch that changes the way we deal
with spurious errors from the EFI memory attribute protocol. This will
be backported to v6.6, and is intended to ensure that we will not
paint ourselves into a corner when we tighten this further in order to
comply with MS requirements on signed EFI code.
Note that this protocol does not currently exist in x86 production
systems in the field, only in Microsoft's fork of OVMF, but it will be
mandatory for Windows logo certification for x86 PCs in the future.
- Tighten ELF relocation checks on the RISC-V EFI stub
- Give up if the new EFI memory attributes protocol fails spuriously
on x86
- Take care not to place the kernel in the lowest 16 MB of DRAM on
x86
- Omit special purpose EFI memory from memblock
- Some fixes for the CXL CPER reporting code
- Make the PE/COFF layout of mixed-mode capable images comply with a
strict interpretation of the spec"
* tag 'efi-fixes-for-v6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section
cxl/trace: Remove unnecessary memcpy's
cxl/cper: Fix errant CPER prints for CXL events
efi: Don't add memblocks for soft-reserved memory
efi: runtime: Fix potential overflow of soft-reserved region size
efi/libstub: Add one kernel-doc comment
x86/efistub: Avoid placing the kernel below LOAD_PHYSICAL_ADDR
x86/efistub: Give up if memory attribute protocol returns an error
riscv/efistub: Tighten ELF relocation check
riscv/efistub: Ensure GP-relative addressing is not used
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Fix an unintentional truncation of DWC MSI-X address to 32 bits and
update similar MSI code to match (Dan Carpenter)
* tag 'pci-v6.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: dwc: Clean up dw_pcie_ep_raise_msi_irq() alignment
PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- coretemp: Various fixes, and increase number of supported CPU cores
- aspeed-pwm-tacho: Add missing mutex protection
* tag 'hwmon-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (coretemp) Enlarge per package core count limit
hwmon: (coretemp) Fix bogus core_id to attr name mapping
hwmon: (coretemp) Fix out-of-bounds memory access
hwmon: (aspeed-pwm-tacho) mutex for tach reading
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Allow non-sleeping read-only slot-gpio
MMC host:
- sdhci-pci-o2micro: Fix a warm reboot BIOS issue"
* tag 'mmc-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: slot-gpio: Allow non-sleeping GPIO ro
mmc: sdhci-pci-o2micro: Fix a warm reboot issue that disk can't be detected by BIOS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"Core:
- Move the unused cleanup to a _sync initcall
Providers:
- mediatek: Fix race conditions at probe/remove with genpd
- renesas: r8a77980-sysc: CR7 must be always on"
* tag 'pmdomain-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: mediatek: fix race conditions with genpd
pmdomain: renesas: r8a77980-sysc: CR7 must be always on
pmdomain: core: Move the unused cleanup to a _sync initcall
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
- remove the new GPIO device from the global list unconditionally in
error path in core GPIOLIB
* tag 'gpio-fixes-for-v6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: remove GPIO device from the list unconditionally in error path
|
|
Pull drm fixes from Dave Airlie:
"Regular weekly fixes, xe, amdgpu and msm are most of them, with some
misc in i915, ivpu and nouveau, scattered but nothing too intense at
this point.
i915:
- gvt: docs fix, uninit var, MAINTAINERS
ivpu:
- add aborted job status
- disable d3 hot delay
- mmu fixes
nouveau:
- fix gsp rpc size request
- fix dma buffer leaks
- use common code for gsp mem ctor
xe:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics
amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix
msm:
- DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
- DP:
- fixes to handle unknown bpc case correctly for DP
- fix for MISC0 programming
- GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable"
* tag 'drm-fixes-2024-02-09' of git://anongit.freedesktop.org/drm/drm: (43 commits)
drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
drm/xe/vm: don't ignore error when in_kthread
drm/xe: Assume large page size if VMA not yet bound
drm/xe/display: Fix memleak in display initialization
drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
drm/xe: circumvent bogus stringop-overflow warning
drm/xe: Pick correct userptr VMA to repin on REMAP op failure
drm/xe: Take a reference in xe_exec_queue_last_fence_get()
drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
drm/amd/display: Implement bounds check for stream encoder creation in DCN301
drm/amd/display: Increase frame-larger-than for all display_mode_vba files
drm/amd/display: Clear phantom stream count and plane count
drm/amdgpu: Avoid fetching VRAM vendor info
drm/amd/display: Disable ODM by default for DCN35
drm/amd/display: Update phantom pipe enable / disable sequence
drm/amd/display: Fix MST Null Ptr for RV
drm/amdgpu: Fix shared buff copy to user
drm/amd/display: Increase eval/entry delay for DCN35
drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
...
|
|
AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
report incorrect child count. The failing crosspoints report 8 children
while they only have two.
When the driver tries to access the inexistent child nodes, it believes it
has reached an invalid node type and probing fails. The workaround is to
ignore those incorrect child nodes and continue normally.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
[ rm: rewrote simpler generalised version ]
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
S2M NDR BI-ConflictAck opcode is described as 4 in the CXL
r3.0 3.3.9 Table 3.43. However, it is defined as 3 in macro definition.
Fixes: 5d7107c72796 ("perf: CXL Performance Monitoring Unit driver")
Signed-off-by: Hojin Nam <hj96.nam@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240208013415epcms2p2904187c8a863f4d0d2adc980fb91a2dc@epcms2p2
Signed-off-by: Will Deacon <will@kernel.org>
|
|
the driver currently fails to compile on 6.8-rc3 due to:
| spi-ppc4xx.c: In function ‘spi_ppc4xx_of_probe’:
| @346:36: error: invalid use of undefined type ‘struct platform_device’
| 346 | struct device_node *np = op->dev.of_node;
| | ^~
| ... (more similar errors)
it was working with 6.7. Looks like it only needed the include
and its compiling fine!
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Link: https://lore.kernel.org/r/3eb3f9c4407ba99d1cd275662081e46b9e839173.1707490664.git.chunkeey@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Similar to the existing "ports" node name, coresight device tree bindings
have added "in-ports" and "out-ports" as standard node names for a
collection of ports.
Add support for these name to of_graph_get_port_parent() so that
remote-endpoint parsing can find the correct parent node for these
coresight ports too.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-4-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
remote-endpoint properties created a fwnode link from the consumer device
to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
trying to create device links or detecting cycles. So, improve this the
same way we improved finding the consumer of a remote-endpoint property.
Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-3-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
We have a more accurate function to find the right consumer of a
remote-endpoint property instead of searching for a parent with
compatible string property. So, use that instead. While at it, make the
code to find the consumer a bit more flexible and based on the property
being parsed.
Fixes: f7514a663016 ("of: property: fw_devlink: Add support for remote-endpoint")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20240207011803.2637531-2-saravanak@google.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
This reverts commit 398aa9a7e77cf23c2a6f882ddd3dcd96f21771dc.
The update to the gadget API to support EBC feature is incomplete. It's
missing at least the following:
* New usage documentation
* Gadget capability check
* Condition for the user to check how many and which endpoints can be
used as "fifo_mode"
* Description of how it can affect completed request (e.g. dwc3 won't
update TRB on completion -- ie. how it can affect request's actual
length report)
Let's revert this until it's ready.
Fixes: 398aa9a7e77c ("usb: dwc3: Support EBC feature of DWC_usb31")
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/3042f847ff904b4dd4e4cf66a1b9df470e63439e.1707441690.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Do not report the XDP capability NETDEV_XDP_ACT_XSK_ZEROCOPY as the
bonding driver does not support XDP and AF_XDP in zero-copy mode even
if the real NIC drivers do.
Note that the driver used to report everything as supported before a
device was bonded. Instead of just masking out the zero-copy support
from this, have the driver report that no XDP feature is supported
until a real device is bonded. This seems to be more truthful as it is
the real drivers that decide what XDP features are supported.
Fixes: cb9e6e584d58 ("bonding: add xdp_features support")
Reported-by: Prashant Batra <prbatra.mail@gmail.com>
Link: https://lore.kernel.org/all/CAJ8uoz2ieZCopgqTvQ9ZY6xQgTbujmC6XkMTamhp68O-h_-rLg@mail.gmail.com/T/
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20240207084737.20890-1-magnus.karlsson@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I managed to hit following use after free warning recently:
[ 2169.711665] ==================================================================
[ 2169.714009] BUG: KASAN: slab-use-after-free in __run_timers.part.0+0x179/0x4c0
[ 2169.716293] Write of size 8 at addr ffff88812b326a70 by task swapper/4/0
[ 2169.719022] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 6.8.0-rc2jiri+ #2
[ 2169.720974] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 2169.722457] Call Trace:
[ 2169.722756] <IRQ>
[ 2169.723024] dump_stack_lvl+0x58/0xb0
[ 2169.723417] print_report+0xc5/0x630
[ 2169.723807] ? __virt_addr_valid+0x126/0x2b0
[ 2169.724268] kasan_report+0xbe/0xf0
[ 2169.724667] ? __run_timers.part.0+0x179/0x4c0
[ 2169.725116] ? __run_timers.part.0+0x179/0x4c0
[ 2169.725570] __run_timers.part.0+0x179/0x4c0
[ 2169.726003] ? call_timer_fn+0x320/0x320
[ 2169.726404] ? lock_downgrade+0x3a0/0x3a0
[ 2169.726820] ? kvm_clock_get_cycles+0x14/0x20
[ 2169.727257] ? ktime_get+0x92/0x150
[ 2169.727630] ? lapic_next_deadline+0x35/0x60
[ 2169.728069] run_timer_softirq+0x40/0x80
[ 2169.728475] __do_softirq+0x1a1/0x509
[ 2169.728866] irq_exit_rcu+0x95/0xc0
[ 2169.729241] sysvec_apic_timer_interrupt+0x6b/0x80
[ 2169.729718] </IRQ>
[ 2169.729993] <TASK>
[ 2169.730259] asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 2169.730755] RIP: 0010:default_idle+0x13/0x20
[ 2169.731190] Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 9a 7f 1f 02 85 c0 7e 07 0f 00 2d cf 69 43 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 c0 93 04 00
[ 2169.732759] RSP: 0018:ffff888100dbfe10 EFLAGS: 00000242
[ 2169.733264] RAX: 0000000000000001 RBX: ffff888100d9c200 RCX: ffffffff8241bd62
[ 2169.733925] RDX: ffffed109a848b15 RSI: 0000000000000004 RDI: ffffffff8127ac55
[ 2169.734566] RBP: 0000000000000004 R08: 0000000000000000 R09: ffffed109a848b14
[ 2169.735200] R10: ffff8884d42458a3 R11: 000000000000ba7e R12: ffffffff83d7d3a0
[ 2169.735835] R13: 1ffff110201b7fc6 R14: 0000000000000000 R15: ffff888100d9c200
[ 2169.736478] ? ct_kernel_exit.constprop.0+0xa2/0xc0
[ 2169.736954] ? do_idle+0x285/0x290
[ 2169.737323] default_idle_call+0x63/0x90
[ 2169.737730] do_idle+0x285/0x290
[ 2169.738089] ? arch_cpu_idle_exit+0x30/0x30
[ 2169.738511] ? mark_held_locks+0x1a/0x80
[ 2169.738917] ? lockdep_hardirqs_on_prepare+0x12e/0x200
[ 2169.739417] cpu_startup_entry+0x30/0x40
[ 2169.739825] start_secondary+0x19a/0x1c0
[ 2169.740229] ? set_cpu_sibling_map+0xbd0/0xbd0
[ 2169.740673] secondary_startup_64_no_verify+0x15d/0x16b
[ 2169.741179] </TASK>
[ 2169.741686] Allocated by task 1098:
[ 2169.742058] kasan_save_stack+0x1c/0x40
[ 2169.742456] kasan_save_track+0x10/0x30
[ 2169.742852] __kasan_kmalloc+0x83/0x90
[ 2169.743246] mlx5_dpll_probe+0xf5/0x3c0 [mlx5_dpll]
[ 2169.743730] auxiliary_bus_probe+0x62/0xb0
[ 2169.744148] really_probe+0x127/0x590
[ 2169.744534] __driver_probe_device+0xd2/0x200
[ 2169.744973] device_driver_attach+0x6b/0xf0
[ 2169.745402] bind_store+0x90/0xe0
[ 2169.745761] kernfs_fop_write_iter+0x1df/0x2a0
[ 2169.746210] vfs_write+0x41f/0x790
[ 2169.746579] ksys_write+0xc7/0x160
[ 2169.746947] do_syscall_64+0x6f/0x140
[ 2169.747333] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 2169.748049] Freed by task 1220:
[ 2169.748393] kasan_save_stack+0x1c/0x40
[ 2169.748789] kasan_save_track+0x10/0x30
[ 2169.749188] kasan_save_free_info+0x3b/0x50
[ 2169.749621] poison_slab_object+0x106/0x180
[ 2169.750044] __kasan_slab_free+0x14/0x50
[ 2169.750451] kfree+0x118/0x330
[ 2169.750792] mlx5_dpll_remove+0xf5/0x110 [mlx5_dpll]
[ 2169.751271] auxiliary_bus_remove+0x2e/0x40
[ 2169.751694] device_release_driver_internal+0x24b/0x2e0
[ 2169.752191] unbind_store+0xa6/0xb0
[ 2169.752563] kernfs_fop_write_iter+0x1df/0x2a0
[ 2169.753004] vfs_write+0x41f/0x790
[ 2169.753381] ksys_write+0xc7/0x160
[ 2169.753750] do_syscall_64+0x6f/0x140
[ 2169.754132] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[ 2169.754847] Last potentially related work creation:
[ 2169.755315] kasan_save_stack+0x1c/0x40
[ 2169.755709] __kasan_record_aux_stack+0x9b/0xf0
[ 2169.756165] __queue_work+0x382/0x8f0
[ 2169.756552] call_timer_fn+0x126/0x320
[ 2169.756941] __run_timers.part.0+0x2ea/0x4c0
[ 2169.757376] run_timer_softirq+0x40/0x80
[ 2169.757782] __do_softirq+0x1a1/0x509
[ 2169.758387] Second to last potentially related work creation:
[ 2169.758924] kasan_save_stack+0x1c/0x40
[ 2169.759322] __kasan_record_aux_stack+0x9b/0xf0
[ 2169.759773] __queue_work+0x382/0x8f0
[ 2169.760156] call_timer_fn+0x126/0x320
[ 2169.760550] __run_timers.part.0+0x2ea/0x4c0
[ 2169.760978] run_timer_softirq+0x40/0x80
[ 2169.761381] __do_softirq+0x1a1/0x509
[ 2169.761998] The buggy address belongs to the object at ffff88812b326a00
which belongs to the cache kmalloc-256 of size 256
[ 2169.763061] The buggy address is located 112 bytes inside of
freed 256-byte region [ffff88812b326a00, ffff88812b326b00)
[ 2169.764346] The buggy address belongs to the physical page:
[ 2169.764866] page:000000000f2b1e89 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12b324
[ 2169.765731] head:000000000f2b1e89 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 2169.766484] anon flags: 0x200000000000840(slab|head|node=0|zone=2)
[ 2169.767048] page_type: 0xffffffff()
[ 2169.767422] raw: 0200000000000840 ffff888100042b40 0000000000000000 dead000000000001
[ 2169.768183] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[ 2169.768899] page dumped because: kasan: bad access detected
[ 2169.769649] Memory state around the buggy address:
[ 2169.770116] ffff88812b326900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.770805] ffff88812b326980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.771485] >ffff88812b326a00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2169.772173] ^
[ 2169.772787] ffff88812b326a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 2169.773477] ffff88812b326b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 2169.774160] ==================================================================
[ 2169.774845] ==================================================================
I didn't manage to reproduce it. Though the issue seems to be obvious.
There is a chance that the mlx5_dpll_remove() calls
cancel_delayed_work() when the work runs and manages to re-arm itself.
In that case, after delay timer triggers next attempt to queue it,
it works with freed memory.
Fix this by using cancel_delayed_work_sync() instead which makes sure
that work is done when it returns.
Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240206164328.360313-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, I've been hitting following deadlock warning during dpll pin
dump:
[52804.637962] ======================================================
[52804.638536] WARNING: possible circular locking dependency detected
[52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
[52804.639529] ------------------------------------------------------
[52804.640104] python3/2984 is trying to acquire lock:
[52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
[52804.641417]
but task is already holding lock:
[52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
[52804.642747]
which lock already depends on the new lock.
[52804.643551]
the existing dependency chain (in reverse order) is:
[52804.644259]
-> #1 (dpll_lock){+.+.}-{3:3}:
[52804.644836] lock_acquire+0x174/0x3e0
[52804.645271] __mutex_lock+0x119/0x1150
[52804.645723] dpll_lock_dumpit+0x13/0x20
[52804.646169] genl_start+0x266/0x320
[52804.646578] __netlink_dump_start+0x321/0x450
[52804.647056] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.647575] genl_rcv_msg+0x1ed/0x3b0
[52804.648001] netlink_rcv_skb+0xdc/0x210
[52804.648440] genl_rcv+0x24/0x40
[52804.648831] netlink_unicast+0x2f1/0x490
[52804.649290] netlink_sendmsg+0x36d/0x660
[52804.649742] __sock_sendmsg+0x73/0xc0
[52804.650165] __sys_sendto+0x184/0x210
[52804.650597] __x64_sys_sendto+0x72/0x80
[52804.651045] do_syscall_64+0x6f/0x140
[52804.651474] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.652001]
-> #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
[52804.652650] check_prev_add+0x1ae/0x1280
[52804.653107] __lock_acquire+0x1ed3/0x29a0
[52804.653559] lock_acquire+0x174/0x3e0
[52804.653984] __mutex_lock+0x119/0x1150
[52804.654423] netlink_dump+0xb3/0x780
[52804.654845] __netlink_dump_start+0x389/0x450
[52804.655321] genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.655842] genl_rcv_msg+0x1ed/0x3b0
[52804.656272] netlink_rcv_skb+0xdc/0x210
[52804.656721] genl_rcv+0x24/0x40
[52804.657119] netlink_unicast+0x2f1/0x490
[52804.657570] netlink_sendmsg+0x36d/0x660
[52804.658022] __sock_sendmsg+0x73/0xc0
[52804.658450] __sys_sendto+0x184/0x210
[52804.658877] __x64_sys_sendto+0x72/0x80
[52804.659322] do_syscall_64+0x6f/0x140
[52804.659752] entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.660281]
other info that might help us debug this:
[52804.661077] Possible unsafe locking scenario:
[52804.661671] CPU0 CPU1
[52804.662129] ---- ----
[52804.662577] lock(dpll_lock);
[52804.662924] lock(nlk_cb_mutex-GENERIC);
[52804.663538] lock(dpll_lock);
[52804.664073] lock(nlk_cb_mutex-GENERIC);
[52804.664490]
The issue as follows: __netlink_dump_start() calls control->start(cb)
with nlk->cb_mutex held. In control->start(cb) the dpll_lock is taken.
Then nlk->cb_mutex is released and taken again in netlink_dump(), while
dpll_lock still being held. That leads to ABBA deadlock when another
CPU races with the same operation.
Fix this by moving dpll_lock taking into dumpit() callback which ensures
correct lock taking order.
Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc4
DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670
DP:
- fixes to handle unknown bpc case correctly for DP. The current code was
spilling over into other bits of DP configuration register, had to be
fixed to avoid the extra shifts which were causing the spill over
- fix for MISC0 programming in DP driver to program the correct
colorimetry value
GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv+tb1+_cp7ftxcMZbbxE9810rvxeaC50eL=msQ+zkm0g@mail.gmail.com
|