Age | Commit message (Collapse) | Author |
|
Re-emit the unprocessed state after resetting the queue.
Drop the soft_recovery callbacks as the queue reset replaces
it.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set the dirty bit when the memory resource is not cleared
during BO release.
v2(Christian):
- Drop the cleared flag set to false.
- Improve the amdgpu_vram_mgr_set_clear_state() function.
v3:
- Add back the resource clear flag set function call after
being cleared during eviction (Christian).
- Modified the patch subject name.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Cached metrics data validity is 1ms on SMUv13.0.6 SOCs. It's not
reasonable for any client to query gpu_metrics at a faster rate and
constantly interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If dpm tables are already populated on SMU v13.0.6 SOCs, use the cached
data. Otherwise, fetch values from firmware.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
NVMe devices from multiple vendors appear to get stuck in a reset state
that we can't get out of with an NVMe level Controller Reset. The kernel
would report these with messages that look like:
Device not ready; aborting reset, CSTS=0x1
These have historically required a power cycle to make them usable
again, but in many cases, a PCIe FLR is sufficient to restart operation
without a power cycle. Try it if the initial controller reset fails
during any nvme reset attempt.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_sync: fix connectable extended advertising when using static random address
- hci_core: fix typos in macros
- hci_core: add missing braces when using macro parameters
- hci_core: replace 'quirks' integer by 'quirk_flags' bitmap
- SMP: If an unallowed command is received consider it a failure
- SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
- L2CAP: Fix null-ptr-deref in l2cap_sock_resume_cb()
- L2CAP: Fix attempting to adjust outgoing MTU
- btintel: Check if controller is ISO capable on btintel_classify_pkt_type
- btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
* tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
Bluetooth: hci_core: add missing braces when using macro parameters
Bluetooth: hci_core: fix typos in macros
Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
Bluetooth: SMP: If an unallowed command is received consider it a failure
Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
Bluetooth: hci_sync: fix connectable extended advertising when using static random address
Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
====================
Link: https://patch.msgid.link/20250717142849.537425-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When allocating IOVA the candidate range gets aligned to the target
alignment. If the range is close to ULONG_MAX then the ALIGN() can
wrap resulting in a corrupted iova.
Open code the ALIGN() using get_add_overflow() to prevent this.
This simplifies the checks as we don't need to check for length earlier
either.
Consolidate the two copies of this code under a single helper.
This bug would allow userspace to create a mapping that overlaps with some
other mapping or a reserved range.
Cc: stable@vger.kernel.org
Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Reported-by: syzbot+c2f65e2801743ca64e08@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/685af644.a00a0220.2e5631.0094.GAE@google.com
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://patch.msgid.link/all/1-v1-7b4a16fc390b+10f4-iommufd_alloc_overflow_jgg@nvidia.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Antonio Quartulli says:
====================
This bugfix batch includes the following changes:
* properly propagate sk mark to skb->mark field
* reject unexpected incoming netlink attributes
* reset GSO state when moving skb from transport to tunnel layer
* tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next:
ovpn: reset GSO metadata after decapsulation
ovpn: reject unexpected netlink attributes
ovpn: propagate socket mark to skb in UDP
====================
Link: https://patch.msgid.link/20250716115443.16763-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The deadlock appears in a stack trace like:
virtnet_probe()
rtnl_lock()
virtio_config_changed_work()
netdev_notify_peers()
rtnl_lock()
It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the
virtio-net driver is still probing.
The config_work in probe() will get scheduled until virtnet_open() enables
the config change notification via virtio_config_driver_enable().
Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the upcoming ConnectX-10 device ID to the table of supported
PCI device IDs.
Cc: stable@vger.kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
addrconf
Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf.
Commit under Fixes added a new flag change that was not made
to hv_netvsc resulting in the VF being assinged an IPv6.
Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf")
Suggested-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The per-CPU variable ppp::xmit_recursion is protecting against recursion
due to wrong configuration of the ppp unit. The per-CPU variable
relies on disabled BH for its locking. Without per-CPU locking in
local_bh_disable() on PREEMPT_RT this data structure requires explicit
locking.
The ppp::xmit_recursion is used as a per-CPU boolean. The counter is
checked early in the send routing and the transmit path is only entered
if the counter is zero. Then the counter is incremented to avoid
recursion. It used to detect recursion on channel::downl and
ppp::wlock.
Create a struct ppp_xmit_recursion and move the counter into it.
Add local_lock_t to the struct and use local_lock_nested_bh() for
locking. Due to possible nesting, the lock cannot be acquired
unconditionally but it requires an owner field to identify recursion
before attempting to acquire the lock.
The counter is incremented and checked only after the lock is acquired.
Since it functions as a boolean rather than a count, and its role is now
superseded by the owner field, it can be safely removed.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250715150806.700536-2-bigeasy@linutronix.de
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Adds support to get fractional frequency offset for input pins. Implement
the appropriate callback and function that periodicaly performs reference
frequency measurement and notifies DPLL core about changes.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-6-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get/set phase adjustment for both input and output pins.
The phase adjustment is implemented using reference and output phase
offset compensation registers. For input pins the adjustment value can
be arbitrary number but for outputs the value has to be a multiple
of half synthesizer clock cycles.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-5-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Implement phase offset monitor feature to allow a user to monitor
phase offsets across all available inputs.
The device firmware periodically performs phase offsets measurements for
all available DPLL channels and input references. The driver can ask
the firmware to fill appropriate latch registers with measured values.
There are 2 sets of latch registers for phase offsets reporting:
1) DPLL-to-connected-ref: up to 5 registers that contain values for
phase offset between particular DPLL channel and its connected input
reference.
2) selected-DPLL-to-ref: 10 registers that contain values for phase
offsets between selected DPLL channel and all available input
references.
Both are filled with single read request so the driver can read
DPLL-to-connected-ref phase offset for all DPLL channels at once.
This was implemented in the previous patch.
To read selected-DPLL-to-ref registers for all DPLLs a separate read
request has to be sent to device firmware for each DPLL channel.
To implement phase offset monitor feature:
* Extend zl3073x_ref_phase_offsets_update() to select given DPLL channel
in phase offset read request. The caller can set channel==-1 if it
will not read Type2 registers.
* Use this extended function to update phase offset latch registers
during zl3073x_dpll_changes_check() call if phase monitor is enabled
* Extend zl3073x_dpll_pin_phase_offset_check() to check phase offset
changes for all available input references
* Extend zl3073x_dpll_input_pin_phase_offset_get() to report phase
offset values for all available input references
* Implement phase offset monitor callbacks to enable/disable this
feature
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-4-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get phase offset for the connected input pin. Implement
the appropriate callback and function that performs DPLL to connected
reference phase error measurement and notifies DPLL core about changes.
The measurement is performed internally by device on background 40 times
per second but the measured value is read each second and compared with
previous value.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-3-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get/set embedded sync for both input and output pins.
The DPLL is able to lock on input reference when the embedded sync
frequency is 1 PPS and pulse width 25%. The esync on outputs are more
versatille and theoretically supports any esync frequency that divides
current output frequency but for now support the same that supported on
input pins (1 PPS & 25% pulse).
Note that for the output pins the esync divisor shares the same register
used for N-divided signal formats. Due to this the esync cannot be
enabled on outputs configured with one of the N-divided signal formats.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-2-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Another set of changes, notably:
- cfg80211: fix double-free introduced earlier
- mac80211: fix RCU iteration in CSA
- iwlwifi: many cleanups (unused FW APIs, PCIe code, WoWLAN)
- mac80211: some work around how FIPS affects wifi, which was
wrong (RC4 is used by TKIP, not only WEP)
- cfg/mac80211: improvements for unsolicated probe response
handling
* tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (64 commits)
wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump()
wifi: cfg80211: fix off channel operation allowed check for MLO
wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finish
wifi: mac80211_hwsim: Update comments in header
wifi: mac80211: parse unsolicited broadcast probe response data
wifi: cfg80211: parse attribute to update unsolicited probe response template
wifi: mac80211: don't use TPE data from assoc response
wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH async
wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loop
wifi: mac80211: don't mark keys for inactive links as uploaded
wifi: mac80211: only assign chanctx in reconfig
wifi: mac80211_hwsim: Declare support for AP scanning
wifi: mac80211: clean up cipher suite handling
wifi: mac80211: don't send keys to driver when fips_enabled
wifi: cfg80211: Fix interface type validation
wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return value
wifi: mac80211: don't unreserve never reserved chanctx
mwl8k: Add missing check after DMA map
wifi: mac80211: make VHT opmode NSS ignore a debug message
wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions
...
====================
Link: https://patch.msgid.link/20250717094610.20106-47-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Couple of fixes:
- ath12k performance regression from -rc1
- cfg80211 counted_by() removal for scan request
as it doesn't match usage and keeps complaining
- iwlwifi crash with certain older devices
- iwlwifi missing an error path unlock
- iwlwifi compatibility with certain BIOS updates
* tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: Fix botched indexing conversion
wifi: cfg80211: remove scan request n_channels counted_by
wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled
wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap
wifi: iwlwifi: pcie: fix locking on invalid TOP reset
====================
Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If we reload the GuC due to suspend/resume or GT reset then we
have to resend not only any VFs provisioning data, but also PF
configuration, like scheduling parameters (EQ, PT), as otherwise
GuC will continue to use default values.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-3-michal.wajdeczko@intel.com
(cherry picked from commit 1c38dd6afa4a8ecce28e94da794fd1d205c30f51)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.
However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:
$ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset
this PF worker might be still busy, which lead to errors due to
just stopped or disabled GuC CTB communication:
[ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed
[ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe]
[ ] xe 0000:00:02.0: [drm] GT0: reset queued
[ ] xe 0000:00:02.0: [drm] GT0: reset started
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped
[ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled!
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED)
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations
[ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed
While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com
(cherry picked from commit 9f50b729dd61dfb9f4d7c66900d22a7c7353a8c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The check would fail if the address is unaligned, but not when
accounting the offset. Instead of `buf | offset` it should have
been `buf + offset`. To make it more readable and also drop the
uintptr_t, just use the IS_ALIGNED() macro.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 81e139db6900503a2e68009764054fad128fbf95)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
We need the topology to determine GT page fault queue size, move page
fault init after topology init.
Cc: stable@vger.kernel.org
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
(cherry picked from commit beb72acb5b38dbe670d8eb752d1ad7a32f9c4119)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
MOCS uc_index is used even before it is initialized in the following
callstack
guc_prepare_xfer()
__xe_guc_upload()
xe_guc_min_load_for_hwconfig()
xe_uc_init_hwconfig()
xe_gt_init_hwconfig()
Do MOCS index initialization earlier in the device probe.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 241cc827c0987d7173714fc5a95a7c8fc9bf15c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The delay that we are waiting on in brcm_pcie_start_link() is
PCIE_T_RRS_READY_MS, use it.
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
[mani: Removed the redundant comment]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250624231923.990361-3-florian.fainelli@broadcom.com
|
|
After we do the modification on the host side, ensure we write the
result back to VRAM and not the other way around, otherwise the
modification will be lost if treated like a read.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com
(cherry picked from commit c12fe703cab93f9d8bfe0ff32b58e7b1fd52be1f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Skipping TLB invalidations on VF causing unrecoverable
faults. Probable reason for skipping TLB invalidations
on SRIOV could be lack of support for instruction
MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some
additional handling.
Helps in resolving,
[ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]]
ASID: 0
VFID: 0
PDATA: 0x0d92
Faulted Address: 0x0000000002fa0000
FaultType: 0
AccessType: 1
FaultLevel: 0
EngineClass: 3 bcs
EngineInstance: 8
[ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
V2:
- Use Xmas tree (MichalW)
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 97515d0b3ed92 ("drm/xe/vf: Don't emit access to Global HWSP if VF")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit b528e896fa570844d654b5a4617a97fa770a1030)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Currently all paths that set err and then check it for an error
perform immediate returns, hence err always zero at the end of
the function _mlx5r_umr_zap_mkey. The return expression
err ? err : nblocks has a redundant check on the err since err
is always zero, so just return nblocks instead.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20250717112108.4036171-1-colin.i.king@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Commit 2df7168717b7 ("dm: Always split write BIOs to zoned device
limits") updates the device-mapper driver to perform splits for the
write BIOs. However, it did not address the cases where DM targets do
not emulate zone append, such as in the cases of dm-linear or dm-flakey.
For these targets, when the write BIOs span across zone boundaries, they
trigger WARN_ON_ONCE(bio_straddles_zones(bio)) in
blk_zone_wplug_handle_write(). This results in I/O errors. The errors
are reproduced by running blktests test case zbd/004 using zoned
dm-linear or dm-flakey devices.
To avoid the I/O errors, handle the write BIOs regardless whether DM
targets emulate zone append or not, so that all write BIOs are split at
zone boundaries. For that purpose, drop the check for zone append
emulation in dm_zone_bio_needs_split(). Its argument 'md' is no longer
used then drop it also.
Fixes: 2df7168717b7 ("dm: Always split write BIOs to zoned device limits")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lore.kernel.org/r/20250717103539.37279-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.
Setting chunk_sectors limit in this way overrides the stacked limit
already calculated based on the bottom device limits. This is ok, as
when any bios are sent to the bottom devices, the block layer will still
respect the bottom device chunk_sectors.
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-6-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-5-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently we use min io size as the chunk size when deciding on the
atomic write size limits - see blk_stack_atomic_writes_head().
The limit min_io size is not a reliable value to store the chunk size, as
this may be mutated by the block stacking code. Such an example would be
for the min io size less than the physical block size, and the min io size
is raised to the physical block size - see blk_stack_limits().
The block stacking limits will rely on chunk_sectors in future,
so set this value (to the chunk size).
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-4-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NVMe fixes from Christoph:
"- revert the cross-controller atomic write size validation that caused
regressions (Christoph Hellwig)
- fix endianness of command word prints in nvme_log_err_passthru()
(John Garry)
- fix callback lock for TLS handshake (Maurizio Lombardi)
- fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
- fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
(Zheng Qixing)"
* tag 'nvme-6.16-2025-07-17' of git://git.infradead.org/nvme:
nvmet-tcp: fix callback lock for TLS handshake
nvme: fix misaccounting of nvme-mpath inflight I/O
nvme: revert the cross-controller atomic write size validation
nvme: fix endianness of command word prints in nvme_log_err_passthru()
nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
|
|
Have nvmet_req_init() and req->execute() complete failed commands.
Description of the problem:
nvmet_req_init() calls __nvmet_req_complete() internally upon failure,
e.g., unsupported opcode, which calls the "queue_response" callback,
this results in nvmet_pci_epf_queue_response() being called, which will
call nvmet_pci_epf_complete_iod() if data_len is 0 or if dma_dir is
different from DMA_TO_DEVICE. This results in a double completion as
nvmet_pci_epf_exec_iod_work() also calls nvmet_pci_epf_complete_iod()
when nvmet_req_init() fails.
Steps to reproduce:
On the host send a command with an unsupported opcode with nvme-cli,
For example the admin command "security receive"
$ sudo nvme security-recv /dev/nvme0n1 -n1 -x4096
This triggers a double completion as nvmet_req_init() fails and
nvmet_pci_epf_queue_response() is called, here iod->dma_dir is still
in the default state of "DMA_NONE" as set by default in
nvmet_pci_epf_alloc_iod(), so nvmet_pci_epf_complete_iod() is called.
Because nvmet_req_init() failed nvmet_pci_epf_complete_iod() is also
called in nvmet_pci_epf_exec_iod_work() leading to a double completion.
This not only sends two completions to the host but also corrupts the
state of the PCI NVMe target leading to kernel oops.
This patch lets nvmet_req_init() and req->execute() complete all failed
commands, and removes the double completion case in
nvmet_pci_epf_exec_iod_work() therefore fixing the edge cases where
double completions occurred.
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Update the nvme_tcp_start_tls() function to use dev_err() instead of
dev_dbg() when a TLS error is detected. This ensures that handshake
failures are visible by default, aiding in debugging.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Correct a typo error in the NVMe status code constant from
NVME_SC_SELT_TEST_IN_PROGRESS to NVME_SC_SELF_TEST_IN_PROGRESS to
accurately reflect its meaning.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Remove the unnecessary ret = -EMFILE; assignment since it is immediately
overwritten by the result of nvmet_bdev_ns_enable() The initial value
(-EMFILE) is redundant because it has no effect on the code logic or
outcome.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Correct the error log to print ctrl->io_cqes instead of incorrectly using
ctrl->io_sqes for the io cqes size check.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
This commit fixes several typos and grammatical issues across various
nvme host driver files:
- correct "glace" to "glance" in a comment in apple.c
- fix "Idependent" to "Independent" in core.c
- change "unsucceesful" to "unsuccessful", "they blk-mq" to "the blk-mq",
- fix "terminaed" to "terminated" and other grammar in fc.c
- update "O's" to "0's" to clarify meaning in nvme.h
- fix a function name reference in a comment in zns.c:
*_transter_len() -> *_transfer_len().
- fix sysfs_emit() output format in pci.c (replace x%08x with 0x%08x)
These changes improve the code readability and documentation consistency
across the NVMe driver.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Synopsys DesignWare XPCS CSR clock is optional,
so it is better to use devm_clk_get_optional
instead of devm_clk_get.
Signed-off-by: Jack Ping CHNG <jchng@maxlinear.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250715021956.3335631-1-jchng@maxlinear.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This reverts commit e8afa1557f4f963c9a511bd2c6074a941c308685.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-8-tzimmermann@suse.de
|
|
This reverts commit 1a148af06000e545e714fe3210af3d77ff903c11.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-7-tzimmermann@suse.de
|
|
This reverts commit cce16fcd7446dcff7480cd9d2b6417075ed81065.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-6-tzimmermann@suse.de
|
|
This reverts commit f83a9b8c7fd0557b0c50784bfdc1bbe9140c9bf8.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Cc: <stable@vger.kernel.org> # v6.15+
Link: https://lore.kernel.org/r/20250715155934.150656-5-tzimmermann@suse.de
|
|
This reverts commit e91eb3ae415472b28211d7fed07fa283845b311e.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-4-tzimmermann@suse.de
|
|
This reverts commit aec8a40228acb385d60feec59b54573d307e60f3.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
|
|
This reverts commit 415cb45895f43015515473fbc40563ca5eec9a7c.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250715155934.150656-3-tzimmermann@suse.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux
Tony Nguyen says:
====================
Add RDMA support for Intel IPU E2000 in idpf
Tatyana Nikolova says:
This idpf patch series is the second part of the staged submission for
introducing RDMA RoCEv2 support for the IPU E2000 line of products,
referred to as GEN3.
To support RDMA GEN3 devices, the idpf driver uses common definitions
of the IIDC interface and implements specific device functionality in
iidc_rdma_idpf.h.
The IPU model can host one or more logical network endpoints called
vPorts per PCI function that are flexibly associated with a physical
port or an internal communication port.
Other features as it pertains to GEN3 devices include:
* MMIO learning
* RDMA capability negotiation
* RDMA vectors discovery between idpf and control plane
These patches are split from the submission "Add RDMA support for Intel
IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts
and platforms with a variety of general RDMA applications which include
standalone verbs (rping, perftest, etc.), storage and HPC applications.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/
This idpf patch series is the second part of the staged submission for
introducing RDMA RoCEv2 support for the IPU E2000 line of products,
referred to as GEN3.
To support RDMA GEN3 devices, the idpf driver uses common definitions
of the IIDC interface and implements specific device functionality in
iidc_rdma_idpf.h.
The IPU model can host one or more logical network endpoints called
vPorts per PCI function that are flexibly associated with a physical
port or an internal communication port.
Other features as it pertains to GEN3 devices include:
* MMIO learning
* RDMA capability negotiation
* RDMA vectors discovery between idpf and control plane
These patches are split from the submission "Add RDMA support for Intel
IPU E2000 (GEN3)" [1]. The patches have been tested on a range of hosts
and platforms with a variety of general RDMA applications which include
standalone verbs (rping, perftest, etc.), storage and HPC applications.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
[1] https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/
IWL reviews:
v3: https://lore.kernel.org/all/20250708210554.1662-1-tatyana.e.nikolova@intel.com/
v2: https://lore.kernel.org/all/20250612220002.1120-1-tatyana.e.nikolova@intel.com/
v1 (split from previous series):
https://lore.kernel.org/all/20250523170435.668-1-tatyana.e.nikolova@intel.com/
v3: https://lore.kernel.org/all/20250207194931.1569-1-tatyana.e.nikolova@intel.com/
RFC v2: https://lore.kernel.org/all/20240824031924.421-1-tatyana.e.nikolova@intel.com/
RFC: https://lore.kernel.org/all/20240724233917.704-1-tatyana.e.nikolova@intel.com/
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux:
idpf: implement get LAN MMIO memory regions
idpf: implement IDC vport aux driver MTU change handler
idpf: implement remaining IDC RDMA core callbacks and handlers
idpf: implement RDMA vport auxiliary dev create, init, and destroy
idpf: implement core RDMA auxiliary dev create, init, and destroy
idpf: use reserved RDMA vectors from control plane
====================
Link: https://patch.msgid.link/20250714181002.2865694-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|