Age | Commit message (Collapse) | Author |
|
Notice that device_suspend_noirq(), device_suspend_late() and
device_suspend() all set async_error on errors, so they don't really
need to return a value. Accordingly, make them all void and use
async_error in their callers instead of their return values.
Moreover, since async_error is updated concurrently without locking
during asynchronous suspend and resume processing, use READ_ONCE() and
WRITE_ONCE() for accessing it in those places to ensure that all of the
accesses will be carried out as expected.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Link: https://patch.msgid.link/6198088.lOV4Wx5bFT@rjwysocki.net
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth, CAN, WiFi and Netfilter.
More code here than I would have liked. That said, better now than
next week. Nothing particularly scary stands out. The improvement to
the OpenVPN input validation is a bit large but better get them in
before the code makes it to a final release. Some of the changes we
got from sub-trees could have been split better between the fix and
-next refactoring, IMHO, that has been communicated.
We have one known regression in a TI AM65 board not getting link. The
investigation is going a bit slow, a number of people are on vacation.
We'll try to wrap it up, but don't think it should hold up the
release.
Current release - fix to a fix:
- Bluetooth: L2CAP: fix attempting to adjust outgoing MTU, it broke
some headphones and speakers
Current release - regressions:
- wifi: ath12k: fix packets received in WBM error ring with REO LUT
enabled, fix Rx performance regression
- wifi: iwlwifi:
- fix crash due to a botched indexing conversion
- mask reserved bits in chan_state_active_bitmap, avoid FW assert()
Current release - new code bugs:
- nf_conntrack: fix crash due to removal of uninitialised entry
- eth: airoha: fix potential UaF in airoha_npu_get()
Previous releases - regressions:
- net: fix segmentation after TCP/UDP fraglist GRO
- af_packet: fix the SO_SNDTIMEO constraint not taking effect and a
potential soft lockup waiting for a completion
- rpl: fix UaF in rpl_do_srh_inline() for sneaky skb geometry
- virtio-net: fix recursive rtnl_lock() during probe()
- eth: stmmac: populate entire system_counterval_t in get_time_fn()
- eth: libwx: fix a number of crashes in the driver Rx path
- hv_netvsc: prevent IPv6 addrconf after IFF_SLAVE lost that meaning
Previous releases - always broken:
- mptcp: fix races in handling connection fallback to pure TCP
- rxrpc: assorted error handling and race fixes
- sched: another batch of "security" fixes for qdiscs (QFQ, HTB)
- tls: always refresh the queue when reading sock, avoid UaF
- phy: don't register LEDs for genphy, avoid deadlock
- Bluetooth: btintel: check if controller is ISO capable on
btintel_classify_pkt_type(), work around FW returning incorrect
capabilities
Misc:
- make OpenVPN Netlink input checking more strict before it makes it
to a final release
- wifi: cfg80211: remove scan request n_channels __counted_by, it's
only yielding false positives"
* tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
rxrpc: Fix to use conn aborts for conn-wide failures
rxrpc: Fix transmission of an abort in response to an abort
rxrpc: Fix notification vs call-release vs recvmsg
rxrpc: Fix recv-recv race of completed call
rxrpc: Fix irq-disabled in local_bh_enable()
selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
net: bridge: Do not offload IGMP/MLD messages
selftests: Add test cases for vlan_filter modification during runtime
net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime
tls: always refresh the queue when reading sock
virtio-net: fix recursived rtnl_lock() during probe()
net/mlx5: Update the list of the PCI supported devices
hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf
phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept()
Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
netfilter: nf_conntrack: fix crash due to removal of uninitialised entry
net: fix segmentation after TCP/UDP fraglist GRO
ipv6: mcast: Delay put pmc->idev in mld_del_delrec()
net: airoha: fix potential use-after-free in airoha_npu_get()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address three issues introduced during the current development
cycle and related to system suspend and hibernation, one triggering
when asynchronous suspend of devices fails, one possibly affecting
memory management in the core suspend code error path, and one due to
duplicate filesystems freezing during system suspend:
- Fix a deadlock that may occur on asynchronous device suspend
failures due to missing completion updates in error paths (Rafael
Wysocki)
- Drop a misplaced pm_restore_gfp_mask() call, which may cause swap
to be accessed too early if system suspend fails, from
suspend_devices_and_enter() (Rafael Wysocki)
- Remove duplicate filesystems_freeze/thaw() calls, which sometimes
cause systems to be unable to resume, from enter_state() (Zihuan
Zhang)"
* tag 'pm-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Update power.completion for all devices on errors
PM: suspend: clean up redundant filesystems_freeze/thaw() handling
PM: suspend: Drop a misplaced pm_restore_gfp_mask() call
|
|
Rather than checking in the callbacks, check if the reset
type is supported in the caller.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Drop the soft_recovery callbacks as the queue reset replaces
it.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Drop the soft_recovery callbacks as the queue reset replaces
it.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Drop the soft_recovery callbacks as the queue reset replaces
it.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Re-emit the unprocessed state after resetting the queue.
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set the dirty bit when the memory resource is not cleared
during BO release.
v2(Christian):
- Drop the cleared flag set to false.
- Improve the amdgpu_vram_mgr_set_clear_state() function.
v3:
- Add back the resource clear flag set function call after
being cleared during eviction (Christian).
- Modified the patch subject name.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Cached metrics data validity is 1ms on SMUv13.0.6 SOCs. It's not
reasonable for any client to query gpu_metrics at a faster rate and
constantly interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If dpm tables are already populated on SMU v13.0.6 SOCs, use the cached
data. Otherwise, fetch values from firmware.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
NVMe devices from multiple vendors appear to get stuck in a reset state
that we can't get out of with an NVMe level Controller Reset. The kernel
would report these with messages that look like:
Device not ready; aborting reset, CSTS=0x1
These have historically required a power cycle to make them usable
again, but in many cases, a PCIe FLR is sufficient to restart operation
without a power cycle. Try it if the initial controller reset fails
during any nvme reset attempt.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_sync: fix connectable extended advertising when using static random address
- hci_core: fix typos in macros
- hci_core: add missing braces when using macro parameters
- hci_core: replace 'quirks' integer by 'quirk_flags' bitmap
- SMP: If an unallowed command is received consider it a failure
- SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
- L2CAP: Fix null-ptr-deref in l2cap_sock_resume_cb()
- L2CAP: Fix attempting to adjust outgoing MTU
- btintel: Check if controller is ISO capable on btintel_classify_pkt_type
- btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
* tag 'for-net-2025-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU
Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID
Bluetooth: hci_dev: replace 'quirks' integer by 'quirk_flags' bitmap
Bluetooth: hci_core: add missing braces when using macro parameters
Bluetooth: hci_core: fix typos in macros
Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout
Bluetooth: SMP: If an unallowed command is received consider it a failure
Bluetooth: btintel: Check if controller is ISO capable on btintel_classify_pkt_type
Bluetooth: hci_sync: fix connectable extended advertising when using static random address
Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb()
====================
Link: https://patch.msgid.link/20250717142849.537425-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When allocating IOVA the candidate range gets aligned to the target
alignment. If the range is close to ULONG_MAX then the ALIGN() can
wrap resulting in a corrupted iova.
Open code the ALIGN() using get_add_overflow() to prevent this.
This simplifies the checks as we don't need to check for length earlier
either.
Consolidate the two copies of this code under a single helper.
This bug would allow userspace to create a mapping that overlaps with some
other mapping or a reserved range.
Cc: stable@vger.kernel.org
Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Reported-by: syzbot+c2f65e2801743ca64e08@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/685af644.a00a0220.2e5631.0094.GAE@google.com
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://patch.msgid.link/all/1-v1-7b4a16fc390b+10f4-iommufd_alloc_overflow_jgg@nvidia.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Antonio Quartulli says:
====================
This bugfix batch includes the following changes:
* properly propagate sk mark to skb->mark field
* reject unexpected incoming netlink attributes
* reset GSO state when moving skb from transport to tunnel layer
* tag 'ovpn-net-20250716' of https://github.com/OpenVPN/ovpn-net-next:
ovpn: reset GSO metadata after decapsulation
ovpn: reject unexpected netlink attributes
ovpn: propagate socket mark to skb in UDP
====================
Link: https://patch.msgid.link/20250716115443.16763-1-antonio@openvpn.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The deadlock appears in a stack trace like:
virtnet_probe()
rtnl_lock()
virtio_config_changed_work()
netdev_notify_peers()
rtnl_lock()
It happens if the VMM sends a VIRTIO_NET_S_ANNOUNCE request while the
virtio-net driver is still probing.
The config_work in probe() will get scheduled until virtnet_open() enables
the config change notification via virtio_config_driver_enable().
Fixes: df28de7b0050 ("virtio-net: synchronize operstate with admin state on up/down")
Signed-off-by: Zigit Zo <zuozhijie@bytedance.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20250716115717.1472430-1-zuozhijie@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the upcoming ConnectX-10 device ID to the table of supported
PCI device IDs.
Cc: stable@vger.kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1752650969-148501-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
addrconf
Set an additional flag IFF_NO_ADDRCONF to prevent ipv6 addrconf.
Commit under Fixes added a new flag change that was not made
to hv_netvsc resulting in the VF being assinged an IPv6.
Fixes: 8a321cf7becc ("net: add IFF_NO_ADDRCONF and use it in bonding to prevent ipv6 addrconf")
Suggested-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20250716002607.4927-1-litian@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The per-CPU variable ppp::xmit_recursion is protecting against recursion
due to wrong configuration of the ppp unit. The per-CPU variable
relies on disabled BH for its locking. Without per-CPU locking in
local_bh_disable() on PREEMPT_RT this data structure requires explicit
locking.
The ppp::xmit_recursion is used as a per-CPU boolean. The counter is
checked early in the send routing and the transmit path is only entered
if the counter is zero. Then the counter is incremented to avoid
recursion. It used to detect recursion on channel::downl and
ppp::wlock.
Create a struct ppp_xmit_recursion and move the counter into it.
Add local_lock_t to the struct and use local_lock_nested_bh() for
locking. Due to possible nesting, the lock cannot be acquired
unconditionally but it requires an owner field to identify recursion
before attempting to acquire the lock.
The counter is incremented and checked only after the lock is acquired.
Since it functions as a boolean rather than a count, and its role is now
superseded by the owner field, it can be safely removed.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250715150806.700536-2-bigeasy@linutronix.de
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Adds support to get fractional frequency offset for input pins. Implement
the appropriate callback and function that periodicaly performs reference
frequency measurement and notifies DPLL core about changes.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-6-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get/set phase adjustment for both input and output pins.
The phase adjustment is implemented using reference and output phase
offset compensation registers. For input pins the adjustment value can
be arbitrary number but for outputs the value has to be a multiple
of half synthesizer clock cycles.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-5-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Implement phase offset monitor feature to allow a user to monitor
phase offsets across all available inputs.
The device firmware periodically performs phase offsets measurements for
all available DPLL channels and input references. The driver can ask
the firmware to fill appropriate latch registers with measured values.
There are 2 sets of latch registers for phase offsets reporting:
1) DPLL-to-connected-ref: up to 5 registers that contain values for
phase offset between particular DPLL channel and its connected input
reference.
2) selected-DPLL-to-ref: 10 registers that contain values for phase
offsets between selected DPLL channel and all available input
references.
Both are filled with single read request so the driver can read
DPLL-to-connected-ref phase offset for all DPLL channels at once.
This was implemented in the previous patch.
To read selected-DPLL-to-ref registers for all DPLLs a separate read
request has to be sent to device firmware for each DPLL channel.
To implement phase offset monitor feature:
* Extend zl3073x_ref_phase_offsets_update() to select given DPLL channel
in phase offset read request. The caller can set channel==-1 if it
will not read Type2 registers.
* Use this extended function to update phase offset latch registers
during zl3073x_dpll_changes_check() call if phase monitor is enabled
* Extend zl3073x_dpll_pin_phase_offset_check() to check phase offset
changes for all available input references
* Extend zl3073x_dpll_input_pin_phase_offset_get() to report phase
offset values for all available input references
* Implement phase offset monitor callbacks to enable/disable this
feature
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-4-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get phase offset for the connected input pin. Implement
the appropriate callback and function that performs DPLL to connected
reference phase error measurement and notifies DPLL core about changes.
The measurement is performed internally by device on background 40 times
per second but the measured value is read each second and compared with
previous value.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-3-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to get/set embedded sync for both input and output pins.
The DPLL is able to lock on input reference when the embedded sync
frequency is 1 PPS and pulse width 25%. The esync on outputs are more
versatille and theoretically supports any esync frequency that divides
current output frequency but for now support the same that supported on
input pins (1 PPS & 25% pulse).
Note that for the output pins the esync divisor shares the same register
used for N-divided signal formats. Due to this the esync cannot be
enabled on outputs configured with one of the N-divided signal formats.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Prathosh Satish <prathosh.satish@microchip.com>
Co-developed-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Prathosh Satish <Prathosh.Satish@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250715144633.149156-2-ivecera@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Another set of changes, notably:
- cfg80211: fix double-free introduced earlier
- mac80211: fix RCU iteration in CSA
- iwlwifi: many cleanups (unused FW APIs, PCIe code, WoWLAN)
- mac80211: some work around how FIPS affects wifi, which was
wrong (RC4 is used by TKIP, not only WEP)
- cfg/mac80211: improvements for unsolicated probe response
handling
* tag 'wireless-next-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (64 commits)
wifi: cfg80211: fix double free for link_sinfo in nl80211_station_dump()
wifi: cfg80211: fix off channel operation allowed check for MLO
wifi: mac80211: use RCU-safe iteration in ieee80211_csa_finish
wifi: mac80211_hwsim: Update comments in header
wifi: mac80211: parse unsolicited broadcast probe response data
wifi: cfg80211: parse attribute to update unsolicited probe response template
wifi: mac80211: don't use TPE data from assoc response
wifi: mac80211: handle WLAN_HT_ACTION_NOTIFY_CHANWIDTH async
wifi: mac80211: simplify __ieee80211_rx_h_amsdu() loop
wifi: mac80211: don't mark keys for inactive links as uploaded
wifi: mac80211: only assign chanctx in reconfig
wifi: mac80211_hwsim: Declare support for AP scanning
wifi: mac80211: clean up cipher suite handling
wifi: mac80211: don't send keys to driver when fips_enabled
wifi: cfg80211: Fix interface type validation
wifi: mac80211: remove ieee80211_link_unreserve_chanctx() return value
wifi: mac80211: don't unreserve never reserved chanctx
mwl8k: Add missing check after DMA map
wifi: mac80211: make VHT opmode NSS ignore a debug message
wifi: iwlwifi: remove support of several iwl_ppag_table_cmd versions
...
====================
Link: https://patch.msgid.link/20250717094610.20106-47-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Couple of fixes:
- ath12k performance regression from -rc1
- cfg80211 counted_by() removal for scan request
as it doesn't match usage and keeps complaining
- iwlwifi crash with certain older devices
- iwlwifi missing an error path unlock
- iwlwifi compatibility with certain BIOS updates
* tag 'wireless-2025-07-17' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: Fix botched indexing conversion
wifi: cfg80211: remove scan request n_channels counted_by
wifi: ath12k: Fix packets received in WBM error ring with REO LUT enabled
wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap
wifi: iwlwifi: pcie: fix locking on invalid TOP reset
====================
Link: https://patch.msgid.link/20250717091831.18787-5-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If we reload the GuC due to suspend/resume or GT reset then we
have to resend not only any VFs provisioning data, but also PF
configuration, like scheduling parameters (EQ, PT), as otherwise
GuC will continue to use default values.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-3-michal.wajdeczko@intel.com
(cherry picked from commit 1c38dd6afa4a8ecce28e94da794fd1d205c30f51)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
As part of the resume or GT reset, the PF driver schedules work
which is then used to complete restarting of the SR-IOV support,
including resending to the GuC configurations of provisioned VFs.
However, in case of short delay between those two actions, which
could be seen by triggering a GT reset on the suspened device:
$ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset
this PF worker might be still busy, which lead to errors due to
just stopped or disabled GuC CTB communication:
[ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed
[ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe]
[ ] xe 0000:00:02.0: [drm] GT0: reset queued
[ ] xe 0000:00:02.0: [drm] GT0: reset started
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped
[ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled!
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED)
[ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV)
[ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations
[ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed
While this VFs reprovisioning will be successful during next spin
of the worker, to avoid those errors, make sure to cancel restart
worker if we are about to trigger next reset.
Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com
(cherry picked from commit 9f50b729dd61dfb9f4d7c66900d22a7c7353a8c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The check would fail if the address is unaligned, but not when
accounting the offset. Instead of `buf | offset` it should have
been `buf + offset`. To make it more readable and also drop the
uintptr_t, just use the IS_ALIGNED() macro.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 81e139db6900503a2e68009764054fad128fbf95)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
We need the topology to determine GT page fault queue size, move page
fault init after topology init.
Cc: stable@vger.kernel.org
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
(cherry picked from commit beb72acb5b38dbe670d8eb752d1ad7a32f9c4119)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
MOCS uc_index is used even before it is initialized in the following
callstack
guc_prepare_xfer()
__xe_guc_upload()
xe_guc_min_load_for_hwconfig()
xe_uc_init_hwconfig()
xe_gt_init_hwconfig()
Do MOCS index initialization earlier in the device probe.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com>
Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 241cc827c0987d7173714fc5a95a7c8fc9bf15c0)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
The delay that we are waiting on in brcm_pcie_start_link() is
PCIE_T_RRS_READY_MS, use it.
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
[mani: Removed the redundant comment]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250624231923.990361-3-florian.fainelli@broadcom.com
|
|
After we do the modification on the host side, ensure we write the
result back to VRAM and not the other way around, otherwise the
modification will be lost if treated like a read.
Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com
(cherry picked from commit c12fe703cab93f9d8bfe0ff32b58e7b1fd52be1f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Skipping TLB invalidations on VF causing unrecoverable
faults. Probable reason for skipping TLB invalidations
on SRIOV could be lack of support for instruction
MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some
additional handling.
Helps in resolving,
[ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]]
ASID: 0
VFID: 0
PDATA: 0x0d92
Faulted Address: 0x0000000002fa0000
FaultType: 0
AccessType: 1
FaultLevel: 0
EngineClass: 3 bcs
EngineInstance: 8
[ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22
V2:
- Use Xmas tree (MichalW)
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 97515d0b3ed92 ("drm/xe/vf: Don't emit access to Global HWSP if VF")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
(cherry picked from commit b528e896fa570844d654b5a4617a97fa770a1030)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
Currently all paths that set err and then check it for an error
perform immediate returns, hence err always zero at the end of
the function _mlx5r_umr_zap_mkey. The return expression
err ? err : nblocks has a redundant check on the err since err
is always zero, so just return nblocks instead.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20250717112108.4036171-1-colin.i.king@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Commit 2df7168717b7 ("dm: Always split write BIOs to zoned device
limits") updates the device-mapper driver to perform splits for the
write BIOs. However, it did not address the cases where DM targets do
not emulate zone append, such as in the cases of dm-linear or dm-flakey.
For these targets, when the write BIOs span across zone boundaries, they
trigger WARN_ON_ONCE(bio_straddles_zones(bio)) in
blk_zone_wplug_handle_write(). This results in I/O errors. The errors
are reproduced by running blktests test case zbd/004 using zoned
dm-linear or dm-flakey devices.
To avoid the I/O errors, handle the write BIOs regardless whether DM
targets emulate zone append or not, so that all write BIOs are split at
zone boundaries. For that purpose, drop the check for zone append
emulation in dm_zone_bio_needs_split(). Its argument 'md' is no longer
used then drop it also.
Fixes: 2df7168717b7 ("dm: Always split write BIOs to zoned device limits")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Link: https://lore.kernel.org/r/20250717103539.37279-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.
Setting chunk_sectors limit in this way overrides the stacked limit
already calculated based on the bottom device limits. This is ok, as
when any bios are sent to the bottom devices, the block layer will still
respect the bottom device chunk_sectors.
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-6-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Same as done for raid0, set chunk_sectors limit to appropriately set the
atomic write size limit.
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-5-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently we use min io size as the chunk size when deciding on the
atomic write size limits - see blk_stack_atomic_writes_head().
The limit min_io size is not a reliable value to store the chunk size, as
this may be mutated by the block stacking code. Such an example would be
for the min io size less than the physical block size, and the min io size
is raised to the physical block size - see blk_stack_limits().
The block stacking limits will rely on chunk_sectors in future,
so set this value (to the chunk size).
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20250711105258.3135198-4-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NVMe fixes from Christoph:
"- revert the cross-controller atomic write size validation that caused
regressions (Christoph Hellwig)
- fix endianness of command word prints in nvme_log_err_passthru()
(John Garry)
- fix callback lock for TLS handshake (Maurizio Lombardi)
- fix misaccounting of nvme-mpath inflight I/O (Yu Kuai)
- fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
(Zheng Qixing)"
* tag 'nvme-6.16-2025-07-17' of git://git.infradead.org/nvme:
nvmet-tcp: fix callback lock for TLS handshake
nvme: fix misaccounting of nvme-mpath inflight I/O
nvme: revert the cross-controller atomic write size validation
nvme: fix endianness of command word prints in nvme_log_err_passthru()
nvme: fix inconsistent RCU list manipulation in nvme_ns_add_to_ctrl_list()
|
|
Have nvmet_req_init() and req->execute() complete failed commands.
Description of the problem:
nvmet_req_init() calls __nvmet_req_complete() internally upon failure,
e.g., unsupported opcode, which calls the "queue_response" callback,
this results in nvmet_pci_epf_queue_response() being called, which will
call nvmet_pci_epf_complete_iod() if data_len is 0 or if dma_dir is
different from DMA_TO_DEVICE. This results in a double completion as
nvmet_pci_epf_exec_iod_work() also calls nvmet_pci_epf_complete_iod()
when nvmet_req_init() fails.
Steps to reproduce:
On the host send a command with an unsupported opcode with nvme-cli,
For example the admin command "security receive"
$ sudo nvme security-recv /dev/nvme0n1 -n1 -x4096
This triggers a double completion as nvmet_req_init() fails and
nvmet_pci_epf_queue_response() is called, here iod->dma_dir is still
in the default state of "DMA_NONE" as set by default in
nvmet_pci_epf_alloc_iod(), so nvmet_pci_epf_complete_iod() is called.
Because nvmet_req_init() failed nvmet_pci_epf_complete_iod() is also
called in nvmet_pci_epf_exec_iod_work() leading to a double completion.
This not only sends two completions to the host but also corrupts the
state of the PCI NVMe target leading to kernel oops.
This patch lets nvmet_req_init() and req->execute() complete all failed
commands, and removes the double completion case in
nvmet_pci_epf_exec_iod_work() therefore fixing the edge cases where
double completions occurred.
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Update the nvme_tcp_start_tls() function to use dev_err() instead of
dev_dbg() when a TLS error is detected. This ensures that handshake
failures are visible by default, aiding in debugging.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Correct a typo error in the NVMe status code constant from
NVME_SC_SELT_TEST_IN_PROGRESS to NVME_SC_SELF_TEST_IN_PROGRESS to
accurately reflect its meaning.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Remove the unnecessary ret = -EMFILE; assignment since it is immediately
overwritten by the result of nvmet_bdev_ns_enable() The initial value
(-EMFILE) is redundant because it has no effect on the code logic or
outcome.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|