linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-07-12	Merge tag 'drm-misc-next-fixes-2024-07-11' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-next A fix for fbdev on big endian systems, a condition fix for a sharp panel at removal, and a fix for qxl to prevent unpinned buffer access under certain conditions. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711-benign-rich-mouflon-2eeafe@houat
2024-07-12	tipc: Consolidate redundant functions	Shigeru Yoshida
	link_is_up() and tipc_link_is_up() have the same functionality. Consolidate these functions. Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12	tipc: Remove unused struct declaration	Shigeru Yoshida
	struct tipc_name_table in core.h is not used. Remove this declaration. Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12	Documentation/powerpc: Mention 40x is removed	Christophe Leroy
	Commit 732b32daef80 ("powerpc: Remove core support for 40x") removed 40x. Update documentation accordingly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c2d64bebc514ca892a12e51a68821a6317048d3a.1720694954.git.christophe.leroy@csgroup.eu
2024-07-12	powerpc: Remove 40x leftovers	Christophe Leroy
	Remove stale references to 40x. Fixes: e939da89d024 ("powerpc: Remove 40x from Kconfig and defconfig") Fixes: 548f5244f106 ("powerpc/40x: Remove EP405") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ab30ae302783d8617d407864b92db1b926ab5ab9.1720694914.git.christophe.leroy@csgroup.eu
2024-07-11	Merge branch 'netconsole-fix-potential-race-condition-and-improve-code-clarity'	Jakub Kicinski
	Breno Leitao says: ==================== netconsole: improve code clarity These changes aim to enhance the reliability of netconsole by eliminating the potential race condition and improve maintainability by making the code more straightforward to understand and modify. ==================== Link: https://patch.msgid.link/20240709144403.544099-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net: netconsole: Eliminate redundant setting of enabled field	Breno Leitao
	When disabling a netconsole target, enabled_store() is called with enabled=false. Currently, this results in updating the nt->enabled field twice: 1. Inside the if/else block, with the target_list_lock spinlock held 2. Later, without the target_list_lock This patch eliminates the redundancy by setting the field only once, improving efficiency and reducing potential race conditions. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240709144403.544099-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net: netconsole: Remove unnecessary cast from bool	Breno Leitao
	The 'enabled' variable is already a bool, so casting it to its value is redundant. Remove the superfluous cast, improving code clarity without changing functionality. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240709144403.544099-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12	md/raid1: set max_sectors during early return from choose_slow_rdev()	Mateusz Jończyk
	Linux 6.9+ is unable to start a degraded RAID1 array with one drive, when that drive has a write-mostly flag set. During such an attempt, the following assertion in bio_split() is hit: BUG_ON(sectors <= 0); Call Trace: ? bio_split+0x96/0xb0 ? exc_invalid_op+0x53/0x70 ? bio_split+0x96/0xb0 ? asm_exc_invalid_op+0x1b/0x20 ? bio_split+0x96/0xb0 ? raid1_read_request+0x890/0xd20 ? __call_rcu_common.constprop.0+0x97/0x260 raid1_make_request+0x81/0xce0 ? __get_random_u32_below+0x17/0x70 ? new_slab+0x2b3/0x580 md_handle_request+0x77/0x210 md_submit_bio+0x62/0xa0 __submit_bio+0x17b/0x230 submit_bio_noacct_nocheck+0x18e/0x3c0 submit_bio_noacct+0x244/0x670 After investigation, it turned out that choose_slow_rdev() does not set the value of max_sectors in some cases and because of it, raid1_read_request calls bio_split with sectors == 0. Fix it by filling in this variable. This bug was introduced in commit dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()") but apparently hidden until commit 0091c5a269ec ("md/raid1: factor out helpers to choose the best rdev from read_balance()") shortly thereafter. Cc: stable@vger.kernel.org # 6.9.x+ Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Fixes: dfa8ecd167c1 ("md/raid1: factor out choose_slow_rdev() from read_balance()") Cc: Song Liu <song@kernel.org> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Paul Luse <paul.e.luse@linux.intel.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@o2.pl/ -- Tested on both Linux 6.10 and 6.9.8. Inside a VM, mdadm testsuite for RAID1 on 6.10 did not find any problems: ./test --dev=loop --no-error --raidtype=raid1 (on 6.9.8 there was one failure, caused by external bitmap support not compiled in). Notes: - I was reliably getting deadlocks when adding / removing devices on such an array - while the array was loaded with fsstress with 20 concurrent processes. When the array was idle or loaded with fsstress with 8 processes, no such deadlocks happened in my tests. This occurred also on unpatched Linux 6.8.0 though, but not on 6.1.97-rc1, so this is likely an independent regression (to be investigated). - I was also getting deadlocks when adding / removing the bitmap on the array in similar conditions - this happened on Linux 6.1.97-rc1 also though. fsstress with 8 concurrent processes did cause it only once during many tests. - in my testing, there was once a problem with hot adding an internal bitmap to the array: mdadm: Cannot add bitmap while array is resyncing or reshaping etc. mdadm: failed to set internal bitmap. even though no such reshaping was happening according to /proc/mdstat. This seems unrelated, though. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240711202316.10775-1-mat.jonczyk@o2.pl
2024-07-12	md-cluster: fix no recovery job when adding/re-adding a disk	Heming Zhao
	The commit db5e653d7c9f ("md: delay choosing sync action to md_start_sync()") delays the start of the sync action. In a clustered environment, this will cause another node to first activate the spare disk and skip recovery. As a result, no nodes will perform recovery when a disk is added or re-added. Before db5e653d7c9f: ``` node1 node2 ---------------------------------------------------------------- md_check_recovery + md_update_sb \| sendmsg: METADATA_UPDATED + md_choose_sync_action process_metadata_update \| remove_and_add_spares //node1 has not finished adding + call mddev->sync_work //the spare disk:do nothing md_start_sync starts md_do_sync md_do_sync + grabbed resync_lockres:DLM_LOCK_EX + do syncing job md_check_recovery sendmsg: METADATA_UPDATED process_metadata_update //activate spare disk ... ... md_do_sync waiting to grab resync_lockres:EX ``` After db5e653d7c9f: (note: if 'cmd:idle' sets MD_RECOVERY_INTR after md_check_recovery starts md_start_sync, setting the INTR action will exacerbate the delay in node1 calling the md_do_sync function.) ``` node1 node2 ---------------------------------------------------------------- md_check_recovery + md_update_sb \| sendmsg: METADATA_UPDATED + calls mddev->sync_work process_metadata_update //node1 has not finished adding //the spare disk:do nothing md_start_sync + md_choose_sync_action \| remove_and_add_spares + calls md_do_sync md_check_recovery md_update_sb sendmsg: METADATA_UPDATED process_metadata_update //activate spare disk ... ... ... ... md_do_sync + grabbed resync_lockres:EX + raid1_sync_request skip sync under conf->fullsync:0 md_do_sync 1. waiting to grab resync_lockres:EX 2. when node1 could grab EX lock, node1 will skip resync under recovery_offset:MaxSector ``` How to trigger: ```(commands @node1) # to easily watch the recovery status echo 2000 > /proc/sys/dev/raid/speed_limit_max ssh root@node2 "echo 2000 > /proc/sys/dev/raid/speed_limit_max" mdadm -CR /dev/md0 -l1 -b clustered -n 2 /dev/sda /dev/sdb --assume-clean ssh root@node2 mdadm -A /dev/md0 /dev/sda /dev/sdb mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda mdadm --manage /dev/md0 --add /dev/sdc === "cat /proc/mdstat" on both node, there are no recovery action. === ``` How to fix: because md layer code logic is hard to restore for speeding up sync job on local node, we add new cluster msg to pending the another node to active disk. Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Acked-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240709104120.22243-2-heming.zhao@suse.com
2024-07-12	md-cluster: fix hanging issue while a new disk adding	Heming Zhao
	The commit 1bbe254e4336 ("md-cluster: check for timeout while a new disk adding") is correct in terms of code syntax but not suite real clustered code logic. When a timeout occurs while adding a new disk, if recv_daemon() bypasses the unlock for ack_lockres:CR, another node will be waiting to grab EX lock. This will cause the cluster to hang indefinitely. How to fix: 1. In dlm_lock_sync(), change the wait behaviour from forever to a timeout, This could avoid the hanging issue when another node fails to handle cluster msg. Another result of this change is that if another node receives an unknown msg (e.g. a new msg_type), the old code will hang, whereas the new code will timeout and fail. This could help cluster_md handle new msg_type from different nodes with different kernel/module versions (e.g. The user only updates one leg's kernel and monitors the stability of the new kernel). 2. The old code for __sendmsg() always returns 0 (success) under the design (must successfully unlock ->message_lockres). This commit makes this function return an error number when an error occurs. Fixes: 1bbe254e4336 ("md-cluster: check for timeout while a new disk adding") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Su Yue <glass.su@suse.com> Acked-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240709104120.22243-1-heming.zhao@suse.com
2024-07-11	selftests: openvswitch: retry instead of sleep	Adrian Moreno
	There are a couple of places where the test script "sleep"s to wait for some external condition to be met. This is error prone, specially in slow systems (identified in CI by "KSFT_MACHINE_SLOW=yes"). To fix this, add a "ovs_wait" function that tries to execute a command a few times until it succeeds. The timeout used is set to 5s for "normal" systems and doubled if a slow CI machine is detected. This should make the following work: $ vng --build \ --config tools/testing/selftests/net/config \ --config kernel/configs/debug.config $ vng --run . --user root -- "make -C tools/testing/selftests/ \ KSFT_MACHINE_SLOW=yes TARGETS=net/openvswitch run_tests" Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Ilya Maximets <i.maximets@ovn.org> Link: https://patch.msgid.link/20240710090500.1655212-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	netdevice: define and allocate &net_device _properly_	Alexander Lobakin
	In fact, this structure contains a flexible array at the end, but historically its size, alignment etc., is calculated manually. There are several instances of the structure embedded into other structures, but also there's ongoing effort to remove them and we could in the meantime declare &net_device properly. Declare the array explicitly, use struct_size() and store the array size inside the structure, so that __counted_by() can be applied. Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the allocated buffer is aligned to what the user expects. Also, change its alignment from %NETDEV_ALIGN to the cacheline size as per several suggestions on the netdev ML. bloat-o-meter for vmlinux: free_netdev 445 440 -5 netdev_freemem 24 - -24 alloc_netdev_mqs 1481 1450 -31 On x86_64 with several NICs of different vendors, I was never able to get a &net_device pointer not aligned to the cacheline size after the change. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20240710113036.2125584-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net: psample: fix flag being set in wrong skb	Adrian Moreno
	A typo makes PSAMPLE_ATTR_SAMPLE_RATE netlink flag be added to the wrong sk_buff. Fix the error and make the input sk_buff pointer "const" so that it doesn't happen again. Acked-by: Eelco Chaudron <echaudro@redhat.com> Fixes: 7b1b2b60c63f ("net: psample: allow using rate as probability") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Antoine Tenart <atenart@kernel.org> Link: https://patch.msgid.link/20240710171004.2164034-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12	bna: adjust 'name' buf size of bna_tcb and bna_ccb structures	Alexey Kodanev
	To have enough space to write all possible sprintf() args. Currently 'name' size is 16, but the first '%s' specifier may already need at least 16 characters, since 'bnad->netdev->name' is used there. For '%d' specifiers, assume that they require: * 1 char for 'tx_id + tx_info->tcb[i]->id' sum, BNAD_MAX_TXQ_PER_TX is 8 * 2 chars for 'rx_id + rx_info->rx_ctrl[i].ccb->id', BNAD_MAX_RXP_PER_RX is 16 And replace sprintf with snprintf. Detected using the static analysis tool - Svace. Fixes: 8b230ed8ec96 ("bna: Brocade 10Gb Ethernet device driver") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-11	Revert "drm/amd/display: Reset freesync config before update new state"	Leo Li
	This change caused PSR SU panels to not read from their remote fb, preventing us from entering self-refresh. It is a regression. This reverts commit 6b8487cdf9fc7bae707519ac5b5daeca18d1e85b. Signed-off-by: Leo Li <sunpeng.li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-11	i40e: fix: remove needless retries of NVM update	Aleksandr Loktionov
	Remove wrong EIO to EGAIN conversion and pass all errors as is. After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only replace F/W specific error codes with Linux kernel generic, all EIO errors suddenly started to be converted into EAGAIN which leads nvmupdate to retry until it timeouts and sometimes fails after more than 20 minutes in the middle of NVM update, so NVM becomes corrupted. The bug affects users only at the time when they try to update NVM, and only F/W versions that generate errors while nvmupdate. For example, X710DA2 with 0x8000ECB7 F/W is affected, but there are probably more... Command for reproduction is just NVM update: ./nvmupdate64 In the log instead of: i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM) appears: i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM i40e: eeprom check failed (-5), Tx/Rx traffic disabled The problematic code did silently convert EIO into EAGAIN which forced nvmupdate to ignore EAGAIN error and retry the same operation until timeout. That's why NVM update takes 20+ minutes to finish with the fail in the end. Fixes: 230f3d53a547 ("i40e: remove i40e_status") Co-developed-by: Kelvin Kang <kelvin.kang@intel.com> Signed-off-by: Kelvin Kang <kelvin.kang@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	Merge tag 'wireless-next-2024-07-11' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.11 Most likely the last "new features" pull request for v6.11 with changes both in stack and in drivers. The big thing is the multiple radios for wiphy feature which makes it possible to better advertise radio capabilities to user space. mt76 enabled MLO and iwlwifi re-enabled MLO, ath12k and rtw89 Wi-Fi 6 devices got WoWLAN support. Major changes: cfg80211/mac80211 * remove DEAUTH_NEED_MGD_TX_PREP flag * multiple radios per wiphy support mac80211_hwsim * multi-radio wiphy support ath12k * DebugFS support for datapath statistics * WCN7850: support for WoW (Wake on WLAN) * WCN7850: device-tree bindings ath11k * QCA6390: device-tree bindings iwlwifi * mvm: re-enable Multi-Link Operation (MLO) * aggregation (A-MSDU) optimisations rtw89 * preparation for RTL8852BE-VT support * WoWLAN support for WiFi 6 chips * 36-bit PCI DMA support mt76 * mt7925 Multi-Link Operation (MLO) support * tag 'wireless-next-2024-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (204 commits) wifi: mac80211: fix AP chandef capturing in CSA wifi: iwlwifi: correctly reference TSO page information wifi: mt76: mt792x: fix scheduler interference in drv own process wifi: mt76: mt7925: enabling MLO when the firmware supports it wifi: mt76: mt7925: remove the unused mt7925_mcu_set_chan_info wifi: mt76: mt7925: update mt7925_mac_link_bss_add for MLO wifi: mt76: mt7925: update mt7925_mcu_bss_basic_tlv for MLO wifi: mt76: mt7925: update mt7925_mcu_set_timing for MLO wifi: mt76: mt7925: update mt7925_mcu_sta_phy_tlv for MLO wifi: mt76: mt7925: update mt7925_mcu_sta_rate_ctrl_tlv for MLO wifi: mt76: mt7925: add mt7925_mcu_sta_eht_mld_tlv for MLO wifi: mt76: mt7925: update mt7925_mcu_sta_update for MLO wifi: mt76: mt7925: update mt7925_mcu_add_bss_info for MLO wifi: mt76: mt7925: update mt7925_mcu_bss_mld_tlv for MLO wifi: mt76: mt7925: update mt7925_mcu_sta_mld_tlv for MLO wifi: mt76: mt7925: add mt7925_[assign,unassign]_vif_chanctx wifi: mt76: add def_wcid to struct mt76_wcid wifi: mt76: mt7925: report link information in rx status wifi: mt76: mt7925: update rate index according to link id wifi: mt76: mt7925: add link handling in the mt7925_ipv6_addr_change ... ==================== Link: https://patch.msgid.link/20240711102353.0C849C116B1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net: ethtool: Fix RSS setting	Saeed Mahameed
	When user submits a rxfh set command without touching XFRM_SYM_XOR, rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff. Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR && !ops->cap_rss_sym_xor_supported) return -EOPNOTSUPP; Will always be true on devices that don't set cap_rss_sym_xor_supported, since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure of any command that doesn't require any change of XFRM, e.g RSS context or hash function changes. To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE before testing other conditions. Note that the problem will only trigger with XFRM-aware userspace, old ethtool CLI would continue to work. Fixes: 0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net: reduce rtnetlink_rcv_msg() stack usage	Eric Dumazet
	IFLA_MAX is increasing slowly but surely. Some compilers use more than 512 bytes of stack in rtnetlink_rcv_msg() because it calls rtnl_calcit() for RTM_GETLINK message. Use noinline_for_stack attribute to not inline rtnl_calcit(), and directly use nla_for_each_attr_type() (Jakub suggestion) because we only care about IFLA_EXT_MASK at this stage. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240710151653.3786604-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	net/sched: act_skbmod: convert comma to semicolon	Chen Ni
	Replace a comma between expression statements by a semicolon. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240709072838.1152880-1-nichen@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	bcachefs: bch2_gc_btree() should not use btree_root_lock	Kent Overstreet
	btree_root_lock is for the root keys in btree_root, not the pointers to the nodes themselves; this fixes a lock ordering issue between btree_root_lock and btree node locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11	bcachefs: Set PF_MEMALLOC_NOFS when trans->locked	Kent Overstreet
	proper lock ordering is: fs_reclaim -> btree node locks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11	bcachefs; Use trans_unlock_long() when waiting on allocator	Kent Overstreet
	not using unlock_long() blocks key cache reclaim, and the allocator may take awhile Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11	Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"	Kent Overstreet
	This reverts commit 86d81ec5f5f05846c7c6e48ffb964b24cba2e669. This wasn't tested with memcg enabled, it immediately hits a null ptr deref in list_lru_add(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-12	dt-bindings: i2c: amlogic,meson6-i2c: add optional power-domains	George Stark
	On newer SoCs, the I2C hardware can require a power domain to operate. Since the same compatible is used for older and newer SoCs make power-domains property optional. Signed-off-by: George Stark <gnstark@salutedevices.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-07-12	bootconfig: Remove duplicate included header file linux/bootconfig.h	Thorsten Blum
	The header file linux/bootconfig.h is included whether __KERNEL__ is defined or not. Include it only once before the #ifdef/#else/#endif preprocessor directives and remove the following make includecheck warning: linux/bootconfig.h is included more than once Move the comment to the top and delete the now empty #else block. Link: https://lore.kernel.org/all/20240711084315.1507-1-thorsten.blum@toblux.com/ Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-07-11	Merge branch 'for-6.11/xor_fixes' into cxl-for-next	Dave Jiang
	Series to fix XOR math for DPA to SPA translation - Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa() - Complete DPA->HPA->SPA translation and correct XOR translation issue - Add new method to verify a CXL target position - Remove old method of CXL target position verifiation
2024-07-12	i2c: rcar: ensure Gen3+ reset does not disturb local targets	Wolfram Sang
	R-Car Gen3+ needs a reset before every controller transfer. That erases configuration of a potentially in parallel running local target instance. To avoid this disruption, avoid controller transfers if a local target is running. Also, disable SMBusHostNotify because it requires being a controller and local target at the same time. Fixes: 3b770017b03a ("i2c: rcar: handle RXDMA HW behaviour on Gen3") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-07-11	iommufd: Remove IOMMUFD_PAGE_RESP_FAILURE	Lu Baolu
	The response code of IOMMUFD_PAGE_RESP_FAILURE was defined to be equivalent to the "Response Failure" in PCI spec, section 10.4.2.1. This response code indicates that one or more pages within the associated request group have encountered or caused an unrecoverable error. Therefore, this response disables the PRI at the function. Modern I/O virtualization technologies, like SR-IOV, share PRI among the assignable device units. Therefore, a response failure on one unit might cause I/O failure on other units. Remove this response code so that user space can only respond with SUCCESS or INVALID. The VMM is recommended to emulate a failure response as a PRI reset, or PRI disable and changing to a non-PRI domain. Fixes: c714f15860fc ("iommufd: Add fault and response message definitions") Link: https://lore.kernel.org/r/20240710083341.44617-2-baolu.lu@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-07-11	cxl: Remove defunct code calculating host bridge target positions	Alison Schofield
	The CXL Spec 3.1 Table 9-22 requires that the BIOS populate the CFMWS target list in interleave target order. This means the calculations the CXL driver added to determine positions when XOR math is in use, along with the entire XOR vs Modulo call back setup is not needed. A prior patch added a common method to verify positions. Remove the now unused code related to the cxl_calc_hb_fn. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/2e2c32a2d0f1007e920b58712d15edad2e48d857.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-11	cxl/region: Verify target positions using the ordered target list	Alison Schofield
	When a root decoder is configured the interleave target list is read from the BIOS populated CFMWS structure. Per the CXL spec 3.1 Table 9-22 the target list is in interleave order. The CXL driver populates its decoder target list in the same order and stores it in 'struct cxl_switch_decoder' field "@target: active ordered target list in current decoder configuration" Given the promise of an ordered list, the driver can stop duplicating the work of BIOS and simply check target positions against the ordered list during region configuration. The simplified check against the ordered list is presented here. A follow-on patch will remove the unused code. For Modulo arithmetic this is not a fix, only a simplification. For XOR arithmetic this is a fix for HB IW of 3,6,12. Fixes: f9db85bfec0d ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/35d08d3aba08fee0f9b86ab1cef0c25116ca8a55.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-11	cxl: Restore XOR'd position bits during address translation	Alison Schofield
	When a device reports a DPA in events like poison, general_media, and dram, the driver translates that DPA back to an HPA. Presently, the CXL driver translation only considers the Modulo position and will report the wrong HPA for XOR configured root decoders. Add a helper function that restores the XOR'd bits during DPA->HPA address translation. Plumb a root decoder callback to the new helper when XOR interleave arithmetic is in use. For Modulo arithmetic, just let the callback be NULL - as in no extra work required. Upon completion of a DPA->HPA translation a couple of checks are performed on the result. One simply confirms that the calculated HPA is within the address range of the region. That test is useful for both Modulo and XOR interleave arithmetic decodes. A second check confirms that the HPA is within an expected chunk based on the endpoints position in the region and the region granularity. An XOR decode disrupts the Modulo pattern making the chunk check useless. To align the checks with the proper decode, pull the region range check inline and use the helper to do the chunk check for Modulo decodes only. A cxl-test unit test is posted for upstream review here: https://lore.kernel.org/20240624210644.495563-1-alison.schofield@intel.com/ Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/1a1ac880d9f889bd6384e657e810431b9a0a72e5.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-11	cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()	Alison Schofield
	Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA addresses the work it performs is a DPA to HPA translation not a trace. Tidy up this naming by moving the minimal work done in cxl_trace_hpa() into cxl_dpa_to_hpa() and use cxl_dpa_to_hpa() for trace event callbacks. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/452a9b0c525b774c72d9d5851515ffa928750132.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12	spi: dt-bindings: at91: Add sama7d65 compatible string	Nicolas Ferre
	Add compatible string for sama7d65. Like sam9x60 and sam9x7, it requires to bind to "atmel,at91rm9200-spi". Group these three under the same enum, sorted alphanumerically, and remove previously added item. Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20240711165402.373634-1-nicolas.ferre@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-07-12	ASoC: dt-bindings: cirrus,cs42xx8: Convert to dtschema	Animesh Agarwal
	Convert the Cirrus Logic CS42448/CS42888 audio CODEC bindings to DT schema format. Set power supply properties to required only for CS42888. Cc: Daniel Baluta <daniel.baluta@nxp.com> Signed-off-by: Animesh Agarwal <animeshagarwal28@gmail.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20240710072756.99765-1-animeshagarwal28@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-07-11	workqueue: Rename wq_update_pod() to unbound_wq_update_pwq()	Lai Jiangshan
	What wq_update_pod() does is just to update the pwq of the specific cpu. Rename it and update the comments. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Remove the arguments @hotplug_cpu and @online from wq_update_pod()	Lai Jiangshan
	The arguments @hotplug_cpu and @online are not used in wq_update_pod() since the functions called by wq_update_pod() don't need them. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Remove the argument @cpu_going_down from wq_calc_pod_cpumask()	Lai Jiangshan
	wq_calc_pod_cpumask() uses wq_online_cpumask, which excludes the cpu going down, so the argument cpu_going_down is unused and can be removed. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Remove the unneeded cpumask empty check in wq_calc_pod_cpumask()	Lai Jiangshan
	The cpumask empty check in wq_calc_pod_cpumask() has long been useless. It just works purely as documents which states that the cpumask is not possible empty after the function returns. Now the code above is even more explicit that the cpumask is not empty, so the document-only empty check can be removed. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Remove cpus_read_lock() from apply_wqattrs_lock()	Lai Jiangshan
	1726a1713590 ("workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.") led to the following possible deadlock: WARNING: possible recursive locking detected 6.10.0-rc5-00004-g1d4c6111406c #1 Not tainted -------------------------------------------- swapper/0/1 is trying to acquire lock: c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) but task is already holding lock: c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: padata_alloc (kernel/padata.c:1007) ... stack backtrace: ... cpus_read_lock (include/linux/percpu-rwsem.h:53 kernel/cpu.c:488) alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) padata_alloc (kernel/padata.c:1007 (discriminator 1)) pcrypt_init_padata (crypto/pcrypt.c:327 (discriminator 1)) pcrypt_init (crypto/pcrypt.c:353) do_one_initcall (init/main.c:1267) do_initcalls (init/main.c:1328 (discriminator 1) init/main.c:1345 (discriminator 1)) kernel_init_freeable (init/main.c:1364) kernel_init (init/main.c:1469) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_32.S:737) entry_INT80_32 (arch/x86/entry/entry_32.S:944) This is caused by pcrypt allocating a workqueue while holding cpus_read_lock(), so workqueue code can't do it again as that can lead to deadlocks if down_write starts after the first down_read. The pwq creations and installations have been reworked based on wq_online_cpumask rather than cpu_online_mask making cpus_read_lock() is unneeded during wqattrs changes. Fix the deadlock by removing cpus_read_lock() from apply_wqattrs_lock(). tj: Updated changelog. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Fixes: 1726a1713590 ("workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.") Link: http://lkml.kernel.org/r/202407081521.83b627c1-lkp@intel.com Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Simplify wq_calc_pod_cpumask() with wq_online_cpumask	Lai Jiangshan
	Avoid relying on cpu_online_mask for wqattrs changes so that cpus_read_lock() can be removed from apply_wqattrs_lock(). And with wq_online_cpumask, attrs->__pod_cpumask doesn't need to be reused as a temporary storage to calculate if the pod have any online CPUs @attrs wants since @cpu_going_down is not in the wq_online_cpumask. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	workqueue: Add wq_online_cpumask	Lai Jiangshan
	The new wq_online_mask mirrors the cpu_online_mask except during hotplugging; specifically, it differs between the hotplugging stages of workqueue_offline_cpu() and workqueue_online_cpu(), during which the transitioning CPU is not represented in the mask. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-11	Merge tag 'for-6.10/dm-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mikulas Patocka: - Fix broken discard for device mapper VDO target * tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm vdo: replace max_discard_sectors with max_hw_discard_sectors
2024-07-11	perf build: Conditionally add feature check flags for libtrace{event,fs}	Guilherme Amadio
	This avoids reported warnings when the packages are not installed. [namhyung]: Removed the dummy assignment and unnecessary ifeq checks. Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}") Signed-off-by: Guilherme Amadio <amadio@gentoo.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Thorsten Leemhuis <linux@leemhuis.info> Link: https://lore.kernel.org/r/20240628203432.3273625-1-amadio@gentoo.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-11	Merge branch ↵	Jakub Kicinski
	'ethtool-use-the-rss-context-xarray-in-ring-deactivation-safety-check' Jakub Kicinski says: ==================== ethtool: use the rss context XArray in ring deactivation safety-check Now that we have an XArray storing information about all extra RSS contexts - use it to extend checks already performed using ethtool_get_max_rxfh_channel(). ==================== Link: https://patch.msgid.link/20240710174043.754664-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	ethtool: use the rss context XArray in ring deactivation safety-check	Jakub Kicinski
	ethtool_get_max_rxfh_channel() gets called when user requests deactivating Rx channels. Check the additional RSS contexts, too. While we do track whether RSS context has an indirection table explicitly set by the user, no driver looks at that bit. Assume drivers won't auto-regenerate the additional tables, to be safe. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240710174043.754664-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	ethtool: fail closed if we can't get max channel used in indirection tables	Jakub Kicinski
	Commit 0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts") proves that allowing indirection table to contain channels with out of bounds IDs may lead to crashes. Currently the max channel check in the core gets skipped if driver can't fetch the indirection table or when we can't allocate memory. Both of those conditions should be extremely rare but if they do happen we should try to be safe and fail the channel change. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240710174043.754664-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11	drm/vboxvideo: fix mapping leaks	Philipp Stanner
	When the PCI devres API was introduced to this driver, it was wrongly assumed that initializing the device with pcim_enable_device() instead of pci_enable_device() will make all PCI functions managed. This is wrong and was caused by the quite confusing PCI devres API in which some, but not all, functions become managed that way. The function pci_iomap_range() is never managed. Replace pci_iomap_range() with the managed function pcim_iomap_range(). Fixes: 8558de401b5f ("drm/vboxvideo: use managed pci functions") Link: https://lore.kernel.org/r/20240613115032.29098-14-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2024-07-11	PCI: Add managed pcim_iomap_range()	Philipp Stanner
	The only managed mapping function currently is pcim_iomap() which doesn't allow for mapping an area starting at a certain offset, which many drivers want. Add pcim_iomap_range() as an exported function. Link: https://lore.kernel.org/r/20240613115032.29098-13-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>