linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-03-01	i40e: disable NAPI right after disabling irqs when handling xsk_pool	Maciej Fijalkowski
	Disable NAPI before shutting down queues that this particular NAPI contains so that the order of actions in i40e_queue_pair_disable() mirrors what we do in i40e_queue_pair_enable(). Fixes: 123cecd427b6 ("i40e: added queue pair disable/enable functions") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-01	ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}able	Maciej Fijalkowski
	Currently routines that are supposed to toggle state of ring pair do not take care of associated interrupt with queue vector that these rings belong to. This causes funky issues such as dead interface due to irq misconfiguration, as per Pavel's report from Closes: tag. Add a function responsible for disabling single IRQ in EIMC register and call this as a very first thing when disabling ring pair during xsk_pool setup. For enable let's reuse ixgbe_irq_enable_queues(). Besides this, disable/enable NAPI as first/last thing when dealing with closing or opening ring pair that xsk_pool is being configured on. Reported-by: Pavel Vazharov <pavel@x3me.net> Closes: https://lore.kernel.org/netdev/CAJEV1ijxNyPTwASJER1bcZzS9nMoZJqfR86nu_3jFFVXzZQ4NA@mail.gmail.com/ Fixes: 024aa5800f32 ("ixgbe: added Rx/Tx ring disable/enable functions") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-01	nbd: use the atomic queue limits API in nbd_set_size	Christoph Hellwig
	Use queue_limits_start_update / queue_limits_commit_update to update all the limits in one go and with proper sanity checking. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240229143846.1047223-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-01	nbd: freeze the queue for queue limits updates	Christoph Hellwig
	nbd currently updates the logical and physical block sizes as well as the discard_sectors on a live queue. Freeze the queue first to make sure there are not commands in flight that can see torn or inconsistent limits. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240229143846.1047223-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-01	nbd: don't clear discard_sectors in nbd_config_put	Christoph Hellwig
	nbd_config_put currently clears discard_sectors when unusing a device. This is pretty odd behavior and different from the sector size configuration which is simply left in places and then reconfigured when nbd_set_size is as part of configuring the device. Change nbd_set_size to clear discard_sectors if discard is not supported so that all the queue limits changes are handled in one place. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240229143846.1047223-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-01	pktcdvd: don't set max_hw_sectors on the underlying device	Christoph Hellwig
	pktcdvd sets max_hw_sectors on the queue of the underlying device that it doesn't own (and doesn't reset it ever) since the driver was merged. This can create all kinds of problems as the underlying driver doesn't even know about it changing the limit. As the state purpose is to not create I/Os larger than a single frame, and pktcdvd never builds bios larger than that, just set REQ_NOMERGE on the bios it submits so that largers I/Os never get built. Note: I don't have packet writing hardware, so this is compile tested only. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240229144408.1047967-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-01	RAS/AMD/FMPM: Add debugfs interface to print record entries	Yazen Ghannam
	It is helpful to see the saved record entries during run time in human-readable format. This is useful for testing during module development. It can also be used by system admins to quickly and easily see the state of the system. Provide a sequential file in debugfs to print fields of interest from the FRU records and their entries. Don't fail to load the module if the debugfs interface is not available. This is a convenience feature which does not affect other module functionality. The new interface reads the record entries and should hold the mutex. Expand the mutex code comment to clarify when it should be held. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240301143748.854090-4-yazen.ghannam@amd.com
2024-03-01	RAS/AMD/FMPM: Save SPA values	Yazen Ghannam
	The system physical address (SPA) of an error is not a stable value. It will change depending on the location of the memory: parts can be swapped. And it will change depending on memory topology: NUMA nodes and/or interleaving can be adjusted. Therefore, the SPA value is not part of the "FRU Memory Poison" record format. And it will not be saved to persistent storage. However, the SPA values can be helpful during debug and for system admins during run time. Save the SPA values in a separate structure. This is updated when records are restored and when new errors are saved. [ bp: Make error messages more user friendly and add and correct comments. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240301143748.854090-3-yazen.ghannam@amd.com
2024-03-01	RAS: Export helper to get ras_debugfs_dir	Borislav Petkov (AMD)
	Export a getter instead of the debugfs node directly so that, other in-tree-only RAS modules can use it. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20240301143748.854090-2-yazen.ghannam@amd.com
2024-03-01	dm: use queue_limits_set	Christoph Hellwig
	Use queue_limits_set which validates the limits and takes care of updating the readahead settings instead of directly assigning them to the queue. For that make sure all limits are actually updated before the assignment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20240228225653.947152-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-03-01	iommu/sva: Fix SVA handle sharing in multi device case	Zhangfei Gao
	iommu_sva_bind_device will directly goto out in multi-device case when found existing domain, ignoring list_add handle, which causes the handle to fail to be shared. Fixes: 65d4418c5002 ("iommu/sva: Restore SVA handle sharing") Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240227064821.128-1-zhangfei.gao@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-03-01	Merge tag 'md-6.9-20240301' of ↵	Jens Axboe
	https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.9/block Pull MD updates from Song: "The major changes are: 1. Refactor raid1 read_balance, by Yu Kuai and Paul Luse. 2. Clean up and fix for md_ioctl, by Li Nan. 3. Other small fixes, by Gui-Dong Han and Heming Zhao." * tag 'md-6.9-20240301' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: (22 commits) md/raid1: factor out helpers to choose the best rdev from read_balance() md/raid1: factor out the code to manage sequential IO md/raid1: factor out choose_bb_rdev() from read_balance() md/raid1: factor out choose_slow_rdev() from read_balance() md/raid1: factor out read_first_rdev() from read_balance() md/raid1-10: factor out a new helper raid1_should_read_first() md/raid1-10: add a helper raid1_check_read_range() md/raid1: fix choose next idle in read_balance() md/raid1: record nonrot rdevs while adding/removing rdevs to conf md/raid1: factor out helpers to add rdev to conf md: add a new helper rdev_has_badblock() md/raid5: fix atomicity violation in raid5_cache_count md/md-bitmap: fix incorrect usage for sb_index md: check mddev->pers before calling md_set_readonly() md: clean up openers check in do_md_stop() and md_set_readonly() md: sync blockdev before stopping raid or setting readonly md: factor out a helper to sync mddev md: Don't clear MD_CLOSING when the raid is about to stop md: return directly before setting did_set_md_closing md: clean up invalid BUG_ON in md_ioctl ...
2024-03-01	netdevsim: add ndo_get_iflink() implementation	David Wei
	Add an implementation for ndo_get_iflink() in netdevsim that shows the ifindex of the linked peer, if any. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Maciek Machnikowski <maciek@machnikowski.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	netdevsim: forward skbs from one connected port to another	David Wei
	Forward skbs sent from one netdevsim port to its connected netdevsim port using dev_forward_skb, in a spirit similar to veth. Add a tx_dropped variable to struct netdevsim, tracking the number of skbs that could not be forwarded using dev_forward_skb(). The xmit() function accessing the peer ptr is protected by an RCU read critical section. The rcu_read_lock() is functionally redundant as since v5.0 all softirqs are implicitly RCU read critical sections; but it is useful for human readers. If another CPU is concurrently in nsim_destroy(), then it will first set the peer ptr to NULL. This does not affect any existing readers that dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is a synchronize_rcu() before the netdev is actually unregistered and freed. This ensures that any readers i.e. xmit() that got a non-NULL peer will complete before the netdev is freed. Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() will dereference NULL, making it safe. The codepath to nsim_destroy() and nsim_create() takes both the newly added nsim_dev_list_lock and rtnl_lock. This makes it safe with concurrent calls to linking two netdevsims together. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Maciek Machnikowski <maciek@machnikowski.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	netdevsim: allow two netdevsim ports to be connected	David Wei
	Add two netdevsim bus attribute to sysfs: /sys/bus/netdevsim/link_device /sys/bus/netdevsim/unlink_device Writing "A M B N" to link_device will link netdevsim M in netnsid A with netdevsim N in netnsid B. Writing "A M" to unlink_device will unlink netdevsim M in netnsid A from its peer, if any. rtnl_lock is taken to ensure nothing changes during the linking. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Maciek Machnikowski <maciek@machnikowski.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	crypto: rk3288 - Fix use after free in unprepare	Herbert Xu
	The unprepare call must be carried out before the finalize call as the latter can free the request. Fixes: c66c17a0f69b ("crypto: rk3288 - Remove prepare/unprepare request") Reported-by: Andrey Skvortsov <andrej.skvortzov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Andrey Skvortsov <andrej.skvortzov@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-03-01	net: bcmasp: Add support for PHY interrupts	Justin Chen
	Hook up the phy interrupts for internal phys to reduce mdio traffic and improve responsiveness of link changes. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: bcmasp: Keep buffers through power management	Justin Chen
	There is no advantage of freeing and re-allocating buffers through suspend and resume. This waste cycles and makes suspend/resume time longer. We also open ourselves to failed allocations in systems with heavy memory fragmentation. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: phy: mdio-bcm-unimac: Add asp v2.2 support	Justin Chen
	Add mdio compat string for ASP 2.0 ethernet driver. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: bcmasp: Add support for ASP 2.2	Justin Chen
	ASP 2.2 improves power savings during low power modes. A new register was added to toggle to a slower clock during low power modes. EEE was broken for ASP 2.0/2.1. A HW workaround was added for ASP 2.2 that requires toggling a chicken bit. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: phy: qcom: qca808x: fill in possible_interfaces	Robert Marko
	Currently QCA808x driver does not fill the possible_interfaces. 2.5G QCA808x support SGMII and 2500Base-X while 1G model only supports SGMII, so fill the possible_interfaces accordingly. Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: phy: qcom: qca808x: add helper for checking for 1G only model	Robert Marko
	There are 2 versions of QCA808x, one 2.5G capable and one 1G capable. Currently, this matter only in the .get_features call however, it will be required for filling supported interface modes so lets add a helper that can be reused. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: bql: fix building with BQL disabled	Arnd Bergmann
	It is now possible to disable BQL, but that causes the cpsw driver to break: drivers/net/ethernet/ti/am65-cpsw-nuss.c:297:28: error: no member named 'dql' in 'struct netdev_queue' 297 \| dql_avail(&netif_txq->dql), There is already a helper function in net/sch_generic.h that could be used to help here. Move its implementation into the common linux/netdevice.h along with the other bql interfaces and change both users over to the new interface. Fixes: ea7f3cfaa588 ("net: bql: allow the config to be disabled") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	ipv6: annotate data-races around cnf.forwarding	Eric Dumazet
	idev->cnf.forwarding and net->ipv6.devconf_all->forwarding might be read locklessly, add appropriate READ_ONCE() and WRITE_ONCE() annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	ipv6: annotate data-races around cnf.hop_limit	Eric Dumazet
	idev->cnf.hop_limit and net->ipv6.devconf_all->hop_limit might be read locklessly, add appropriate READ_ONCE() and WRITE_ONCE() annotations. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Florian Westphal <fw@strlen.de> # for netfilter parts Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	net: lan78xx: fix runtime PM count underflow on link stop	Oleksij Rempel
	Current driver has some asymmetry in the runtime PM calls. On lan78xx_open() it will call usb_autopm_get() and unconditionally usb_autopm_put(). And on lan78xx_stop() it will call only usb_autopm_put(). So far, it was working only because this driver do not activate autosuspend by default, so it was visible only by warning "Runtime PM usage count underflow!". Since, with current driver, we can't use runtime PM with active link, execute lan78xx_open()->usb_autopm_put() only in error case. Otherwise, keep ref counting high as long as interface is open. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01	gpio: fix resource unwinding order in error path	Bartosz Golaszewski
	Hogs are added after ACPI so should be removed before in error path. Fixes: a411e81e61df ("gpiolib: add hogs support for machine code") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-03-01	Drivers: hv: vmbus: Update indentation in create_gpadl_header()	Michael Kelley
	A previous commit left the indentation in create_gpadl_header() unchanged for ease of review. Update the indentation and remove line wrap in two places where it is no longer necessary. No functional change. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240111165451.269418-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240111165451.269418-2-mhklinux@outlook.com>
2024-03-01	Drivers: hv: vmbus: Remove duplication and cleanup code in create_gpadl_header()	Michael Kelley
	create_gpadl_header() creates a message header, and one or more message bodies if the number of GPADL entries exceeds what fits in the header. Currently the code for creating the message header is duplicated in the two halves of the main "if" statement governing whether message bodies are created. Eliminate the duplication by making minor tweaks to the logic and associated comments. While here, simplify the handling of memory allocation errors, and use umin() instead of open coding it. For ease of review, the indentation of sizable chunks of code is not changed. A follow-on patch updates only the indentation. No functional change. Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240111165451.269418-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240111165451.269418-1-mhklinux@outlook.com>
2024-03-01	fbdev/hyperv_fb: Fix logic error for Gen2 VMs in hvfb_getmem()	Michael Kelley
	A recent commit removing the use of screen_info introduced a logic error. The error causes hvfb_getmem() to always return -ENOMEM for Generation 2 VMs. As a result, the Hyper-V frame buffer device fails to initialize. The error was introduced by removing an "else if" clause, leaving Gen2 VMs to always take the -ENOMEM error path. Fix the problem by removing the error path "else" clause. Gen 2 VMs now always proceed through the MMIO memory allocation code, but with "base" and "size" defaulting to 0. Fixes: 0aa0838c84da ("fbdev/hyperv_fb: Remove firmware framebuffers with aperture helpers") Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20240201060022.233666-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240201060022.233666-1-mhklinux@outlook.com>
2024-03-01	hv_utils: Allow implicit ICTIMESYNCFLAG_SYNC	Peter Martincic
	Hyper-V hosts can omit the _SYNC flag to due a bug on resume from modern suspend. In such a case, the guest may fail to update its time-of-day to account for the period when it was suspended, and could proceed with a significantly wrong time-of-day. In such a case when the guest is significantly behind, fix it by treating a _SAMPLE the same as if _SYNC was received so that the guest time-of-day is updated. This is hidden behind param hv_utils.timesync_implicit. Signed-off-by: Peter Martincic <pmartincic@microsoft.com> Acked-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20231127213524.52783-1-pmartincic@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20231127213524.52783-1-pmartincic@linux.microsoft.com>
2024-03-01	gpiolib: Fix the error path order in gpiochip_add_data_with_key()	Andy Shevchenko
	After shuffling the code, error path wasn't updated correctly. Fix it here. Fixes: 2f4133bb5f14 ("gpiolib: No need to call gpiochip_remove_pin_ranges() twice") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-03-01	gpio: 74x164: Enable output pins after registers are reset	Arturas Moskvinas
	Chip outputs are enabled[1] before actual reset is performed[2] which might cause pin output value to flip flop if previous pin value was set to 1. Fix that behavior by making sure chip is fully reset before all outputs are enabled. Flip-flop can be noticed when module is removed and inserted again and one of the pins was changed to 1 before removal. 100 microsecond flipping is noticeable on oscilloscope (100khz SPI bus). For a properly reset chip - output is enabled around 100 microseconds (on 100khz SPI bus) later during probing process hence should be irrelevant behavioral change. Fixes: 7ebc194d0fd4 (gpio: 74x164: Introduce 'enable-gpios' property) Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L130 [1] Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L150 [2] Signed-off-by: Arturas Moskvinas <arturas.moskvinas@gmail.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-02-29	md/raid1: factor out helpers to choose the best rdev from read_balance()	Yu Kuai
	The way that best rdev is chosen: 1) If the read is sequential from one rdev: - if rdev is rotational, use this rdev; - if rdev is non-rotational, use this rdev until total read length exceed disk opt io size; 2) If the read is not sequential: - if there is idle disk, use it, otherwise: - if the array has non-rotational disk, choose the rdev with minimal inflight IO; - if all the underlaying disks are rotational disk, choose the rdev with closest IO; There are no functional changes, just to make code cleaner and prepare for following refactor. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-12-yukuai1@huaweicloud.com
2024-02-29	md/raid1: factor out the code to manage sequential IO	Yu Kuai
	There is no functional change for now, make read_balance() cleaner and prepare to fix problems and refactor the handler of sequential IO. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-11-yukuai1@huaweicloud.com
2024-02-29	md/raid1: factor out choose_bb_rdev() from read_balance()	Yu Kuai
	read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the rdev with bad blocks from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-10-yukuai1@huaweicloud.com
2024-02-29	md/raid1: factor out choose_slow_rdev() from read_balance()	Yu Kuai
	read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the slow rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-9-yukuai1@huaweicloud.com
2024-02-29	md/raid1: factor out read_first_rdev() from read_balance()	Yu Kuai
	read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the first rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-8-yukuai1@huaweicloud.com
2024-02-29	md/raid1-10: factor out a new helper raid1_should_read_first()	Yu Kuai
	If resync is in progress, read_balance() should find the first usable disk, otherwise, data could be inconsistent after resync is done. raid1 and raid10 implement the same checking, hence factor out the checking to make code cleaner. Noted that raid1 is using 'mddev->recovery_cp', which is updated after all resync IO is done, while raid10 is using 'conf->next_resync', which is inaccurate because raid10 update it before submitting resync IO. Fortunately, raid10 read IO can't concurrent with resync IO, hence there is no problem. And this patch also switch raid10 to use 'mddev->recovery_cp'. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-7-yukuai1@huaweicloud.com
2024-02-29	md/raid1-10: add a helper raid1_check_read_range()	Yu Kuai
	The checking and handler of bad blocks appear many timers during read_balance() in raid1 and raid10. This helper will be used in later patches to simplify read_balance() a lot. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-6-yukuai1@huaweicloud.com
2024-02-29	md/raid1: fix choose next idle in read_balance()	Yu Kuai
	Commit 12cee5a8a29e ("md/raid1: prevent merging too large request") add the case choose next idle in read_balance(): read_balance: for_each_rdev if(next_seq_sect == this_sector \|\| dist == 0) -> sequential reads best_disk = disk; if (...) choose_next_idle = 1 continue; for_each_rdev -> iterate next rdev if (pending == 0) best_disk = disk; -> choose the next idle disk break; if (choose_next_idle) -> keep using this rdev if there are no other idle disk contine However, commit 2e52d449bcec ("md/raid1: add failfast handling for reads.") remove the code: - /* If device is idle, use it */ - if (pending == 0) { - best_disk = disk; - break; - } Hence choose next idle will never work now, fix this problem by following: 1) don't set best_disk in this case, read_balance() will choose the best disk after iterating all the disks; 2) add 'pending' so that other idle disk will be chosen; 3) add a new local variable 'sequential_disk' to record the disk, and if there is no other idle disk, 'sequential_disk' will be chosen; Fixes: 2e52d449bcec ("md/raid1: add failfast handling for reads.") Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-5-yukuai1@huaweicloud.com
2024-02-29	md/raid1: record nonrot rdevs while adding/removing rdevs to conf	Yu Kuai
	For raid1, each read will iterate all the rdevs from conf and check if any rdev is non-rotational, then choose rdev with minimal IO inflight if so, or rdev with closest distance otherwise. Disk nonrot info can be changed through sysfs entry: /sys/block/[disk_name]/queue/rotational However, consider that this should only be used for testing, and user really shouldn't do this in real life. Record the number of non-rotational disks in conf, to avoid checking each rdev in IO fast path and simplify read_balance() a little bit. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-4-yukuai1@huaweicloud.com
2024-02-29	md/raid1: factor out helpers to add rdev to conf	Yu Kuai
	There are no functional changes, just make code cleaner and prepare to record disk non-rotational information while adding and removing rdev to conf Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-3-yukuai1@huaweicloud.com
2024-02-29	md: add a new helper rdev_has_badblock()	Yu Kuai
	The current api is_badblock() must pass in 'first_bad' and 'bad_sectors', however, many caller just want to know if there are badblocks or not, and these caller must define two local variable that will never be used. Add a new helper rdev_has_badblock() that will only return if there are badblocks or not, remove unnecessary local variables and replace is_badblock() with the new helper in many places. There are no functional changes, and the new helper will also be used later to refactor read_balance(). Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-2-yukuai1@huaweicloud.com
2024-03-01	drm/nouveau: keep DMA buffers required for suspend/resume	Sid Pranjale
	Nouveau deallocates a few buffers post GPU init which are required for GPU suspend/resume to function correctly. This is likely not as big an issue on systems where the NVGPU is the only GPU, but on multi-GPU set ups it leads to a regression where the kernel module errors and results in a system-wide rendering freeze. This commit addresses that regression by moving the two buffers required for suspend and resume to be deallocated at driver unload instead of post init. Fixes: 042b5f83841fb ("drm/nouveau: fix several DMA buffer leaks") Signed-off-by: Sid Pranjale <sidpranjale127@protonmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01	nouveau: report byte usage in VRAM usage.	Dave Airlie
	Turns out usage is always in bytes not shifted. Fixes: 72fa02fdf833 ("nouveau: add an ioctl to report vram usage") Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01	Merge tag 'amd-drm-fixes-6.8-2024-02-29' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.8-2024-02-29: amdgpu: - Fix potential buffer overflow - Fix power min cap - Suspend/resume fix - SI PM fix - eDP fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229152424.6646-1-alexander.deucher@amd.com
2024-03-01	Merge tag 'drm-msm-fixes-2024-02-28' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.8-rc7 DP: - Revert a change which was causing a HDP regression Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvhWvHiPGQ1pRD2XPAQoHEM2M35kjhrsSAEtzh8AMSRvg@mail.gmail.com
2024-03-01	Merge tag 'drm-xe-fixes-2024-02-29' of ↵	Dave Airlie
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes UAPI Changes: - A couple of tracepoint updates from Priyanka and Lucas. - Make sure BINDs are completed before accepting UNBINDs on LR vms. - Don't arbitrarily restrict max number of batched binds. - Add uapi for dumpable bos (agreed on IRC). - Remove unused uapi flags and a leftover comment. Driver Changes: - A couple of fixes related to the execlist backend. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZeCBg4MA2hd1oggN@fedora
2024-03-01	Merge tag 'drm-misc-fixes-2024-02-29' of ↵	Dave Airlie
	https://anongit.freedesktop.org/git/drm/drm-misc into drm-fixes A reset fix for host1x, a resource leak fix and a probe fix for aux-hpd, a use-after-free fix and a boot fix for a pmic_glink qcom driver in drivers/soc, a fix for the simpledrm/tegra transition, a kunit fix for the TTM tests, a font handling fix for fbcon, two allocation fixes and a kunit test to cover them for drm/buddy Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229-angelic-adorable-teal-fbfabb@houat