linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-19	mmc: mvsdio: fix deferred probing	Sergey Shtylyov
	The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-5-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mtk-sd: fix deferred probing	Sergey Shtylyov
	The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-4-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: meson-gx: fix deferred probing	Sergey Shtylyov
	The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: cbcaac6d7dd2 ("mmc: meson-gx-mmc: Fix platform_get_irq's error checking") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230617203622.6812-3-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: bcm2835: fix deferred probing	Sergey Shtylyov
	The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 660fc733bd74 ("mmc: bcm2835: Add new driver for the sdhost controller.") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20230617203622.6812-2-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUS	Jisheng Zhang
	mmc host drivers should have enabled the asynchronous probe option, but it seems like we didn't set it for litex_mmc when introducing litex mmc support, so let's set it now. Tested with linux-on-litex-vexriscv on sipeed tang nano 20K fpga. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Acked-by: Gabriel Somlo <gsomlo@gmail.com> Fixes: 92e099104729 ("mmc: Add driver for LiteX's LiteSDCard interface") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230617085319.2139-1-jszhang@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Break out a helper function	Linus Walleij
	These four lines clearing, masking and resetting the state of the busy detect state machine is repeated five times in the code so break this out to a small helper so things are easier to read. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-9-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Use a switch statement machine	Linus Walleij
	As is custom, use a big switch statement to transition between the edges of the state machine inside the ux500 ->busy_complete callback. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-8-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Use state machine state as exit condition	Linus Walleij
	Return true if and only if we reached the state MMCI_BUSY_DONE in the ux500 ->busy_complete() callback. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-7-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Retry the busy start condition	Linus Walleij
	This makes the ux500 ->busy_complete() callback re-read the status register 10 times while waiting for the busy signal to assert in the status register. If this does not happen, we bail out regarding the command completed already, i.e. before we managed to start to check the busy status. There is a comment in the code about this, let's just implement it to be certain that we can catch this glitch if it happens. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-6-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Make busy complete state machine explicit	Linus Walleij
	This refactors the ->busy_complete() callback currently only used by Ux500 and STM32 to handle busy detection on hardware where one and the same IRQ is fired whether we get a start or an end signal on busy detect. The code is currently using the cached status from the command IRQ in ->busy_status as a state to select what to do next: if this state is non-zero we are waiting for IRQs and if it is zero we treat the state as the starting point for a busy detect wait cycle. Make this explicit by creating a state machine where the ->busy_complete callback moves between three states. The Ux500 busy detect code currently assumes this order: we enable the busy detect IRQ, get a busy start IRQ, then a busy end IRQ, and then we clear and mask this IRQ and proceed. We insert debug prints for unexpected states. This works as before on most cards, however on a problematic card that is not working with busy detect, and which I have been debugging, the following happens a lot: [ 3.380554] mmci-pl18x 80005000.mmc: no busy signalling in time [ 3.387420] mmci-pl18x 80005000.mmc: no busy signalling in time [ 3.394561] mmci-pl18x 80005000.mmc: lost busy status when waiting for busy start IRQ This probably means that the busy detect start IRQ has already occurred when we start executing the ->busy_complete() callbacks, and the busy detect end IRQ is counted as the start IRQ, and this is what is causing the card to not be detected properly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-5-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Break out error check in busy detect	Linus Walleij
	The busy detect callback for Ux500 checks for an error in the status in the first if() clause. The only practical reason is that if an error occurs, the if()-clause is not executed, and the code falls through to the last if()-clause if (host->busy_status) which will clear and disable the irq. Make this explicit instead: it is easier to read. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-4-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Stash status while waiting for busy	Linus Walleij
	Some interesting flags can arrive while we are waiting for the first busy detect IRQ so OR then onto the stashed flags so they are not missed. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-3-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Unwind big if() clause	Linus Walleij
	This does two things: firsr replace the hard-to-read long if-expression: if (!host->busy_status && !(status & err_msk) && (readl(base + MMCISTATUS) & host->variant->busy_detect_flag)) { With the more readable: if (!host->busy_status && !(status & err_msk)) { status = readl(base + MMCISTATUS); if (status & host->variant->busy_detect_flag) { Second notice that the re-read MMCISTATUS register is now stored into the status variable, using logic OR because what if something else changed too? While we are at it, explain what the function is doing. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-2-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	mmc: mmci: Clear busy_status when starting command	Linus Walleij
	If we are starting a command which can generate a busy response, then clear the variable host->busy_status if the variant is using a ->busy_complete callback. We are lucky that the member is zero by default and hopefully always gets cleared in the ->busy_complete callback even on errors, but it's just fragile so make sure it is always initialized to zero. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230405-pl180-busydetect-fix-v7-1-69a7164f2a61@linaro.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-19	wifi: mac80211: check EHT basic MCS/NSS set	Johannes Berg
	Check that all the NSS in the EHT basic MCS/NSS set are actually supported, otherwise disable EHT for the connection. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.737827c906c9.I0c11a3cd46ab4dcb774c11a5bbc30aecfb6fce11@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: cfg80211: search all RNR elements for colocated APs	Benjamin Berg
	An AP reporting colocated APs may send more than one reduced neighbor report element. As such, iterate all elements instead of only parsing the first one when looking for colocated APs. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.ffe2c014f478.I372a4f96c88f7ea28ac39e94e0abfc465b5330d4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: cfg80211: stop parsing after allocation failure	Benjamin Berg
	The error handling code would break out of the loop incorrectly, causing the rest of the message to be misinterpreted. Fix this by also jumping out of the surrounding while loop, which will trigger the error detection code. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.0ffac98475cf.I6f5c08a09f5c9fced01497b95a9841ffd1b039f8@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: update multi-link element STA reconfig	Johannes Berg
	Update the MLE STA reconfig sub-type to 802.11be D3.0 format, which includes the operation update field. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.2e1383b31f07.I8055a111c8fcf22e833e60f5587a4d8d21caca5b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: mac80211: agg-tx: prevent start/stop race	Johannes Berg
	There were crashes reported in this code, and the timer_shutdown() warning in one of the previous patches indicates that the timeout timer for the AP response (addba_resp_timer) is still armed while we're stopping the aggregation session. After a very long deliberation of the code, so far the only way I could find that might cause this would be the following sequence: - session start requested - session start indicated to driver, but driver returns IEEE80211_AMPDU_TX_START_DELAY_ADDBA - session stop requested, sets HT_AGG_STATE_WANT_STOP - session stop worker runs ___ieee80211_stop_tx_ba_session(), sets HT_AGG_STATE_STOPPING From here on, the order doesn't matter exactly, but: 1. driver calls ieee80211_start_tx_ba_cb_irqsafe(), setting HT_AGG_STATE_START_CB 2. driver calls ieee80211_stop_tx_ba_cb_irqsafe(), setting HT_AGG_STATE_STOP_CB 3. the worker will run ieee80211_start_tx_ba_cb() for HT_AGG_STATE_START_CB 4. the worker will run ieee80211_stop_tx_ba_cb() for HT_AGG_STATE_STOP_CB (the order could also be 1./3./2./4.) This will cause ieee80211_start_tx_ba_cb() to send out the AddBA request frame to the AP and arm the timer, but we're already in the middle of stopping and so the ieee80211_stop_tx_ba_cb() will no longer assume it needs to stop anything. Prevent this by checking for WANT_STOP/STOPPING in the start CB, and warn if we're sending a frame on a stopping session. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.e5b52777462a.I0b2ed6658e81804279f5d7c9c1918cb1f6626bf2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: mac80211: agg-tx: add a few locking assertions	Johannes Berg
	This is all true today, but difficult to understand since the callers are in other files etc. Add two new lockdep assertions to make things easier to read. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.7f03dec6a90b.I762c11e95da005b80fa0184cb1173b99ec362acf@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: ieee80211: reorder presence checks in MLE per-STA profile	Johannes Berg
	In ieee80211_mle_sta_prof_size_ok(), the presence checks aren't ordered by field order, so that's a bit confusing. Reorder them. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.fdbf17320a37.I517cf27fdc3f6e5d6a2615182da47ba4bdf14039@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: mac80211: Support link removal using Reconfiguration ML element	Ilan Peer
	Add support for handling link removal indicated by the Reconfiguration Multi-Link element. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Link: https://lore.kernel.org/r/20230618214436.d8a046dc0c1a.I4dcf794da2a2d9f4e5f63a4b32158075d27c0660@changeid [use cfg80211_links_removed() API instead] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: mac80211: add set_active_links variant not locking sdata	Benjamin Berg
	There are cases where keeping sdata locked for an operation. Add a variant that does not take sdata lock to permit these usecases. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	wifi: mac80211: add ___ieee80211_disconnect variant not locking sdata	Benjamin Berg
	There are cases where keeping sdata locked for an operation. Add a variant that does not take sdata lock to permit these usecases. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-19	ovl: port to new mount api	Christian Brauner
	We recently ported util-linux to the new mount api. Now the mount(8) tool will by default use the new mount api. While trying hard to fall back to the old mount api gracefully there are still cases where we run into issues that are difficult to handle nicely. Now with mount(8) and libmount supporting the new mount api I expect an increase in the number of bug reports and issues we're going to see with filesystems that don't yet support the new mount api. So it's time we rectify this. When ovl_fill_super() fails before setting sb->s_root, we need to cleanup sb->s_fs_info. The logic is a bit convoluted but tl;dr: If sget_fc() has succeeded fc->s_fs_info will have been transferred to sb->s_fs_info. So by the time ->fill_super()/ovl_fill_super() is called fc->s_fs_info is NULL consequently fs_context->free() won't call ovl_free_fs(). If we fail before sb->s_root() is set then ->put_super() won't be called which would call ovl_free_fs(). IOW, if we fail in ->fill_super() before sb->s_root we have to clean it up. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: factor out ovl_parse_options() helper	Amir Goldstein
	For parsing a single mount option. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: store enum redirect_mode in config instead of a string	Amir Goldstein
	Do all the logic to set the mode during mount options parsing and do not keep the option string around. Use a constant_table to translate from enum redirect mode to string in preperation for new mount api option parsing. The mount option "off" is translated to either "follow" or "nofollow", depending on the "redirect_always_follow" build/module config, so in effect, there are only three possible redirect modes. This results in a minor change to the string that is displayed in show_options() - when redirect_dir is enabled by default and the user mounts with the option "redirect_dir=off", instead of displaying the mode "redirect_dir=off" in show_options(), the displayed mode will be either "redirect_dir=follow" or "redirect_dir=nofollow", depending on the value of "redirect_always_follow" build/module config. The displayed mode reflects the effective mode, so mounting overlayfs again with the dispalyed redirect_dir option will result with the same effective and displayed mode. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: pass ovl_fs to xino helpers	Amir Goldstein
	Internal ovl methods should use ovl_fs and not sb as much as possible. Use a constant_table to translate from enum xino mode to string in preperation for new mount api option parsing. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: clarify ovl_get_root() semantics	Amir Goldstein
	Change the semantics to take a reference on upperdentry instead of transferrig the reference. This is needed for upcoming port to new mount api. Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: negate the ofs->share_whiteout boolean	Amir Goldstein
	The default common case is that whiteout sharing is enabled. Change to storing the negated no_shared_whiteout state, so we will not need to initialize it. This is the first step towards removing all config and feature initializations out of ovl_fill_super(). Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	ovl: check type and offset of struct vfsmount in ovl_entry	Christian Brauner
	Porting overlayfs to the new amount api I started experiencing random crashes that couldn't be explained easily. So after much debugging and reasoning it became clear that struct ovl_entry requires the point to struct vfsmount to be the first member and of type struct vfsmount. During the port I added a new member at the beginning of struct ovl_entry which broke all over the place in the form of random crashes and cache corruptions. While there's a comment in ovl_free_fs() to the effect of "Hack! Reuse ofs->layers as a vfsmount array before freeing it" there's no such comment on struct ovl_entry which makes this easy to trip over. Add a comment and two static asserts for both the offset and the type of pointer in struct ovl_entry. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2023-06-19	EDAC/amd64: Cache and use GPU node map	Yazen Ghannam
	AMD systems have historically provided an "AMD Node ID" that is a unique identifier for each die in a multi-die package. This was associated with a unique instance of the AMD Northbridge on a legacy system. And now it is associated with a unique instance of the AMD Data Fabric on modern systems. Each instance is referred to as a "Node"; this is an AMD-specific term not to be confused with NUMA nodes. The data fabric provides a number of interfaces accessible through a set of functions in a single PCI device. There is one PCI device per Data Fabric (AMD Node), and multi-die systems will see multiple such PCI devices. The AMD Node ID matches a Node's position in the PCI hierarchy. For example, the Node 0 is accessed using the first PCI device, Node 1 is accessed using the second, and so on. A logical CPU can find its AMD Node ID using CPUID. Furthermore, the AMD Node ID is used within the hardware fabric, so it is not purely a logical value. Heterogeneous AMD systems, with a CPU Data Fabric connected to GPU data fabrics, follow a similar convention. Each CPU and GPU die has a unique AMD Node ID value, and each Node ID corresponds to PCI devices in sequential order. However, there are two caveats: 1) GPUs are not x86, and they don't have CPUID to read their AMD Node ID like on CPUs. This means the value is more implicit and based on PCI enumeration and hardware-specifics. 2) There is a gap in the hardware values for AMD Node IDs. Values 0-7 are for CPUs and values 8-15 are for GPUs. For example, a system with one CPU die and two GPUs dies will have the following values: CPU0 -> AMD Node 0 GPU0 -> AMD Node 8 GPU1 -> AMD Node 9 EDAC is the only subsystem where this has a practical effect. Memory errors on AMD systems are commonly reported through MCA to a CPU on the local AMD Node. The error information is passed along to EDAC where the AMD EDAC modules use the AMD Node ID of reporting logical CPU to access AMD Node information. However, memory errors from a GPU die will be reported to the CPU die. Therefore, the logical CPU's AMD Node ID can't be used since it won't match the AMD Node ID of the GPU die. The AMD Node ID of the GPU die is provided as part of the MCA information, and the value will match the hardware enumeration (e.g. 8-15). Handle this situation by discovering GPU dies the same way as CPU dies in the AMD NB code. But do a "node id" fixup in AMD64 EDAC where it's needed. The GPU data fabrics provide a register with the base AMD Node ID for their local "type", i.e. GPU data fabric. This value is the same for all fabrics of the same type in a system. Read and cache the base AMD Node ID from one of the GPU devices during module initialization. Use this to fixup the "node id" when reporting memory errors at runtime. [ bp: Squash a fix making gpu_node_map static as reported by Tom Rix <trix@redhat.com>. Link: https://lore.kernel.org/r/20230610210930.174074-1-trix@redhat.com ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230515113537.1052146-6-muralimk@amd.com
2023-06-19	ovl: implement lazy lookup of lowerdata in data-only layers	Amir Goldstein
	Defer lookup of lowerdata in the data-only layers to first data access or before copy up. We perform lowerdata lookup before copy up even if copy up is metadata only copy up. We can further optimize this lookup later if needed. We do best effort lazy lookup of lowerdata for d_real_inode(), because this interface does not expect errors. The only current in-tree caller of d_real_inode() is trace_uprobe and this caller is likely going to be followed reading from the file, before placing uprobes on offset within the file, so lowerdata should be available when setting the uprobe. Tested-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: prepare for lazy lookup of lowerdata inode	Amir Goldstein
	Make the code handle the case of numlower > 1 and missing lowerdata dentry gracefully. Missing lowerdata dentry is an indication for lazy lookup of lowerdata and in that case the lowerdata_redirect path is stored in ovl_inode. Following commits will defer lookup and perform the lazy lookup on access. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: prepare to store lowerdata redirect for lazy lowerdata lookup	Amir Goldstein
	Prepare to allow ovl_lookup() to leave the last entry in a non-dir lowerstack empty to signify lazy lowerdata lookup. In this case, ovl_lookup() stores the redirect path from metacopy to lowerdata in ovl_inode, which is going to be used later to perform the lazy lowerdata lookup. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: implement lookup in data-only layers	Amir Goldstein
	Lookup in data-only layers only for a lower metacopy with an absolute redirect xattr. The metacopy xattr is not checked on files found in the data-only layers and redirect xattr are not followed in the data-only layers. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: introduce data-only lower layers	Amir Goldstein
	Introduce the format lowerdir=lower1:lower2::lowerdata1::lowerdata2 where the lower layers on the right of the :: separators are not merged into the overlayfs merge dirs. Data-only lower layers are only allowed at the bottom of the stack. The files in those layers are only meant to be accessible via absolute redirect from metacopy files in lower layers. Following changes will implement lookup in the data layers. This feature was requested for composefs ostree use case, where the lower data layer should only be accessiable via absolute redirects from metacopy inodes. The lower data layers are not required to a have a unique uuid or any uuid at all, because they are never used to compose the overlayfs inode st_ino/st_dev. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: remove unneeded goto instructions	Amir Goldstein
	There is nothing in the out goto target of ovl_get_layers(). Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: deduplicate lowerdata and lowerstack[]	Amir Goldstein
	The ovl_inode contains a copy of lowerdata in lowerstack[], so the lowerdata inode member can be removed. Use accessors ovl_lowerdata*() to get the lowerdata whereever the member was accessed directly. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: deduplicate lowerpath and lowerstack[]	Amir Goldstein
	The ovl_inode contains a copy of lowerpath in lowerstack[0], so the lowerpath member can be removed. Use accessor ovl_lowerpath() to get the lowerpath whereever the member was accessed directly. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: move ovl_entry into ovl_inode	Amir Goldstein
	The lower stacks of all the ovl inode aliases should be identical and there is redundant information in ovl_entry and ovl_inode. Move lowerstack into ovl_inode and keep only the OVL_E_FLAGS per overlay dentry. Following patches will deduplicate redundant ovl_inode fields. Note that for pure upper and negative dentries, OVL_E(dentry) may be NULL now, so it is imporatnt to use the ovl_numlower() accessor. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: factor out ovl_free_entry() and ovl_stack_*() helpers	Amir Goldstein
	In preparation for moving lowerstack into ovl_inode. Note that in ovl_lookup() the temp stack dentry refs are now cloned into the final ovl_lowerstack instead of being transferred, so cleanup always needs to call ovl_stack_free(stack). Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: use ovl_numlower() and ovl_lowerstack() accessors	Amir Goldstein
	This helps fortify against dereferencing a NULL ovl_entry, before we move the ovl_entry reference into ovl_inode. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: use OVL_E() and OVL_E_FLAGS() accessors	Amir Goldstein
	Instead of open coded instances, because we are about to split the two apart. Reviewed-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: update of dentry revalidate flags after copy up	Amir Goldstein
	After copy up, we may need to update d_flags if upper dentry is on a remote fs and lower dentries are not. Add helpers to allow incremental update of the revalidate flags. Fixes: bccece1ead36 ("ovl: allow remote upper") Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: fix null pointer dereference in ovl_get_acl_rcu()	Zhihao Cheng
	Following process: P1 P2 path_openat link_path_walk may_lookup inode_permission(rcu) ovl_permission acl_permission_check check_acl get_cached_acl_rcu ovl_get_inode_acl realinode = ovl_inode_real(ovl_inode) drop_cache __dentry_kill(ovl_dentry) iput(ovl_inode) ovl_destroy_inode(ovl_inode) dput(oi->__upperdentry) dentry_kill(upperdentry) dentry_unlink_inode upperdentry->d_inode = NULL ovl_inode_upper upperdentry = ovl_i_dentry_upper(ovl_inode) d_inode(upperdentry) // returns NULL IS_POSIXACL(realinode) // NULL pointer dereference , will trigger an null pointer dereference at realinode: [ 205.472797] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 205.476701] CPU: 2 PID: 2713 Comm: ls Not tainted 6.3.0-12064-g2edfa098e750-dirty #1216 [ 205.478754] RIP: 0010:do_ovl_get_acl+0x5d/0x300 [ 205.489584] Call Trace: [ 205.489812] <TASK> [ 205.490014] ovl_get_inode_acl+0x26/0x30 [ 205.490466] get_cached_acl_rcu+0x61/0xa0 [ 205.490908] generic_permission+0x1bf/0x4e0 [ 205.491447] ovl_permission+0x79/0x1b0 [ 205.491917] inode_permission+0x15e/0x2c0 [ 205.492425] link_path_walk+0x115/0x550 [ 205.493311] path_lookupat.isra.0+0xb2/0x200 [ 205.493803] filename_lookup+0xda/0x240 [ 205.495747] vfs_fstatat+0x7b/0xb0 Fetch a reproducer in [Link]. Use the helper ovl_i_path_realinode() to get realinode and then do non-nullptr checking. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217404 Fixes: 332f606b32b6 ("ovl: enable RCU'd ->get_acl()") Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Christian Brauner <brauner@kernel.org> Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: fix null pointer dereference in ovl_permission()	Zhihao Cheng
	Following process: P1 P2 path_lookupat link_path_walk inode_permission ovl_permission ovl_i_path_real(inode, &realpath) path->dentry = ovl_i_dentry_upper(inode) drop_cache __dentry_kill(ovl_dentry) iput(ovl_inode) ovl_destroy_inode(ovl_inode) dput(oi->__upperdentry) dentry_kill(upperdentry) dentry_unlink_inode upperdentry->d_inode = NULL realinode = d_inode(realpath.dentry) // return NULL inode_permission(realinode) inode->i_sb // NULL pointer dereference , will trigger an null pointer dereference at realinode: [ 335.664979] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 335.668032] CPU: 0 PID: 2592 Comm: ls Not tainted 6.3.0 [ 335.669956] RIP: 0010:inode_permission+0x33/0x2c0 [ 335.678939] Call Trace: [ 335.679165] <TASK> [ 335.679371] ovl_permission+0xde/0x320 [ 335.679723] inode_permission+0x15e/0x2c0 [ 335.680090] link_path_walk+0x115/0x550 [ 335.680771] path_lookupat.isra.0+0xb2/0x200 [ 335.681170] filename_lookup+0xda/0x240 [ 335.681922] vfs_statx+0xa6/0x1f0 [ 335.682233] vfs_fstatat+0x7b/0xb0 Fetch a reproducer in [Link]. Use the helper ovl_i_path_realinode() to get realinode and then do non-nullptr checking. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217405 Fixes: 4b7791b2e958 ("ovl: handle idmappings in ovl_permission()") Cc: <stable@vger.kernel.org> # v5.19 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Christian Brauner <brauner@kernel.org> Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	ovl: let helper ovl_i_path_real() return the realinode	Zhihao Cheng
	Let helper ovl_i_path_real() return the realinode to prepare for checking non-null realinode in RCU walking path. [msz] Use d_inode_rcu() since we are depending on the consitency between dentry and inode being non-NULL in an RCU setting. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Fixes: ffa5723c6d25 ("ovl: store lower path in ovl_inode") Cc: <stable@vger.kernel.org> # v5.19 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2023-06-19	Documentation: virt: Clean up paravirt_ops doc	Randy Dunlap
	Clarify language. Clean up grammar. Hyphenate some words. Change "low-ops" to "low-level" since "low-ops" isn't defined or even mentioned anywhere else in the kernel source tree. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230610054310.6242-1-rdunlap@infradead.org
2023-06-19	wifi: cfg80211/nl80211: Add support to indicate STA MLD setup links removal	Veerendranath Jakkam
	STA MLD setup links may get removed if AP MLD remove the corresponding affiliated APs with Multi-Link reconfiguration as described in P802.11be_D3.0, section 35.3.6.2.2 Removing affiliated APs. Currently, there is no support to notify such operation to cfg80211 and userspace. Add support for the drivers to indicate STA MLD setup links removal to cfg80211 and notify the same to userspace. Upon receiving such indication from the driver, clear the MLO links information of the removed links in the WDEV. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/20230317142153.237900-1-quic_vjakkam@quicinc.com [rename function and attribute, fix kernel-doc] Signed-off-by: Johannes Berg <johannes.berg@intel.com>