summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-03-18wifi: iwlwifi: mld: KUnit: test iwl_mld_channel_load_allows_emlsrMiri Korenblit
Add tests to check that iwl_mld_channel_load_allows_emlsr decides correctly whether EMLSR is allowed or not. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20250313002008.06fdf416c62f.If6e8f0e017287e79364eac9366f93c9ab964a673@changeid [fix kunit visibility macro] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18bnxt_en: add .set_module_eeprom_by_page() supportDamodharam Ammepalli
Add support for .set_module_eeprom_by_page() callback which implements generic solution for modules eeprom access. This implementation also supports CMIS 5.0.3 compliant eeprom FW download. Sample Usage: ethtool --flash-module-firmware enp177s0np0 file dummy.bin Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-8-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Refactor bnxt_get_module_eeprom_by_page()Michael Chan
In preparation for adding .set_module_eeprom_by_page(), extract the common error checking done in bnxt_get_module_eeprom_by_page() into a new common function that can be re-used for .set_module_eeprom_by_page(). Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-7-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Update firmware interface to 1.10.3.97Michael Chan
The main changes are adding i2c write for module eeprom and a new v2 PCIe statistics structure. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-6-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Query FW parameters when the CAPS_CHANGE bit is setshantiprasad shettar
Newer FW can set the CAPS_CHANGE flag during ifup if some capabilities or configurations have changed. For example, the CoS queue configurations may have changed. Support this new flag by treating it almost like FW reset. The driver will essentially rediscover all features and capabilities, reconfigure all backing store context memory, reset everything to default, and reserve all resources. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: shantiprasad shettar <shantiprasad.shettar@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-5-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Add devlink support for ENABLE_ROCE nvm parameterPavan Chebbi
Add set/show support for the ENABLE_ROCE NVM parameter to enable/disable RoCE for a PF. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Co-developed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-4-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Refactor bnxt_hwrm_nvm_req()Michael Chan
bnxt_hwrm_nvm_req() first searches the nvm_params[] array for the NVM parameter to set or get. The array entry contains all the NVM information about that parameter. The information is then used to send the FW message to set or get the parameter. Refactor it to only do the array search in bnxt_hwrm_nvm_req() and pass the array entry to the new function __bnxt_hwrm_nvm_req() to send the FW message. The next patch will be able to use the new function. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-3-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18bnxt_en: Add support for a new ethtool dump flag 3Vasuthevan Maheswaran
When doing a live coredump with ethtool -w, the context data cached in the NIC is not dumped by the FW by default. The reason is that retrieving this cached context data with traffic running may cause problems. Add a new dump flag 3 to allow the option to include this cached context data which may be useful in some debug scenarios. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Vasuthevan Maheswaran <vasuthevan.maheswaran@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250310183129.3154117-2-michael.chan@broadcom.com Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ixgbe: add support for thermal sensor event receptionJedrzej Jagielski
E610 NICs unlike the previous devices utilising ixgbe driver are notified in the case of overheating by the FW ACI event. In event of overheat when threshold is exceeded, FW suspends all traffic and sends overtemp event to the driver. Then driver logs appropriate message and disables the adapter instance. The card remains in that state until the platform is rebooted. This approach is a solution to the fact current version of the E610 FW doesn't support reading thermal sensor data by the SW. So give to user at least any info that overtemp event has occurred, without interface disappearing from the OS without any note. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Jeremiah Lokan <jeremiahx.j.lokan@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-7-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ixgbe: add PTP support for E610 devicePiotr Kwapulinski
Add PTP support for E610 adapter. The E610 is based on X550 and adds firmware managed link, enhanced security capabilities and support for updated server manageability. It does not introduce any new PTP features compared to X550. Reviewed-by: Milena Olech <milena.olech@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Tested-by: Bharath R <bharath.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-6-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ice: E825C PHY register cleanupKarol Kolacinski
Minor PTP register refactor, including logical grouping E825C 1-step timestamping registers. Remove unused register definitions (PHY_REG_GPCS_BITSLIP, PHY_REG_REVISION). Also, apply preferred GENMASK macro (instead of ICE_M) for register fields definition affected by this patch. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-5-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ice: Refactor E825C PHY registers info structKarol Kolacinski
Simplify ice_phy_reg_info_eth56g struct definition to include base address for the very first quad. Use base address info and 'step' value to determine address for specific PHY quad. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-4-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ice: rename ice_ptp_init_phc_eth56g functionKarol Kolacinski
Refactor the code by changing ice_ptp_init_phc_eth56g function name to ice_ptp_init_phc_e825, to be consistent with the naming pattern for other devices. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-3-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ice: Add E830 checksum offload supportPaul Greenwalt
E830 supports raw receive and generic transmit checksum offloads. Raw receive checksum support is provided by hardware calculating the checksum over the whole packet, regardless of type. The calculated checksum is provided to driver in the Rx flex descriptor. Then the driver assigns the checksum to skb->csum and sets skb->ip_summed to CHECKSUM_COMPLETE. Generic transmit checksum support is provided by hardware calculating the checksum given two offsets: the start offset to begin checksum calculation, and the offset to insert the calculated checksum in the packet. Support is advertised to the stack using NETIF_F_HW_CSUM feature. E830 has the following limitations when both generic transmit checksum offload and TCP Segmentation Offload (TSO) are enabled: 1. Inner packet header modification is not supported. This restriction includes the inability to alter TCP flags, such as the push flag. As a result, this limitation can impact the receiver's ability to coalesce packets, potentially degrading network throughput. 2. The Maximum Segment Size (MSS) is limited to 1023 bytes, which prevents support of Maximum Transmission Unit (MTU) greater than 1063 bytes. Therefore NETIF_F_HW_CSUM and NETIF_F_ALL_TSO features are mutually exclusive. NETIF_F_HW_CSUM hardware feature support is indicated but is not enabled by default. Instead, IP checksums and NETIF_F_ALL_TSO are the defaults. Enforcement of mutual exclusivity of NETIF_F_HW_CSUM and NETIF_F_ALL_TSO is done in ice_set_features(). Mutual exclusivity of IP checksums and NETIF_F_HW_CSUM is handled by netdev_fix_features(). When NETIF_F_HW_CSUM is requested the provided skb->csum_start and skb->csum_offset are passed to hardware in the Tx context descriptor generic checksum (GCS) parameters. Hardware calculates the 1's complement from skb->csum_start to the end of the packet, and inserts the result in the packet at skb->csum_offset. Co-developed-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Alice Michael <alice.michael@intel.com> Co-developed-by: Eric Joyner <eric.joyner@intel.com> Signed-off-by: Eric Joyner <eric.joyner@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250310174502.3708121-2-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18ata: libata-core: Add ATA_QUIRK_NO_LPM_ON_ATI for certain Samsung SSDsNiklas Cassel
Before commit 7627a0edef54 ("ata: ahci: Drop low power policy board type") the ATI AHCI controllers specified board type 'board_ahci' rather than board type 'board_ahci'. This means that LPM was historically not enabled for the ATI AHCI controllers. By looking at commit 7a8526a5cd51 ("libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD."), it is clear that, for some unknown reason, that Samsung SSDs do not play nice with ATI AHCI controllers. (When using other AHCI controllers, NCQ can be enabled on these Samsung SSDs without issues.) In a similar way, from user reports, it is clear the ATI AHCI controllers can enable LPM on e.g. Maxtor HDDs perfectly fine, but when enabling LPM on certain Samsung SSDs, things break. (E.g. the SSDs will not get detected by the ATI AHCI controller even after a COMRESET.) Yet, when using LPM on these Samsung SSDs with other AHCI controllers, e.g. Intel AHCI controllers, these Samsung drives appear to work perfectly fine. Considering that the combination of ATI + Samsung, for some unknown reason, does not seem to work well, disable LPM when detecting an ATI AHCI controller with a problematic Samsung SSD. Apply this new ATA_QUIRK_NO_LPM_ON_ATI quirk for all Samsung SSDs that have already been reported to not play nice with ATI (ATA_QUIRK_NO_NCQ_ON_ATI). Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type") Suggested-by: Hans de Goede <hdegoede@redhat.com> Reported-by: Eric <eric.4.debian@grabatoulnz.fr> Closes: https://lore.kernel.org/linux-ide/Z8SBZMBjvVXA7OAK@eldamar.lan/ Tested-by: Eric <eric.4.debian@grabatoulnz.fr> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250317170348.1748671-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2025-03-18wifi: iwlwifi: mld: KUnit: create chanctx with a custom widthMiri Korenblit
Currently iwlmld_kunit_add_chanctx receives a band, picks a predefined static chandef, and creates the chanctx from it. Change it to receive a bandwidth as well. Otherwise, the bandwidth in the chanctx/phy will be different than what test specified in the iwl_mld_kunit_link. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250313002008.85a1285d34cd.Ia71cdcd4241fe73501bc93e3cb2c6bb3f631b9ec@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: KUnit: introduce iwl_mld_kunit_linkMiri Korenblit
To allow setting up association/EMLSR states with more flexibility, change the relevant functions to receive a new struct, iwl_mld_kunit_link, which will contain all the link parameters (for now just link id, band and bandwidth). Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250313002008.f336491ccc4e.I6b727765eb394a3dbb78cea71e356be1bdc4a17c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: allow EMLSR for unequal bandwidthMiri Korenblit
Allow EMLSR if the bandwidths of the links are unequal if one of the following conditions is true: 1. in low latency mode 2. bandwidth of the secondary link is greater than the bandwidth of the primary 3. the primary link is active and is loaded enough to justify EMLSR Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250313002008.150c330711c4.Ifd72d2e076783991852a7f1756948b4f0efb9fea@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: prevent toggling EMLSR due to FW requestsMiri Korenblit
We exit EMLSR mode if the FW requested to do so. To prevent repeated toggling of the EMLSR mode (frequent entry and exit), add this exit reason to the EMLSR prevention mechanism. This mechanism avoids re-entering EMLSR for a certain period of time after multiple exits caused by the same reason. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250313002008.f0e74a7f99af.I447c8788afba85a2a5040ae2c1213b6e05ec14f3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: remove IWL_MLD_EMLSR_BLOCKED_FWMiri Korenblit
The channel load logic moves from the FW to the driver. - Implement the logic: allow EMLSR only if the candidate primary link is active and if its average channel load exceeds the threshold. - Remove IWL_MLD_EMLSR_BLOCKED_FW. Instead, treat ESR_RECOMMEND_LEAVE in the EMLSR_RECOMMENDATION notif as an EXIT reason. Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250313002008.6729a8d67815.Iab39bf0982d8cdbb0db701d31854101c2fcf3b64@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: add support for DHC_TOOLS_UMAC_GET_TAS_STATUS commandPagadala Yesu Anjaneyulu
Add debugfs file in mld to retrieve TAS status per radio, TAS block list, current mcc, OEM name and OEM allowed list. This will add ability to get TAS status to user application via debugfs and required for debugging. Add the required API definitions and some debug host command utils. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250313002008.66524c6ea198.I1625135284fc075148a55dd9ac629e94ca881fe4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: Ensure wiphy lock is held during debugfs read operationsPagadala Yesu Anjaneyulu
The WIPHY_DEBUGFS_READ_WRITE_FILE_OPS_MLD macro is intended to call read/write handlers with the wiphy lock held. However, the current implementation uses the MLD_DEBUGFS_READ_WRAPPER macro, which does not hold the wiphy lock during read operations. This fix updates the WIPHY_DEBUGFS_READ_WRITE_FILE_OPS_MLD macro to use the WIPHY_DEBUGFS_READ_WRAPPER_MLD macro instead, ensuring that the wiphy lock is held during both read and write operations. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250313002008.2001d2335e9d.I607a8bd12efc6d1190cef1fca44279dbdd2756ea@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: Add support for WIPHY_DEBUGFS_READ_FILE_OPS_MLD macroPagadala Yesu Anjaneyulu
Introduced the WIPHY_DEBUGFS_READ_FILE_OPS_MLD macro to enable reading data from the driver while holding the wiphy lock. This will enable read operations with wiphy locked. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250313002008.b0ddb6b0a144.I1fab63f2c6f52fea61cc5d7b27775aed58adfd8d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18wifi: iwlwifi: mld: Rename WIPHY_DEBUGFS_HANDLER_WRAPPER to ↵Pagadala Yesu Anjaneyulu
WIPHY_DEBUGFS_WRITE_HANDLER_WRAPPER Renamed the macro WIPHY_DEBUGFS_HANDLER_WRAPPER to WIPHY_DEBUGFS_WRITE_HANDLER_WRAPPER to better reflect its purpose as a write handler. Additionally, updated the corresponding macro WIPHY_DEBUGFS_HANDLER_WRAPPER_MLD to WIPHY_DEBUGFS_WRITE_HANDLER_WRAPPER_MLD for consistency. This change does not alter the functionality but enhances the maintainability of the code. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250313002008.bb8a1d7907c8.I53325f2f37ccaad2b212d35d10616e06c1555e48@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18Merge net-next/main to resolve conflictsJohannes Berg
There are a few conflicts between the work that went into wireless and that's here now, resolve them. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-03-18ata: libata: Fix NCQ Non-Data log not supported printNiklas Cassel
Currently, both ata_dev_config_ncq_send_recv() - which checks for NCQ Send/Recv Log (Log Address 13h) and ata_dev_config_ncq_non_data() - which checks for NCQ Non-Data Log (Log Address 12h), uses the same print when the log is not supported: "NCQ Send/Recv Log not supported" This seems like a copy paste error, since NCQ Non-Data Log is actually a separate log. Fix the print to reference the correct log. Fixes: 284b3b77ea88 ("libata: NCQ encapsulation for ZAC MANAGEMENT OUT") Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250317111754.1666084-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2025-03-18net: phylink: Use phy_caps to get an interface's capabilities and modesMaxime Chevallier
Phylink has internal code to get the MAC capabilities of a given PHY interface (what are the supported speed and duplex). Extract that into phy_caps, but use the link_capa for conversion. Add an internal phylink helper for the link caps -> mac caps conversion, and use this in phylink_caps_to_linkmodes(). Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-14-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phylink: Convert capabilities to linkmodes using phy_capsMaxime Chevallier
phylink_caps_to_linkmodes() is used to derive a list of linkmodes that can be conceivably exposed using a given set of speeds and duplex through phylink's MAC capabilities. This list can be derived from the link_caps array in phy_caps, provided we convert the MAC capabilities into a LINK_CAPA bitmask first. Introduce an internal phylink helper phylink_caps_to_link_caps() to convert from MAC capabilities into phy_caps, then phy_caps_linkmodes() to do the link_caps -> linkmodes conversion. This avoids having to update phylink for every new linkmode. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-13-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phylink: Add a mapping between MAC_CAPS and LINK_CAPSMaxime Chevallier
phylink allows MAC drivers to report the capabilities in terms of speed, duplex and pause support. This is done through a dedicated set of enum values in the form of the MAC_ capabilities. They are very close to what the LINK_CAPA_xxx can express, with the difference that LINK_CAPA don't have any information about Pause/Asym Pause support. To prepare converting phylink to using the phy_caps, add the mapping between MAC capabilities and phy_caps. While doing so, we move the phylink_caps_params array up a bit to simplify future commits. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-12-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: drop phy_settings and the associated lookup helpersMaxime Chevallier
The phy_settings array is no longer relevant as it has now been replaced by the link_caps array and associated phy_caps helpers. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-11-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phylink: Use phy_caps_lookup for fixed-link configurationMaxime Chevallier
When phylink creates a fixed-link configuration, it finds a matching linkmode to set as the advertised, lp_advertising and supported modes based on the speed and duplex of the fixed link. Use the newly introduced phy_caps_lookup to get these modes instead of phy_lookup_settings(). This has the side effect that the matched settings and configured linkmodes may now contain several linkmodes (the intersection of supported linkmodes from the phylink settings and the linkmodes that match speed/duplex) instead of the one from phy_lookup_settings(). Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-10-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_device: Use link_capabilities lookup for PHY aneg configMaxime Chevallier
When configuring PHY advertising with autoneg disabled, we lookd for an exact linkmode to advertise and configure for the requested Speed and Duplex, specially at or over 1G. Using phy_caps_lookup allows us to build a list of the supported linkmodes at that speed that we can advertise instead of the first mode that matches. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-9-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_caps: Allow looking-up link caps based on speed and duplexMaxime Chevallier
As the link_caps array is efficient for <speed,duplex> lookups, implement a function for speed/duplex lookups that matches a given mask. This replicates to some extent the phy_lookup_settings() behaviour, matching full link_capabilities instead of a single linkmode. phy.c's phy_santize_settings() and phylink's phylink_ethtool_ksettings_set() performs such lookup using the phy_settings table, but are only interested in the actual speed/duplex that were matched, rathet than the individual linkmode. Similar to phy_lookup_settings(), the newly introduced phy_caps_lookup() will run through the link_caps[] array by descending speed/duplex order. If the link_capabilities for a given <speed/duplex> tuple intersects the passed linkmodes, we consider that a match. Similar to phy_lookup_settings(), we also allow passing an 'exact' boolean, allowing non-exact match. Here, we MUST always match the linkmodes mask, but we allow matching on lower speed settings. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-8-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_caps: Implement link_capabilities lookup by linkmodeMaxime Chevallier
In several occasions, phylib needs to lookup a set of matching speed and duplex against a given linkmode set. Instead of relying on the phy_settings array and thus iterate over the whole linkmodes list, use the link_capabilities array to lookup these matches, as we aren't interested in the actual link setting that matches but rather the speed and duplex for that setting. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-7-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_caps: Introduce phy_caps_validMaxime Chevallier
With the link_capabilities array, it's trivial to validate a given mask againts a <speed, duplex> tuple. Create a helper for that purpose, and use it to replace a phy_settings lookup in phy_check_valid(); Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-6-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_caps: Move __set_linkmode_max_speed to phy_capsMaxime Chevallier
Convert the __set_linkmode_max_speed to use the link_capabilities array. This makes it easy to clamp the linkmodes to a given max speed. Introduce a new helper phy_caps_linkmode_max_speed to replace the previous one that used phy_settings. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-5-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: phy_caps: Move phy_speeds to phy_capsMaxime Chevallier
Use the newly introduced link_capabilities array to derive the list of possible speeds when given a combination of linkmodes. As link_capabilities is indexed by speed, we don't have to iterate the whole phy_settings array. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-4-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18net: phy: Use an internal, searchable storage for the linkmodesMaxime Chevallier
The canonical definition for all the link modes is in linux/ethtool.h, which is complemented by the link_mode_params array stored in net/ethtool/common.h . That array contains all the metadata about each of these modes, including the Speed and Duplex information. Phylib and phylink needs that information as well for internal management of the link, which was done by duplicating that information in locally-stored arrays and lookup functions. This makes it easy for developpers adding new modes to forget modifying phylib and phylink accordingly. However, the link_mode_params array in net/ethtool/common.c is fairly inefficient to search through, as it isn't sorted in any manner. Phylib and phylink perform a lot of lookup operations, mostly to filter modes by speed and/or duplex. We therefore introduce the link_caps private array in phy_caps.c, that indexes linkmodes in a more efficient manner. Each element associated a tuple <speed, duplex> to a bitfield of all the linkmodes runs at these speed/duplex. We end-up with an array that's fairly short, easily addressable and that it optimised for the typical use-cases of phylib/phylink. That array is initialized at the same time as phylib. As the link_mode_params array is part of the net stack, which phylink depends on, it should always be accessible from phylib. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250307173611.129125-3-maxime.chevallier@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18mtd: spi-nor: drop unused <linux/of_platform.h>Tudor Ambarus
There's nothing used in the SPI NOR core from <linux/of_platform.h>, drop the header inclusion. Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20250307-spi-nor-headers-cleanup-v1-3-c186a9511c1e@linaro.org Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
2025-03-18mtd: spi-nor: explicitly include <linux/of.h>Tudor Ambarus
The core driver is using of_property_read_bool() and relies on implicit inclusion of <linux/of.h>, which comes from <linux/mtd/mtd.h>. It is good practice to directly include all headers used, it avoids implicit dependencies and spurious breakage if someone rearranges headers and causes the implicit include to vanish. Include the missing header. Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20250307-spi-nor-headers-cleanup-v1-1-c186a9511c1e@linaro.org Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
2025-03-17drivers/base/memory: improve add_boot_memory_block()Gavin Shan
Patch series "drivers/base/memory: Two cleanups", v3. Two cleanups to drivers/base/memory. This patch (of 2)L It's unnecessary to count the present sections for the specified block since the block will be added if any section in the block is present. Besides, for_each_present_section_nr() can be reused as Andrew Morton suggested. Improve by using for_each_present_section_nr() and dropping the unnecessary @section_count. No functional changes intended. Link: https://lkml.kernel.org/r/20250311233045.148943-1-gshan@redhat.com Link: https://lkml.kernel.org/r/20250311233045.148943-2-gshan@redhat.com Signed-off-by: Gavin Shan <gshan@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17device/dax: properly refcount device dax pages when mappingAlistair Popple
Device DAX pages are currently not reference counted when mapped, instead relying on the devmap PTE bit to ensure mapping code will not get/put references. This requires special handling in various page table walkers, particularly GUP, to manage references on the underlying pgmap to ensure the pages remain valid. However there is no reason these pages can't be refcounted properly at map time. Doning so eliminates the need for the devmap PTE bit, freeing up a precious PTE bit. It also simplifies GUP as it no longer needs to manage the special pgmap references and can instead just treat the pages normally as defined by vm_normal_page(). Link: https://lkml.kernel.org/r/968d3a8e9157e7492e85d065765c027e525f9fc9.1740713401.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: Dan Wiliams <dan.j.williams@intel.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: linmiaohe <linmiaohe@huawei.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17fs/dax: properly refcount fs dax pagesAlistair Popple
Currently fs dax pages are considered free when the refcount drops to one and their refcounts are not increased when mapped via PTEs or decreased when unmapped. This requires special logic in mm paths to detect that these pages should not be properly refcounted, and to detect when the refcount drops to one instead of zero. On the other hand get_user_pages(), etc. will properly refcount fs dax pages by taking a reference and dropping it when the page is unpinned. Tracking this special behaviour requires extra PTE bits (eg. pte_devmap) and introduces rules that are potentially confusing and specific to FS DAX pages. To fix this, and to possibly allow removal of the special PTE bits in future, convert the fs dax page refcounts to be zero based and instead take a reference on the page each time it is mapped as is currently the case for normal pages. This may also allow a future clean-up to remove the pgmap refcounting that is currently done in mm/gup.c. Link: https://lkml.kernel.org/r/c7d886ad7468a20452ef6e0ddab6cfe220874e7c.1740713401.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: linmiaohe <linmiaohe@huawei.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17dcssblk: mark DAX broken, remove FS_DAX_LIMITED supportDan Williams
The dcssblk driver has long needed special case supoprt to enable limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode works around the incomplete support for ZONE_DEVICE on s390 by forgoing the ability of dax-mapped pages to support GUP. Now, pending cleanups to fsdax that fix its reference counting [1] depend on the ability of all dax drivers to supply ZONE_DEVICE pages. To allow that work to move forward, dax support needs to be paused for dcssblk until ZONE_DEVICE support arrives. That work has been known for a few years [2], and the removal of "pte_devmap" requirements [3] makes the conversion easier. For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL (dcssblk was the only user). Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] Link: https://lkml.kernel.org/r/33eef2379c0d240f40cc15453fad2df1a4ae34c8.1740713401.git-series.apopple@nvidia.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Tested-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: linmiaohe <linmiaohe@huawei.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17mm: allow compound zone device pagesAlistair Popple
Zone device pages are used to represent various type of device memory managed by device drivers. Currently compound zone device pages are not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user of higher order zone device pages and have their own page reference counting. A future change will unify FS DAX reference counting with normal page reference counting rules and remove the special FS DAX reference counting. Supporting that requires compound zone device pages. Supporting compound zone device pages requires compound_head() to distinguish between head and tail pages whilst still preserving the special struct page fields that are specific to zone device pages. A tail page is distinguished by having bit zero being set in page->compound_head, with the remaining bits pointing to the head page. For zone device pages page->compound_head is shared with page->pgmap. The page->pgmap field must be common to all pages within a folio, even if the folio spans memory sections. Therefore pgmap is the same for both head and tail pages and can be moved into the folio and we can use the standard scheme to find compound_head from a tail page. Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Balbir Singh <balbirs@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: linmiaohe <linmiaohe@huawei.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17mm/mm_init: move p2pdma page refcount initialisation to p2pdmaAlistair Popple
Currently ZONE_DEVICE page reference counts are initialised by core memory management code in __init_zone_device_page() as part of the memremap() call which driver modules make to obtain ZONE_DEVICE pages. This initialises page refcounts to 1 before returning them to the driver. This was presumably done because it drivers had a reference of sorts on the page. It also ensured the page could always be mapped with vm_insert_page() for example and would never get freed (ie. have a zero refcount), freeing drivers of manipulating page reference counts. However it complicates figuring out whether or not a page is free from the mm perspective because it is no longer possible to just look at the refcount. Instead the page type must be known and if GUP is used a secondary pgmap reference is also sometimes needed. To simplify this it is desirable to remove the page reference count for the driver, so core mm can just use the refcount without having to account for page type or do other types of tracking. This is possible because drivers can always assume the page is valid as core kernel will never offline or remove the struct page. This means it is now up to drivers to initialise the page refcount as required. P2PDMA uses vm_insert_page() to map the page, and that requires a non-zero reference count when initialising the page so set that when the page is first mapped. Link: https://lkml.kernel.org/r/6aedb0ac2886dcc4503cb705273db5b3863a0b66.1740713401.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: linmiaohe <linmiaohe@huawei.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael "Camp Drill Sergeant" Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17dax: remove access to page->indexMatthew Wilcox (Oracle)
This looks like a complete mess (why are we setting page->index at page fault time?), but I no longer care about DAX, and there's no reason to let DAX hold us back from removing page->index. Link: https://lkml.kernel.org/r/20241216155408.8102-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-17scsi: st: Tighten the page format heuristics with MODE SELECTKai Mäkisara
In the days when SCSI-2 was emerging, some drives did claim SCSI-2 but did not correctly implement it. The st driver first tries MODE SELECT with the page format bit set to set the block descriptor. If not successful, the non-page format is tried. The test only tests the sense code and this triggers also from illegal parameter in the parameter list. The test is limited to "old" devices and made more strict to remove false alarms. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-4-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-03-17scsi: st: ERASE does not change tape locationKai Mäkisara
The SCSI ERASE command erases from the current position onwards. Don't clear the position variables. Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-3-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-03-17scsi: st: Fix array overflow in st_setup()Kai Mäkisara
Change the array size to follow parms size instead of a fixed value. Reported-by: Chenyuan Yang <chenyuan0y@gmail.com> Closes: https://lore.kernel.org/linux-scsi/CALGdzuoubbra4xKOJcsyThdk5Y1BrAmZs==wbqjbkAgmKS39Aw@mail.gmail.com/ Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Link: https://lore.kernel.org/r/20250311112516.5548-2-Kai.Makisara@kolumbus.fi Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>