summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-28rhashtable: Restore RCU marking on rhash_lock_headHerbert Xu
This patch restores the RCU marking on bucket_table->buckets as it really does need RCU protection. Its removal had led to a fatal bug. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28rhashtable: Fix unprotected RCU dereference in __rht_ptrHerbert Xu
The rcu_dereference call in rht_ptr_rcu is completely bogus because we've already dereferenced the value in __rht_ptr and operated on it. This causes potential double readings which could be fatal. The RCU dereference must occur prior to the comparison in __rht_ptr. This patch changes the order of RCU dereference so that it is done first and the result is then fed to __rht_ptr. The RCU marking changes have been minimised using casts which will be removed in a follow-up patch. Fixes: ba6306e3f648 ("rhashtable: Remove RCU marking from...") Reported-by: "Gong, Sishuai" <sishuai@purdue.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'introduce-PLDM-firmware-update-library'David S. Miller
Jacob Keller says: ==================== introduce PLDM firmware update library This series goal is to enable support for updating the ice hardware flash using the devlink flash command. The ice firmware update files are distributed using the file format described by the "PLDM for Firmware Update" standard: https://www.dmtf.org/documents/pmci/pldm-firmware-update-specification-100 Because this file format is standard, this series introduces a new library that handles the generic logic for parsing the PLDM file header. The library uses a design that is very similar to the Mellanox mlxfw module. That is, a simple ops table is setup and device drivers instantiate an instance of the pldmfw library with the device specific operations. Doing so allows for each device to implement the low level behavior for how to interact with its firmware. This series includes the library and an implementation for the ice hardware. I've removed all of the parameters, and the proposed changes to support overwrite mode. I'll be working on the overwrite mask suggestion from Jakub as a follow-up series. Because the PLDM file format is a standard and not something that is specific to the Intel hardware, I opted to place this update library in lib/pldmfw. I should note that while I tried to make the library generic, it does not attempt to mimic the complete "Update Agent" as defined in the standard. This is mostly due to the fact that the actual interfaces exposed to software for the ice hardware would not allow this. This series depends on some work just recently and is based on top of the patch series sent by Tony published at: https://lore.kernel.org/netdev/20200723234720.1547308-1-anthony.l.nguyen@intel.com/T/#t Changes since v2 RFC * Removed overwrite mode patches, as this can become a follow up series with a separate discussion * Fixed a minor bug in the pldm_timestamp structure not being packed. * Dropped Cc for other driver maintainers, as this series no longer includes changes to the core flash update command. * Re-ordered patches slightly. Changes since v1 RFC * Removed the "allow_downgrade_on_flash_update" parameter. Instead, the driver will always attempt to flash the device, even when firmware indicates that it would be a downgrade. A dev_warn is used to indicate when this occurs. * Removed the "ignore_pending_flash_update". Instead, the driver will always check for and cancel any previous pending update. A devlink flash status message will be sent when this cancellation occurs. * Removed the "reset_after_flash_update" parameter. This will instead be implemented as part of a devlink reset interface, work left for a future change. * Replaced the "flash_update_preservation_level" parameter with a new "overwrite" mode attribute on the flash update command. For ice, this mode will select the preservation level. For all other drivers, I modified them to check that the mode is "OVERWRITE_NOTHING", and have Cc'd the maintainers to get their input. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: implement device flash update via devlinkJacob Keller
Use the newly added pldmfw library to implement device flash update for the Intel ice networking device driver. This support uses the devlink flash update interface. The main parts of the flash include the Option ROM, the netlist module, and the main NVM data. The PLDM firmware file contains modules for each of these components. Using the pldmfw library, the provided firmware file will be scanned for the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for the main NVM module containing the primary device firmware, and "fw.netlist" containing the netlist module. The flash is separated into two banks, the active bank containing the running firmware, and the inactive bank which we use for update. Each module is updated in a staged process. First, the inactive bank is erased, preparing the device for update. Second, the contents of the component are copied to the inactive portion of the flash. After all components are updated, the driver signals the device to switch the active bank during the next EMP reset (which would usually occur during the next reboot). Although the firmware AdminQ interface does report an immediate status for each command, the NVM erase and NVM write commands receive status asynchronously. The driver must not continue writing until previous erase and write commands have finished. The real status of the NVM commands is returned over the receive AdminQ. Implement a simple interface that uses a wait queue so that the main update thread can sleep until the completion status is reported by firmware. For erasing the inactive banks, this can take quite a while in practice. To help visualize the process to the devlink application and other applications based on the devlink netlink interface, status is reported via the devlink_flash_update_status_notify. While we do report status after each 4k block when writing, there is no real status we can report during erasing. We simply must wait for the complete module erasure to finish. With this implementation, basic flash update for the ice hardware is supported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: add flags indicating pending update of firmware moduleJacob Keller
After a flash update, the pending status of the update can be determined from the device capabilities. Read the appropriate device capability and store whether there is a pending update awaiting a reboot. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: Add AdminQ commands for FW updateCudzilo, Szymon T
Add structures, identifiers, and helper functions for several AdminQ commands related to performing a firmware update for the ice hardware. These will be used in future code for implementing the devlink .flash_update handler. Signed-off-by: Cudzilo, Szymon T <szymon.t.cudzilo@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28ice: Add support for unified NVM update flow capabilityJacek Naczyk
Extends function parsing response from Discover Device Capability AQC to check if the device supports unified NVM update flow. Signed-off-by: Jacek Naczyk <jacek.naczyk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Add pldmfw library for PLDM firmware updateJacob Keller
The pldmfw library is used to implement common logic needed to flash devices based on firmware files using the format described by the PLDM for Firmware Update standard. This library consists of logic to parse the PLDM file format from a firmware file object, as well as common logic for sending the relevant PLDM header data to the device firmware. A simple ops table is provided so that device drivers can implement device specific hardware interactions while keeping the common logic to the pldmfw library. This library will be used by the Intel ice networking driver as part of implementing device flash update via devlink. The library aims to be vendor and device agnostic. For this reason, it has been placed in lib/pldmfw, in the hopes that other devices which use the PLDM firmware file format may benefit from it in the future. However, do note that not all features defined in the PLDM standard have been implemented. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: ethernet: mtk_eth_soc: Always call mtk_gmac0_rgmii_adjust() for mt7623René van Dorst
Modify mtk_gmac0_rgmii_adjust() so it can always be called. mtk_gmac0_rgmii_adjust() sets-up the TRGMII clocks. Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-By: David Woodhouse <dwmw2@infradead.org> Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'mptcp-Exchange-MPTCP-DATA_FIN-DATA_ACK-before-TCP-FIN'David S. Miller
Mat Martineau says: ==================== mptcp: Exchange MPTCP DATA_FIN/DATA_ACK before TCP FIN This series allows the MPTCP-level connection to be closed with the peers exchanging DATA_FIN and DATA_ACK according to the state machine in appendix D of RFC 8684. The process is very similar to the TCP disconnect state machine. The prior code sends DATA_FIN only when TCP FIN packets are sent, and does not allow for the MPTCP-level connection to be half-closed. Patch 8 ("mptcp: Use full MPTCP-level disconnect state machine") is the core of the series. Earlier patches in the series have some small fixes and helpers in preparation, and the final four small patches do some cleanup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Safely store sequence number when sending dataMat Martineau
The MPTCP socket's write_seq member can be read without the msk lock held, so use WRITE_ONCE() to store it. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Safely read sequence number when lock isn't heldMat Martineau
The MPTCP socket's write_seq member should be read with READ_ONCE() when the msk lock is not held. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Skip unnecessary skb extension allocation for bare acksMat Martineau
Bare TCP ack skbs are freed right after MPTCP sees them, so the work to allocate, zero, and populate the MPTCP skb extension is wasted. Detect these skbs and do not add skb extensions to them. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Only use subflow EOF signaling on fallback connectionsMat Martineau
The MPTCP state machine handles disconnections on non-fallback connections, but the mptcp_sock still needs to get notified when fallback subflows disconnect. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Use full MPTCP-level disconnect state machineMat Martineau
RFC 8684 appendix D describes the connection state machine for MPTCP. This patch implements the DATA_FIN / DATA_ACK exchanges and MPTCP-level socket state changes described in that appendix, rather than simply sending DATA_FIN along with TCP FIN when disconnecting subflows. DATA_FIN is now sent and acknowledged before shutting down the subflows. Received DATA_FIN information (if not part of a data packet) is written to the MPTCP socket when the incoming DSS option is parsed by the subflow, and the MPTCP worker is scheduled to process the flag. DATA_FIN received as part of a full DSS mapping will be handled when the mapping is processed. The DATA_FIN is acknowledged by the worker if the reader is caught up. If there is still data to be moved to the MPTCP-level queue, ack_seq will be incremented to account for the DATA_FIN when it reaches the end of the stream and a DATA_ACK will be sent to the peer. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Add helper to process acks of DATA_FINMat Martineau
After DATA_FIN has been sent, the peer will acknowledge it. An ack of the relevant MPTCP-level sequence number will update the MPTCP connection state appropriately. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Add mptcp_close_state() helperMat Martineau
This will be used to transition to the appropriate state on close and determine if a DATA_FIN needs to be sent for that state transition. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Track received DATA_FIN sequence number and add related helpersMat Martineau
Incoming DATA_FIN headers need to propagate the presence of the DATA_FIN bit and the associated sequence number to the MPTCP layer, even when arriving on a bare ACK that does not get added to the receive queue. Add structure members to store the DATA_FIN information and helpers to set and check those values. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Use MPTCP-level flag for sending DATA_FINMat Martineau
Since DATA_FIN information is the same for every subflow, store it only in the mptcp_sock. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Remove outdated and incorrect commentMat Martineau
mptcp_close() acquires the msk lock, so it clearly should not be held before the function is called. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Return EPIPE if sending is shut down during a sendmsgMat Martineau
A MPTCP socket where sending has been shut down should not attempt to send additional data, since DATA_FIN has already been sent. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mptcp: Allow DATA_FIN in headers without TCP FINMat Martineau
RFC 8684-compliant DATA_FIN needs to be sent and ack'd before subflows are closed with TCP FIN, so write DATA_FIN DSS headers whenever their transmission has been enabled by the MPTCP connection-level socket. Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge tag 'mlx5-fixes-2020-07-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes-2020-07-28 This series introduces some fixes to mlx5 driver. v1->v2: - Drop the "Hold reference on mirred devices" patch, until Or's comments are addressed. - Imporve "Modify uplink state" patch commit message per Or's request. Please pull and let me know if there is any problem. For -Stable: For -stable v4.9 ('net/mlx5e: Fix error path of device attach') For -stable v4.15 ('net/mlx5: Verify Hardware supports requested ptp function on a given pin') For -stable v5.3 ('net/mlx5e: Modify uplink state on interface up/down') For -stable v5.4 ('net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev') ('net/mlx5: E-switch, Destroy TSAR when fail to enable the mode') For -stable v5.5 ('net/mlx5: E-switch, Destroy TSAR after reload interface') For -stable v5.7 ('net/mlx5: Fix a bug of using ptp channel index as pin index') ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-29bpf: Fix build without CONFIG_NET when using BPF XDP linkAndrii Nakryiko
Entire net/core subsystem is not built without CONFIG_NET. linux/netdevice.h just assumes that it's always there, so the easiest way to fix this is to conditionally compile out bpf_xdp_link_attach() use in bpf/syscall.c. Fixes: aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200728190527.110830-1-andriin@fb.com
2020-07-29bpf, selftests: use :: 1 for localhost in tcp_server.pyJohn Fastabend
Using localhost requires the host to have a /etc/hosts file with that specific line in it. By default my dev box did not, they used ip6-localhost, so the test was failing. To fix remove the need for any /etc/hosts and use ::1. I could just add the line, but this seems easier. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159594714197.21431.10113693935099326445.stgit@john-Precision-5820-Tower
2020-07-28Merge branch 'sockptr_t-fixes-v2'David S. Miller
Christoph Hellwig says: ==================== sockptr_t fixes v2 a bunch of fixes for the sockptr_t conversion Changes since v1: - fix a user pointer dereference braino in bpfilter ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: improve the user pointer check in init_user_sockptrChristoph Hellwig
Make sure not just the pointer itself but the whole range lies in the user address space. For that pass the length and then use the access_ok helper to do the check. Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces") Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: remove sockptr_advanceChristoph Hellwig
sockptr_advance never properly worked. Replace it with _offset variants of copy_from_sockptr and copy_to_sockptr. Fixes: ba423fdaa589 ("net: add a new sockptr_t type") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: make sockptr_is_null strict aliasing safeChristoph Hellwig
While the kernel in general is not strict aliasing safe we can trivially do that in sockptr_is_null without affecting code generation, so always check the actually assigned union member. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28netfilter: arp_tables: restore a SPDX identifierChristoph Hellwig
This was accidentally removed in an unrelated commit. Fixes: c2f12630c60f ("netfilter: switch nf_setsockopt to sockptr_t") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'net-lan78xx-fix-NULL-deref-and-memory-leak'David S. Miller
Johan Hovold says: ==================== net: lan78xx: fix NULL deref and memory leak The first two patches fix a NULL-pointer dereference at probe that can be triggered by a malicious device and a small transfer-buffer memory leak, respectively. For another subsystem I would have marked them: Cc: stable@vger.kernel.org # 4.3 The third one replaces the driver's current broken endpoint lookup helper, which could end up accepting incomplete interfaces and whose results weren't even useeren Johan ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: lan78xx: replace bogus endpoint lookupJohan Hovold
Drop the bogus endpoint-lookup helper which could end up accepting interfaces based on endpoints belonging to unrelated altsettings. Note that the returned bulk pipes and interrupt endpoint descriptor were never actually used. Instead the bulk-endpoint numbers are hardcoded to 1 and 2 (matching the specification), while the interrupt- endpoint descriptor was assumed to be the third descriptor created by USB core. Try to bring some order to this by dropping the bogus lookup helper and adding the missing endpoint sanity checks while keeping the interrupt- descriptor assumption for now. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: lan78xx: fix transfer-buffer memory leakJohan Hovold
The interrupt URB transfer-buffer was never freed on disconnect or after probe errors. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: lan78xx: add missing endpoint sanity checkJohan Hovold
Add the missing endpoint sanity check to prevent a NULL-pointer dereference should a malicious device lack the expected endpoints. Note that the driver has a broken endpoint-lookup helper, lan78xx_get_endpoints(), which can end up accepting interfaces in an altsetting without endpoints as long as *some* altsetting has a bulk-in and a bulk-out endpoint. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge branch 'mlxsw-Add-support-for-QSFP-DD-transceiver-type'David S. Miller
Ido Schimmel says: ==================== mlxsw: Add support for QSFP-DD transceiver type This patch set from Vadim adds support for Quad Small Form Factor Pluggable Double Density (QSFP-DD) modules in mlxsw. Patch #1 enables dumping of QSFP-DD module information through ethtool. Patch #2 enables reading of temperature thresholds from QSFP-DD modules for hwmon and thermal zone purposes. Changes since v1 [1]: Only rebase on top of net-next. After discussing with Andrew and Adrian we agreed that current approach is OK and that in the future we can follow Andrew's suggestion to "make a new API where user space can request any pages it want, and specify the size of the page". This should allow us "to work around known issues when manufactures get their EEPROM wrong". [1] https://lore.kernel.org/netdev/20200626144724.224372-1-idosch@idosch.org/#t ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mlxsw: core: Add support for temperature thresholds reading for QSFP-DD ↵Vadim Pasternak
transceivers Allow QSFP-DD transceivers temperature thresholds reading for hardware monitoring and thermal control. For this type, the thresholds are located in page 02h according to the "Module and Lane Thresholds" description from Common Management Interface Specification. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28mlxsw: core: Add ethtool support for QSFP-DD transceiversVadim Pasternak
The Quad Small Form Factor Pluggable Double Density (QSFP-DD) hardware specification defines a form factor that supports up to 400 Gbps in aggregate over an 8x50-Gbps electrical interface. The QSFP-DD supports both optical and copper interfaces. Implementation is based on Common Management Interface Specification; Rev 4.0 May 8, 2019. Table 8-2 "Identifier and Status Summary (Lower Page)" from this spec defines "Id and Status" fields located at offsets 00h - 02h. Bit 2 at offset 02h ("Flat_mem") specifies QSFP EEPROM memory mode, which could be "upper memory flat" or "paged". Flat memory mode is coded "1", and indicates that only page 00h is implemented in EEPROM. Paged memory is coded "0" and indicates that pages 00h, 01h, 02h, 10h and 11h are implemented. Pages 10h and 11h are currently not supported by the driver. "Flat" memory mode is used for the passive copper transceivers. For this type only page 00h (256 bytes) is available. "Paged" memory is used for the optical transceivers. For this type pages 00h (256 bytes), 01h (128 bytes) and 02h (128 bytes) are available. Upper page 01h contains static advertising field, while upper page 02h contains the module-defined thresholds and lane-specific monitors. Extend enumerator 'mlxsw_reg_mcia_eeprom_module_info_id' with additional field 'MLXSW_REG_MCIA_EEPROM_MODULE_INFO_TYPE_ID'. This field is used to indicate for QSFP-DD transceiver type which memory mode is to be used. Expose 256 bytes buffer for QSFP-DD passive copper transceiver and 512 bytes buffer for optical. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Merge tag 'mlx5-updates-2020-07-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-07-28 Misc and small update to mlx5 driver: 1) Aya adds PCIe relaxed ordering support for mlx5 netdev queues. 2) Eran Refactors pages data base to be per vf/function to speedup unload time. 3) Parav changes eswitch steering initialization to account for tota_vports rather than for only active vports and Link non uplink representors to PCI device, for uniform naming scheme. 4) Tariq, trivial RX code improvements and missing inidirect calls wrappers. 5) Small cleanup patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28usb: hso: check for return value in hso_serial_common_create()Rustam Kovhaev
in case of an error tty_register_device_attr() returns ERR_PTR(), add IS_ERR() check Reported-and-tested-by: syzbot+67b2bd0e34f952d0321e@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=67b2bd0e34f952d0321e Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28farsync: use generic power managementVaibhav Gupta
The .suspend() and .resume() callbacks are not defined for this driver. Still, their power management structure follows the legacy framework. To bring it under the generic framework, simply remove the binding of callbacks from "struct pci_driver". Change code indentation from space to tab in "struct pci_driver". Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net/mlx5e: Fix kernel crash when setting vf VLANID on a VF devAlaa Hleihel
After the cited commit, function 'mlx5_eswitch_set_vport_vlan' started to acquire esw->state_lock. However, esw is not defined for VF devices, hence attempting to set vf VLANID on a VF dev will cause a kernel panic. Fix it by moving up the (redundant) esw validation from function '__mlx5_eswitch_set_vport_vlan' since the rest of the callers now have and use a valid esw. For example with vf device eth4: # ip link set dev eth4 vf 0 vlan 0 Trace of the panic: [ 411.409842] BUG: unable to handle page fault for address: 00000000000011b8 [ 411.449745] #PF: supervisor read access in kernel mode [ 411.452348] #PF: error_code(0x0000) - not-present page [ 411.454938] PGD 80000004189c9067 P4D 80000004189c9067 PUD 41899a067 PMD 0 [ 411.458382] Oops: 0000 [#1] SMP PTI [ 411.460268] CPU: 4 PID: 5711 Comm: ip Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1 [ 411.462447] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 411.464158] RIP: 0010:__mutex_lock+0x4e/0x940 [ 411.464928] Code: fd 41 54 49 89 f4 41 52 53 89 d3 48 83 ec 70 44 8b 1d ee 03 b0 01 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 45 85 db 75 0a <48> 3b 7f 60 0f 85 7e 05 00 00 49 8d 45 68 41 56 41 b8 01 00 00 00 [ 411.467678] RSP: 0018:ffff88841fcd74b0 EFLAGS: 00010246 [ 411.468562] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 411.469715] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000001158 [ 411.470812] RBP: ffff88841fcd7550 R08: ffffffffa00fa1ce R09: 0000000000000000 [ 411.471835] R10: ffff88841fcd7570 R11: 0000000000000000 R12: 0000000000000002 [ 411.472862] R13: 0000000000001158 R14: ffffffffa00fa1ce R15: 0000000000000000 [ 411.474004] FS: 00007faee7ca6b80(0000) GS:ffff88846fc00000(0000) knlGS:0000000000000000 [ 411.475237] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 411.476129] CR2: 00000000000011b8 CR3: 000000041909c006 CR4: 0000000000360ea0 [ 411.477260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 411.478340] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 411.479332] Call Trace: [ 411.479760] ? __nla_validate_parse.part.6+0x57/0x8f0 [ 411.482825] ? mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core] [ 411.483804] mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core] [ 411.484733] mlx5e_set_vf_vlan+0x41/0x50 [mlx5_core] [ 411.485545] do_setlink+0x613/0x1000 [ 411.486165] __rtnl_newlink+0x53d/0x8c0 [ 411.486791] ? mark_held_locks+0x49/0x70 [ 411.487429] ? __lock_acquire+0x8fe/0x1eb0 [ 411.488085] ? rcu_read_lock_sched_held+0x52/0x60 [ 411.488998] ? kmem_cache_alloc_trace+0x16d/0x2d0 [ 411.489759] rtnl_newlink+0x47/0x70 [ 411.490357] rtnetlink_rcv_msg+0x24e/0x450 [ 411.490978] ? netlink_deliver_tap+0x92/0x3d0 [ 411.491631] ? validate_linkmsg+0x330/0x330 [ 411.492262] netlink_rcv_skb+0x47/0x110 [ 411.492852] netlink_unicast+0x1ac/0x270 [ 411.493551] netlink_sendmsg+0x336/0x450 [ 411.494209] sock_sendmsg+0x30/0x40 [ 411.494779] ____sys_sendmsg+0x1dd/0x1f0 [ 411.495378] ? copy_msghdr_from_user+0x5c/0x90 [ 411.496082] ___sys_sendmsg+0x87/0xd0 [ 411.496683] ? lock_acquire+0xb9/0x3a0 [ 411.497322] ? lru_cache_add+0x5/0x170 [ 411.497944] ? find_held_lock+0x2d/0x90 [ 411.498568] ? handle_mm_fault+0xe46/0x18c0 [ 411.499205] ? __sys_sendmsg+0x51/0x90 [ 411.499784] __sys_sendmsg+0x51/0x90 [ 411.500341] do_syscall_64+0x59/0x2e0 [ 411.500938] ? asm_exc_page_fault+0x8/0x30 [ 411.501609] ? rcu_read_lock_sched_held+0x52/0x60 [ 411.502350] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 411.503093] RIP: 0033:0x7faee73b85a7 [ 411.503654] Code: Bad RIP value. Fixes: 0e18134f4f9f ("net/mlx5e: Eswitch, use state_lock to synchronize vlan change") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Modify uplink state on interface up/downRon Diskin
When setting the PF interface up/down, notify the firmware to update uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled. This behavior will prevent sending traffic out on uplink port when PF is down, such as sending traffic from a VF interface which is still up. Currently when calling mlx5e_open/close(), the driver only sends PAOS command to notify the firmware to set the physical port state to up/down, however, it is not sufficient. When VF is in "auto" state, it follows the uplink state, which was not updated on mlx5e_open/close() before this patch. When switchdev mode is enabled and uplink representor is first enabled, set the uplink port state value back to its FW default "AUTO". Fixes: 63bfd399de55 ("net/mlx5e: Send PAOS command on interface up/down") Signed-off-by: Ron Diskin <rondi@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Query PPS pin operational status before registering itEran Ben Elisha
In a special configuration, a ConnectX6-Dx pin pps-out might be activated when driver is loaded. Fix the driver to always read the operational pin mode when registering it, and advertise it accordingly. Fixes: ee7f12205abc ("net/mlx5e: Implement 1PPS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Fix slab-out-of-bounds in mlx5e_rep_is_lag_netdevRaed Salem
mlx5e_rep_is_lag_netdev is used as first check as part of netdev events handler for bond device of non-uplink representors, this handler can get any netdevice under the same network namespace of mlx5e netdevice. Current code treats the netdev as mlx5e netdev and only later on verifies this, hence causes the following Kasan trace: [15402.744990] ================================================================== [15402.746942] BUG: KASAN: slab-out-of-bounds in mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core] [15402.749009] Read of size 8 at addr ffff880391f3f6b0 by task ovs-vswitchd/5347 [15402.752065] CPU: 7 PID: 5347 Comm: ovs-vswitchd Kdump: loaded Tainted: G B O --------- -t - 4.18.0-g3dcc204d291d-dirty #1 [15402.755349] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [15402.757600] Call Trace: [15402.758968] dump_stack+0x71/0xab [15402.760427] print_address_description+0x6a/0x270 [15402.761969] kasan_report+0x179/0x2d0 [15402.763445] ? mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core] [15402.765121] mlx5e_rep_is_lag_netdev+0xcb/0xf0 [mlx5_core] [15402.766782] mlx5e_rep_esw_bond_netevent+0x129/0x620 [mlx5_core] Fix by deferring the violating access to be post the netdev verify check. Fixes: 7e51891a237f ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Verify Hardware supports requested ptp function on a given pinEran Ben Elisha
Fix a bug where driver did not verify Hardware pin capabilities for PTP functions. Fixes: ee7f12205abc ("net/mlx5e: Implement 1PPS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Fix a bug of using ptp channel index as pin indexEran Ben Elisha
On PTP mlx5_ptp_enable(on=0) flow, driver mistakenly used channel index as pin index. After ptp patch marked in fixes tag was introduced, driver can freely call ptp_find_pin() as part of the .enable() callback. Fix driver mlx5_ptp_enable(on=0) flow to always use ptp_find_pin(). With that, Driver will use the correct pin index in mlx5_ptp_enable(on=0) flow. In addition, when initializing the pins, always set channel to zero. As all pins can be attached to all channels, let ptp_set_pinfunc() to move them between the channels. For stable branches, this fix to be applied only on kernels that includes both patches in fixes tag. Otherwise, mlx5_ptp_enable(on=0) will be stuck on pincfg_mux. Fixes: 62582a7ee783 ("ptp: Avoid deadlocks in the programmable pin code.") Fixes: ee7f12205abc ("net/mlx5e: Implement 1PPS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Fix missing cleanup of ethtool steering during rep rx cleanupMaor Dickman
The cited commit add initialization of ethtool steering during representor rx initializations without cleaning it up in representor rx cleanup, this may cause for stale ethtool flows to remain after moving back from switchdev mode to legacy mode. Fixed by calling ethtool steering cleanup during rep rx cleanup. Fixes: 6783e8b29f63 ("net/mlx5e: Init ethtool steering for representors") Signed-off-by: Maor Dickman <maord@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5e: Fix error path of device attachAya Levin
On failure to attach the netdev, fix the rollback by re-setting the device's state back to MLX5E_STATE_DESTROYING. Failing to attach doesn't stop statistics polling via .ndo_get_stats64. In this case, although the device is not attached, it falsely continues to query the firmware for counters. Setting the device's state back to MLX5E_STATE_DESTROYING prevents the firmware counters query. Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: Fix forward to next namespaceMaor Gottlieb
The steering tree is as follow (nic RX as example): --------- |root_ns| --------- | -------------------------------- | | | ---------- ---------- --------- |p(prio)0| | p1 | | pn | ---------- ---------- --------- | | ---------------- --------------- |ns(e.g bypass)| |ns(e.g. lag) | ---------------- --------------- | | | ---- ---- ---- |p0| |p1| |pn| ---- ---- ---- | ---- |FT| ---- find_next_chained_ft(prio) returns the first flow table in the next priority. If prio is a parent of a flow table then it returns the first flow table in the next priority in the same namespace, else if prio is parent of namespace, then it should return the first flow table in the next namespace. Currently if the user requests to forward to next namespace, the code calls to find_next_chained_ft with the prio of the next namespace and not the prio of the namesapce itself. Fixes: 9254f8ed15b6 ("net/mlx5: Add support in forward to namespace") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28net/mlx5: E-switch, Destroy TSAR after reload interfaceParav Pandit
When eswitch offloads is enabled, TSAR is created before reloading the interfaces. However when eswitch offloads mode is disabled, TSAR is disabled before reloading the interfaces. To keep the eswitch enable/disable sequence as mirror, destroy TSAR after reloading the interfaces. Fixes: 1bd27b11c1df ("net/mlx5: Introduce E-switch QoS management") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>