summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-24gve: gve_rx_copy: Move padding to an argumentBailey Forrest
Future use cases will have a different padding value. Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24gve: Move some static functions to a common fileBailey Forrest
These functions will be shared by the GQI and DQO variants of the GVNIC driver as of follow-up patches in this series. Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24gve: Update GVE documentation to describe DQOBailey Forrest
DQO is a new descriptor format for our next generation virtual NIC. Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24dt-bindings: interrupt-controller: Convert ARM VIC to json-schemaSudeep Holla
Convert the ARM VIC binding document to DT schema format using json-schema. Cc: Rob Herring <robh@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20210617205317.3060163-1-sudeep.holla@arm.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-06-24of: of_reserved_mem: mark nomap memory instead of removingDong Aisheng
Since commit 86588296acbf ("fdt: Properly handle "no-map" field in the memory region"), nomap memory is changed to call memblock_mark_nomap() instead of memblock_remove(). But it only changed the reserved memory with fixed addr and size case in early_init_dt_reserve_memory_arch(), not including the dynamical allocation by size case in early_init_dt_alloc_reserved_memory_arch(). Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Reviewed-by: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20210611131153.3731147-2-aisheng.dong@nxp.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-06-24of: of_reserved_mem: only call memblock_free for normal reserved memoryDong Aisheng
For nomap case, the memory block will be removed by memblock_remove() in early_init_dt_alloc_reserved_memory_arch(). So it's meaningless to call memblock_free() on error path. Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Link: https://lore.kernel.org/r/20210611131153.3731147-1-aisheng.dong@nxp.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-06-24PCI: intel-gw: Fix INTx enableMartin Blumenstingl
The legacy PCI interrupt lines need to be enabled using PCIE_APP_IRNEN bits 13 (INTA), 14 (INTB), 15 (INTC) and 16 (INTD). The old code however was taking (for example) "13" as raw value instead of taking BIT(13). Define the legacy PCI interrupt bits using the BIT() macro and then use these in PCIE_APP_IRN_INT. Link: https://lore.kernel.org/r/20210106135540.48420-1-martin.blumenstingl@googlemail.com Fixes: ed22aaaede44 ("PCI: dwc: intel: PCIe RC controller driver") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rahul Tanwar <rtanwar@maxlinear.com>
2021-06-24ipv6: fix out-of-bound access in ip6_parse_tlv()Eric Dumazet
First problem is that optlen is fetched without checking there is more than one byte to parse. Fix this by taking care of IPV6_TLV_PAD1 before fetching optlen (under appropriate sanity checks against len) Second problem is that IPV6_TLV_PADN checks of zero padding are performed before the check of remaining length. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Fixes: c1412fce7ecc ("net/ipv6/exthdrs.c: Strict PadN option checking") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24Merge branch 'macsec-key-length'David S. Miller
Antoine Tenart says: ==================== net: macsec: fix key length when offloading The key length used to copy the key to offloading drivers and to store it is wrong and was working by chance as it matched the default key length. But using a different key length fails. Fix it by using instead the max length accepted in uAPI to store the key and the actual key length when copying it. This was tested on the MSCC PHY driver but not on the Atlantic MAC (looking at the code it looks ok, but testing would be appreciated). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: atlantic: fix the macsec key lengthAntoine Tenart
The key length used to store the macsec key was set to MACSEC_KEYID_LEN (16), which is an issue as: - This was never meant to be the key length. - The key length can be > 16. Fix this by using MACSEC_MAX_KEY_LEN instead (the max length accepted in uAPI). Fixes: 27736563ce32 ("net: atlantic: MACSec egress offload implementation") Fixes: 9ff40a751a6f ("net: atlantic: MACSec ingress offload implementation") Reported-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: phy: mscc: fix macsec key lengthAntoine Tenart
The key length used to store the macsec key was set to MACSEC_KEYID_LEN (16), which is an issue as: - This was never meant to be the key length. - The key length can be > 16. Fix this by using MACSEC_MAX_KEY_LEN instead (the max length accepted in uAPI). Fixes: 28c5107aa904 ("net: phy: mscc: macsec support") Reported-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: macsec: fix the length used to copy the key for offloadingAntoine Tenart
The key length used when offloading macsec to Ethernet or PHY drivers was set to MACSEC_KEYID_LEN (16), which is an issue as: - This was never meant to be the key length. - The key length can be > 16. Fix this by using MACSEC_MAX_KEY_LEN to store the key (the max length accepted in uAPI) and secy->key_len to copy it. Fixes: 3cf3227a21d1 ("net: macsec: hardware offloading infrastructure") Reported-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24trace/hwlat: Switch disable_migrate to mode noneDaniel Bristot de Oliveira
When in the round-robin mode, if the tracer detects a change in the hwlatd thread affinity by an external tool, e.g., taskset, the round-robin logic is disabled. The disable_migrate variable currently tracks this. With the addition of the "mode" config and the mode "none," the disable_migrate logic is equivalent to switch to the "none" mode. Hence, instead of using a hidden variable to track this behavior, switch the mode to none, informing the user about this change. Link: https://lkml.kernel.org/r/a679af672458d6b1f62252605905c5214030f247.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-24trace/hwlat: Implement the mode config optionDaniel Bristot de Oliveira
Provides the "mode" config to the hardware latency detector. hwlatd has two different operation modes. The default mode is the "round-robin" one, in which a single hwlatd thread runs, migrating among the allowed CPUs in a "round-robin" fashion. This is the current behavior. The "none" sets the allowed cpumask for a single hwlatd thread at the startup, but skips the round-robin, letting the scheduler handle the migration. In preparation to the per-cpu mode. Link: https://lkml.kernel.org/r/f3b1271262aa030c680e26615c1b9b2d71e55e92.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-24trace/hwlat: Fix Clark's emailDaniel Bristot de Oliveira
Clark's email is williams@redhat.com. No functional change. Link: https://lkml.kernel.org/r/6fa4b49e17ab8a1ff19c335ab7cde38d8afb0e29.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-24usbnet: add usbnet_event_names[] for keventYajun Deng
Modify the netdev_dbg content from int to char * in usbnet_defer_kevent(), this looks more readable. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24bootconfig/tracing/ktest: Add ktest examples of testing bootconfigSteven Rostedt (VMware)
bootconfig is a new feature that appends scripts onto the initrd, and the kernel executes the scripts as an extended kernel command line. Need to add tests to test that the happened. To test the bootconfig properly, the initrd needs to be updated and the kernel rebooted. ktest is the perfect solution to perform these tests. Add a example bootconfig.conf in the tools/testing/ktest/examples/include and example bootconfig scripts in tools/testing/ktest/examples/bootconfig and also include verifier scripts that ktest will install on the target and run to make sure that the bootconfig options in the scripts took place after the target rebooted with the new initrd update. Link: https://lkml.kernel.org/r/20210618112647.6a81dec5@oasis.local.home Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-24vfio: use the new pci_dev_trylock() helper to simplify try lockLuis Chamberlain
Use the new pci_dev_trylock() helper to simplify our locking. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20210623022824.308041-3-mcgrof@kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-24PCI: Export pci_dev_trylock() and pci_dev_unlock()Luis Chamberlain
Other places in the kernel use this form, and so just provide a common path for it. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20210623022824.308041-2-mcgrof@kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-24vfio/mdpy: Fix memory leak of object mdev_state->vconfigColin Ian King
In the case where the call to vfio_register_group_dev fails the error return path kfree's mdev_state but not mdev_state->vconfig. Fix this by kfree'ing mdev_state->vconfig before returning. Addresses-Coverity: ("Resource leak") Fixes: 437e41368c01 ("vfio/mdpy: Convert to use vfio_register_group_dev()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210622183710.28954-1-colin.king@canonical.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-24libceph: set global_id as soon as we get an auth ticketIlya Dryomov
Commit 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") delayed the setting of global_id too much. It is set only after all tickets are received, but in pre-nautilus clusters an auth ticket and the service tickets are obtained in separate steps (for a total of three MAuth replies). When the service tickets are requested, global_id is used to build an authorizer; if global_id is still 0 we never get them and fail to establish the session. Moving the setting of global_id into protocol implementations. This way global_id can be set exactly when an auth ticket is received, not sooner nor later. Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24libceph: don't pass result into ac->ops->handle_reply()Ilya Dryomov
There is no result to pass in msgr2 case because authentication failures are reported through auth_bad_method frame and in MAuth case an error is returned immediately. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24perf top: Add cgroup support for perf top (-G)Joshua Martinez
Added callback option (-G) to support cgroups for 'perf top'. Added condition to make sure -cgroup and --all-cgroups aren't both enabled. Example: $perf top -e cycles -G system.slice/docker-6b95a5eb649c0d671eba3835f0d93973d05a088f3ae8602246bde37affb1ba3e.scope -a --stdio PerfTop: 3330 irqs/sec kernel:68.2% exact: 0.0% lost: 0/0 drop: 0/11075 [4000Hz cpu-clock], (all, 4 CPUs) ------------------------------------------------------------------------------------------------------------------------------------------------------- 27.32% [unknown] [.] 0x00007f8ab7b69352 11.44% [kernel] [k] 0xffffffff968cd657 3.12% [kernel] [k] 0xffffffff96160e96 2.63% [kernel] [k] 0xffffffff96160eb0 1.96% [kernel] [k] 0xffffffff9615fcf6 1.42% [kernel] [k] 0xffffffff964ddfc7 1.09% [kernel] [k] 0xffffffff96160e90 0.81% [kernel] [k] 0xffffffff96160eb3 0.67% [kernel] [k] 0xffffffff9615fec1 0.57% [kernel] [k] 0xffffffff961ee1d0 0.53% [unknown] [.] 0x00007f8ab7b6666c 0.53% [kernel] [k] 0xffffffff96160e64 0.52% [kernel] [k] 0xffffffff9616c303 0.51% [kernel] [k] 0xffffffffc08e7d50 ... Signed-off-by: Joshua Martinez <joshuamart@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: joshua martinez <joshuamart@google.com> Link: http://lore.kernel.org/lkml/20210616231829.3735671-1-joshuamart@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-24spi: Fix self assignment issue with ancillary->modeColin Ian King
There is an assignment of ancillary->mode to itself which looks dubious since the proceeding comment states that the speed and mode is taken over from the SPI main device, indicating that ancillary->mode should assigned using the value spi->mode. Fix this. Addresses-Coverity: ("Self assignment") Fixes: 0c79378c0199 ("spi: add ancillary device support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210623172300.161484-1-colin.king@canonical.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-24RDMA/cma: Protect RMW with qp_mutexHåkon Bugge
The struct rdma_id_private contains three bit-fields, tos_set, timeout_set, and min_rnr_timer_set. These are set by accessor functions without any synchronization. If two or all accessor functions are invoked in close proximity in time, there will be Read-Modify-Write from several contexts to the same variable, and the result will be intermittent. Fixed by protecting the bit-fields by the qp_mutex in the accessor functions. The consumer of timeout_set and min_rnr_timer_set is in rdma_init_qp_attr(), which is called with qp_mutex held for connected QPs. Explicit locking is added for the consumers of tos and tos_set. This commit depends on ("RDMA/cma: Remove unnecessary INIT->INIT transition"), since the call to rdma_init_qp_attr() from cma_init_conn_qp() does not hold the qp_mutex. Fixes: 2c1619edef61 ("IB/cma: Define option to set ack timeout and pack tos_set") Fixes: 3aeffc46afde ("IB/cma: Introduce rdma_set_min_rnr_timer()") Link: https://lore.kernel.org/r/1624369197-24578-3-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-24ASoC: qcom: lpass-cpu: mark IRQ_CLEAR register as volatile and readableSrinivas Kandagatla
Currently IRQ_CLEAR register is marked as write-only, however using regmap_update_bits on this register will have some side effects. so mark IRQ_CLEAR register appropriately as readable and volatile. Fixes: da0363f7bfd3 ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20210624092153.5771-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-24RDMA/cma: Remove unnecessary INIT->INIT transitionHåkon Bugge
In rdma_create_qp(), a connected QP will be transitioned to the INIT state. Afterwards, the QP will be transitioned to the RTR state by the cma_modify_qp_rtr() function. But this function starts by performing an ib_modify_qp() to the INIT state again, before another ib_modify_qp() is performed to transition the QP to the RTR state. Hence, there is no need to transition the QP to the INIT state in rdma_create_qp(). Link: https://lore.kernel.org/r/1624369197-24578-2-git-send-email-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-24Merge branch 'add-sparx5i-driver'David S. Miller
Steen Hegelund says: ==================== Adding the Sparx5i Switch Driver This series provides the Microchip Sparx5i Switch Driver The SparX-5 Enterprise Ethernet switch family provides a rich set of Enterprise switching features such as advanced TCAM-based VLAN and QoS processing enabling delivery of differentiated services, and security through TCAMbased frame processing using versatile content aware processor (VCAP). IPv4/IPv6 Layer 3 (L3) unicast and multicast routing is supported with up to 18K IPv4/9K IPv6 unicast LPM entries and up to 9K IPv4/3K IPv6 (S,G) multicast groups. L3 security features include source guard and reverse path forwarding (uRPF) tasks. Additional L3 features include VRF-Lite and IP tunnels (IP over GRE/IP). The SparX-5 switch family features a highly flexible set of Ethernet ports with support for 10G and 25G aggregation links, QSGMII, USGMII, and USXGMII. The device integrates a powerful 1 GHz dual-core ARM® Cortex®-A53 CPU enabling full management of the switch and advanced Enterprise applications. The SparX-5 switch family targets managed Layer 2 and Layer 3 equipment in SMB, SME, and Enterprise where high port count 1G/2.5G/5G/10G switching with 10G/25G aggregation links is required. The SparX-5 switch family consists of following SKUs: VSC7546 SparX-5-64 supports up to 64 Gbps of bandwidth with the following primary port configurations. - 6 ×10G - 16 × 2.5G + 2 × 10G - 24 × 1G + 4 × 10G VSC7549 SparX-5-90 supports up to 90 Gbps of bandwidth with the following primary port configurations. - 9 × 10G - 16 × 2.5G + 4 × 10G - 48 × 1G + 4 × 10G VSC7552 SparX-5-128 supports up to 128 Gbps of bandwidth with the following primary port configurations. - 12 × 10G - 6 x 10G + 2 x 25G - 16 × 2.5G + 8 × 10G - 48 × 1G + 8 × 10G VSC7556 SparX-5-160 supports up to 160 Gbps of bandwidth with the following primary port configurations. - 16 × 10G - 10 × 10G + 2 × 25G - 16 × 2.5G + 10 × 10G - 48 × 1G + 10 × 10G VSC7558 SparX-5-200 supports up to 200 Gbps of bandwidth with the following primary port configurations. - 20 × 10G - 8 × 25G In addition, the device supports one 10/100/1000/2500/5000 Mbps SGMII/SerDes node processor interface (NPI) Ethernet port. Time sensitive networking (TSN) is supported through a comprehensive set of features including frame preemption, cut-through, frame replication and elimination for reliability, enhanced scheduling: credit-based shaping, time-aware shaping, cyclic queuing, and forwarding, and per-stream policing and filtering. Together with IEEE 1588 and IEEE 802.1AS support, this guarantees low-latency deterministic networking for Industrial Ethernet. The Sparx5i support is developed on the PCB134 and PCB135 evaluation boards. - PCB134 main networking features: - 12x SFP+ front 10G module slots (connected to Sparx5i through SFI). - 8x SFP28 front 25G module slots (connected to Sparx5i through SFI high speed). - Optional, one additional 10/100/1000BASE-T (RJ45) Ethernet port (on-board VSC8211 PHY connected to Sparx5i through SGMII). - PCB135 main networking features: - 48x1G (10/100/1000M) RJ45 front ports using 12xVSC8514 QuadPHY’s each connected to VSC7558 through QSGMII. - 4x10G (1G/2.5G/5G/10G) RJ45 front ports using the AQR407 10G QuadPHY each port connects to VSC7558 through SFI. - 4x SFP28 25G module slots on back connected to VSC7558 through SFI high speed. - Optional, one additional 1G (10/100/1000M) RJ45 port using an on-board VSC8211 PHY, which can be connected to VSC7558 NPI port through SGMII using a loopback add-on PCB) This series provides support for: - SFPs and DAC cables via PHYLINK with a number of 5G, 10G and 25G devices and media types. - Port module configuration for 10M to 25G speeds with SGMII, QSGMII, 1000BASEX, 2500BASEX and 10GBASER as appropriate for these modes. - SerDes configuration via the Sparx5i SerDes driver (see below). - Host mode providing register based injection and extraction. - Switch mode providing MAC/VLAN table learning and Layer2 switching offloaded to the Sparx5i switch. - STP state, VLAN support, host/bridge port mode, Forwarding DB, and configuration and statistics via ethtool. More support will be added at a later stage. The Sparx5i Chip Register Model can be browsed at this location: https://github.com/microchip-ung/sparx-5_reginfo and the datasheet is available here: https://ww1.microchip.com/downloads/en/DeviceDoc/SparX-5_Family_L2L3_Enterprise_10G_Ethernet_Switches_Datasheet_00003822B.pdf The series depends on the following series currently on their way into the kernel: - 25G Base-R phy mode Link: https://lore.kernel.org/r/20210611125453.313308-1-steen.hegelund@microchip.com/ - Sparx5 Reset Driver Link: https://lore.kernel.org/r/20210416084054.2922327-1-steen.hegelund@microchip.com/ ChangeLog: v5: - cover letter - updated the description to match the latest data sheets - basic driver - added error message in case of reset controller error - port struct: replacing has_sfp with inband, adding pause_adv - host mode - port cleanup: unregisters netdevs and then removes phylink etc - checking for pause_adv when comparing port config changes - getting duplex and pause state in the link_up callback. - getting inband, autoneg and pause_adv config in the pcs_config callback. - port - use only the pause_adv bits when getting aneg status - use the inband state when updating the PCS and port config v4: - basic driver: Using devm_reset_control_get_optional_shared to get the reset control, and let the reset framework check if it is valid. - host mode (phylink): Use the PCS operations to get state and update configuration. Removed the setting of interface modes. Let phylink control this. Using the new 5gbase-r and 25gbase-r modes. Using a helper function to check if one of the 3 base-r modes has been selected. Currently it will not be possible to change the interface mode by changing the speed (e.g via ethtool). This will be added later. v3: - basic driver: - removed unneeded braces - release reference to ports node after use - use dev_err_probe to handle DEFER - update error value when bailing out (a few cases) - updated formatting of port struct and grouping of bool values - simplified the spx5_rmw and spx5_inst_rmw inline functions - host mode (netdev): - removed lockless flag - added port timer init - host mode (packet - manual injection): - updated error counters in error situations - implemented timer handling of watermark threshold: stop and restart netif queues. - fixed error message handling (rate limited) - fixed comment style error - used DIV_ROUND_UP macro - removed a debug message for open ports v2: - Updated bindings: - drop minItems for the reg property - Statistics implementation: - Reorganized statistics into ethtool groups: eth-phy, eth-mac, eth-ctrl, rmon as defined by the IEEE 802.3 categories and RFC 2819. - The remaining statistics are provided by the classic ethtool statistics command. - Hostmode support: - Removed netdev renaming - Validate ethernet address in sparx5_set_mac_address() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24arm64: dts: sparx5: Add the Sparx5 switch nodeSteen Hegelund
This provides the configuration for the currently available evaluation boards PCB134 and PCB135. The series depends on the following series currently on its way into the kernel: - Sparx5 Reset Driver Link: https://lore.kernel.org/r/20210416084054.2922327-1-steen.hegelund@microchip.com/ Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add ethtool configuration and statistics supportSteen Hegelund
This adds statistic counters for the network interfaces provided by the driver. It also adds CPU port counters (which are not exposed by ethtool). This also adds support for configuring the network interface parameters via ethtool: speed, duplex, aneg etc. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add calendar bandwidth allocation supportSteen Hegelund
This configures the Sparx5 calendars according to the bandwidth requested in the Device Tree nodes. It also checks if the total requested bandwidth is within the specs of the detected Sparx5 models limits. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add switching supportSteen Hegelund
This adds SwitchDev support by hardware offloading the software bridge. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add vlan supportSteen Hegelund
This adds Sparx5 VLAN support. Sparx5 has more VLAN features than provided here, but these will be added in later series. For now we only add the basic L2 features. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add mactable supportSteen Hegelund
This adds the Sparx5 MAC tables: listening for MAC table updates and updating on request. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add port module supportSteen Hegelund
This add configuration of the Sparx5 port module instances. Sparx5 has in total 65 logical ports (denoted D0 to D64) and 33 physical SerDes connections (S0 to S32). The 65th port (D64) is fixed allocated to SerDes0 (S0). The remaining 64 ports can in various multiplexing scenarios be connected to the remaining 32 SerDes using QSGMII, or USGMII or USXGMII extenders. 32 of the ports can have a 1:1 mapping to the 32 SerDes. Some additional ports (D65 to D69) are internal to the device and do not connect to port modules or SerDes macros. For example, internal ports are used for frame injection and extraction to the CPU queues. The 65 logical ports are split up into the following blocks. - 13 x 5G ports (D0-D11, D64) - 32 x 2G5 ports (D16-D47) - 12 x 10G ports (D12-D15, D48-D55) - 8 x 25G ports (D56-D63) Each logical port supports different line speeds, and depending on the speeds supported, different port modules (MAC+PCS) are needed. A port supporting 5 Gbps, 10 Gbps, or 25 Gbps as maximum line speed, will have a DEV5G, DEV10G, or DEV25G module to support the 5 Gbps, 10 Gbps (incl 5 Gbps), or 25 Gbps (including 10 Gbps and 5 Gbps) speeds. As well as, it will have a shadow DEV2G5 port module to support the lower speeds (10/100/1000/2500Mbps). When a port needs to operate at lower speed and the shadow DEV2G5 needs to be connected to its corresponding SerDes Not all interface modes are supported in this series, but will be added at a later stage. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add hostmode with phylink supportSteen Hegelund
This patch adds netdevs and phylink support for the ports in the switch. It also adds register based injection and extraction for these ports. Frame DMA support for injection and extraction will be added in a later series. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: sparx5: add the basic sparx5 driverSteen Hegelund
This adds the Sparx5 basic SwitchDev driver framework with IO range mapping, switch device detection and core clock configuration. Support for ports, phylink, netdev, mactable etc. are in the following patches. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24dt-bindings: net: sparx5: Add sparx5-switch bindingsSteen Hegelund
Document the Sparx5 switch device driver bindings Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: Lars Povlsen <lars.povlsen@microchip.com> Signed-off-by: Bjarni Jonasson <bjarni.jonasson@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24Merge tag 'linux-can-fixes-for-5.13-20210624' of git://git.kernel.org/David S. Miller
pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-06-24 this is a pull request of 2 patches for net/master. The first patch is by Norbert Slusarek and prevent allocation of filter for optlen == 0 in the j1939 CAN protocol. The last patch is by Stephane Grosjean and fixes a potential starvation in the TX path of the peak_pciefd driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24Merge branch 'ibmvnic-fixes'David S. Miller
Sukadev Bhattiprolu says: ==================== ibmvnic: Assorted bug fixes Assorted bug fixes that we tested over the last several weeks. Thanks to Brian King, Cris Forno, Dany Madden and Rick Lindsley for reviews and help with testing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24ibmvnic: parenthesize a checkSukadev Bhattiprolu
Parenthesize a check to be more explicit and to fix a sparse warning seen on some distros. Fixes: 91dc5d2553fbf ("ibmvnic: fix miscellaneous checks") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24ibmvnic: free tx_pool if tso_pool alloc failsSukadev Bhattiprolu
Free tx_pool and clear it, if allocation of tso_pool fails. release_tx_pools() assumes we have both tx and tso_pools if ->tx_pool is non-NULL. If allocation of tso_pool fails in init_tx_pools(), the assumption will not be true and we would end up dereferencing ->tx_buff, ->free_map fields from a NULL pointer. Fixes: 3205306c6b8d ("ibmvnic: Update TX pool initialization routine") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24ibmvnic: set ltb->buff to NULL after freeingSukadev Bhattiprolu
free_long_term_buff() checks ltb->buff to decide whether we have a long term buffer to free. So set ltb->buff to NULL afer freeing. While here, also clear ->map_id, fix up some coding style and log an error. Fixes: 9c4eaabd1bb39 ("Check CRQ command return codes") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24ibmvnic: account for bufs already saved in indir_bufSukadev Bhattiprolu
This fixes a crash in replenish_rx_pool() when called from ibmvnic_poll() after a previous call to replenish_rx_pool() encountered an error when allocating a socket buffer. Thanks to Rick Lindsley and Dany Madden for helping debug the crash. Fixes: 4f0b6812e9b9 ("ibmvnic: Introduce batched RX buffer descriptor transmission") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24ibmvnic: clean pending indirect buffs during resetSukadev Bhattiprolu
We batch subordinate command response queue (scrq) descriptors that we need to send to the VIOS using an "indirect" buffer. If after we queue one or more scrqs in the indirect buffer encounter an error (say fail to allocate an skb), we leave the queued scrq descriptors in the indirect buffer until the next call to ibmvnic_xmit(). On the next call to ibmvnic_xmit(), it is possible that the adapter is going through a reset and it is possible that the long term buffers have been unmapped on the VIOS side. If we proceed to flush (send) the packets that are in the indirect buffer, we will end up using the old map ids and this can cause the VIOS to trigger an unnecessary FATAL error reset. Instead of flushing packets remaining on the indirect_buff, discard (clean) them instead. Fixes: 0d973388185d4 ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24Revert "ibmvnic: remove duplicate napi_schedule call in open function"Dany Madden
This reverts commit 7c451f3ef676c805a4b77a743a01a5c21a250a73. When a vnic interface is taken down and then up, connectivity is not restored. We bisected it to this commit. Reverting this commit until we can fully investigate the issue/benefit of the change. Fixes: 7c451f3ef676 ("ibmvnic: remove duplicate napi_schedule call in open function") Reported-by: Cristobal Forno <cforno12@linux.ibm.com> Reported-by: Abdul Haleem <abdhalee@in.ibm.com> Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24Revert "ibmvnic: simplify reset_long_term_buff function"Sukadev Bhattiprolu
This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054. We tried to optimize the number of hcalls we send and skipped sending the REQUEST_MAP calls for some maps. However during resets, we need to resend all the maps to the VIOS since the VIOS does not remember the old values. In fact we may have failed over to a new VIOS which will not have any of the mappings. When we send packets with map ids the VIOS does not know about, it triggers a FATAL reset. While the client does recover from the FATAL error reset, we are seeing a large number of such resets. Handling FATAL resets is lot more unnecessary work than issuing a few more hcalls so revert the commit and resend the maps to the VIOS. Fixes: 1c7d45e7b2c ("ibmvnic: simplify reset_long_term_buff function") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: mdiobus: fix fwnode_mdbiobus_register() fallback caseMarcin Wojtas
The fallback case of fwnode_mdbiobus_register() (relevant for !CONFIG_FWNODE_MDIO) was defined with wrong argument name, causing a compilation error. Fix that. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24net: ip: avoid OOM kills with large UDP sends over loopbackJakub Kicinski
Dave observed number of machines hitting OOM on the UDP send path. The workload seems to be sending large UDP packets over loopback. Since loopback has MTU of 64k kernel will try to allocate an skb with up to 64k of head space. This has a good chance of failing under memory pressure. What's worse if the message length is <32k the allocation may trigger an OOM killer. This is entirely avoidable, we can use an skb with page frags. af_unix solves a similar problem by limiting the head length to SKB_MAX_ALLOC. This seems like a good and simple approach. It means that UDP messages > 16kB will now use fragments if underlying device supports SG, if extra allocator pressure causes regressions in real workloads we can switch to trying the large allocation first and falling back. v4: pre-calculate all the additions to alloclen so we can be sure it won't go over order-2 Reported-by: Dave Jones <dsj@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24RDMA/hns: Add window selection field of congestion controlYixing Liu
The window selection field is necessary for congestion control of HIP09, it is got from firmware and then filled into QPC. Some algorithms need it to decide whether to limit the number of windows. Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW") Link: https://lore.kernel.org/r/1624364163-44185-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>