linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-03-09	qed: Fix copy of uninitialized memory	robert.foss@collabora.com
	In qed_ll2_start_ooo() the ll2_info variable is uninitialized and then passed to qed_ll2_acquire_connection() where it is copied into a new memory space. This shouldn't cause any issue as long as non of the copied memory is every read. But the potential for a bug being introduced by reading this memory is real. Detected by CoverityScan, CID#1399632 ("Uninitialized scalar variable") Signed-off-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: stmicro: replace kzalloc with devm_kzalloc	Joao Pinto
	The axi variable was not being freed upon device removal. With devm_kzalloc it ensures that it is properly freed. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mediatek: Use eth_hw_addr_random()	Tobias Klauser
	Use eth_hw_addr_random() to set a random dev_addr and update addr_assign_type instead of open-coding it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	Merge branch 'thunderx-misc-fixes'	David S. Miller
	Sunil Goutham says: ==================== net: thunderx: Miscellaneous fixes This patch set fixes multiple issues such as IOMMU translation faults when kernel is booted with IOMMU enabled on host, incorrect MAC ID reading from ACPI tables and IPv6 UDP packet drop due to failure of checksum validation. Changes from v1: - As suggested by David Miller, got rid of conditional calling of DMA map/unmap APIs. Also updated commit message in 'IOMMU translation faults' patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: thunderx: Allow IPv6 frames with zero UDP checksum	Thanneeru Srinivasulu
	Do not consider IPv6 frames with zero UDP checksum as frames with bad checksum and drop them. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: thunderx: Fix invalid mac addresses for node1 interfaces	Sunil Goutham
	When booted with ACPI, random mac addresses are being assigned to node1 interfaces due to mismatch of bgx_id in BGX driver and ACPI tables. This patch fixes this issue by setting maximum BGX devices per node based on platform/soc instead of a macro. This change will set the bgx_id appropriately. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: thunderx: Fix LMAC mode debug prints for QSGMII mode	Sunil Goutham
	When BGX/LMACs are in QSGMII mode, for some LMACs, mode info is not being printed. This patch will fix that. With changes already done to not do any sort of serdes 2 lane mapping config calculation in kernel driver, we can get rid of this logic. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: thunderx: Fix IOMMU translation faults	Sunil Goutham
	ACPI support has been added to ARM IOMMU driver in 4.10 kernel and that has resulted in VNIC interfaces throwing translation faults when kernel is booted with ACPI as driver was not using DMA API. This patch fixes the issue by using DMA API which inturn will create translation tables when IOMMU is enabled. Also VNIC doesn't have a seperate receive buffer ring per receive queue, so there is no 1:1 descriptor index matching between CQE_RX and the index in buffer ring from where a buffer has been used for DMA'ing. Unlike other NICs, here it's not possible to maintain dma address to virt address mappings within the driver. This leaves us no other choice but to use IOMMU's IOVA address conversion API to get buffer's virtual address which can be given to network stack for processing. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	rds: ib: add error handle	Zhu Yanjun
	In the function rds_ib_setup_qp, the error handle is missing. When some error occurs, it is possible that memory leak occurs. As such, error handle is added. Cc: Joe Jin <joe.jin@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	liquidio: improve UDP TX performance	VSR Burru
	Improve UDP TX performance by: * reducing the ring size from 2K to 512 * replacing the numerous streaming DMA allocations for info buffers and gather lists with one large consistent DMA allocation per ring BQL is not effective here. We reduced the ring size because there is heavy overhead with dma_map_single every so often. With iommu=on, dma_map_single in PF Tx data path was taking longer time (~700usec) for every ~250 packets. Debugged intel_iommu code, and found that PF driver is utilizing too many static IO virtual address mapping entries (for gather list entries and info buffers): about 100K entries for two PF's each using 8 rings. Also, finding an empty entry (in rbtree of device domain's iova mapping in kernel) during Tx path becomes a bottleneck every so often; the loop to find the empty entry goes through over 40K iterations; this is too costly and was the major overhead. Overhead is low when this loop quits quickly. Netperf benchmark numbers before and after patch: PF UDP TX +--------+--------+------------+------------+---------+ \| \| \| Before \| After \| \| \| Number \| \| Patch \| Patch \| \| \| of \| Packet \| Throughput \| Throughput \| Percent \| \| Flows \| Size \| (Gbps) \| (Gbps) \| Change \| +--------+--------+------------+------------+---------+ \| \| 360 \| 0.52 \| 0.93 \| +78.9 \| \| 1 \| 1024 \| 1.62 \| 2.84 \| +75.3 \| \| \| 1518 \| 2.44 \| 4.21 \| +72.5 \| +--------+--------+------------+------------+---------+ \| \| 360 \| 0.45 \| 1.59 \| +253.3 \| \| 4 \| 1024 \| 1.34 \| 5.48 \| +308.9 \| \| \| 1518 \| 2.27 \| 8.31 \| +266.1 \| +--------+--------+------------+------------+---------+ \| \| 360 \| 0.40 \| 1.61 \| +302.5 \| \| 8 \| 1024 \| 1.64 \| 4.24 \| +158.5 \| \| \| 1518 \| 2.87 \| 6.52 \| +127.2 \| +--------+--------+------------+------------+---------+ VF UDP TX +--------+--------+------------+------------+---------+ \| \| \| Before \| After \| \| \| Number \| \| Patch \| Patch \| \| \| of \| Packet \| Throughput \| Throughput \| Percent \| \| Flows \| Size \| (Gbps) \| (Gbps) \| Change \| +--------+--------+------------+------------+---------+ \| \| 360 \| 1.28 \| 1.49 \| +16.4 \| \| 1 \| 1024 \| 4.44 \| 4.39 \| -1.1 \| \| \| 1518 \| 6.08 \| 6.51 \| +7.1 \| +--------+--------+------------+------------+---------+ \| \| 360 \| 2.35 \| 2.35 \| 0.0 \| \| 4 \| 1024 \| 6.41 \| 8.07 \| +25.9 \| \| \| 1518 \| 9.56 \| 9.54 \| -0.2 \| +--------+--------+------------+------------+---------+ \| \| 360 \| 3.41 \| 3.65 \| +7.0 \| \| 8 \| 1024 \| 9.35 \| 9.34 \| -0.1 \| \| \| 1518 \| 9.56 \| 9.57 \| +0.1 \| +--------+--------+------------+------------+---------+ Signed-off-by: VSR Burru <veerasenareddy.burru@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: ipv6: Remove redundant RTA_OIF in multipath routes	David Ahern
	Dinesh reported that RTA_MULTIPATH nexthops are 8-bytes larger with IPv6 than IPv4. The recent refactoring for multipath support in netlink messages does discriminate between non-multipath which needs the OIF and multipath which adds a rtnexthop struct for each hop making the RTA_OIF attribute redundant. Resolve by adding a flag to the info function to skip the oif for multipath. Fixes: beb1afac518d ("net: ipv6: Add support to dump multipath routes via RTA_MULTIPATH attribute") Reported-by: Dinesh Dutt <ddutt@cumulusnetworks.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	tg3: Add the ability to conditionally build w/ HWMON	Florian Fainelli
	Introduce a Kconfig option: CONFIG_TIGON3_HWMON which allows to build in/out support for thermal sensors reported by Tigon3 NICs. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	Merge tag 'for-linus-4.11-rc1-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix and cleanup from Juergen Gross: "This contains one fix for MSIX handling under Xen and a trivial cleanup patch" * tag 'for-linus-4.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: Remove duplicate inclusion of linux/init.h xen: do not re-use pirq number cached in pci device msi msg data
2017-03-09	mm: introduce __p4d_alloc()	Kirill A. Shutemov
	For full 5-level paging we need a helper to allocate p4d page table. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	mm: convert generic code to 5-level paging	Kirill A. Shutemov
	Convert all non-architecture-specific code to 5-level paging. It's mostly mechanical adding handling one more page table level in places where we deal with pud_t. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	asm-generic: introduce <asm-generic/pgtable-nop4d.h>	Kirill A. Shutemov
	Like with pgtable-nopud.h for 4-level paging, this new header is base for converting an architectures to properly folded p4d_t level. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	arch, mm: convert all architectures to use 5level-fixup.h	Kirill A. Shutemov
	If an architecture uses 4level-fixup.h we don't need to do anything as it includes 5level-fixup.h. If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK before inclusion of the header. It makes asm-generic code to use 5level-fixup.h. If an architecture has 4-level paging or folds levels on its own, include 5level-fixup.h directly. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	asm-generic: introduce __ARCH_USE_5LEVEL_HACK	Kirill A. Shutemov
	We are going to introduce <asm-generic/pgtable-nop4d.h> to provide abstraction for properly (in opposite to 5level-fixup.h hack) folded p4d level. The new header will be included from pgtable-nopud.h. If an architecture uses <asm-generic/nopd.h>, we cannot use 5level-fixup.h directly to quickly convert the architecture to 5-level paging as it would conflict with pgtable-nop4d.h. With this patch an architecture can define __ARCH_USE_5LEVEL_HACK before inclusion <asm-genenric/nopd.h> to use 5level-fixup.h. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	asm-generic: introduce 5level-fixup.h	Kirill A. Shutemov
	We are going to switch core MM to 5-level paging abstraction. This is preparation step which adds <asm-generic/5level-fixup.h> As with 4level-fixup.h, the new header allows quickly make all architectures compatible with 5-level paging in core MM. In long run we would like to switch architectures to properly folded p4d level by using <asm-generic/pgtable-nop4d.h>, but it requires more changes to arch-specific code. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	x86/cpufeature: Add 5-level paging detection	Kirill A. Shutemov
	Look for 'la57' in /proc/cpuinfo to see if your machine supports 5-level paging. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09	Merge branch 'mvpp2-add-initial-support-for-PPv2.2'	David S. Miller
	Thomas Petazzoni says: ==================== net: mvpp2: add initial support for PPv2.2 The goal of this patch series is to add basic support for PPv2.2 in the existing mvpp2 driver. mvpp2 currently supported the PPv2.1 version of the IP, used in the 32 bits Marvell Armada 375 SoC. PPv2.2 is an evolution of this IP block, used in the 64 bits Marvell Armada 7K/8K SoCs. In order to ease the review, the introduction of PPv2.2 support has been made into multiple small commits, with the final commit adding the compatible string that makes the PPv2.2 support actually usable. The series remain fully bisectable. People interested in testing the code will find the full series (plus a few Device Tree patches) at: https://github.com/MISL-EBU-System-SW/mainline-public/tree/4.11/mvpp2.2-support-v3 I'd like to thank Stefan Chulski and Marcin Wojtas, who helped me a lot in the development of this patch series, by reviewing the patches, and giving lots of useful hints to debug the driver on PPv2.2. Thanks as well to Russell King for reviewing previous iterations of this series, and providing suggestions and fixes. Changes between v2 and v3: - Rebased on v4.11-rc1. - Add patch "net: mvpp2: fix DMA address calculation in mvpp2_txq_inc_put()", to properly take into account the "packet offset" field of the TX descriptors. Without this, we were getting DMA_API_DEBUG warnings that we are unmapping DMA mappings with a non-mapped DMA address. - In patch "net: mvpp2: add and use accessors for TX/RX descriptors", add a function named mvpp2_txdesc_offset_get(), which is needed for the DMA address calculation fix. - In patch "net: mvpp2: add and use accessors for TX/RX descriptors", fix the calculation of tx_desc physical address and packet offset in mvpp2_tx_frag_process(). The offset was assigned into the buffer physical address, and the physical address to the packet offset, which meant the fragment process was completely broken. - In patch "net: mvpp2: adjust the allocation/free of BM pools for PPv2.2" fix how MVPP22_BM_ADDR_HIGH_VIRT_RLS_MASK is used. This mask is already shifted. So the value should be shifted before being masked and not the opposite. - Add a new patch "net: mvpp2: set dma mask and coherent dma mask on PPv2.2", to set the DMA mask and DMA coherent mask. By setting the DMA mask to 40 bits we avoid using bounce buffers when network packets are above the 4 GB limit. The coherent mask remains set to 32 bits, because the BM pools must all have the same high 32 bits in their addresses. - Use "dma" instead of "phys" where appropriate, as suggested by Russell King. - Use the "cookie" field of the RX descriptor to store the physical address instead of the virtual address, and then use phys_to_virt() to get the virtual address. This allows to work around the limit that the "cookie" field only has 40 bits, which is not sufficient to store a virtual address on 64 bits platforms. This was suggested by Russell King. As part of this change, also got rid of all the compile time conditionals on CONFIG_ARCH_DMA_ADDR_T_64BIT, to get better compile-time coverage. - In patch "net: mvpp2: handle misc PPv2.1/PPv2.2 differences": * Instead of calling mvpp21_port_power_up(port) only on PPv2.1, remove this function, and call its relevant parts directly from ->probe(). Only mvpp2_port_fc_adv_enable() is PPv2.1 specific. Reported by Russell King. * Add a mvpp22_port_mii_set() function that properly initializes SGMII support on PPv2.2. Code provided by Russell King. - In patch "net: mvpp2: handle register mapping and access for PPv2.2": * Adjust the code to match the change of the DT binding in terms of mapping the second register area on PPv2.2. * Rework the register accessors to remove the get_cpu()/put_cpu(), and instead use separate accessors for global registers vs. per-CPU registers. - Add a few new patches removing dead/unused/useless code: net: mvpp2: remove support for buffer header net: mvpp2: remove unused register definition MVPP2_TXQ_THRESH_REG net: mvpp2: remove mvpp2_txq_pend_desc_num_get() function - Fix a number of checkpatch warnings. Changes between v1 and v2: - Made a separate series from the set of patches doing preparation changes/fixes to the mvpp2 driver. - Rebased on top of v4.10-rc1. - Update Kconfig text of the mvpp2 driver to mention the support for Armada 7K and 8K (PPv2.2). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: finally add the PPv2.2 compatible string	Thomas Petazzoni
	Now that the mvpp2 driver has been modified to accommodate the support for PPv2.2, we can finally advertise this support by adding the appropriate compatible string. At the same time, we update the Kconfig description of the MVPP2 driver. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: set dma mask and coherent dma mask on PPv2.2	Thomas Petazzoni
	On PPv2.2, the streaming mappings can be anywhere in the first 40 bits of the physical address space. However, for the coherent mappings, we still need them to be in the first 32 bits of the address space, because all BM pools share a single register to store the high 32 bits of the BM pool address, which means all BM pools must be allocated in the same 4GB memory area. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: add support for an additional clock needed for PPv2.2	Thomas Petazzoni
	The PPv2.2 variant of the network controller needs an additional clock, the "MG clock" in order for the IP block to operate properly. This commit adds support for this additional clock to the driver, reworking as needed the error handling path. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: adapt rxq distribution to PPv2.2	Thomas Petazzoni
	In PPv2.1, we have a maximum of 8 RXQs per port, with a default of 4 RXQs per port, and we were assigning RXQs 0->3 to the first port, 4->7 to the second port, 8->11 to the third port, etc. In PPv2.2, we have a maximum of 32 RXQs per port, and we must allocate RXQs from the range of 32 RXQs available for each port. So port 0 must use RXQs in the range 0->31, port 1 in the range 32->63, etc. This commit adapts the mvpp2 to this difference between PPv2.1 and PPv2.2: - The constant definition MVPP2_MAX_RXQ is replaced by a new field 'max_port_rxqs' in 'struct mvpp2', which stores the maximum number of RXQs per port. This field is initialized during ->probe() depending on the IP version. - MVPP2_RXQ_TOTAL_NUM is removed, and instead we calculate the total number of RXQs by multiplying the number of ports by the maximum of RXQs per port. This was anyway used in only one place. - In mvpp2_port_probe(), the calculation of port->first_rxq is adjusted to cope with the different allocation strategy between PPv2.1 and PPv2.2. Due to this change, the 'next_first_rxq' argument of this function is no longer needed and is removed. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: rework RXQ interrupt group initialization for PPv2.2	Thomas Petazzoni
	This commit adjusts how the MVPP2_ISR_RXQ_GROUP_REG register is configured, since it changed between PPv2.1 and PPv2.2. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: add AXI bridge initialization for PPv2.2	Thomas Petazzoni
	The PPv2.2 unit is connected to an AXI bus on Armada 7K/8K, so this commit adds the necessary initialization of the AXI bridge. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: handle misc PPv2.1/PPv2.2 differences	Thomas Petazzoni
	This commit handles a few miscellaneous differences between PPv2.1 and PPv2.2 in different areas, where code done for PPv2.1 doesn't apply for PPv2.2 or needs to be adjusted (getting the MAC address, disabling PHY polling, etc.). Thanks to Russell King for providing the initial implementation of mvpp22_port_mii_set(). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: handle register mapping and access for PPv2.2	Thomas Petazzoni
	This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: adjust mvpp2_{rxq, txq}_init for PPv2.2	Thomas Petazzoni
	In PPv2.2, the MVPP2_RXQ_DESC_ADDR_REG and MVPP2_TXQ_DESC_ADDR_REG registers have a slightly different layout, because they need to contain a 64-bit address for the RX and TX descriptor arrays. This commit adjusts those functions accordingly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: adapt mvpp2_defaults_set() to PPv2.2	Thomas Petazzoni
	This commit modifies the mvpp2_defaults_set() function to not do the loopback and FIFO threshold initialization, which are not needed for PPv2.2. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: adapt the mvpp2_rxq_*_pool_set functions to PPv2.2	Thomas Petazzoni
	The MVPP2_RXQ_CONFIG_REG register has a slightly different layout between PPv2.1 and PPv2.2, so this commit adapts the functions modifying this register to accommodate for both the PPv2.1 and PPv2.2 cases. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: adjust the allocation/free of BM pools for PPv2.2	Thomas Petazzoni
	This commit adjusts the allocation and freeing of BM pools to support PPv2.2. This involves: - Checking that the number of buffer pointers is a multiple of 16, as required by the hardware. - Adjusting the size of the DMA coherent area allocated for buffer pointers. Indeed, PPv2.2 needs space for 2 pointers of 64-bits per buffer, as opposed to 2 pointers of 32-bits per buffer in PPv2.1. The size in bytes is now stored in a new field of the mvpp2_bm_pool structure. - On PPv2.2, getting the DMA address and cookie (used for the physical address) of each buffer requires reading the MVPP22_BM_ADDR_HIGH_ALLOC to get the high order bits of those addresses. A new utility function mvpp2_bm_bufs_get_addrs() is introduced to handle this. - On PPv2.2, releasing a buffer requires writing the high order 32 bits of the DMA address and cookie to MVPP22_BM_PHY_VIRT_HIGH_RLS_REG. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors	Thomas Petazzoni
	This commit adds the definition of the PPv2.2 HW descriptors, adjusts the mvpp2_tx_desc and mvpp2_rx_desc structures accordingly, and adapts the accessors to work on both PPv2.1 and PPv2.2. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: introduce an intermediate union for the TX/RX descriptors	Thomas Petazzoni
	Since the format of the HW descriptors is different between PPv2.1 and PPv2.2, this commit introduces an intermediate union, with for now only the PPv2.1 descriptors. The bulk of the driver code only manipulates opaque mvpp2_tx_desc and mvpp2_rx_desc pointers, and the descriptors can only be accessed and modified through the accessor functions. A follow-up commit will add the descriptor definitions for PPv2.2. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: add hw_version field in "struct mvpp2"	Thomas Petazzoni
	In preparation to the introduction for the support of PPv2.2 in the mvpp2 driver, this commit adds a hw_version field to the struct mvpp2, and uses the .data field of the DT match table to fill it in. Having the MVPP21 and MVPP22 definitions available will allow to start adding the necessary conditional code to support PPv2.2. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: add and use accessors for TX/RX descriptors	Thomas Petazzoni
	The PPv2.2 IP has a different TX and RX descriptor layout compared to PPv2.1. In order to prepare for the introduction of PPv2.2 support in mvpp2, this commit adds accessors for the different fields of the TX and RX descriptors, and changes the code to use them. For now, the mvpp2_port argument passed to the accessors is not used, but it will be used in follow-up to update the descriptor according to the version of the IP being used. Apart from the mechanical changes to use the newly introduced accessors, a few other changes, needed to use the accessors, are made: - The mvpp2_txq_inc_put() function now takes a mvpp2_port as first argument, as it is needed to use the accessors. - Similarly, the mvpp2_bm_cookie_build() gains a mvpp2_port first argument, for the same reason. - In mvpp2_rx_error(), instead of accessing the RX descriptor in each case of the switch, we introduce a local variable to store the packet size. - In mvpp2_tx_frag_process() and mvpp2_tx() instead of accessing the packet size from the TX descriptor, we use the actual value available in the function, which is used to set the TX descriptor packet size a few lines before. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: store physical address of buffer in rx_desc->buf_cookie	Thomas Petazzoni
	The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: remove mvpp2_txq_pend_desc_num_get() function	Thomas Petazzoni
	The mvpp2_txq_pend_desc_num_get() function only selects a TX queue, and reads the number of pending descriptors. It is used in only one place, in mvpp2_txq_clean(), where the TX queue has already been selected by a write to MVPP2_TXQ_NUM_REG. Therefore, this function is useless, and the caller can simply read the value of the MVPP2_TXQ_PENDING_REG register instead. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: remove unused register definition MVPP2_TXQ_THRESH_REG	Thomas Petazzoni
	This register is no longer used since commit edc660fa09e2 ("net: mvpp2: replace TX coalescing interrupts with hrtimer"). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: remove support for buffer header	Thomas Petazzoni
	The "buffer header" functionality is a functionality used by the hardware to split an incoming packets over multiple BM buffers if they are not large enough. However, the mvpp2 driver guarantees that a pool of BM buffers has buffers with a size large enough to store MTU-sized packets. Therefore, this functionality is completely unused, and the code can be removed, and we should never get a descriptor with bit MVPP2_RXD_BUF_HDR set. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	net: mvpp2: use "dma" instead of "phys" where appropriate	Thomas Petazzoni
	As indicated by Russell King, the mvpp2 driver currently uses a lot "phys" or "phys_addr" to store what really is a DMA address. This commit clarifies this by using "dma" or "dma_addr" where appropriate. This is especially important as we are going to introduce more changes where the distinction between physical address and DMA address will be key. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	dt-bindings: net: update Marvell PPv2 binding for PPv2.2 support	Thomas Petazzoni
	The Marvell PPv2 Device Tree binding was so far only used to describe the PPv2.1 network controller, used in the Marvell Armada 375. A new version of this IP block, PPv2.2 is used in the Marvell Armada 7K/8K processor. This commit extends the existing binding so that it can also be used to describe PPv2.2 hardware. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	Merge branch 'mlx4-order-0-allocations-and-page-recycling'	David S. Miller
	Eric Dumazet says: ==================== mlx4: order-0 allocations and page recycling As mentioned half a year ago, we better switch mlx4 driver to order-0 allocations and page recycling. This reduces vulnerability surface thanks to better skb->truesize tracking and provides better performance in most cases. (33 Gbit for one TCP flow on my lab hosts) I will provide for linux-4.13 a patch on top of this series, trying to improve data locality as described in https://www.spinics.net/lists/netdev/msg422258.html v2 provides an ethtool -S new counter (rx_alloc_pages) and code factorization, plus Tariq fix. v3 includes various fixes based on Tariq tests and feedback from Saeed and Tariq. v4 rebased on net-next for inclusion in linux-4.12, as requested by Tariq. Worth noting this patch series deletes ~250 lines of code ;) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: remove duplicate code in mlx4_en_process_rx_cq()	Eric Dumazet
	We should keep one way to build skbs, regardless of GRO being on or off. Note that I made sure to defer as much as possible the point we need to pull data from the frame, so that future prefetch() we might add are more effective. These skb attributes derive from the CQE or ring : ip_summed, csum hash vlan offload hwtstamps queue_mapping As a bonus, this patch removes mlx4 dependency on eth_get_headlen() which is very often broken enough to give us headaches. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: make validate_loopback() more generic	Eric Dumazet
	Testing a boolean in fast path is not worth duplicating the code allocating packets, when GRO is on or off. If this proves to be a problem, we might later use a jump label. Next patch will remove this duplicated code and ease code review. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: factorize page_address() calls	Eric Dumazet
	We need to compute the frame virtual address at different points. Do it once. Following patch will use the new va address for validate_loopback() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: do not access rx_desc from mlx4_en_process_rx_cq()	Eric Dumazet
	Instead of fetching dma address from rx_desc->data[0].addr, prefer using frags[0].dma + frags[0].page_offset to avoid a potential cache line miss. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: add rx_alloc_pages counter in ethtool -S	Eric Dumazet
	This new counter tracks number of pages that we allocated for one port. lpaa24:~# ethtool -S eth0 \| egrep 'rx_alloc_pages\|rx_packets' rx_packets: 306755183 rx_alloc_pages: 932897 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-09	mlx4: add page recycling in receive path	Eric Dumazet
	Same technique than some Intel drivers, for arches where PAGE_SIZE = 4096 In most cases, pages are reused because they were consumed before we could loop around the RX ring. This brings back performance, and is even better, a single TCP flow reaches 30Gbit on my hosts. v2: added full memset() in mlx4_en_free_frag(), as Tariq found it was needed if we switch to large MTU, as priv->log_rx_info can dynamically be changed. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>