summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-22ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTERWANG Cong
In commit 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") I assumed NETDEV_REGISTER and NETDEV_UNREGISTER are paired, unfortunately, as reported by jeffy, netdev_wait_allrefs() could rebroadcast NETDEV_UNREGISTER event until all refs are gone. We have to add an additional check to avoid this corner case. For netdev_wait_allrefs() dev->reg_state is NETREG_UNREGISTERED, for dev_change_net_namespace(), dev->reg_state is NETREG_REGISTERED. So check for dev->reg_state != NETREG_UNREGISTERED. Fixes: 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") Reported-by: jeffy <jeffy.chen@rock-chips.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22net/phy: micrel: configure intterupts after autoneg workaroundZach Brown
The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an autoneg failure case by resetting the hardware. This turns off intterupts. Things will work themselves out if the phy polls, as it will figure out it's state during a poll. However if the phy uses only intterupts, the phy will stall, since interrupts are off. This patch fixes the issue by calling config_intr after resetting the phy. Fixes: d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg ") Signed-off-by: Zach Brown <zach.brown@ni.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22net/mlx5e: Use device ID definesMyron Stowe
Use Mellanox device ID definitions in the driver's mlx5 ID table so tools such as 'grep' and 'cscope' can be used to help find correlated material (such as INTx Masking quirks: d76d2fe05fd PCI: Convert Mellanox broken INTx quirks to be for listed devices only). No functional change intended. Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22liquidio: stop using huge static buffer, save 4096k in .dataDenys Vlasenko
Only compile-tested - I don't have the hardware. >From code inspection, octeon_pci_write_core_mem() appears to be safe wrt unaligned source. In any case, u8 fbuf[] was not guaranteed to be aligned anyway. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Felix Manlunas <felix.manlunas@cavium.com> CC: Prasad Kanneganti <prasad.kanneganti@cavium.com> CC: Derek Chickles <derek.chickles@cavium.com> CC: David Miller <davem@davemloft.net> CC: netdev@vger.kernel.org CC: linux-kernel@vger.kernel.org Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22KVM: x86: fix singlestepping over syscallPaolo Bonzini
TF is handled a bit differently for syscall and sysret, compared to the other instructions: TF is checked after the instruction completes, so that the OS can disable #DB at a syscall by adding TF to FMASK. When the sysret is executed the #DB is taken "as if" the syscall insn just completed. KVM emulates syscall so that it can trap 32-bit syscall on Intel processors. Fix the behavior, otherwise you could get #DB on a user stack which is not nice. This does not affect Linux guests, as they use an IST or task gate for #DB. This fixes CVE-2017-7518. Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-06-22Merge tag 'kvm-s390-master-4.12-2' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: fix shadow table handling for nested guests Some odd-ball cases (real-space designation ASCEs) are handled wrong for the shadow page tables. Fix it.
2017-06-22Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo
ath.git patches for 4.13. Major changes: wil6210 * add low level RF sector interface via nl80211 vendor commands * add module parameter ftm_mode to load separate firmware for factory testing * support devices with different PCIe bar size * add support for PCIe D3hot in system suspend * remove ioctl interface which should not be in a wireless driver ath10k * go back to using dma_alloc_coherent() for firmware scratch memory * add per chain RSSI reporting
2017-06-22net/mlx5: Fix offset of hca cap reserved fieldOr Gerlitz
The offending commit pushed fwd the field by two bits but didn't increment the offset, fix that. Currently, no damage was done b/c this is just a field name, but lets have it right. Fixes: f32f5bd2eb7e ('net/mlx5: Configure cache line size for start and end padding') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: IPoIB, Support the flash device ethtool callbackOr Gerlitz
This callback further invokes the mlxfw module to flash the new firmware file to the device. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Support the flash device ethtool callbackOr Gerlitz
This callback further invokes the mlxfw module to flash the new firmware file to the device. As the firmware flash process takes about 20 seconds and ethtool takes the rtnl lock during the flash_device callback, we release the rtnl lock at the beginning of the flash process and take it again before leaving the callback. This way, rtnl is not held during the process. To make sure the device does not get deleted while being flashed, we take a reference to it before releasing rtnl lock. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5: Add mlxfw callbacksOr Gerlitz
Add mlx5 implementation for the ones defined by the mlxfw shared module to be used while flashing the device firmware. The callbacks do their job through the MCQI, MCC and MCDA registers. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5: Add helper functions to set/query MCC/MCDA/MCQI registersOr Gerlitz
To be used by the mlx5 callbacks exposed to the mlxfw module. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5: Enhance MCAM reg to allow query on access reg supportOr Gerlitz
Enhance MCAM to allow the driver to query which access regs are supported. For now, expose the regs needed for FW flashing. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5: Add MCC (Management Component Control) register definitionsOr Gerlitz
MCC (Management Component Control) allows to control a firmware component update. MCDA (Management Component Data Access) allows to read and write a firmware component. MCQI (Management Component Query Information) allows to query information about firmware components. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22mlxfw: Make the module selectableOr Gerlitz
There are upcoming NIC (mlx5) use-cases where people want to avoid building the mlxfw module, allow for that. The mlxsw module is untouched and keeps selecting mlxfw. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Add header re-write offloading of IPv6 hop-limitOr Gerlitz
For environments where flow-based ipv6 router is offloaded. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Use macro for TC header re-write offload field mappingOr Gerlitz
Use a macro for the static mapping between the enumeration of field supported by the firmware for header re-write to the corresponding network header field. This improves the readability of the code and doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Offload TC matching on ip ttlOr Gerlitz
Enable offloading of TC matching on ip ttl / hop-limit As matching on ttl is supported only by newer HW brands (ConnectX-5), we should do capability check before attempting to offload that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Relocate the TC match on ip tos offload code sectionOr Gerlitz
The code section for offloading matches on ip tos (L3) should come before and not after the one that deals with tcp/udp (L4) matches. Otherwise, we might come up with wrong min-inline requirement, when one attempts to match on both L3 and L4. Fixes: fd7da28b280d ('net/mlx5e: Offload TC matching on ip tos / traffic-class') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Introduce RX Page-ReuseTariq Toukan
Introduce a Page-Reuse mechanism in non-Striding RQ RX datapath. A WQE (RX descriptor) buffer is a page, that in most cases was fully wasted on a packet that is much smaller, requiring a new page for the next round. In this patch, we implement a page-reuse mechanism, that resembles a `SW Striding RQ`. We allow the WQE to reuse its allocated page as much as it could, until the page is fully consumed. In each round, the WQE is capable of receiving packet of maximal size (MTU). Yet, upon the reception of a packet, the WQE knows the actual packet size, and consumes the exact amount of memory needed to build a linear SKB. Then, it updates the buffer pointer within the page accordingly, for the next round. Feature is mutually exclusive with XDP (packet-per-page) and LRO (session size is a power of two, needs unused page). Performance tests: iperf tcp tests show huge gain: -------------------------------------------- num streams | BW before | BW after | ratio | 1 | 22.2 | 30.9 | 1.39x | 8 | 64.2 | 93.6 | 1.46x | 64 | 56.7 | 91.4 | 1.61x | -------------------------------------------- Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Enhance RX SKB headroom logicTariq Toukan
In the RX memory scheme of non Striding RQ, we use linear SKBs. Keeping NET_IP_ALIGN in headroom can improve performance on some archs. In addition, take this headroom into account when calculating the LRO WQE size. These are not needed in Striding RQ as they're done implicitly within the non-linear SKB allocation. Fixes: 1bfecfca565c ("net/mlx5e: Build RX SKB on demand") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22net/mlx5e: Build SKB with exact frag_sizeTariq Toukan
Build the SKB over the receive packet instead of the whole page. Getting the SKB's linear data and shared_info closer improves locality. In addition, this opens up the possibility to make use of other parts of the page in the downstream page-reuse patch. Fixes: 1bfecfca565c ("net/mlx5e: Build RX SKB on demand") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-22powerpc/powernv/npu-dma: Add explicit flush when sending an ATSDAlistair Popple
NPU2 requires an extra explicit flush to an active GPU PID when sending address translation shoot downs (ATSDs) to reliably flush the GPU TLB. This patch adds just such a flush at the end of each sequence of ATSDs. We can safely use PID 0 which is always reserved and active on the GPU. PID 0 is only used for init_mm which will never be a user mm on the GPU. To enforce this we add a check in pnv_npu2_init_context() just in case someone tries to use PID 0 on the GPU. Signed-off-by: Alistair Popple <alistair@popple.id.au> [mpe: Use true/false for bool literals] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-22KVM: s390: gaccess: fix real-space designation asce handling for gmap shadowsHeiko Carstens
For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-22perf/x86/intel: Add 1G DTLB load/store miss support for SKLKan Liang
Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and 4M page size. Need to extend the events to support any page size (4K/2M/4M/1G). The complete DTLB load/store miss events are: DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08 DTLB_STORE_MISSES.WALK_COMPLETED 0xe49 Signed-off-by: Kan Liang <Kan.liang@intel.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-22i2c: imx: Use correct function to write to registerMichail Georgios Etairidis
The i2c-imx driver incorrectly uses readb()/writeb() to read and write to the appropriate registers when performing a repeated start. The appropriate imx_i2c_read_reg()/imx_i2c_write_reg() functions should be used instead. Performing a repeated start results in a kernel panic. The platform is imx. Signed-off-by: Michail G Etairidis <m.etairidis@beck-ipc.com> Fixes: ce1a78840ff7 ("i2c: imx: add DMA support for freescale i2c driver") Fixes: 054b62d9f25c ("i2c: imx: fix the i2c bus hang issue when do repeat restart") Acked-by: Fugang Duan <fugang.duan@nxp.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-06-22esp6_offload: Fix IP6CB(skb)->nhoff for ESP GROYossi Kuperman
IP6CB(skb)->nhoff is the offset of the nexthdr field in an IPv6 header, unless there are extension headers present, in which case nhoff points to the nexthdr field of the last extension header. In non-GRO code path, nhoff is set by ipv6_rcv before any XFRM code is executed. Conversely, in GRO code path (when esp6_offload is loaded), nhoff is not set. The following functions fail to read the correct value and eventually the packet is dropped: xfrm6_transport_finish xfrm6_tunnel_input xfrm6_rcv_tnl Set nhoff to the proper offset of nexthdr in esp6_gro_receive. Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-06-22xfrm6: Fix IPv6 payload_len in xfrm6_transport_finishYossi Kuperman
IPv6 payload length indicates the size of the payload, including any extension headers. In xfrm6_transport_finish, ipv6_hdr(skb)->payload_len is set to the payload size only, regardless of the presence of any extension headers. After ESP GRO transport mode decapsulation, ipv6_rcv trims the packet according to the wrong payload_len, thus corrupting the packet. Set payload_len to account for extension headers as well. Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-06-21Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "This contains a set of fixes for xen-blkback by way of Konrad, and a performance regression fix for blk-mq for shared tags. The latter could account for as much as a 50x reduction in performance, with the test case from the user with 500 name spaces. A more realistic setup on my end with 32 drives showed a 3.5x drop. The fix has been thoroughly tested before being committed" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: fix performance regression with shared tags xen-blkback: don't leak stack data via response ring xen/blkback: don't use xen_blkif_get() in xen-blkback kthread xen/blkback: don't free be structure too early xen/blkback: fix disconnect while I/Os in flight
2017-06-21xfs: don't allow bmap on rt filesDarrick J. Wong
bmap returns a dumb LBA address but not the block device that goes with that LBA. Swapfiles don't care about this and will blindly assume that the data volume is the correct blockdev, which is totally bogus for files on the rt subvolume. This results in the swap code doing IOs to arbitrary locations on the data device(!) if the passed in mapping is a realtime file, so just turn off bmap for rt files. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-06-22kbuild: fix header installation under fakechroot environmentRichard Genoud
Since commit fcc8487d477a ("uapi: export all headers under uapi directories") fakechroot make bindeb-pkg fails, mismatching files for directories: touch: cannot touch 'usr/include/video/uvesafb.h/.install': Not a directory This due to a bug in fakechroot: when using the function $(wildcard $(srcdir)/*/.) in a makefile, under a fakechroot environment, not only directories but also files are returned. To circumvent that, we are using the functions: $(sort $(dir $(wildcard $(srcdir)/*/)))) Fixes: fcc8487d477a ("uapi: export all headers under uapi directories") Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21ACPI / scan: Fix enumeration for special SPI and I2C devicesJarkko Nikula
Commit f406270bf73d ("ACPI / scan: Set the visited flag for all enumerated devices") caused that two group of special SPI or I2C devices do not enumerate. SPI and I2C devices are expected to be enumerated by the SPI and I2C subsystems but change caused that acpi_bus_attach() marks those devices with acpi_device_set_enumerated(). First group of devices are matched using Device Tree compatible property with special _HID "PRP0001". Those devices have matched scan handler, acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them with acpi_device_set_enumerated(). Second group of devices without valid _HID such as "LNXVIDEO" have device->pnp.type.platform_id set to zero and change again marks them with acpi_device_set_enumerated(). Fix this by flagging the SPI and I2C devices during struct acpi_device object initialization time and let the code in acpi_bus_attach() to go through the device_attach() and acpi_default_enumeration() path for all SPI and I2C devices. Fixes: f406270bf73d (ACPI / scan: Set the visited flag for all enumerated devices) Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: 4.11+ <stable@vger.kernel.org> # 4.11+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix refcounting wrt timers which hold onto inet6 address objects, from Xin Long. 2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg. 3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel. 4) Several mlx5 driver fixes (firmware readiness, timestamp cap reporting, devlink command validity checking, tc offloading, etc.) From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz. 5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan. 6) Fix dst refcount bug in decnet, from Wei Wang. 7) Netdev can be double freed in register_vlan_device(). Fix from Gao Feng. 8) Don't allow object to be destroyed while it is being dumped in SCTP, from Xin Long. 9) Fix dpaa_eth build when modular, from Madalin Bucur. 10) Fix throw route leaks, from Serhey Popovych. 11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table, also from Serhey Popovych. 12) Fix premature TX SKB free in stmmac, from Niklas Cassel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits) igmp: add a missing spin_lock_init() net: stmmac: free an skb first when there are no longer any descriptors using it sfc: remove duplicate up_write on VF filter_sem rtnetlink: add IFLA_GROUP to ifla_policy ipv6: Do not leak throw route references dt-bindings: net: sms911x: Add missing optional VDD regulators dpaa_eth: reuse the dma_ops provided by the FMan MAC device fsl/fman: propagate dma_ops net/core: remove explicit do_softirq() from busy_poll_stop() fib_rules: Resolve goto rules target on delete sctp: ensure ep is not destroyed before doing the dump net/hns:bugfix of ethtool -t phy self_test net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev cxgb4: notify uP to route ctrlq compl to rdma rspq ip6_tunnel: Correct tos value in collect_md mode decnet: always not take dst->__refcnt when inserting dst into hash table ip6_tunnel: fix potential issue in __ip6_tnl_rcv ip_tunnel: fix potential issue in ip_tunnel_rcv brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it ...
2017-06-21Merge branch 'qed-File-split-and-rename-towards-iWARP-support'David S. Miller
Michal Kalderon says: ==================== qed*: File split and rename towards iWARP support This patch series makes a few more infrastructure changes towards adding iWARP support. Hopefully this is the last infrastructure change prior to the iWARP RFC. Patch #1-3 take care of taking all the common iWARP/RoCE code out of qed_roce.[ch] and placing it in qed_rdma.[ch] Patch #4 renames the roce interface file as it is common for RoCE and iWARP. This patch touches qedr as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21qed*: Rename qed_roce_if.h to qed_rdma_if.hKalderon, Michal
Rename the qed_roce_if file to qed_rdma_if as it represents a common interface for RoCE and iWARP. this commit affects RDMA/qedr as well. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21qed: Split rdma content between qed_rdma and qed_roceKalderon, Michal
This patch places common iWARP / RoCE code in qed_rdma and roce specific code in qed_roce There is one new function ( qed_roce_setup ) added, the rest of the patch removes content from the files and removes some static definitions. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21qed: Duplicate qed_roce.[ch] to qed_rdma.[ch]Kalderon, Michal
This patch adds files that will contain common code for RoCE/iWARP. The files are currently identical to qed_roce.c / qed_roce.h and intentionally not added to the makefile. The next patch in the series will modify the files so that roce specific code is left in qed_roce and common roce/iwarp code will be placed in qed_rdma This patch is the result of a simple cp qed_rdma.c qed_roce.c cp qed_rdma.h qed_roce.h Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21qed: Cleanup qed_roce before duplicating itKalderon, Michal
The next patch in the series will duplicate qed_roce as part of code preprations for iWARP support. Do some cleanup before duplicating Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21Merge tag 'pinctrl-v4.12-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull more pin control fixes from Linus Walleij: "Some late arriving fixes. I should have sent earlier, just swamped with work as usual. Thomas patch makes AMD systems usable despite firmware bugs so it is fairly important. - Make the AMD driver use a regular interrupt rather than a chained one, so the system does not lock up. - Fix a function call error deep inside the STM32 driver" * tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: stm32: Fix bad function call pinctrl/amd: Use regular interrupt instead of chained
2017-06-21bpf: expose prog id for cls_bpf and act_bpfDaniel Borkmann
In order to be able to retrieve the attached programs from cls_bpf and act_bpf, we need to expose the prog ids via netlink so that an application can later on get an fd based on the id through the BPF_PROG_GET_FD_BY_ID command, and dump related prog info via BPF_OBJ_GET_INFO_BY_FD command for bpf(2). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - revert of a commit to magicmouse driver that regressess certain devices, from Daniel Stone - quirk for a specific Dell mouse, from Sebastian Parschauer * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse" HID: Add quirk for Dell PIXART OEM mouse
2017-06-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fix from Jiri Kosina: "Fix the way how livepatches are being stacked with respect to RCU, from Petr Mladek" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Fix stacking of patches with respect to RCU
2017-06-21Merge branch 'ufs-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more ufs fixes from Al Viro: "More UFS fixes, unfortunately including build regression fix for the 64-bit s_dsize commit. Fixed in this pile: - trivial bug in signedness of 32bit timestamps on ufs1 - ESTALE instead of ufs_error() when doing open-by-fhandle on something deleted - build regression on 32bit in ufs_new_fragments() - calculating that many percents of u64 pulls libgcc stuff on some of those. Mea culpa. - fix hysteresis loop broken by typo in 2.4.14.7 (right next to the location of previous bug). - fix the insane limits of said hysteresis loop on filesystems with very low percentage of reserved blocks. If it's 5% or less, just use the OPTSPACE policy. - calculate those limits once and mount time. This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_ survive cross-builds. Again, my apologies for missing that, especially since I have noticed a related percentage-of-64bit issue in earlier patches (when dealing with amount of reserved blocks). Self-LART applied..." * 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ufs: fix the logics for tail relocation ufs_iget(): fail with -ESTALE on deleted inode fix signedness of timestamps on ufs1
2017-06-21Allow stack to grow up to address space limitHelge Deller
Fix expand_upwards() on architectures with an upward-growing stack (parisc, metag and partly IA-64) to allow the stack to reliably grow exactly up to the address space limit given by TASK_SIZE. Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-21mm: fix new crash in unmapped_area_topdown()Hugh Dickins
Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Cc: stable@vger.kernel.org Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones <davej@codemonkey.org.uk> Debugged-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-21blk-mq: fix performance regression with shared tagsJens Axboe
If we have shared tags enabled, then every IO completion will trigger a full loop of every queue belonging to a tag set, and every hardware queue for each of those queues, even if nothing needs to be done. This causes a massive performance regression if you have a lot of shared devices. Instead of doing this huge full scan on every IO, add an atomic counter to the main queue that tracks how many hardware queues have been marked as needing a restart. With that, we can avoid looking for restartable queues, if we don't have to. Max reports that this restores performance. Before this patch, 4K IOPS was limited to 22-23K IOPS. With the patch, we are running at 950-970K IOPS. Fixes: 6d8c6c0f97ad ("blk-mq: Restart a single queue if tag sets are shared") Reported-by: Max Gurtovoy <maxg@mellanox.com> Tested-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-21dm io: fix duplicate bio completion due to missing ref countMike Snitzer
If only a subset of the devices associated with multiple regions support a given special operation (eg. DISCARD) then the dec_count() that is used to set error for the region must increment the io->count. Otherwise, when the dec_count() is called it can cause the dm-io caller's bio to be completed multiple times. As was reported against the dm-mirror target that had mirror legs with a mix of discard capabilities. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=196077 Reported-by: Zhang Yi <yizhan@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-21dm integrity: fix to not disable/enable interrupts from interrupt contextMike Snitzer
Use spin_lock_irqsave and spin_unlock_irqrestore rather than spin_{lock,unlock}_irq in submit_flush_bio(). Otherwise lockdep issues the following warning: DEBUG_LOCKS_WARN_ON(current->hardirq_context) WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:2748 trace_hardirqs_on_caller+0x107/0x180 Reported-by: Ondrej Kozina <okozina@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com>
2017-06-21sock: avoid dirtying incoming_cpu if not neededPaolo Abeni
for connected socket, the incoming_cpu field in the sock struct is not going to change frequently, but we are setting it unconditionally for each packet. Since sk_incoming_cpu and sk_flags share the same cacheline, and the latter is access by udp_recvmsg(), this cause a cache miss for each packet for UDP connected socket. With this patch, we set the incoming cpu field only when the ingress cpu really changes. This gives a small but measurable performance improvement for connected UDP socket. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>