linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-07-08	net: axienet: fix a potential double free in axienet_probe()	Wen Yang
	There is a possible use-after-free issue in the axienet_probe(): 1701: np = of_parse_phandle(pdev->dev.of_node, "axistream-connected", 0); 1702: if (np) { ... 1787: of_node_put(np); ---> released here 1788: lp->eth_irq = platform_get_irq(pdev, 0); 1789: } else { ... 1801: } 1802: if (IS_ERR(lp->dma_regs)) { ... 1805: of_node_put(np); ---> double released here 1806: goto free_netdev; 1807: } We solve this problem by removing the unnecessary of_node_put(). Fixes: 28ef9ebdb64c ("net: axienet: make use of axistream-connected attribute optional") Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Anirudha Sarangi <anirudh@xilinx.com> Cc: John Linn <John.Linn@xilinx.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Robert Hancock <hancock@sedsystems.ca> Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Robert Hancock <hancock@sedsystems.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	sunrpc/cache: remove the exporting of cache_seq_next	Denis Efremov
	The function cache_seq_next is declared static and marked EXPORT_SYMBOL_GPL, which is at best an odd combination. Because the function is not used outside of the net/sunrpc/cache.c file it is defined in, this commit removes the EXPORT_SYMBOL_GPL() marking. Fixes: d48cf356a130 ("SUNRPC: Remove non-RCU protected lookup") Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-08	Merge branch 'locking-core-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - rwsem scalability improvements, phase #2, by Waiman Long, which are rather impressive: "On a 2-socket 40-core 80-thread Skylake system with 40 reader and writer locking threads, the min/mean/max locking operations done in a 5-second testing window before the patchset were: 40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810 40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255 After the patchset, they became: 40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741 40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098" There's a lot of changes to the locking implementation that makes it similar to qrwlock, including owner handoff for more fair locking. Another microbenchmark shows how across the spectrum the improvements are: "With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 2-socket Skylake system with equal numbers of readers and writers (mixed) before and after this patchset were: # of Threads Before Patch After Patch ------------ ------------ ----------- 2 2,618 4,193 4 1,202 3,726 8 802 3,622 16 729 3,359 32 319 2,826 64 102 2,744" The changes are extensive and the patch-set has been through several iterations addressing various locking workloads. There might be more regressions, but unless they are pathological I believe we want to use this new implementation as the baseline going forward. - jump-label optimizations by Daniel Bristot de Oliveira: the primary motivation was to remove IPI disturbance of isolated RT-workload CPUs, which resulted in the implementation of batched jump-label updates. Beyond the improvement of the real-time characteristics kernel, in one test this patchset improved static key update overhead from 57 msecs to just 1.4 msecs - which is a nice speedup as well. - atomic64_t cross-arch type cleanups by Mark Rutland: over the last ~10 years of atomic64_t existence the various types used by the APIs only had to be self-consistent within each architecture - which means they became wildly inconsistent across architectures. Mark puts and end to this by reworking all the atomic64 implementations to use 's64' as the base type for atomic64_t, and to ensure that this type is consistently used for parameters and return values in the API, avoiding further problems in this area. - A large set of small improvements to lockdep by Yuyang Du: type cleanups, output cleanups, function return type and othr cleanups all around the place. - A set of percpu ops cleanups and fixes by Peter Zijlstra. - Misc other changes - please see the Git log for more details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) locking/lockdep: increase size of counters for lockdep statistics locking/atomics: Use sed(1) instead of non-standard head(1) option locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING x86/jump_label: Make tp_vec_nr static x86/percpu: Optimize raw_cpu_xchg() x86/percpu, sched/fair: Avoid local_clock() x86/percpu, x86/irq: Relax {set,get}_irq_regs() x86/percpu: Relax smp_processor_id() x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() locking/rwsem: Guard against making count negative locking/rwsem: Adaptive disabling of reader optimistic spinning locking/rwsem: Enable time-based spinning on reader-owned rwsem locking/rwsem: Make rwsem->owner an atomic_long_t locking/rwsem: Enable readers spinning on writer locking/rwsem: Clarify usage of owner's nonspinaable bit locking/rwsem: Wake up almost all readers in wait queue locking/rwsem: More optimal RT task handling of null owner locking/rwsem: Always release wait_lock before waking up tasks locking/rwsem: Implement lock handoff to prevent lock starvation locking/rwsem: Make rwsem_spin_on_owner() return owner state ...
2019-07-09	selftests/bpf: fix test_reuseport_array on s390	Ilya Leoshkevich
	Fix endianness issue: passing a pointer to 64-bit fd as a 32-bit key does not work on big-endian architectures. So cast fd to 32-bits when necessary. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-08	net: stmmac: enable clause 45 mdio support	Kweh Hock Leong
	DWMAC4 is capable to support clause 45 mdio communication. This patch enable the feature on stmmac_mdio_write() and stmmac_mdio_read() by following phy_write_mmd() and phy_read_mmd() mdiobus read write implementation format. Reviewed-by: Li, Yifan <yifan2.li@intel.com> Signed-off-by: Kweh Hock Leong <hock.leong.kweh@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Drivers: hv: vmbus: Break out ISA independent parts of mshyperv.h	Michael Kelley
	Break out parts of mshyperv.h that are ISA independent into a separate file in include/asm-generic. This move facilitates ARM64 code reusing these definitions and avoids code duplication. No functionality or behavior is changed. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-08	net: openvswitch: use netif_ovs_is_port() instead of opencode	Taehee Yoo
	Use netif_ovs_is_port() function instead of open code. This patch doesn't change logic. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	MAINTAINERS: Add page_pool maintainer entry	Jesper Dangaard Brouer
	In this release cycle the number of NIC drivers using page_pool will likely reach 4 drivers. It is about time to add a maintainer entry. Add myself and Ilias. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'mvpp2-cls-ether'	David S. Miller
	Maxime Chevallier says: ==================== net: mvpp2: Add classification based on the ETHER flow This series adds support for classification of the ETHER flow in the mvpp2 driver. The first patch allows detecting when a user specifies a flow_type that isn't supported by the driver, while the second adds support for this flow_type by adding the mapping between the ETHER_FLOW enum value and the relevant classifier flow entries. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: mvpp2: cls: Add support for ETHER_FLOW	Maxime Chevallier
	Users can specify classification actions based on the 'ether' flow type. In that case, this will apply to all ethernet traffic, superseeding flows such as 'udp4' or 'tcp6'. Add support for this flow type in the PPv2 classifier, by mapping the ETHER_FLOW value to the corresponding entries in the classifier. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: mvpp2: cls: Report an error for unsupported flow types	Maxime Chevallier
	Add a missing check to detect flow types that we don't support, so that user can be informed of this. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'core-rcu-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The changes in this cycle are: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) tools/memory-model: Improve data-race detection tools/memory-model: Change definition of rcu-fence tools/memory-model: Expand definition of barrier tools/memory-model: Do not use "herd" to refer to "herd7" tools/memory-model: Fix comment in MP+poonceonces.litmus Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic() rcu: Don't return a value from rcu_assign_pointer() rcu: Force inlining of rcu_read_lock() rcu: Fix irritating whitespace error in rcu_assign_pointer() rcu: Upgrade sync_exp_work_done() to smp_mb() rcutorture: Upper case solves the case of the vanishing NULL pointer torture: Suppress propagating trace_printk() warning rcutorture: Dump trace buffer for callback pipe drain failures torture: Add --trust-make to suppress "make clean" torture: Make --cpus override idleness calculations torture: Run kernel build in source directory torture: Add function graph-tracing cheat sheet torture: Capture qemu output rcutorture: Tweak kvm options rcutorture: Add trivial RCU implementation ...
2019-07-08	selftests: txring_overwrite: fix incorrect test of mmap() return value	Frank de Brabander
	If mmap() fails it returns MAP_FAILED, which is defined as ((void ) -1). The current if-statement incorrectly tests if ring is NULL. Fixes: 358be656406d ("selftests/net: add txring_overwrite") Signed-off-by: Frank de Brabander <debrabander@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'vsock-virtio-fixes'	David S. Miller
	Stefano Garzarella says: ==================== vsock/virtio: several fixes in the .probe() and .remove() During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock before registering the driver", Stefan pointed out some possible issues in the .probe() and .remove() callbacks of the virtio-vsock driver. This series tries to solve these issues: - Patch 1 adds RCU critical sections to avoid use-after-free of 'the_virtio_vsock' pointer. - Patch 2 stops workers before to call vdev->config->reset(vdev) to be sure that no one is accessing the device. - Patch 3 moves the works flush at the end of the .remove() to avoid use-after-free of 'vsock' object. v3: - Patch 1: use rcu_dereference_protected() to get the_virtio_vosck value in the virtio_vsock_probe() [Jason] v2: https://patchwork.kernel.org/cover/11022343/ v1: https://patchwork.kernel.org/cover/10964733/ Before this series the guest crashes in a few second. After this series the test runs (~12h) without issues. Tested on an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait) with these scripts to stress the .probe()/.remove() path: - guest while true; do cat /dev/urandom \| nc-vsock -l 4321 > /dev/null & cat /dev/urandom \| nc-vsock -l 5321 > /dev/null & cat /dev/urandom \| nc-vsock -l 6321 > /dev/null & cat /dev/urandom \| nc-vsock -l 7321 > /dev/null & wait done - host while true; do cat /dev/urandom \| nc-vsock 3 4321 > /dev/null & cat /dev/urandom \| nc-vsock 3 5321 > /dev/null & cat /dev/urandom \| nc-vsock 3 6321 > /dev/null & cat /dev/urandom \| nc-vsock 3 7321 > /dev/null & sleep 2 echo "device_del v1" \| nc 127.0.0.1 1234 sleep 1 echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" \| nc 127.0.0.1 1234 sleep 1 done ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: fix flush of works during the .remove()	Stefano Garzarella
	This patch moves the flush of works after vdev->config->del_vqs(vdev), because we need to be sure that no workers run before to free the 'vsock' object. Since we stopped the workers using the [tx\|rx\|event]_run flags, we are sure no one is accessing the device while we are calling vdev->config->reset(vdev), so we can safely move the workers' flush. Before the vdev->config->del_vqs(vdev), workers can be scheduled by VQ callbacks, so we must flush them after del_vqs(), to avoid use-after-free of 'vsock' object. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: stop workers during the .remove()	Stefano Garzarella
	Before to call vdev->config->reset(vdev) we need to be sure that no one is accessing the device, for this reason, we add new variables in the struct virtio_vsock to stop the workers during the .remove(). This patch also add few comments before vdev->config->reset(vdev) and vdev->config->del_vqs(vdev). Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock	Stefano Garzarella
	Some callbacks used by the upper layers can run while we are in the .remove(). A potential use-after-free can happen, because we free the_virtio_vsock without knowing if the callbacks are over or not. To solve this issue we move the assignment of the_virtio_vsock at the end of .probe(), when we finished all the initialization, and at the beginning of .remove(), before to release resources. For the same reason, we do the same also for the vdev->priv. We use RCU to be sure that all callbacks that use the_virtio_vsock ended before freeing it. This is not required for callbacks that use vdev->priv, because after the vdev->config->del_vqs() we are sure that they are ended and will no longer be invoked. We also take the mutex during the .remove() to avoid that .probe() can run while we are resetting the device. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'b53-docs'	David S. Miller
	Benedikt Spranger says: ==================== Document the configuration of b53 this is the third round to document the configuration of a b53 supported switch. v3..v2: - fix a typo - improve b53 configuration in DSA_TAG_PROTO_NONE showcase. - grade up from RFC to patch for mainline inclusion. v1..v2: - split out generic parts of the configuration. - target comments by Andrew Lunn and Florian Fainelli. - make changes visible to build system ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Documentation: net: dsa: b53: Describe b53 configuration	Benedikt Spranger
	Document the different needs of documentation for the b53 driver. Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Documentation: net: dsa: Describe DSA switch configuration	Benedikt Spranger
	Document DSA tagged and VLAN based switch configuration by showcases. Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09	power: reset: nvmem-reboot-mode: add CONFIG_OF dependency	Arnd Bergmann
	Without CONFIG_OF, we get a build failure in the reboot-mode implementation: drivers/power/reset/reboot-mode.c: In function 'reboot_mode_register': drivers/power/reset/reboot-mode.c:72:2: error: implicit declaration of function 'for_each_property_of_node'; did you mean 'for_each_child_of_node'? [-Werror=implicit-function-declaration] for_each_property_of_node(np, prop) { Add a Kconfig dependency like we have for the other users of CONFIG_REBOOT_MODE. Fixes: 7a78a7f7695b ("power: reset: nvmem-reboot-mode: use NVMEM as reboot mode write interface") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2019-07-08	nfp: tls: fix error return code in nfp_net_tls_add()	Wei Yongjun
	Fix to return negative error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1f35a56cf586 ("nfp: tls: add/delete TLS TX connections") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'bnxt_en-XDP_REDIRECT'	David S. Miller
	Michael Chan says: ==================== bnxt_en: Add XDP_REDIRECT support. This patch series adds XDP_REDIRECT support by Andy Gospodarek. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: add page_pool support	Andy Gospodarek
	This removes contention over page allocation for XDP_REDIRECT actions by adding page_pool support per queue for the driver. The performance for XDP_REDIRECT actions scales linearly with the number of cores performing redirect actions when using the page pools instead of the standard page allocator. v2: Fix up the error path from XDP registration, noted by Ilias Apalodimas. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: optimized XDP_REDIRECT support	Andy Gospodarek
	This adds basic support for XDP_REDIRECT in the bnxt_en driver. Next patch adds the more optimized page pool support. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: Refactor __bnxt_xmit_xdp().	Michael Chan
	__bnxt_xmit_xdp() is used by XDP_TX and ethtool loopback packet transmit. Refactor it so that it can be re-used by the XDP_REDIRECT logic. Restructure the TX interrupt handler logic to cleanly separate XDP_TX logic in preparation for XDP_REDIRECT. Acked-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	bnxt_en: rename some xdp functions	Andy Gospodarek
	Renaming bnxt_xmit_xdp to __bnxt_xmit_xdp to get ready for XDP_REDIRECT support and reduce confusion/namespace collision. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	Merge branch 'cpsw-Add-XDP-support'	David S. Miller
	Ivan Khoronzhuk says: ==================== net: ethernet: ti: cpsw: Add XDP support This patchset adds XDP support for TI cpsw driver and base it on page_pool allocator. It was verified on af_xdp socket drop, af_xdp l2f, ebpf XDP_DROP, XDP_REDIRECT, XDP_PASS, XDP_TX. It was verified with following configs enabled: CONFIG_JIT=y CONFIG_BPFILTER=y CONFIG_BPF_SYSCALL=y CONFIG_XDP_SOCKETS=y CONFIG_BPF_EVENTS=y CONFIG_HAVE_EBPF_JIT=y CONFIG_BPF_JIT=y CONFIG_CGROUP_BPF=y Link on previous v7: https://lkml.org/lkml/2019/7/4/715 Also regular tests with iperf2 were done in order to verify impact on regular netstack performance, compared with base commit: https://pastebin.com/JSMT0iZ4 v8..v9: - fix warnings on arm64 caused by typos in type casting v7..v8: - corrected dma calculation based on headroom instead of hard start - minor comment changes v6..v7: - rolled back to v4 solution but with small modification - picked up patch: https://www.spinics.net/lists/netdev/msg583145.html - added changes related to netsec fix and cpsw v5..v6: - do changes that is rx_dev while redirect/flush cycle is kept the same - dropped net: ethernet: ti: davinci_cpdma: return handler status - other changes desc in patches v4..v5: - added two plreliminary patches: net: ethernet: ti: davinci_cpdma: allow desc split while down net: ethernet: ti: cpsw_ethtool: allow res split while down - added xdp alocator refcnt on xdp level, avoiding page pool refcnt - moved flush status as separate argument for cpdma_chan_process - reworked cpsw code according to last changes to allocator - added missed statistic counter v3..v4: - added page pool user counter - use same pool for ndevs in dual mac - restructured page pool create/destroy according to the last changes in API v2..v3: - each rxq and ndev has its own page pool v1..v2: - combined xdp_xmit functions - used page allocation w/o refcnt juggle - unmapped page for skb netstack - moved rxq/page pool allocation to open/close pair - added several preliminary patches: net: page_pool: add helper function to retrieve dma addresses net: page_pool: add helper function to unmap dma addresses net: ethernet: ti: cpsw: use cpsw as drv data net: ethernet: ti: cpsw_ethtool: simplify slave loops ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: cpsw: add XDP support	Ivan Khoronzhuk
	Add XDP support based on rx page_pool allocator, one frame per page. Page pool allocator is used with assumption that only one rx_handler is running simultaneously. DMA map/unmap is reused from page pool despite there is no need to map whole page. Due to specific of cpsw, the same TX/RX handler can be used by 2 network devices, so special fields in buffer are added to identify an interface the frame is destined to. Thus XDP works for both interfaces, that allows to test xdp redirect between two interfaces easily. Also, each rx queue have own page pools, but common for both netdevs. XDP prog is common for all channels till appropriate changes are added in XDP infrastructure. Also, once page_pool recycling becomes part of skb netstack some simplifications can be added, like removing page_pool_release_page() before skb receive. In order to keep rx_dev while redirect, that can be somehow used in future, do flush in rx_handler, that allows to keep rx dev the same while redirect. It allows to conform with tracing rx_dev pointed by Jesper. Also, there is probability, that XDP generic code can be extended to support multi ndev drivers like this one, using same rx queue for several ndevs, based on switchdev for instance or else. In this case, driver can be modified like exposed here: https://lkml.org/lkml/2019/7/3/243 Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: cpsw_ethtool: allow res split while down	Ivan Khoronzhuk
	That's possible to set channel num while interfaces are down. When interface gets up it should resplit budget. This resplit can happen after phy is up but only if speed is changed, so should be set before this, for this allow it to happen while changing number of channels, when interfaces are down. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: davinci_cpdma: allow desc split while down	Ivan Khoronzhuk
	That's possible to set ring params while interfaces are down. When interface gets up it uses number of descs to fill rx queue and on later on changes to create rx pools. Usually, this resplit can happen after phy is up, but it can be needed before this, so allow it to happen while setting number of rx descs, when interfaces are down. Also, if no dependency on intf state, move it to cpdma layer, where it should be. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: ethernet: ti: davinci_cpdma: add dma mapped submit	Ivan Khoronzhuk
	In case if dma mapped packet needs to be sent, like with XDP page pool, the "mapped" submit can be used. This patch adds dma mapped submit based on regular one. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	net: core: page_pool: add user refcnt and reintroduce page_pool_destroy	Ivan Khoronzhuk
	Jesper recently removed page_pool_destroy() (from driver invocation) and moved shutdown and free of page_pool into xdp_rxq_info_unreg(), in-order to handle in-flight packets/pages. This created an asymmetry in drivers create/destroy pairs. This patch reintroduce page_pool_destroy and add page_pool user refcnt. This serves the purpose to simplify drivers error handling as driver now drivers always calls page_pool_destroy() and don't need to track if xdp_rxq_info_reg_mem_model() was unsuccessful. This could be used for a special cases where a single RX-queue (with a single page_pool) provides packets for two net_device'es, and thus needs to register the same page_pool twice with two xdp_rxq_info structures. This patch is primarily to ease API usage for drivers. The recently merged netsec driver, actually have a bug in this area, which is solved by this API change. This patch is a modified version of Ivan Khoronzhuk's original patch. Link: https://lore.kernel.org/netdev/20190625175948.24771-2-ivan.khoronzhuk@linaro.org/ Fixes: 5c67bf0ec4d0 ("net: netsec: Use page_pool API") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	of/fdt: pass early_init_dt_reserve_memory_arch() with bool type nomap	Masahiro Yamada
	The third argument 'nomap' of early_init_dt_reserve_memory_arch() is bool. It is preferred to pass it with a bool type parameter. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-08	dma-mapping: mark dma_alloc_need_uncached as __always_inline	Christoph Hellwig
	Without the __always_inline at least i386 configs that have CONFIG_OPTIMIZE_INLINING set seem fail to inline dma_alloc_need_uncached, leading to a linker error because of undefined symbols. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
2019-07-08	docs: automarkup.py: ignore exceptions when seeking for xrefs	Mauro Carvalho Chehab
	When using the automarkup extension with: make pdfdocs without passing an specific book, the code will raise an exception: File "/devel/v4l/docs/Documentation/sphinx/automarkup.py", line 86, in auto_markup node.parent.replace(node, markup_funcs(name, app, node)) File "/devel/v4l/docs/Documentation/sphinx/automarkup.py", line 59, in markup_funcs 'function', target, pxref, lit_text) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/domains/c.py", line 308, in resolve_xref contnode, target) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/util/nodes.py", line 450, in make_refnode '#' + targetid) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/builders/latex/__init__.py", line 159, in get_relative_uri return self.get_target_uri(to, typ) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/builders/latex/__init__.py", line 152, in get_target_uri raise NoUri sphinx.environment.NoUri This happens because not all references will belong to a single PDF/LaTeX document. Better to just ignore those than breaking Sphinx build. Fixes: d74b0d31ddde ("Docs: An initial automarkup extension for sphinx") Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> [jc: Narrowed the "except" and tweaked the comment] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-07-08	docs: Move binderfs to admin-guide	Matthew Wilcox (Oracle)
	The documentation is more appropriate for the administrator than for the internal kernel API section it is currently in. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-07-08	of/platform: Drop superfluous cast in of_device_make_bus_id()	Geert Uytterhoeven
	There is no need to cast "u64" to "unsigned long long" before printing it, as both types have been made identical on all architectures many years ago. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rob Herring <robh@kernel.org>
2019-07-08	IB/core: Work on the caller socket net namespace in nldev_newlink()	Parav Pandit
	While creating new RDMA devices based on netdevice name, consider the net namespace of the caller skb's socket similar to rest of the doit() callbacks and nldev_dellink() which deletes the RDMA device created using nldev_newlink(). Fixes: 3856ec4b93c94 ("RDMA/core: Add RDMA_NLDEV_CMD_NEWLINK/DELLINK support") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	nfc: fix potential illegal memory access	Yang Wei
	The frags_q is not properly initialized, it may result in illegal memory access when conn_info is NULL. The "goto free_exit" should be replaced by "goto exit". Signed-off-by: Yang Wei <albin_yang@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	macb: fix build warning for !CONFIG_OF	Arnd Bergmann
	When CONFIG_OF is disabled, we get a harmless warning about the newly added variable: drivers/net/ethernet/cadence/macb_main.c:48:39: error: 'mgmt' defined but not used [-Werror=unused-variable] static struct sifive_fu540_macb_mgmt *mgmt; Move the variable closer to its use inside of the #ifdef. Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	gve: fix unused variable/label warnings	Arnd Bergmann
	On unusual page sizes, we get harmless warnings: drivers/net/ethernet/google/gve/gve_rx.c:283:6: error: unused variable 'pagecount' [-Werror,-Wunused-variable] drivers/net/ethernet/google/gve/gve_rx.c:336:1: error: unused label 'have_skb' [-Werror,-Wunused-label] Change the preprocessor #if to regular if() to avoid this. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM	Konstantin Taranov
	Calculate the correct byte_len on the receiving side when a work completion is generated with IB_WC_RECV_RDMA_WITH_IMM opcode. According to the IBA byte_len must indicate the number of written bytes, whereas it was always equal to zero for the IB_WC_RECV_RDMA_WITH_IMM opcode, even though data was transferred. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Konstantin Taranov <konstantin.taranov@inf.ethz.ch> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	RDMA/mlx5: Set RDMA DIM to be enabled by default	Leon Romanovsky
	Enable RDMA DIM by default for better user experience. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink	Yamin Friedman
	Added parameter in ib_device for enabling dynamic interrupt moderation so that it can be configured in userspace using rdma tool. In order to set adaptive-moderation for an ib device the command is: rdma dev set [DEV] adaptive-moderation [on\|off] Please set on/off. rdma dev show 0: mlx5_0: node_type ca fw 16.26.0055 node_guid 248a:0703:00a5:29d0 sys_image_guid 248a:0703:00a5:29d0 adaptive-moderation on rdma resource show cq dev mlx5_0 cqn 0 cqe 1023 users 4 poll-ctx UNBOUND_WORKQUEUE adaptive-moderation off comm [ib_core] Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	RDMA/core: Provide RDMA DIM support for ULPs	Yamin Friedman
	Added the interface in the infiniband driver that applies the rdma_dim adaptive moderation. There is now a special function for allocating an ib_cq that uses rdma_dim. Performance improvement (ConnectX-5 100GbE, x86) running FIO benchmark over NVMf between two equal end-hosts with 56 cores across a Mellanox switch using null_blk device: READS without DIM: blk size \| BW \| IOPS \| 99th percentile latency \| 99.99th latency 512B \| 3.8GiB/s \| 7.7M \| 1401 usec \| 2442 usec 4k \| 7.0GiB/s \| 1.8M \| 4817 usec \| 6587 usec 64k \| 10.7GiB/s\| 175k \| 9896 usec \| 10028 usec IO WRITES without DIM: blk size \| BW \| IOPS \| 99th percentile latency \| 99.99th latency 512B \| 3.6GiB/s \| 7.5M \| 1434 usec \| 2474 usec 4k \| 6.3GiB/s \| 1.6M \| 938 usec \| 1221 usec 64k \| 10.7GiB/s\| 175k \| 8979 usec \| 12780 usec IO READS with DIM: blk size \| BW \| IOPS \| 99th percentile latency \| 99.99th latency 512B \| 4GiB/s \| 8.2M \| 816 usec \| 889 usec 4k \| 10.1GiB/s\| 2.65M\| 3359 usec \| 5080 usec 64k \| 10.7GiB/s\| 175k \| 9896 usec \| 10028 usec IO WRITES with DIM: blk size \| BW \| IOPS \| 99th percentile latency \| 99.99th latency 512B \| 3.9GiB/s \| 8.1M \| 799 usec \| 922 usec 4k \| 9.6GiB/s \| 2.5M \| 717 usec \| 1004 usec 64k \| 10.7GiB/s\| 176k \| 8586 usec \| 12256 usec The rdma_dim algorithm was designed to measure the effectiveness of moderation on the flow in a general way and thus should be appropriate for all RDMA storage protocols. rdma_dim is configured to be the default option based on performance improvement seen after extensive tests. Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	linux/dim: Implement RDMA adaptive moderation (DIM)	Yamin Friedman
	RDMA DIM implements a different algorithm from net DIM and is based on completions which is how we can implement interrupt moderation in RDMA. The algorithm optimizes for number of completions and ratio between completions and events. In order to avoid long latencies, the implementation performs fast reduction of moderation level when the traffic changes. Signed-off-by: Yamin Friedman <yaminf@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	Merge tag 'blk-dim-v2' into rdma.git for-next	Jason Gunthorpe
	Generic DIM From: Tal Gilboa and Yamin Fridman Implement net DIM over a generic DIM library, add RDMA DIM dim.h lib exposes an implementation of the DIM algorithm for dynamically-tuned interrupt moderation for networking interfaces. We want a similar functionality for other protocols, which might need to optimize interrupts differently. Main motivation here is DIM for NVMf storage protocol. Current DIM implementation prioritizes reducing interrupt overhead over latency. Also, in order to reduce DIM's own overhead, the algorithm might take some time to identify it needs to change profiles. While this is acceptable for networking, it might not work well on other scenarios. Here we propose a new structure to DIM. The idea is to allow a slightly modified functionality without the risk of breaking Net DIM behavior for netdev. We verified there are no degradations in current DIM behavior with the modified solution. Suggested solution: - Common logic is implemented in lib/dim/dim.c - Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses the common logic in dim.c - Any new DIM logic will be implemented in "lib/dim/new_dim.c". This new implementation will expose modified versions of profiles, dim_step() and dim_decision(). - DIM API is declared in include/linux/dim.h for all implementations. Pros for this solution are: - Zero impact on existing net_dim implementation and usage - Relatively more code reuse (compared to two separate solutions) - Increased extensibility Required for dependencies in the next series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-08	net: stmmac: Re-work the queue selection for TSO packets	Jose Abreu
	Ben Hutchings says: "This is the wrong place to change the queue mapping. stmmac_xmit() is called with a specific TX queue locked, and accessing a different TX queue results in a data race for all of that queue's state. I think this commit should be reverted upstream and in all stable branches. Instead, the driver should implement the ndo_select_queue operation and override the queue mapping there." Fixes: c5acdbee22a1 ("net: stmmac: Send TSO packets always from Queue 0") Suggested-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-08	drm/amd/display: avoid 64-bit division	Arnd Bergmann
	On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Acked-by: Slava Abramov <slava.abramov@amd.com> Tested-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>