git.armlinux.org.uk/linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2018-09-26	Merge tag 'iommu-fixes-v4.19-rc5' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Joerg writes: "IOMMU Fixes for Linux v4.19-rc5 Three fixes queued up: - Warning fix for Rockchip IOMMU where there were IRQ handlers for offlined hardware. - Fix for Intel VT-d because recent changes caused boot failures on some machines because it tried to allocate to much contiguous memory. - Fix for AMD IOMMU to handle eMMC devices correctly that appear as ACPI HID devices." * tag 'iommu-fixes-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Return devid as alias for ACPI HID devices iommu/vt-d: Handle memory shortage on pasid table allocation iommu/rockchip: Free irqs in shutdown handler
2018-09-26	Bluetooth: SMP: fix crash in unpairing	Matias Karhumaa
	In case unpair_device() was called through mgmt interface at the same time when pairing was in progress, Bluetooth kernel module crash was seen. [ 600.351225] general protection fault: 0000 [#1] SMP PTI [ 600.351235] CPU: 1 PID: 11096 Comm: btmgmt Tainted: G OE 4.19.0-rc1+ #1 [ 600.351238] Hardware name: Dell Inc. Latitude E5440/08RCYC, BIOS A18 05/14/2017 [ 600.351272] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth] [ 600.351276] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01 [ 600.351279] RSP: 0018:ffffa9be839b3b50 EFLAGS: 00010246 [ 600.351282] RAX: ffff9c999ac565a0 RBX: ffff9c9996e98c00 RCX: ffff9c999aa28b60 [ 600.351285] RDX: dead000000000200 RSI: 0000000000000010 RDI: ffff9c999e403500 [ 600.351287] RBP: ffffa9be839b3b70 R08: 0000000000000000 R09: ffffffff92a25c00 [ 600.351290] R10: ffffa9be839b3ae8 R11: 0000000000000001 R12: ffff9c995375b800 [ 600.351292] R13: 0000000000000000 R14: ffff9c99619a5000 R15: ffff9c9962a01c00 [ 600.351295] FS: 00007fb2be27c700(0000) GS:ffff9c999e880000(0000) knlGS:0000000000000000 [ 600.351298] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 600.351300] CR2: 00007fb2bdadbad0 CR3: 000000041c328001 CR4: 00000000001606e0 [ 600.351302] Call Trace: [ 600.351325] smp_failure+0x4f/0x70 [bluetooth] [ 600.351345] smp_cancel_pairing+0x74/0x80 [bluetooth] [ 600.351370] unpair_device+0x1c1/0x330 [bluetooth] [ 600.351399] hci_sock_sendmsg+0x960/0x9f0 [bluetooth] [ 600.351409] ? apparmor_socket_sendmsg+0x1e/0x20 [ 600.351417] sock_sendmsg+0x3e/0x50 [ 600.351422] sock_write_iter+0x85/0xf0 [ 600.351429] do_iter_readv_writev+0x12b/0x1b0 [ 600.351434] do_iter_write+0x87/0x1a0 [ 600.351439] vfs_writev+0x98/0x110 [ 600.351443] ? ep_poll+0x16d/0x3d0 [ 600.351447] ? ep_modify+0x73/0x170 [ 600.351451] do_writev+0x61/0xf0 [ 600.351455] ? do_writev+0x61/0xf0 [ 600.351460] __x64_sys_writev+0x1c/0x20 [ 600.351465] do_syscall_64+0x5a/0x110 [ 600.351471] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 600.351474] RIP: 0033:0x7fb2bdb62fe0 [ 600.351477] Code: 73 01 c3 48 8b 0d b8 6e 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 69 c7 2c 00 00 75 10 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 80 01 00 48 89 04 24 [ 600.351479] RSP: 002b:00007ffe062cb8f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 [ 600.351484] RAX: ffffffffffffffda RBX: 000000000255b3d0 RCX: 00007fb2bdb62fe0 [ 600.351487] RDX: 0000000000000001 RSI: 00007ffe062cb920 RDI: 0000000000000004 [ 600.351490] RBP: 00007ffe062cb920 R08: 000000000255bd80 R09: 0000000000000000 [ 600.351494] R10: 0000000000000353 R11: 0000000000000246 R12: 0000000000000001 [ 600.351497] R13: 00007ffe062cbbe0 R14: 0000000000000000 R15: 0000000000000000 [ 600.351501] Modules linked in: algif_hash algif_skcipher af_alg cmac ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay arc4 nls_iso8859_1 dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp dell_laptop kvm_intel crct10dif_pclmul dell_smm_hwmon crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_rapl_perf uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media hid_multitouch input_leds joydev serio_raw dell_wmi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic dell_smbios dcdbas sparse_keymap [ 600.351569] snd_hda_intel btusb snd_hda_codec btrtl btbcm btintel snd_hda_core bluetooth(OE) snd_hwdep snd_pcm iwlmvm ecdh_generic wmi_bmof dell_wmi_descriptor snd_seq_midi mac80211 snd_seq_midi_event lpc_ich iwlwifi snd_rawmidi snd_seq snd_seq_device snd_timer cfg80211 snd soundcore mei_me mei dell_rbtn dell_smo8800 mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid hid i915 nouveau kvmgt vfio_mdev mdev vfio_iommu_type1 vfio kvm irqbypass i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt mxm_wmi psmouse ahci sdhci_pci cqhci libahci fb_sys_fops sdhci drm e1000e video wmi [ 600.351637] ---[ end trace e49e9f1df09c94fb ]--- [ 600.351664] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth] [ 600.351666] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01 [ 600.351669] RSP: 0018:ffffa9be839b3b50 EFLAGS: 00010246 [ 600.351672] RAX: ffff9c999ac565a0 RBX: ffff9c9996e98c00 RCX: ffff9c999aa28b60 [ 600.351674] RDX: dead000000000200 RSI: 0000000000000010 RDI: ffff9c999e403500 [ 600.351676] RBP: ffffa9be839b3b70 R08: 0000000000000000 R09: ffffffff92a25c00 [ 600.351679] R10: ffffa9be839b3ae8 R11: 0000000000000001 R12: ffff9c995375b800 [ 600.351681] R13: 0000000000000000 R14: ffff9c99619a5000 R15: ffff9c9962a01c00 [ 600.351684] FS: 00007fb2be27c700(0000) GS:ffff9c999e880000(0000) knlGS:0000000000000000 [ 600.351686] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 600.351689] CR2: 00007fb2bdadbad0 CR3: 000000041c328001 CR4: 00000000001606e0 Crash happened because list_del_rcu() was called twice for smp->ltk. This was possible if unpair_device was called right after ltk was generated but before keys were distributed. In this commit smp_cancel_pairing was refactored to cancel pairing if it is in progress and otherwise just removes keys. Once keys are removed from rcu list, pointers to smp context's keys are set to NULL to make sure removed list items are not accessed later. This commit also adjusts the functionality of mgmt unpair_device() little bit. Previously pairing was canceled only if pairing was in state that keys were already generated. With this commit unpair_device() cancels pairing already in earlier states. Bug was found by fuzzing kernel SMP implementation using Synopsys Defensics. Reported-by: Pekka Oikarainen <pekka.oikarainen@synopsys.com> Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2018-09-26	mac80211_hwsim: do not omit multicast announce of first added radio	Martin Willi
	The allocation of hwsim radio identifiers uses a post-increment from 0, so the first radio has idx 0. This idx is explicitly excluded from multicast announcements ever since, but it is unclear why. Drop that idx check and announce the first radio as well. This makes userspace happy if it relies on these events. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	mac80211_hwsim: fix race in radio destruction from netlink notifier	Martin Willi
	The asynchronous destruction from a work-queue of radios tagged with destroy-on-close may race with the owning namespace about to exit, resulting in potential use-after-free of that namespace. Instead of using a work-queue, move radios about to destroy to a temporary list, which can be worked on synchronously after releasing the lock. This should be safe to do from the netlink socket notifier, as the namespace is guaranteed to not get released. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	mac80211_hwsim: fix locking when iterating radios during ns exit	Martin Willi
	The cleanup of radios during namespace exit has recently been reworked to directly delete a radio while temporarily releasing the spinlock, fixing a race condition between the work-queue execution and namespace exits. However, the temporary unlock allows unsafe modifications on the iterated list, resulting in a potential crash when continuing the iteration of additional radios. Move radios about to destroy to a temporary list, and clean that up after releasing the spinlock once iteration is complete. Fixes: 8cfd36a0b53a ("mac80211_hwsim: fix use-after-free bug in hwsim_exit_net") Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT	Masashi Honma
	Use array_index_nospec() to sanitize ridx with respect to speculation. Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	cfg80211: fix reg_query_regdb_wmm kernel-doc	Randy Dunlap
	Drop @ptr from kernel-doc for function reg_query_regdb_wmm(). This function parameter was recently removed so update the kernel-doc to match that and remove the kernel-doc warnings. Removes 109 occurrences of this warning message: ../include/net/cfg80211.h:4869: warning: Excess function parameter 'ptr' description in 'reg_query_regdb_wmm' Fixes: 38cb87ee47fb ("cfg80211: make wmm_rule part of the reg_rule structure") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: linux-wireless@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	mac80211: allocate TXQs for active monitor interfaces	Felix Fietkau
	Monitor mode interfaces with the active flag are passed down to the driver. Drivers using TXQ expect that all interfaces have allocated TXQs before they get added. Fixes: 79af1f866193d ("mac80211: avoid allocating TXQs that won't be used") Cc: stable@vger.kernel.org Reported-by: Catrinel Catrinescu <cc@80211.de> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-09-26	iommu/amd: Return devid as alias for ACPI HID devices	Arindam Nath
	ACPI HID devices do not actually have an alias for them in the IVRS. But dev_data->alias is still used for indexing into the IOMMU device table for devices being handled by the IOMMU. So for ACPI HID devices, we simply return the corresponding devid as an alias, as parsed from IVRS table. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Fixes: 2bf9a0a12749 ('iommu/amd: Add iommu support for ACPI HID devices') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-09-25	net: phy: marvell: Fix build.	David S. Miller
	Local variable 'autoneg' doesn't even exist: drivers/net/phy/marvell.c: In function 'm88e1121_config_aneg': drivers/net/phy/marvell.c:468:25: error: 'autoneg' undeclared (first use in this function); did you mean 'put_net'? if (phydev->autoneg != autoneg \|\| changed) { ^~~~~~~ Fixes: d6ab93364734 ("net: phy: marvell: Avoid unnecessary soft reset") Reported-by:Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	tipc: lock wakeup & inputq at tipc_link_reset()	Parthasarathy Bhuvaragan
	In tipc_link_reset() we copy the wakeup queue to input queue using skb_queue_splice_init(link->wakeupq, link->inputq). This is performed without holding any locks. The lists might be simultaneously be accessed by other cpu threads in tipc_sk_rcv(), something leading to to random missing packets. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	tipc: reset bearer if device carrier not ok	Parthasarathy Bhuvaragan
	If we detect that under lying carrier detects errors and goes down, we reset the bearer. Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	ARM: dts: stm32: update SPI6 dmas property on stm32mp157c	Amelie Delaunay
	Remove unused parameter from SPI6 dmas property on stm32mp157c SoC. Fixes: dc3f8c86c10d ("ARM: dts: stm32: add SPI support on stm32mp157c") Signed-off-by: Amelie Delaunay <amelie.delaunay@st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com> [olof: Without this patch, SPI6 will fall back to interrupt mode with lower perfmance] Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-25	bridge: br_arp_nd_proxy: set icmp6_router if neigh has NTF_ROUTER	Roopa Prabhu
	Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2018-09-25 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow for RX stack hardening by implementing the kernel's flow dissector in BPF. Idea was originally presented at netconf 2017 [0]. Quote from merge commit: [...] Because of the rigorous checks of the BPF verifier, this provides significant security guarantees. In particular, the BPF flow dissector cannot get inside of an infinite loop, as with CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot read outside of packet bounds, because all memory accesses are checked. Also, with BPF the administrator can decide which protocols to support, reducing potential attack surface. Rarely encountered protocols can be excluded from dissection and the program can be updated without kernel recompile or reboot if a bug is discovered. [...] Also, a sample flow dissector has been implemented in BPF as part of this work, from Petar and Willem. [0] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf 2) Add support for bpftool to list currently active attachment points of BPF networking programs providing a quick overview similar to bpftool's perf subcommand, from Yonghong. 3) Fix a verifier pruning instability bug where a union member from the register state was not cleared properly leading to branches not being pruned despite them being valid candidates, from Alexei. 4) Various smaller fast-path optimizations in XDP's map redirect code, from Jesper. 5) Enable to recognize BPF_MAP_TYPE_REUSEPORT_SOCKARRAY maps in bpftool, from Roman. 6) Remove a duplicate check in libbpf that probes for function storage, from Taeung. 7) Fix an issue in test_progs by avoid checking for errno since on success its value should not be checked, from Mauricio. 8) Fix unused variable warning in bpf_getsockopt() helper when CONFIG_INET is not configured, from Anders. 9) Fix a compilation failure in the BPF sample code's use of bpf_flow_keys, from Prashant. 10) Minor cleanups in BPF code, from Yue and Zhong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: dsa: lantiq_gswip: Depend on HAS_IOMEM	Hauke Mehrtens
	The driver uses devm_ioremap_resource() which is only available when CONFIG_HAS_IOMEM is set, make the driver depend on this config option. User mode Linux does not have CONFIG_HAS_IOMEM set and the driver was failing on this architecture. Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	Merge branch 'net-phy-Eliminate-unnecessary-soft'	David S. Miller
	Florian Fainelli says: ==================== net: phy: Eliminate unnecessary soft This patch series eliminates unnecessary software resets of the PHY. This should hopefully not break anybody's hardware; but I would appreciate testing to make sure this is is the case. Sorry for this long email list, I wanted to make sure I reached out to all people who made changes to the Marvell PHY driver. Thank you! Changes since RFT: - added Tested-by tags from Wang, Dongsheng, Andrew, Chris and Clemens ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: phy: marvell: Avoid unnecessary soft reset	Florian Fainelli
	The BMCR.RESET bit on the Marvell PHYs has a special meaning in that it commits the register writes into the HW for it to latch and be configured appropriately. Doing software resets causes link drops, and this is unnecessary disruption if nothing changed. Determine from marvell_set_polarity()'s return code whether the register value was changed and if it was, propagate that to the logic that hits the software reset bit. This avoids doing unnecessary soft reset if the PHY is configured in the same state it was previously, this also eliminates the need for a m88e1111_config_aneg() function since it now is the same as marvell_config_aneg(). Tested-by: Wang, Dongsheng <dongsheng.wang@hxt-semitech.com> Tested-by: Chris Healy <cphealy@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: phy: Stop with excessive soft reset	Florian Fainelli
	While consolidating the PHY reset in phy_init_hw() an unconditionaly BMCR soft-reset I became quite trigger happy with those. This was later on deactivated for the Generic PHY driver on the premise that a prior software entity (e.g: bootloader) might have applied workarounds in commit 0878fff1f42c ("net: phy: Do not perform software reset for Generic PHY"). Since we have a hook to wire-up a soft_reset callback, just use that and get rid of the call to genphy_soft_reset() entirely. This speeds up initialization and link establishment for most PHYs out there that do not require a reset. Fixes: 87aa9f9c61ad ("net: phy: consolidate PHY reset in phy_init_hw()") Tested-by: Wang, Dongsheng <dongsheng.wang@hxt-semitech.com> Tested-by: Chris Healy <cphealy@gmail.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	Merge branch '40GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-09-25 This series contains updates to i40e and xsk. Mariusz fixes an issue where the VF link state was not being updated properly when the PF is down or up. Also cleaned up the promiscuous configuration during a VF reset. Patryk simplifies the code a bit to use the variables for PF and HW that are declared, rather than using the VSI pointers. Cleaned up the message length parameter to several virtchnl functions, since it was not being used (or needed). Harshitha fixes two potential race conditions when trying to change VF settings by creating a helper function to validate that the VF is enabled and that the VSI is set up. Sergey corrects a double "link down" message by putting in a check for whether or not the link is up or going down. Björn addresses an AF_XDP zero-copy issue that buffers passed from userspace to the kernel was leaked when the hardware descriptor ring was torn down. A zero-copy capable driver picks buffers off the fill ring and places them on the hardware receive ring to be completed at a later point when DMA is complete. Similar on the transmit side; The driver picks buffers off the transmit ring and places them on the transmit hardware ring. In the typical flow, the receive buffer will be placed onto an receive ring (completed to the user), and the transmit buffer will be placed on the completion ring to notify the user that the transfer is done. However, if the driver needs to tear down the hardware rings for some reason (interface goes down, reconfiguration and such), the userspace buffers cannot be leaked. They have to be reused or completed back to userspace. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	Merge branch ↵	David S. Miller
	'Refactor-classifier-API-to-work-with-Qdisc-blocks-without-rtnl-lock' Vlad Buslov says: ==================== Refactor classifier API to work with Qdisc/blocks without rtnl lock Currently, all netlink protocol handlers for updating rules, actions and qdiscs are protected with single global rtnl lock which removes any possibility for parallelism. This patch set is a third step to remove rtnl lock dependency from TC rules update path. Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added. Handlers registered with this flag are called without RTNL taken. End goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER, etc.) to be registered with UNLOCKED flag to allow parallel execution. However, there is no intention to completely remove or split rtnl lock itself. This patch set addresses specific problems in implementation of classifiers API that prevent its control path from being executed concurrently. Additional changes are required to refactor classifiers API and individual classifiers for parallel execution. This patch set lays groundwork to eventually register rule update handlers as rtnl-unlocked by modifying code in cls API that works with Qdiscs and blocks. Following patch set does the same for chains and classifiers. The goal of this change is to refactor tcf_block_find() and its dependencies to allow concurrent execution: - Extend Qdisc API with rcu to lookup and take reference to Qdisc without relying on rtnl lock. - Extend tcf_block with atomic reference counting and rcu. - Always take reference to tcf_block while working with it. - Implement tcf_block_release() to release resources obtained by tcf_block_find() - Create infrastructure to allow registering Qdiscs with class ops that do not require the caller to hold rtnl lock. All three netlink rule update handlers use tcf_block_find() to lookup Qdisc and block, and this patch set introduces additional means of synchronization to substitute rtnl lock in cls API. Some functions in cls and sch APIs have historic names that no longer clearly describe their intent. In order not make this code even more confusing when introducing their concurrency-friendly versions, rename these functions to describe actual implementation. Changes from V2 to V3: - Patch 1: - Explicitly include refcount.h in rtnetlink.h. - Patch 3: - Move rcu_head field to the end of struct Qdisc. - Rearrange local variable declarations in qdisc_lookup_rcu(). - Patch 5: - Remove tcf_qdisc_put() and inline its content to callers. Changes from V1 to V2: - Rebase on latest net-next. - Patch 8 - remove. - Patch 9 - fold into patch 11. - Patch 11: - Rename tcf_block_{get\|put}() to tcf_block_refcnt_{get\|put}(). - Patch 13 - remove. ==================== Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: use reference counting for tcf blocks on rules update	Vlad Buslov
	In order to remove dependency on rtnl lock on rules update path, always take reference to block while using it on rules update path. Change tcf_block_get() error handling to properly release block with reference counting, instead of just destroying it, in order to accommodate potential concurrent users. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: implement tcf_block_refcnt_{get\|put}()	Vlad Buslov
	Implement get/put function for blocks that only take/release the reference and perform deallocation. These functions are intended to be used by unlocked rules update path to always hold reference to block while working with it. They use on new fine-grained locking mechanisms introduced in previous patches in this set, instead of relying on global protection provided by rtnl lock. Extract code that is common with tcf_block_detach_ext() into common function __tcf_block_put(). Extend tcf_block with rcu to allow safe deallocation when it is accessed concurrently. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: protect block idr with spinlock	Vlad Buslov
	Protect block idr access with spinlock, instead of relying on rtnl lock. Take tn->idr_lock spinlock during block insertion and removal. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: implement functions to put and flush all chains	Vlad Buslov
	Extract code that flushes and puts all chains on tcf block to two standalone function to be shared with functions that locklessly get/put reference to block. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: change tcf block reference counter type to refcount_t	Vlad Buslov
	As a preparation for removing rtnl lock dependency from rules update path, change tcf block reference counter type to refcount_t to allow modification by concurrent users. In block put function perform decrement and check reference counter once to accommodate concurrent modification by unlocked users. After this change tcf_chain_put at the end of block put function is called with block->refcnt==0 and will deallocate block after the last chain is released, so there is no need to manually deallocate block in this case. However, if block reference counter reached 0 and there are no chains to release, block must still be deallocated manually. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: use Qdisc rcu API instead of relying on rtnl lock	Vlad Buslov
	As a preparation from removing rtnl lock dependency from rules update path, use Qdisc rcu and reference counting capabilities instead of relying on rtnl lock while working with Qdiscs. Create new tcf_block_release() function, and use it to free resources taken by tcf_block_find(). Currently, this function only releases Qdisc and it is extended in next patches in this series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: add helper function to take reference to Qdisc	Vlad Buslov
	Implement function to take reference to Qdisc that relies on rcu read lock instead of rtnl mutex. Function only takes reference to Qdisc if reference counter isn't zero. Intended to be used by unlocked cls API. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: extend Qdisc with rcu	Vlad Buslov
	Currently, Qdisc API functions assume that users have rtnl lock taken. To implement rtnl unlocked classifiers update interface, Qdisc API must be extended with functions that do not require rtnl lock. Extend Qdisc structure with rcu. Implement special version of put function qdisc_put_unlocked() that is called without rtnl lock taken. This function only takes rtnl lock if Qdisc reference counter reached zero and is intended to be used as optimization. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: sched: rename qdisc_destroy() to qdisc_put()	Vlad Buslov
	Current implementation of qdisc_destroy() decrements Qdisc reference counter and only actually destroy Qdisc if reference counter value reached zero. Rename qdisc_destroy() to qdisc_put() in order for it to better describe the way in which this function currently implemented and used. Extract code that deallocates Qdisc into new private qdisc_destroy() function. It is intended to be shared between regular qdisc_put() and its unlocked version that is introduced in next patch in this series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	net: core: netlink: add helper refcount dec and lock function	Vlad Buslov
	Rtnl lock is encapsulated in netlink and cannot be accessed by other modules directly. This means that reference counted objects that rely on rtnl lock cannot use it with refcounter helper function that atomically releases decrements reference and obtains mutex. This patch implements simple wrapper function around refcount_dec_and_lock that obtains rtnl lock if reference counter value reached 0. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25	blk-mq: Allow blocking queue tag iter callbacks	Keith Busch
	A recent commit runs tag iterator callbacks under the rcu read lock, but existing callbacks do not satisfy the non-blocking requirement. The commit intended to prevent an iterator from accessing a queue that's being modified. This patch fixes the original issue by taking a queue reference instead of reading it, which allows callbacks to make blocking calls. Fixes: f5bbbbe4d6357 ("blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter") Acked-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-25	nvme: properly propagate errors in nvme_mpath_init	Susobhan Dey
	Signed-off-by: Susobhan Dey <susobhan.dey@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-25	dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration	Christoph Hellwig
	The patch adding the infrastructure failed to actually add the symbol declaration, oops.. Fixes: faef87723a ("dma-noncoherent: add a arch_sync_dma_for_cpu_all hook") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paul Burton <paul.burton@mips.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-25	RDMA/core: Set right entry state before releasing reference	Parav Pandit
	Currently add_modify_gid() for IB link layer has followong issue in cache update path. When GID update event occurs, core releases reference to the GID table without updating its state and/or entry pointer. CPU-0 CPU-1 ------ ----- ib_cache_update() IPoIB ULP add_modify_gid() [..] put_gid_entry() refcnt = 0, but state = valid, entry is valid. (work item is not yet executed). ipoib_create_ah() rdma_create_ah() rdma_get_gid_attr() <-- Tries to acquire gid_attr which has refcnt = 0. This is incorrect. GID entry state and entry pointer is provides the accurate GID enty state. Such fields must be updated with rwlock to protect against readers and, such fields must be in sane state before refcount can drop to zero. Otherwise above race condition can happen leading to use-after-free situation. Following backtrace has been observed when cache update for an IB port is triggered while IPoIB ULP is creating an AH. Therefore, when updating GID entry, first mark a valid entry as invalid through state and set the barrier so that no callers can acquired the GID entry, followed by release reference to it. refcount_t: increment on 0; use-after-free. WARNING: CPU: 4 PID: 29106 at lib/refcount.c:153 refcount_inc_checked+0x30/0x50 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] RIP: 0010:refcount_inc_checked+0x30/0x50 RSP: 0018:ffff8802ad36f600 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000008 RDI: ffffffff86710100 RBP: ffff8802d6e60a30 R08: ffffed005d67bf8b R09: ffffed005d67bf8b R10: 0000000000000001 R11: ffffed005d67bf8a R12: ffff88027620cee8 R13: ffff8802d6e60988 R14: ffff8802d6e60a78 R15: 0000000000000202 FS: 0000000000000000(0000) GS:ffff8802eb200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3ab35e5c88 CR3: 00000002ce84a000 CR4: 00000000000006e0 IPv6: ADDRCONF(NETDEV_CHANGE): ib1: link becomes ready Call Trace: rdma_get_gid_attr+0x220/0x310 [ib_core] ? lock_acquire+0x145/0x3a0 rdma_fill_sgid_attr+0x32c/0x470 [ib_core] rdma_create_ah+0x89/0x160 [ib_core] ? rdma_fill_sgid_attr+0x470/0x470 [ib_core] ? ipoib_create_ah+0x52/0x260 [ib_ipoib] ipoib_create_ah+0xf5/0x260 [ib_ipoib] ipoib_mcast_join_complete+0xbbe/0x2540 [ib_ipoib] Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()	Zhao Qiang
	There is a copy and paste bug so we accidentally use the RX_ shift when we're in TX_ mode. Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> (cherry picked from commit 3cb31b634052ed458922e0c8e2b4b093d7fb60b9) Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-25	soc: fsl: qbman: qman: avoid allocating from non existing gen_pool	Alexandre Belloni
	If the qman driver didn't probe, calling qman_alloc_fqid_range, qman_alloc_pool_range or qman_alloc_cgrid_range (as done in dpaa_eth) will pass a NULL pointer to gen_pool_alloc, leading to a NULL pointer dereference. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: Roy Pledge <roy.pledge@nxp.com> Signed-off-by: Li Yang <leoyang.li@nxp.com> (cherry picked from commit f72487a2788aa70c3aee1d0ebd5470de9bac953a) Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-25	Merge tag 'arm-soc/for-4.19/devicetree-fixes' of ↵	Olof Johansson
	https://github.com/Broadcom/stblinux into fixes This pull request contains Broadcom ARM-based SoCs Device Tree changes intended for 4.19, please pull the following: - Florian fixes the PPI and SPI interrupts in the BCM63138 (DSL) SoC DTS * tag 'arm-soc/for-4.19/devicetree-fixes' of https://github.com/Broadcom/stblinux: ARM: dts: BCM63xx: Fix incorrect interrupt specifiers Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-25	IB/mlx5: Destroy the DEVX object upon error flow	Yishai Hadas
	Upon DEVX object creation the object must be destroyed upon a follows error flow. Fixes: 7efce3691d33 ("IB/mlx5: Add obj create and destroy functionality") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	IB/uverbs: Free uapi on destroy	Mark Bloch
	Make sure we free struct uverbs_api once we clean the radix tree. It was allocated by uverbs_alloc_api(). Fixes: 9ed3e5f44772 ("IB/uverbs: Build the specs into a radix tree at runtime") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25	i40e: disallow changing the number of descriptors when AF_XDP is on	Björn Töpel
	When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a user to change the number of descriptors via e.g. "ethtool -G IFNAME". Otherwise, the size of the stash/reuse queue can grow unbounded, which would result in OOM or leaking userspace buffers. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: clean zero-copy XDP Rx ring on shutdown/reset	Björn Töpel
	Outstanding Rx descriptors are temporarily stored on a stash/reuse queue. When/if the HW rings comes up again, entries from the stash are used to re-populate the ring. The latter required some restructuring of the allocation scheme for the AF_XDP zero-copy implementation. There is now a fast, and a slow allocation. The "fast allocation" is used from the fast-path and obtains free buffers from the fill ring and the internal recycle mechanism. The "slow allocation" is only used in ring setup, and obtains buffers from the fill ring and the stash (if any). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	net: xsk: add a simple buffer reuse queue	Jakub Kicinski
	XSK UMEM is strongly single producer single consumer so reuse of frames is challenging. Add a simple "stash" of FILL packets to reuse for drivers to optionally make use of. This is useful when driver has to free (ndo_stop) or resize a ring with an active AF_XDP ZC socket. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: clean zero-copy XDP Tx ring on shutdown/reset	Björn Töpel
	When the zero-copy enabled XDP Tx ring is torn down, due to configuration changes, outstanding frames on the hardware descriptor ring are queued on the completion ring. The completion ring has a back-pressure mechanism that will guarantee that there is sufficient space on the ring. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: Remove unused msglen parameter from virtchnl functions	Patryk Małek
	msglen parameter seems to be unused in several virtchnl function. This patch removes it from signatures of those functions. Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: fix double 'NIC Link is Down' messages	Sergey Nemov
	When isup is false meaning that interface is going to shut down set new speed to 0 to avoid double 'NIC Link is Down' messages. Signed-off-by: Sergey Nemov <sergey.nemov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: add a helper function to validate a VF based on the vf id	Harshitha Ramamurthy
	When we are trying to change VF settings, it is possible for 2 race conditions to happen. One, when the VF is created but not yet enabled. Second, the VF is enabled but the VSI is still not created or not yet re-created in the VF reset flow. This patch introduces a helper function to validate that the VF is enabled and that the VSI is set up. This patch also calls this function from other functions which could get into these race conditions. While we are poking around here, remove unnecessary parenthesis that checkpatch was complaining about. Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: use declared variables for pf and hw	Patryk Małek
	In order to slightly simplify the code use the variables for pf and hw that are declared in i40e_set_mac function. Signed-off-by: Patryk Małek <patryk.malek@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: Unset promiscuous settings on VF reset	Mariusz Stachura
	This patch cleans up promiscuous configuration when a VF reset occurs. Previously the promiscuous mode settings were still there after the VF driver removal. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25	i40e: Fix VF's link state notification	Mariusz Stachura
	This resolves an issue where the VF link state was not being updated when the PF is down or up, and the VF link state would always show that it is running. Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>