summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-03-07l2tp: do not accept arbitrary socketsEric Dumazet
syzkaller found an issue caused by lack of sufficient checks in l2tp_tunnel_create() RAW sockets can not be considered as UDP ones for instance. In another patch, we shall replace all pr_err() by less intrusive pr_debug() so that syzkaller can find other bugs faster. Acked-by: Guillaume Nault <g.nault@alphalink.fr> Acked-by: James Chapman <jchapman@katalix.com> ================================================================== BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 dst_release: dst:00000000d53d0d0f refcnt:-1 Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242 CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23b/0x360 mm/kasan/report.c:412 __asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435 setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69 l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596 pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707 SYSC_connect+0x213/0x4a0 net/socket.c:1640 SyS_connect+0x24/0x30 net/socket.c:1621 do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07net: Fix hlist corruptions in inet_evict_bucket()Kirill Tkhai
inet_evict_bucket() iterates global list, and several tasks may call it in parallel. All of them hash the same fq->list_evictor to different lists, which leads to list corruption. This patch makes fq be hashed to expired list only if this has not been made yet by another task. Since inet_frag_alloc() allocates fq using kmem_cache_zalloc(), we may rely on list_evictor is initially unhashed. The problem seems to exist before async pernet_operations, as there was possible to have exit method to be executed in parallel with inet_frags::frags_work, so I add two Fixes tags. This also may go to stable. Fixes: d1fe19444d82 "inet: frag: don't re-use chainlist for evictor" Fixes: f84c6821aa54 "net: Convert pernet_subsys, registered from inet_init()" Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07net: smsc911x: Fix unload crash when link is upJeremy Linton
The smsc911x driver will crash if it is rmmod'ed while the netdev is up like: Call trace: phy_detach+0x94/0x150 phy_disconnect+0x40/0x50 smsc911x_stop+0x104/0x128 [smsc911x] __dev_close_many+0xb4/0x138 dev_close_many+0xbc/0x190 rollback_registered_many+0x140/0x460 rollback_registered+0x68/0xb0 unregister_netdevice_queue+0x100/0x118 unregister_netdev+0x28/0x38 smsc911x_drv_remove+0x58/0x130 [smsc911x] platform_drv_remove+0x30/0x50 device_release_driver_internal+0x15c/0x1f8 driver_detach+0x54/0x98 bus_remove_driver+0x64/0xe8 driver_unregister+0x34/0x60 platform_driver_unregister+0x20/0x30 smsc911x_cleanup_module+0x14/0xbca8 [smsc911x] SyS_delete_module+0x1e8/0x238 __sys_trace_return+0x0/0x4 This is caused by the mdiobus being unregistered/free'd and the code in phy_detach() attempting to manipulate mdio related structures from unregister_netdev() calling close() To fix this, we delay the mdiobus teardown until after the netdev is deregistered. Reported-by: Matt Sealey <matt.sealey@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routesStefano Brivio
Currently, administrative MTU changes on a given netdevice are not reflected on route exceptions for MTU-less routes, with a set PMTU value, for that device: # ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a proto kernel src 2001:db8::a metric 256 pref medium # ping6 -c 1 -q -s10000 2001:db8::b > /dev/null # ip netns exec a ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium # ip link set dev vti_a mtu 3000 # ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium # ip link set dev vti_a mtu 9000 # ip -6 route get 2001:db8::b 2001:db8::b from :: dev vti_a src 2001:db8::a metric 0 cache expires 571sec mtu 4926 pref medium The first issue is that since commit fb56be83e43d ("net-ipv6: on device mtu change do not add mtu to mtu-less routes") we don't call rt6_exceptions_update_pmtu() from rt6_mtu_change_route(), which handles administrative MTU changes, if the regular route is MTU-less. However, PMTU exceptions should be always updated, as long as RTAX_MTU is not locked. Keep the check for MTU-less main route, as introduced by that commit, but, for exceptions, call rt6_exceptions_update_pmtu() regardless of that check. Once that is fixed, one problem remains: MTU changes are not reflected if the new MTU is higher than the previous one, because rt6_exceptions_update_pmtu() doesn't allow that. We should instead allow PMTU increase if the old PMTU matches the local MTU, as that implies that the old MTU was the lowest in the path, and PMTU discovery might lead to different results. The existing check in rt6_mtu_change_route() correctly took that case into account (for regular routes only), so factor it out and re-use it also in rt6_exceptions_update_pmtu(). While at it, fix comments style and grammar, and try to be a bit more descriptive. Reported-by: Xiumei Mu <xmu@redhat.com> Fixes: fb56be83e43d ("net-ipv6: on device mtu change do not add mtu to mtu-less routes") Fixes: f5bbe7ee79c2 ("ipv6: prepare rt6_mtu_change() for exception table") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07net: qcom/emac: Use proper free methods during TXHemanth Puranik
This patch fixes the warning messages/call traces seen if DMA debug is enabled, In case of fragmented skb's memory was allocated using dma_map_page but freed using dma_unmap_single. This patch modifies buffer allocations in TX path to use dma_map_page in all the places and dma_unmap_page while freeing the buffers. Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org> Acked-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07qed: Free RoCE ILT Memory on rmmod qedrMichal Kalderon
Rdma requires ILT Memory to be allocated for it's QPs. Each ILT entry points to a page used by several Rdma QPs. To avoid allocating all the memory in advance, the rdma implementation dynamically allocates memory as more QPs are added, however it does not dynamically free the memory. The memory should have been freed on rmmod qedr, but isn't. This patch adds the memory freeing on rmmod qedr (currently it will be freed with qed is removed). An outcome of this bug, is that if qedr is unloaded and loaded without unloaded qed, there will be no more RoCE traffic. The reason these are related, is that the logic of detecting the first QP ever opened is by asking whether ILT memory for RoCE has been allocated. In addition, this patch modifies freeing of the Task context to always use the PROTOCOLID_ROCE and not the protocol passed, this is because task context for iWARP and ROCE both use the ROCE protocol id, as opposed to the connection context. Fixes: dbb799c39717 ("qed: Initialize hardware for new protocols") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-03-05 This series contains fixes to e1000e only. Benjamin Poirier provides all but one fix in this series, starting with workaround for a VMWare e1000e emulation issue where ICR reads 0x0 on the emulated device. Partially reverted a previous commit dealing with the "Other" interrupt throttling to avoid unforeseen fallout from these changes that are not strictly necessary. Restored the ICS write for receive and transmit queue interrupts in the case that txq or rxq bits were set in ICR and the Other interrupt handler read and cleared ICR before the queue interrupt was raised. Fixed an bug where interrupts may be missed if ICR is read while INT_ASSERTED is not set, so avoid the problem by setting all bits related to events that can trigger the Other interrupt in IMS. Fixed the return value for check_for_link() when auto-negotiation is off. Pierre-Yves Kerbrat fixes e1000e to use dma_zalloc_coherent() to make sure the ring is memset to 0 to prevent the area from containing garbage. v2: added an additional e1000e fix to the series ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07PCI: dwc: Fix enumeration end when reaching root subordinateKoen Vandeputte
The subordinate value indicates the highest bus number which can be reached downstream though a certain device. Commit a20c7f36bd3d ("PCI: Do not allocate more buses than available in parent") ensures that downstream devices cannot assign busnumbers higher than the upstream device subordinate number, which was indeed illogical. By default, dw_pcie_setup_rc() inits the Root Complex subordinate to a value of 0x01. Due to this combined with above commit, enumeration stops digging deeper downstream as soon as bus num 0x01 has been assigned, which is always the case for a bridge device. This results in all devices behind a bridge bus remaining undetected, as these would be connected to bus 0x02 or higher. Fix this by initializing the RC to a subordinate value of 0xff, which is not altering hardware behaviour in any way, but informs probing function pci_scan_bridge() later on which reads this value back from register. The following nasty errors during boot are also fixed by this: pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01] (conflicts with (null) [bus 01]) ... pci_bus 0000:03: [bus 03] partially hidden behind bridge 0000:01 [bus 01] ... pci_bus 0000:04: [bus 04] partially hidden behind bridge 0000:01 [bus 01] ... pci_bus 0000:05: [bus 05] partially hidden behind bridge 0000:01 [bus 01] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 05 pci_bus 0000:02: busn_res: can not insert [bus 02-05] under [bus 01] (conflicts with (null) [bus 01]) pci_bus 0000:02: [bus 02-05] partially hidden behind bridge 0000:01 [bus 01] Fixes: a20c7f36bd3d ("PCI: Do not allocate more buses than available in parent") Tested-by: Niklas Cassel <niklas.cassel@axis.com> Tested-by: Fabio Estevam <fabio.estevam@nxp.com> Tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Signed-off-by: Koen Vandeputte <koen.vandeputte@ncentric.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Lucas Stach <l.stach@pengutronix.de> Cc: stable@vger.kernel.org # v4.15+ Cc: Binghui Wang <wangbinghui@hisilicon.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jianguo Sun <sunjianguo1@huawei.com> Cc: Jingoo Han <jingoohan1@gmail.com> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Minghuan Lian <minghuan.Lian@freescale.com> Cc: Mingkai Hu <mingkai.hu@freescale.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Pratyush Anand <pratyush.anand@gmail.com> Cc: Richard Zhu <hongxing.zhu@nxp.com> Cc: Roy Zang <tie-fei.zang@freescale.com> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Stanimir Varbanov <svarbanov@mm-sol.com> Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Xiaowei Song <songxiaowei@hisilicon.com> Cc: Zhou Wang <wangzhou1@hisilicon.com>
2018-03-07net: usbnet: fix potential deadlock on 32bit hostsEric Dumazet
Marek reported a LOCKDEP issue occurring on 32bit host, that we tracked down to the fact that usbnet could either run from soft or hard irqs. This patch adds u64_stats_update_begin_irqsave() and u64_stats_update_end_irqrestore() helpers to solve this case. [ 17.768040] ================================ [ 17.772239] WARNING: inconsistent lock state [ 17.776511] 4.16.0-rc3-next-20180227-00007-g876c53a7493c #453 Not tainted [ 17.783329] -------------------------------- [ 17.787580] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 17.793607] swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 17.798751] (&syncp->seq#5){?.-.}, at: [<9b22e5f0>] asix_rx_fixup_internal+0x188/0x288 [ 17.806790] {IN-HARDIRQ-W} state was registered at: [ 17.811677] tx_complete+0x100/0x208 [ 17.815319] __usb_hcd_giveback_urb+0x60/0xf0 [ 17.819770] xhci_giveback_urb_in_irq+0xa8/0x240 [ 17.824469] xhci_td_cleanup+0xf4/0x16c [ 17.828367] xhci_irq+0xe74/0x2240 [ 17.831827] usb_hcd_irq+0x24/0x38 [ 17.835343] __handle_irq_event_percpu+0x98/0x510 [ 17.840111] handle_irq_event_percpu+0x1c/0x58 [ 17.844623] handle_irq_event+0x38/0x5c [ 17.848519] handle_fasteoi_irq+0xa4/0x138 [ 17.852681] generic_handle_irq+0x18/0x28 [ 17.856760] __handle_domain_irq+0x6c/0xe4 [ 17.860941] gic_handle_irq+0x54/0xa0 [ 17.864666] __irq_svc+0x70/0xb0 [ 17.867964] arch_cpu_idle+0x20/0x3c [ 17.871578] arch_cpu_idle+0x20/0x3c [ 17.875190] do_idle+0x144/0x218 [ 17.878468] cpu_startup_entry+0x18/0x1c [ 17.882454] start_kernel+0x394/0x400 [ 17.886177] irq event stamp: 161912 [ 17.889616] hardirqs last enabled at (161912): [<7bedfacf>] __netdev_alloc_skb+0xcc/0x140 [ 17.897893] hardirqs last disabled at (161911): [<d58261d0>] __netdev_alloc_skb+0x94/0x140 [ 17.904903] exynos5-hsi2c 12ca0000.i2c: tx timeout [ 17.906116] softirqs last enabled at (161904): [<387102ff>] irq_enter+0x78/0x80 [ 17.906123] softirqs last disabled at (161905): [<cf4c628e>] irq_exit+0x134/0x158 [ 17.925722]. [ 17.925722] other info that might help us debug this: [ 17.933435] Possible unsafe locking scenario: [ 17.933435]. [ 17.940331] CPU0 [ 17.942488] ---- [ 17.944894] lock(&syncp->seq#5); [ 17.948274] <Interrupt> [ 17.950847] lock(&syncp->seq#5); [ 17.954386]. [ 17.954386] *** DEADLOCK *** [ 17.954386]. [ 17.962422] no locks held by swapper/0/0. Fixes: c8b5d129ee29 ("net: usbnet: support 64bit stats") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07sch_netem: fix skb leak in netem_enqueue()Alexey Kodanev
When we exceed current packets limit and we have more than one segment in the list returned by skb_gso_segment(), netem drops only the first one, skipping the rest, hence kmemleak reports: unreferenced object 0xffff880b5d23b600 (size 1024): comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s) hex dump (first 32 bytes): 00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000d8a19b9d>] __alloc_skb+0xc9/0x520 [<000000001709b32f>] skb_segment+0x8c8/0x3710 [<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830 [<00000000c921cba1>] inet_gso_segment+0x476/0x1370 [<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510 [<000000002182660a>] __skb_gso_segment+0x1dd/0x620 [<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem] [<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120 [<00000000fc5f7327>] ip_finish_output2+0x998/0xf00 [<00000000d309e9d3>] ip_output+0x1aa/0x2c0 [<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670 [<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0 [<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540 [<0000000013d06d02>] tasklet_action+0x1ca/0x250 [<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3 [<00000000e7ed027c>] irq_exit+0x1e2/0x210 Fix it by adding the rest of the segments, if any, to skb 'to_free' list. Add new __qdisc_drop_all() and qdisc_drop_all() functions because they can be useful in the future if we need to drop segmented GSO packets in other places. Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2018-03-05 Here are a few more Bluetooth fixes for the 4.16 kernel: - btusb: reset/resume fixes for Yoga 920 and Dell OptiPlex 3060 - Fix for missing encryption refresh with the Security Manager protocol ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07fsl/fman: avoid sleeping in atomic context while adding an addressDenis Kirjanov
__dev_mc_add grabs an adress spinlock so use atomic context in kmalloc. / # ifconfig eth0 inet 192.168.0.111 [ 89.331622] BUG: sleeping function called from invalid context at mm/slab.h:420 [ 89.339002] in_atomic(): 1, irqs_disabled(): 0, pid: 1035, name: ifconfig [ 89.345799] 2 locks held by ifconfig/1035: [ 89.349908] #0: (rtnl_mutex){+.+.}, at: [<(ptrval)>] devinet_ioctl+0xc0/0x8a0 [ 89.357258] #1: (_xmit_ETHER){+...}, at: [<(ptrval)>] __dev_mc_add+0x28/0x80 [ 89.364520] CPU: 1 PID: 1035 Comm: ifconfig Not tainted 4.16.0-rc3-dirty #8 [ 89.371464] Call Trace: [ 89.373908] [e959db60] [c066f948] dump_stack+0xa4/0xfc (unreliable) [ 89.380177] [e959db80] [c00671d8] ___might_sleep+0x248/0x280 [ 89.385833] [e959dba0] [c01aec34] kmem_cache_alloc_trace+0x174/0x320 [ 89.392179] [e959dbd0] [c04ab920] dtsec_add_hash_mac_address+0x130/0x240 [ 89.398874] [e959dc00] [c04a9d74] set_multi+0x174/0x1b0 [ 89.404093] [e959dc30] [c04afb68] dpaa_set_rx_mode+0x68/0xe0 [ 89.409745] [e959dc40] [c057baf8] __dev_mc_add+0x58/0x80 [ 89.415052] [e959dc60] [c060fd64] igmp_group_added+0x164/0x190 [ 89.420878] [e959dca0] [c060ffa8] ip_mc_inc_group+0x218/0x460 [ 89.426617] [e959dce0] [c06120fc] ip_mc_up+0x3c/0x190 [ 89.431662] [e959dd10] [c0607270] inetdev_event+0x250/0x620 [ 89.437227] [e959dd50] [c005f190] notifier_call_chain+0x80/0xf0 [ 89.443138] [e959dd80] [c0573a74] __dev_notify_flags+0x54/0xf0 [ 89.448964] [e959dda0] [c05743f8] dev_change_flags+0x48/0x60 [ 89.454615] [e959ddc0] [c0606744] devinet_ioctl+0x544/0x8a0 [ 89.460180] [e959de10] [c060987c] inet_ioctl+0x9c/0x1f0 [ 89.465400] [e959de80] [c05479a8] sock_ioctl+0x168/0x460 [ 89.470708] [e959ded0] [c01cf3ec] do_vfs_ioctl+0xac/0x8c0 [ 89.476099] [e959df20] [c01cfc40] SyS_ioctl+0x40/0xc0 [ 89.481147] [e959df40] [c0011318] ret_from_syscall+0x0/0x3c [ 89.486715] --- interrupt: c01 at 0x1006943c [ 89.486715] LR = 0x100c45ec Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Acked-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07Merge branch 'rhltable-dups'David S. Miller
Paul Blakey says: ==================== rhashtable: Fix rhltable duplicates insertion On our mlx5 driver fs_core.c, we use the rhltable interface to store flow groups. We noticed that sometimes we get a warning that flow group isn't found at removal. This rare case was caused when a specific scenario happened, insertion of a flow group with a similar match criteria (a duplicate), but only where the flow group rhash_head was second (or not first) on the relevant rhashtable bucket list. The first patch fixes it, and the second one adds a test that show it is now working. Paul. v3 --> v2 changes: * added missing fix in rhashtable_lookup_one code path as well. v1 --> v2 changes: * Changed commit messages to better reflect the change ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07test_rhashtable: add test case for rhltable with duplicate objectsPaul Blakey
Tries to insert duplicates in the middle of bucket's chain: bucket 1: [[val 21 (tid=1)]] -> [[ val 1 (tid=2), val 1 (tid=0) ]] Reuses tid to distinguish the elements insertion order. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07rhashtable: Fix rhlist duplicates insertionPaul Blakey
When inserting duplicate objects (those with the same key), current rhlist implementation messes up the chain pointers by updating the bucket pointer instead of prev next pointer to the newly inserted node. This causes missing elements on removal and travesal. Fix that by properly updating pprev pointer to point to the correct rhash_head next pointer. Issue: 1241076 Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7 Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface') Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07ARM: ux500: Fix PMU IRQ regressionLinus Walleij
Commit 2b05f6ae1ee5 ("ARM: ux500: remove PMU IRQ bouncer") deleted some code to bounce and work around the weird PMU IRQs in the DB8500 ASIC, but did a semantic mistake: since the auxdata was now unused, the call to of_platform_populate() was removed, but this does not work: the default platform population will only kick in if .init_machine() is assigned NULL, and since the U8540 was still using the callback that was not the case. Fix this by reinstating the call to of_platform_populate(), but pass NULL as auxdata. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 2b05f6ae1ee5 ("ARM: ux500: remove PMU IRQ bouncer") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-07dt-bindings: net: renesas-ravb: Make stream buffer optionalGeert Uytterhoeven
The Stream Buffer for EtherAVB-IF (STBE) is an optional component, and is not present on all SoCs. Document this in the DT bindings, including a list of SoCs that do have it. Fixes: 785ec87483d1e24a ("ravb: document R8A77970 bindings") Fixes: f231c4178a655b09 ("dt-bindings: net: renesas-ravb: Add support for R8A77995 RAVB") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-07Merge remote-tracking branches 'regulator/fix/resume' and ↵Mark Brown
'regulator/fix/stm32-vfrefbuf' into regulator-linus
2018-03-07brcmfmac: fix P2P_DEVICE ethernet address generationArend Van Spriel
The firmware has a requirement that the P2P_DEVICE address should be different from the address of the primary interface. When not specified by user-space, the driver generates the MAC address for the P2P_DEVICE interface using the MAC address of the primary interface and setting the locally administered bit. However, the MAC address of the primary interface may already have that bit set causing the creation of the P2P_DEVICE interface to fail with -EBUSY. Fix this by using a random address instead to determine the P2P_DEVICE address. Cc: stable@vger.kernel.org # 3.10.y Reported-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-03-07brcmfmac: add possibility to obtain firmware errorArend Van Spriel
The feature module needs to evaluate the actual firmware error return upon a control command. This adds a flag to struct brcmf_if that the caller can set. This flag is checked to determine the error code that needs to be returned. Fixes: b69c1df47281 ("brcmfmac: separate firmware errors from i/o errors") Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-03-07fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in ↵Peter Malone
sbusfb_ioctl_helper(). Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper(). 'index' is defined as an int in sbusfb_ioctl_helper(). We retrieve this from the user: if (get_user(index, &c->index) || __get_user(count, &c->count) || __get_user(ured, &c->red) || __get_user(ugreen, &c->green) || __get_user(ublue, &c->blue)) return -EFAULT; and then we use 'index' in the following way: red = cmap->red[index + i] >> 8; green = cmap->green[index + i] >> 8; blue = cmap->blue[index + i] >> 8; This is a classic information leak vulnerability. 'index' should be an unsigned int, given its usage above. This patch is straight-forward; it changes 'index' to unsigned int in two switch-cases: FBIOGETCMAP_SPARC && FBIOPUTCMAP_SPARC. This patch fixes CVE-2018-6412. Signed-off-by: Peter Malone <peter.malone@gmail.com> Acked-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2018-03-07Merge tag 'v4.16-rc4' of ↵Bartlomiej Zolnierkiewicz
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into fbdev-for-next Linux 4.16-rc4
2018-03-07ovl: update Kconfig textsMiklos Szeredi
Add some hints about overlayfs kernel config options. Enabling NFS export by default is especially recommended against, as it incurs a performance penalty even if the filesystem is not actually exported. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-03-07Revert "nvme: create 'slaves' and 'holders' entries for hidden controllers"Christoph Hellwig
This reverts commit e9a48034d7d1318ece7d4a235838a86c94db9d68. The slaves and holders link for the hidden gendisks confuse lsblk so that it errors out on, or doesn't report the nvme multipath devices. Given that we don't need holder relationships for something that can't even be directly accessed we should just stop creating those links. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: stable@vger.kernel.org Signed-off-by: Keith Busch <keith.busch@intel.com>
2018-03-07xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_protoYossi Kuperman
Artem Savkov reported that commit 5efec5c655dd leads to a packet loss under IPSec configuration. It appears that his setup consists of a TUN device, which does not have a MAC header. Make sure MAC header exists. Note: TUN device sets a MAC header pointer, although it does not have one. Fixes: 5efec5c655dd ("xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version") Reported-by: Artem Savkov <artem.savkov@gmail.com> Tested-by: Artem Savkov <artem.savkov@gmail.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-03-07Merge tag 'perf-urgent-for-mingo-4.16-20180306' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Be more robust when drawing arrows in the annotation TUI, avoiding a segfault when jump instructions have as a target addresses in functions other that the one currently being annotated. The full fix will come in the following days, when jumping to other functions will work as call instructions (Arnaldo Carvalho de Melo) - Prevent auxtrace_queues__process_index() from queuing AUX area data for decoding when the --no-itrace option has been used (Adrian Hunter) - Sync copy of kvm UAPI headers and x86's cpufeatures.h (Arnaldo Carvalho de Melo) - Fix 'perf stat' CSV output format for non-supported counters (Ilya Pronin) - Fix crash in 'perf record|perf report' pipe mode (Jiri Olsa) - Fix annoying 'perf top' overwrite fallback message on older kernels (Kan Liang) - Fix the usage on the 'perf kallsyms' man page (Sangwon Hong) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/entry/64/compat: Save one instruction in entry_INT80_compat()Dominik Brodowski
As %rdi is never user except in the following push, there is no need to restore %rdi to the original value. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/entry: Do not special-case clone(2) in compat entryDominik Brodowski
With the CPU renaming registers on its own, and all the overhead of the syscall entry/exit, it is doubtful whether the compiled output of mov %r8, %rax mov %rcx, %r8 mov %rax, %rcx jmpq sys_clone is measurably slower than the hand-crafted version of xchg %r8, %rcx So get rid of this special case. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls: Use COMPAT_SYSCALL_DEFINEx() macros for x86-only compat syscallsDominik Brodowski
While at it, convert declarations of type "unsigned" to "unsigned int". Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls: Use proper syscall definition for sys_ioperm()Dominik Brodowski
Using SYSCALL_DEFINEx() is recommended, so use it also here. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/entry: Remove stale syscall prototypeDominik Brodowski
sys32_vm86_warning() is long gone. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07x86/syscalls/32: Simplify $entry == $compat entriesDominik Brodowski
If the compat entry point is equivalent to the native entry point, it does not need to be specified explicitly. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: luto@amacapital.net Cc: viro@zeniv.linux.org.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-07objtool: Fix 32-bit buildJosh Poimboeuf
Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit host. This also simplifies read_retpoline_hints() a bit and makes its implementation similar to most of the other annotation reading functions. Reported-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: b5bc2231b8ad ("objtool: Add retpoline validation") Link: http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-06RDMA/bnxt_re: Avoid Hard lockup during error CQE processingSelvin Xavier
Hitting the following hardlockup due to a race condition in error CQE processing. [26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req [26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa [26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 [26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace [26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded [26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, [26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] [26156.472639] Call Trace: [26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b [26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140 [26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100 [26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20 [26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510 [26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50 [26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150 [26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460 [26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e [26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf [26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30 [26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re] [26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re] [26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re] The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to put the error QP in flush list. Since SQ and RQ of each error qp are added to two different flush list, we need to protect it using locks of corresponding CQs. Difference in order of acquiring the lock in SQ poll_cq and RQ poll_cq can cause a hard lockup. Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock. Instead of this lock, introduces qplib_cq.flush_lock to handle addition/deletion of QPs in flush list. Also, always invoke the flush_lock in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential deadlock. Other than the poll_cq context, the movement of QP to/from flush list can be done in modify_qp context or from an async error event from HW. Synchronize these operations using the bnxt_re verbs layer CQ locks. To achieve this, adds a call back to the HW abstraction layer(qplib) to bnxt_re ib_verbs layer in case of async error event. Also, removes the buddy cq functions as it is no longer required. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06RDMA/core: Reduce poll batch for direct cq pollingMax Gurtovoy
Fix warning limit for kernel stack consumption: drivers/infiniband/core/cq.c: In function 'ib_process_cq_direct': drivers/infiniband/core/cq.c:78:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Using smaller ib_wc array on the stack brings us comfortably below that limit again. Fixes: 246d8b184c10 ("IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct") Reported-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sergey Gorenko <sergeygo@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()Dan Carpenter
"err" is either zero or possibly uninitialized here. It should be -EINVAL. Fixes: 427c1e7bcd7e ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06IB/mlx5: When not in dual port RoCE mode, use provided port as nativeMark Bloch
The series that introduced dual port RoCE mode assumed that we don't have a dual port HCA that use the mlx5 driver, this is not the case for Connect-IB HCAs. This reasoning led to assigning 1 as the native port index which causes issue when the second port is used. For example query_pkey() when called on the second port will return values of the first port. Make sure that we assign the right port index as the native port index. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06IB/mlx4: Include GID type when deleting GIDs from HW table under RoCEJack M
The commit cited below added a gid_type field (RoCEv1 or RoCEv2) to GID properties. When adding GIDs, this gid_type field was copied over to the hardware gid table. However, when deleting GIDs, the gid_type field was not copied over to the hardware gid table. As a result, when running RoCEv2, all RoCEv2 gids in the hardware gid table were set to type RoCEv1 when any gid was deleted. This problem would persist until the next gid was added (which would again restore the gid_type field for all the gids in the hardware gid table). Fix this by copying over the gid_type field to the hardware gid table when deleting gids, so that the gid_type of all remaining gids is preserved when a gid is deleted. Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDsJack Morgenstein
When using IPv4 addresses in RoCEv2, the GID format for the mapped IPv4 address should be: ::ffff:<4-byte IPv4 address>. In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords zeroed out by memset, which resulted in deleting the ffff field. However, since procedure ipv6_addr_v4mapped() already verifies that the gid has format ::ffff:<ipv4 address>, no change is needed for the gid, and the memset can simply be removed. Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06scsi: sd: Keep disk read-only when re-reading partitionJeremy Cline
If the read-only flag is true on a SCSI disk, re-reading the partition table sets the flag back to false. To observe this bug, you can run: 1. blockdev --setro /dev/sda 2. blockdev --rereadpt /dev/sda 3. blockdev --getro /dev/sda This commit reads the disk's old state and combines it with the device disk-reported state rather than unconditionally marking it as RW. Reported-by: Li Ning <lining916740672@icloud.com> Signed-off-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-06scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failureBill Kuzeja
Because of the shifting around of code in qla2x00_probe_one recently, failures during adapter initialization can lead to problems, i.e. NULL pointer crashes and doubly freed data structures which cause eventual panics. This V2 version makes the relevant memory free routines idempotent, so repeat calls won't cause any harm. I also removed the problematic probe_init_failed exit point as it is not needed. Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure") Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-06scsi: sd_zbc: Fix potential memory leakDamien Le Moal
Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early return if sd_zbc_report_zones() fails. Reported-by: David.butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-06scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIMHannes Reinecke
The firmware event workqueue should not be marked as WQ_MEM_RECLAIM as it's doesn't need to make forward progress under memory pressure. In the current state it will result in a deadlock if the device had been forcefully removed. Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-06RDMA/qedr: Fix iWARP write and send with immediateKalderon, Michal
iWARP does not support RDMA WRITE or SEND with immediate data. Driver should check this before submitting to FW and return an immediate error Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06RDMA/qedr: Fix kernel panic when running fio over NFSoRDMAKalderon, Michal
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock, leading to a case where two context's accessing poll_cq at the same time lead to one of them having a pointer to an old latest_cqe and reading an invalid cqe element Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06RDMA/qedr: Fix iWARP connect with port mapperKalderon, Michal
Fix iWARP connect and listen to use the mapped port for ipv4 and ipv6. Without this fixed, running on a server that has iwpmd enabled will not use the correct port Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06RDMA/qedr: Fix ipv6 destination address resolutionKalderon, Michal
The wrong parameter was passed to dst_neigh_lookup Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06dm table: allow upgrade from bio-based to specialized bio-based variantMike Snitzer
In practice this is really only meaningful in the context of the DM multipath target (which uses dm_table_set_type() to set the type of device DM should create via its "queue_mode" option). So this change allows a DM multipath device with "queue_mode bio" to be upgraded from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED -- iff the underlying device(s) are NVMe. DM_TYPE_NVME_BIO_BASED is just a DM core implementation detail that allows for NVMe-specific optimizations (e.g. use direct_make_request instead of generic_make_request). If in the future there is no benefit or need to distinguish NVMe vs not: then it will be removed. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-03-06dm mpath: remove unnecessary NVMe branching in favor of scsi_dh checksMike Snitzer
This eliminates the "queue_mode" configuration's "nvme" mode. There wasn't anything NVMe-specific about that mode. It was named "nvme" because it was a short name for the mode. But the entire point of the mode was to optimize the multipath target for underlying devices that are _not_ SCSI-based. Devices that aren't SCSI have no need for the various SCSI device handler (scsi_dh) specific code in DM multipath. But rather than narrowly define this scsi_dh vs not branching in terms of "nvme": invert the logic so that we're just checking whether a multipath device is layered on SCSI devices with scsi_dh attached. This allows any future storage technology to avoid scsi_dh specific code in the multipath target too. Fixes: 848b8aefd4 ("dm mpath: optimize NVMe bio-based support") Suggested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-03-06dm table: fix "nvme" testMikulas Patocka
The strncmp function should compare 4 bytes. Fixes: 22c11858e8002 ("dm: introduce DM_TYPE_NVME_BIO_BASED") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>