git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-12-14	tools/kvm_stat: handle invalid regular expressions	Stefan Raspl
	Passing an invalid regular expression on the command line results in a traceback. Note that interactive specification of invalid regular expressions is not affected To reproduce, run "kvm_stat -f '*'". Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: add hint on '-f help' to man page	Stefan Raspl
	The man page update for this new functionality was omitted. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: fix child trace events accounting	Stefan Raspl
	Child trace events were included in calculation of the overall total, which is used for calculation of the percentages of the '%Total' column. However, the parent trace envents' stats summarize the child trace events, hence we'd incorrectly account for them twice, leading to slightly wrong stats. With this fix, we use the correct total. Consequently, the sum of the child trace events' '%Total' column values is identical to the respective value of the respective parent event. However, this also means that the sum of the '%Total' column values will aggregate to more than 100 percent. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: fix extra handling of 'help' with fields filter	Stefan Raspl
	Commit 67fbcd62f54d ("tools/kvm_stat: add '-f help' to get the available event list") added support for '-f help'. However, the extra handling of 'help' will also take effect when 'help' is specified as a regex in interactive mode via 'f'. This results in display of all events while only those matching this regex should be shown. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: fix missing field update after filter change	Stefan Raspl
	When updating the fields filter, tracepoint events of fields previously not visible were not enabled, as TracepointProvider.update_fields() updated the member variable directly instead of using the setter, which triggers the event enable/disable. To reproduce, run 'kvm_stat -f kvm_exit', press 'c' to remove the filter, and notice that no add'l fields that do not match the regex 'kvm_exit' will appear. This issue was introduced by commit c469117df059 ("tools/kvm_stat: simplify initializers"). Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: fix drilldown in events-by-guests mode	Stefan Raspl
	When displaying debugfs events listed by guests, an attempt to switch to reporting of stats for individual child trace events results in garbled output. Reason is that when toggling drilldown, the update of the stats doesn't honor when events are displayed by guests, as indicated by Tui._display_guests. To reproduce, run 'kvm_stat -d' and press 'b' followed by 'x'. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	tools/kvm_stat: fix command line option '-g'	Stefan Raspl
	Specifying a guest via '-g foo' always results in an error: $ kvm_stat -g foo Usage: kvm_stat [options] kvm_stat: error: Error while searching for guest "foo", use "-p" to specify a pid instead Reason is that Tui.get_pid_from_gname() is not static, as it is supposed to be. Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	kvm: x86: fix WARN due to uninitialized guest FPU state	Peter Xu
	------------[ cut here ]------------ Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers. WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90 CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10 RIP: 0010:ex_handler_fprestore+0x88/0x90 Call Trace: fixup_exception+0x4e/0x60 do_general_protection+0xff/0x270 general_protection+0x22/0x30 RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm] RSP: 0018:ffff8803d5627810 EFLAGS: 00010246 kvm_vcpu_reset+0x3b4/0x3c0 [kvm] kvm_apic_accept_events+0x1c0/0x240 [kvm] kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 do_syscall_64+0x15f/0x600 where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu. To fix it, move kvm_load_guest_fpu to the very beginning of kvm_arch_vcpu_ioctl_run. Cc: stable@vger.kernel.org Fixes: f775b13eedee2f7f3c6fdd4e90fb79090ce5d339 Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	KVM: X86: Fix load RFLAGS w/o the fixed bit	Wanpeng Li
	* Guest State * CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871 CR3 = 0x00000000fffbc000 RSP = 0x0000000000000000 RIP = 0x0000000000000000 RFLAGS=0x00000000 DR7 = 0x0000000000000400 ^^^^^^^^^^ The failed vmentry is triggered by the following testcase when ept=Y: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_regs regs = { .rflags = 0, }; ioctl(r[4], KVM_SET_REGS, &regs); ioctl(r[4], KVM_RUN, 0); } X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1 of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails. This patch fixes it by oring X86_EFLAGS_FIXED during ioctl. Cc: stable@vger.kernel.org Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Quan Xu <quan.xu0@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	KVM: MMU: Fix infinite loop when there is no available mmu page	Wanpeng Li
	The below test case can cause infinite loop in kvm when ept=0. #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); ioctl(r[4], KVM_RUN, 0); } It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in kvm return 1 when kvm fails to allocate root page table which can result in beblow infinite loop: vcpu_run() { for (;;) { r = vcpu_enter_guest()::kvm_mmu_reload() returns 1 if (r <= 0) break; if (need_resched()) cond_resched(); } } This patch fixes it by returning -ENOSPC when there is no available kvm mmu page for root page table. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Fixes: 26eeb53cf0f (KVM: MMU: Bail out immediately if there is no available mmu page) Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14	drm/drm_lease: Prevent deadlock in case drm_lease_create() fails	Marius Vlad
	This case can been seen when creating the lease with the same objects passed. [ 605.515097] 2 locks held by testapp/3337: [ 605.519027] #0: (&dev->mode_config.idr_mutex){......}, at: [<ffff0000085f1664>] drm_mode_create_lease_ioctl+0x384/0x858 [ 605.530045] #1: (&dev->mode_config.idr_mutex){......}, at: [<ffff0000085f11bc>] drm_lease_destroy+0x2c/0x110 Which was causing the process to hang: [ 605.398827] [<ffff0000080856cc>] __switch_to+0x94/0xa8 [ 605.404030] [<ffff000008c05d00>] __schedule+0x1b0/0x698 [ 605.409322] [<ffff000008c06224>] schedule+0x3c/0xa8 [ 605.414260] [<ffff000008c06628>] schedule_preempt_disabled+0x20/0x38 [ 605.420677] [<ffff000008c07370>] mutex_lock_nested+0x158/0x340 [ 605.426572] [<ffff0000085f11bc>] drm_lease_destroy+0x2c/0x110 [ 605.432389] [<ffff0000085cecf0>] drm_master_put+0xc0/0xc8 [ 605.437845] [<ffff0000085f175c>] drm_mode_create_lease_ioctl+0x47c/0x858 [ 605.444612] [<ffff0000085d4460>] drm_ioctl+0x198/0x448 [ 605.449811] [<ffff000008201134>] do_vfs_ioctl+0xa4/0x748 [ 605.455192] [<ffff000008201864>] SyS_ioctl+0x8c/0xa0 [ 605.460216] [<ffff000008082f4c>] __sys_trace_return+0x0/0x4 drm_mode_create_lease_ioctl() calls drm_lease_create() which acquires a lock on dev->mode_config.idr_mutex. In case of failure, drm_lease_create() calls drm_master_put() which in turn tries to acquire the same lock when calling drm_lease_destroy(). v2: - Reverse the order at exit in case of fail, so that unlocking takes place before dropping the reference. - Include detail information about deadlock (Daniel Vetter) Signed-off-by: Marius Vlad <marius-cristian.vlad@nxp.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171213181048.32719-1-marius-cristian.vlad@nxp.com
2017-12-13	Merge tag 'xfs-4.15-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds
	Pull xfs fixes from Darrick Wong: "Here are a few more bug fixes & cleanups for 4.15-rc4: - clean up duplicate includes - remove ancient 'no-alloc' crap code that occasionally caused hard fs shutdowns due to lack of proper space reservations - fix regression in FIEMAP behavior when reporting xattr extents" * tag 'xfs-4.15-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: make iomap_begin functions trim iomaps consistently xfs: remove "no-allocation" reservations for file creations fs: xfs: remove duplicate includes
2017-12-13	Merge tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux Pull RISC-V fixes from Palmer Dabbelt: "This contains three small fixes: - A fix to a typo in sys_riscv_flush_icache. This only effects error handling, but I think it's a small and obvious enough change that it's sane outside the merge window. - The addition of smp_mb__after_spinlock(), which was recently removed due to an incorrect comment. This is largly a comment change (as there's a big one now), and while it's necessary for complience with the RISC-V memory model the lack of this fence shouldn't manifest as a bug on current implementations. Nonetheless, it still seems saner to have the fence in 4.15. - The removal of some of the HVC_RISCV_SBI driver that snuck into the arch port. This is compile-time dead code in 4.15 (as the driver isn't in yet), and during the review process we found a better way to implement early printk on RISC-V. While this change doesn't do anything, it will make staging our HVC driver easier: without this change the HVC driver we hope to upstream won't build on 4.15 (because the 4.15 arch code would reference a function that no longer exists). I don't think this is the last patch set we'll want for 4.15: I think I'll want to remove some of the first-level irqchip driver that snuck in as well, which will look a lot like the HVC patch here. This is pending some asm-generic cleanup I'm doing that I haven't quite gotten clean enough to send out yet, though, but hopefully it'll be ready by next week (and still OK for that late)" * tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: RISC-V: Remove unused CONFIG_HVC_RISCV_SBI code RISC-V: Resurrect smp_mb__after_spinlock() RISC-V: Logical vs Bitwise typo
2017-12-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2017-12-13 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Addition of explicit scheduling points to map alloc/free in order to avoid having to hold the CPU for too long, from Eric. 2) Fixing of a corruption in overlapping perf_event_output calls from different BPF prog types on the same CPU out of different contexts, from Daniel. 3) Fallout fixes for recent correction of broken uapi for BPF_PROG_TYPE_PERF_EVENT. um had a missing asm header that needed to be pulled in from asm-generic and for BPF selftests the asm-generic include did not work, so similar asm include scheme was adapted for that problematic header that perf is having with other header files under tools, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	drm: rework delayed connector cleanup in connector_iter	Daniel Vetter
	PROBE_DEFER also uses system_wq to reprobe drivers, which means when that again fails, and we try to flush the overall system_wq (to get all the delayed connectore cleanup work_struct completed), we deadlock. Fix this by using just a single cleanup work, so that we can only flush that one and don't block on anything else. That means a free list plus locking, a standard pattern. v2: - Correctly free connectors only on last ref. Oops (Chris). - use llist_head/node (Chris). v3 - Add init_llist_head (Chris). Fixes: a703c55004e1 ("drm: safely free connectors from connector_iter") Fixes: 613051dac40d ("drm: locking&new iterators for connector_list") Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sean Paul <seanpaul@chromium.org> Cc: <stable@vger.kernel.org> # v4.11+: 613051dac40d ("drm: locking&new iterators for connector_list" Cc: <stable@vger.kernel.org> # v4.11+ Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: David Airlie <airlied@linux.ie> Cc: Javier Martinez Canillas <javier@dowhile0.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Guillaume Tucker <guillaume.tucker@collabora.com> Cc: Mark Brown <broonie@kernel.org> Cc: Kevin Hilman <khilman@baylibre.com> Cc: Matt Hart <matthew.hart@linaro.org> Cc: Thierry Escande <thierry.escande@collabora.co.uk> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Enric Balletbo i Serra <enric.balletbo@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171213124936.17914-1-daniel.vetter@ffwll.ch
2017-12-13	bpf/tracing: fix kernel/events/core.c compilation error	Yonghong Song
	Commit f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") introduced a perf ioctl command to query prog array attached to the same perf tracepoint. The commit introduced a compilation error under certain config conditions, e.g., (1). CONFIG_BPF_SYSCALL is not defined, or (2). CONFIG_TRACING is defined but neither CONFIG_UPROBE_EVENTS nor CONFIG_KPROBE_EVENTS is defined. Error message: kernel/events/core.o: In function `perf_ioctl': core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array' This patch fixed this error by guarding the real definition under CONFIG_BPF_EVENTS and provided static inline dummy function if CONFIG_BPF_EVENTS was not defined. It renamed the function from bpf_event_query_prog_array to perf_event_query_prog_array and moved the definition from linux/bpf.h to linux/trace_events.h so the definition is in proximity to other prog_array related functions. Fixes: f371b304f12e ("bpf/tracing: allow user space to query prog array on the same tp") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-13	Merge branch 'mlx4-misc-fixes'	David S. Miller
	Tariq Toukan says: ==================== mlx4 misc fixes This patchset contains misc bug fixes from the team to the mlx4 Core and Eth drivers. Patch 1 by Eugenia fixes an MTU issue in selftest. Patch 2 by Eran fixes an accounting issue in the resource tracker. Patch 3 by Eran fixes a race condition that causes counter inconsistency. Series generated against net commit: 200809716aed fou: fix some member types in guehdr v2: Patch 2: Add reviewer credit, rephrase commit message. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net/mlx4_en: Fill all counters under one call of stats lock	Eran Ben Elisha
	Before this patch, the stats_lock was acquired twice. In between the locks Driver sent command to gather some more statistics (per priority and counter statistics). If the stats lock was acquired by get statistics NDO in between we would have report out of sync counters. Fix this by collecting all stats from Firmware in advance and then fill the Software structs under one lock. Fixes: 0b131561a7d6 ("net/mlx4_en: Add Flow control statistics display via ethtool") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net/mlx4_core: Fix wrong calculation of free counters	Eran Ben Elisha
	The field res_free indicates the total number of counters which are available for allocation (reserved and unreserved). Fixed a bug where the reserved counters were subtracted from res_free before any allocation was performed. Before this fix, free counters which were not reserved could not be allocated. Fixes: 9de92c60beaa ("net/mlx4_core: Adjust counter grant policy in the resource tracker") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net/mlx4_en: Fix selftest for small MTUs	Eugenia Emantayev
	Set the minimal MTU threshold for running loopback selftest. MTU should be big enough to include packet payload, NET_IP_ALIGN, Ethernet headers and preamble length. Fixes: e7c1c2c46201 ("mlx4_en: Added self diagnostics test implementation") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: dsa: lan9303: Introduce lan9303_read_wait	Egil Hjelmeland
	Simplify lan9303_indirect_phy_wait_for_completion() and lan9303_switch_wait_for_completion() by using a new function lan9303_read_wait() Changes v1 -> v2: - param 'mask' type u32 - removed param 'value' (will probably never be used) - add newline before return Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: marvell: avoid configuring fiber page for SGMII-to-Copper	Russell King
	When in SGMII-to-Copper mode, the fiber page is used for the MAC facing link, and does not require configuration of the fiber auto-negotiation settings. Avoid trying. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	dwc-xlgmac: Add co-maintainer	Jie Deng
	Jose Abreu will join to maintain dwc-xlgmac. He will help with new feature development for this driver. Thanks Jose and welcome on board! Signed-off-by: Jie Deng <jiedeng@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	tcp: refresh tcp_mstamp from timers callbacks	Eric Dumazet
	Only the retransmit timer currently refreshes tcp_mstamp We should do the same for delayed acks and keepalives. Even if RFC 7323 does not request it, this is consistent to what linux did in the past, when TS values were based on jiffies. Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Mike Maloney <maloney@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Mike Maloney <maloney@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	tcp: fix potential underestimation on rcv_rtt	Wei Wang
	When ms timestamp is used, current logic uses 1us in tcp_rcv_rtt_update() when the real rcv_rtt is within 1 - 999us. This could cause rcv_rtt underestimation. Fix it by always using a min value of 1ms if ms timestamp is used. Fixes: 645f4c6f2ebd ("tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	Merge branch 'hv_netvsc-minor-changes'	David S. Miller
	Stephen Hemminger says: ==================== hv_netvsc: minor changes This includes minor cleanup of code in send and receive path and also a new statistic to check for allocation failures. This also eliminates some of the extra RCU when not needed. There is a theoritical bug where buffered data could be blocked for longer than necessary if the ring buffer got full. This has not been seen in the wild, found by inspection. The reference count between net device and internal RNDIS is not needed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: empty current transmit aggregation if flow blocked	Stephen Hemminger
	If the transmit queue is known full, then don't keep aggregating data. And the cp_partial flag which indicates that the current aggregation buffer is full can be folded in to avoid more conditionals. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: remove open_cnt reference count	Stephen Hemminger
	There is only ever a single instance of network device object referencing the internal rndis object. Therefore the open_cnt atomic is not necessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: pass netvsc_device to receive callback	Stephen Hemminger
	The netvsc_receive_callback function was using RCU to find the appropriate underlying netvsc_device. Since calling function already had that pointer, this was unnecessary. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: simplify function args in receive status path	Stephen Hemminger
	The caller (netvsc_receive) already has the net device pointer, and should just pass that to functions rather than the hyperv device. This eliminates several impossible error paths in the process. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: track memory allocation failures in ethtool stats	Stephen Hemminger
	When skb can not be allocated, update ethtool statisitics rather than rx_dropped which is intended for netif_receive. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	hv_netvsc: copy_to_send buf can be void	Stephen Hemminger
	Since only caller does not care about return value. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	Merge branch 'phylink-dsa-prep'	David S. Miller
	Florian Fainelli says: ==================== PHYLINK preparatory patches for DSA In preparation for having DSA migrate to PHYLINK, I had to come up with a number of preparatory patches: - we need to be able to pass phy_flags from an external component calling phylink_of_phy_connect() - DSA tries to connect through OF first, then fallsback using its own internal MDIO bus, in that case we would both show an error, but also not know what the correct phy_interface_t would be, instead use the PHY device/driver provided one - Finally bcm_sf2 makes use of all possible PHYs out there: internal, external, fixed, and MoCA, the latter requires a bit of help to signal link notifications through a MMIO interrupt, as well a report a correct PORT type Changes in v2: - rebased against latest net-next/master - added kernel doc documentation - dropped error message in phylink_of_phy_connect() as suggested by Russell ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: phylink: Report MoCA as PORT_BNC	Florian Fainelli
	Similarly to what PHYLIB already does, make sure that PHY_INTERFACE_MODE_MOCA is reported as PORT_BNC. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: phylink: Allow setting a custom link state callback	Florian Fainelli
	phylink_get_fixed_state() currently consults an optional "link_gpio" GPIO descriptor, expand this mechanism to allow specifying a custom callback. This is necessary to support out of band link notifcation (e.g: from an interrupt within a MMIO register). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: phylink: Remove error message	Florian Fainelli
	Some subsystems like DSA may be trying to connect to a PHY through OF first, and then attempt a connect using a local MDIO bus, remove the error message: "unable to find PHY node" so we can let MAC drivers whether to print it or not. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: phylink: Use PHY device interface if N/A	Florian Fainelli
	We may not always be able to resolve a correct phy_interface_t value before actually connecting to the PHY device, when that happens, just have phylink_connect_phy() utilize what the PHY device/driver provided. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: phylink: Allow specifying PHY device flags	Florian Fainelli
	In order to let subsystems like DSA fully utilize PHYLINK, we need to be able to communicate phy_device::flags from of_phy_{connect,attach} even when using PHYLINK APIs. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	tcp: pause Fast Open globally after third consecutive timeout	Yuchung Cheng
	Prior to this patch, active Fast Open is paused on a specific destination IP address if the previous connections to the IP address have experienced recurring timeouts . But recent experiments by Microsoft (https://goo.gl/cykmn7) and Mozilla browsers indicate the isssue is often caused by broken middle-boxes sitting close to the client. Therefore it is much better user experience if Fast Open is disabled out-right globally to avoid experiencing further timeouts on connections toward other destinations. This patch changes the destination-IP disablement to global disablement if a connection experiencing recurring timeouts or aborts due to timeout. Repeated incidents would still exponentially increase the pause time, starting from an hour. This is extremely conservative but an unfortunate compromise to minimize bad experience due to broken middle-boxes. Reported-by: Dragana Damjanovic <ddamjanovic@mozilla.com> Reported-by: Patrick McManus <mcmanus@ducksong.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Wei Wang <weiwan@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: ethernet: ti: cpdma: correct error handling for chan create	Ivan Khoronzhuk
	It's not correct to return NULL when that is actually an error and function returns errors in any other wrong case. In the same time, the cpsw driver and davinci emac doesn't check error case while creating channel and it can miss actual error. Also remove WARNs replacing them on dev_err msgs. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	cxgb4: Add support for ethtool i2c dump	Arjun Vynipadath
	Adds support for ethtool get_module_info() and get_module_eeprom() callbacks that will dump necessary information for a SFP. Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	skge: remove redundunt free_irq under spinlock	Stephen Hemminger
	The code to handle multi-port SKGE boards was freeing IRQ twice. The first one was under lock and might sleep. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: avoid skb_warn_bad_offload on IS_ERR	Willem de Bruijn
	skb_warn_bad_offload warns when packets enter the GSO stack that require skb_checksum_help or vice versa. Do not warn on arbitrary bad packets. Packet sockets can craft many. Syzkaller was able to demonstrate another one with eth_type games. In particular, suppress the warning when segmentation returns an error, which is for reasons other than checksum offload. See also commit 36c92474498a ("net: WARN if skb_checksum_help() is called on skb requiring segmentation") for context on this warning. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: sk_pacing_shift_update() helper	Eric Dumazet
	In commit 3a9b76fd0db9 ("tcp: allow drivers to tweak TSQ logic") I gave a code sample to set sk->sk_pacing_shift that was not complete. Better add a helper that can be used by drivers without worries, and maybe amended in the future. A wifi driver might use it from its ndo_start_xmit() Following call would setup TCP to allow up to ~8ms of queued data per flow. sk_pacing_shift_update(skb->sk, 7); Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: bridge: use rhashtable for fdbs	Nikolay Aleksandrov
	Before this patch the bridge used a fixed 256 element hash table which was fine for small use cases (in my tests it starts to degrade above 1000 entries), but it wasn't enough for medium or large scale deployments. Modern setups have thousands of participants in a single bridge, even only enabling vlans and adding a few thousand vlan entries will cause a few thousand fdbs to be automatically inserted per participating port. So we need to scale the fdb table considerably to cope with modern workloads, and this patch converts it to use a rhashtable for its operations thus improving the bridge scalability. Tests show the following results (10 runs each), at up to 1000 entries rhashtable is ~3% slower, at 2000 rhashtable is 30% faster, at 3000 it is 2 times faster and at 30000 it is 50 times faster. Obviously this happens because of the properties of the two constructs and is expected, rhashtable keeps pretty much a constant time even with 10000000 entries (tested), while the fixed hash table struggles considerably even above 10000. As a side effect this also reduces the net_bridge struct size from 3248 bytes to 1344 bytes. Also note that the key struct is 8 bytes. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: meson-gxl: make function meson_gxl_read_status static	Colin Ian King
	The function meson_gxl_read_status is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: symbol 'meson_gxl_read_status' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: marvell10g: remove XGMII as an option for 88x3310	Russell King
	Remove XGMII as an option for the 88x3310 PHY driver, as the PHY doesn't support XGMII's 32-bit data lanes. It supports USXGMII, which is not XGMII, but a single-lane serdes interface - see https://developer.cisco.com/site/usgmii-usxgmii/ Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	of_mdio / mdiobus: ensure mdio devices have fwnode correctly populated	Russell King
	Ensure that all mdio devices populate the struct device fwnode pointer as well as the of_node pointer to allow drivers that wish to use fwnode APIs to work. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	net: phy: fix resume handling	Russell King
	When a PHY has the BMCR_PDOWN bit set, it may decide to ignore writes to other registers, or reset the registers to power-on defaults. Micrel PHYs do this for their interrupt registers. The current structure of phylib tries to enable interrupts before resuming (and releasing) the BMCR_PDOWN bit. This fails, causing Micrel PHYs to stop working after a suspend/resume sequence if they are using interrupts. Fix this by ensuring that the PHY driver resume methods do not take the phydev->lock mutex themselves, but the callers of phy_resume() take that lock. This then allows us to move the call to phy_resume() before we enable interrupts in phy_start(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-13	ARM: dts: vf610-zii-dev: use XAUI for DSA link ports	Russell King
	Use XAUI rather than XGMII for DSA link ports, as this is the interface mode that the switches actually use. XAUI is the 4 lane bus with clock per direction, whereas XGMII is a 32 bit bus with clock. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>