linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-05-23	net: phy: lxt: Add suspend/resume support to LXT971 and LXT973.	Christophe Leroy
	All LXT PHYs implement the standard "power down" bit 11 of BMCR, so this patch adds support using the generic genphy_{suspend,resume} functions added by commit 0f0ca340e57b ("phy: power management support"). LXT970 is left aside because all registers get cleared upon "power down" exit. Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	devlink: add warning in case driver does not set port type	Jiri Pirko
	Prevent misbehavior of drivers who would not set port type for longer period of time. Drivers should always set port type. Do WARN if that happens. Note that it is perfectly fine to temporarily not have the type set, during initialization and port type change. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	net: mvpp2: cls: Fix leaked ethtool_rx_flow_rule	Maxime Chevallier
	The flow_rule is only used when configuring the classification tables, and should be free'd once we're done using it. The current code only frees it in the error path. Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	docs: fix multiple doc build warnings in enumeration.rst	Jonathan Corbet
	The conversion of acpi/enumeration.txt to RST included one markup error, leading to many warnings like: .../firmware-guide/acpi/enumeration.rst:430: WARNING: Unexpected indentation. Add the missing colon and create some peace. Fixes: c24bc66e8157 ("Documentation: ACPI: move enumeration.txt to firmware-guide/acpi and convert to reST") Cc: Changbin Du <changbin.du@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-23	lib/list_sort: fix kerneldoc build error	Jonathan Corbet
	Commit 043b3f7b6388 ("lib/list_sort: simplify and remove MAX_LIST_LENGTH_BITS") added some useful kerneldoc info, but also broke the docs build: ./lib/list_sort.c:128: WARNING: Definition list ends without a blank line; unexpected unindent. ./lib/list_sort.c:161: WARNING: Unexpected indentation. ./lib/list_sort.c:162: WARNING: Block quote ends without a blank line; unexpected unindent. Fix the offending literal block and make the error go away. Fixes: 043b3f7b6388 ("lib/list_sort: simplify and remove MAX_LIST_LENGTH_BITS") Cc: George Spelvin <lkml@sdf.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-23	docs: fix numaperf.rst and add it to the doc tree	Jonathan Corbet
	Commit 13bac55ef7ae ("doc/mm: New documentation for memory performance") added numaperf.rst, but did not add it to the TOC tree. There was also an incorrectly marked literal block leading to this warning sequence: numaperf.rst:24: WARNING: Unexpected indentation. numaperf.rst:24: WARNING: Inline substitution_reference start-string without end-string. numaperf.rst:25: WARNING: Block quote ends without a blank line; unexpected unindent. Fix the block and add the file to the document tree. Fixes: 13bac55ef7ae ("doc/mm: New documentation for memory performance") Cc: Keith Busch <keith.busch@intel.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-23	doc: Cope with the deprecation of AutoReporter	Jonathan Corbet
	AutoReporter is going away; recent versions of sphinx emit a warning like: Documentation/sphinx/kerneldoc.py:125: RemovedInSphinx20Warning: AutodocReporter is now deprecated. Use sphinx.util.docutils.switch_source_input() instead. Make the switch. But switch_source_input() only showed up in 1.7, so we have to do ugly version checks to keep things working in older versions. Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-23	doc: Cope with Sphinx logging deprecations	Jonathan Corbet
	Recent versions of sphinx will emit messages like: Documentation/sphinx/kerneldoc.py:103: RemovedInSphinx20Warning: app.warning() is now deprecated. Use sphinx.util.logging instead. Switch to sphinx.util.logging to make this unsightly message go away. Alas, that interface was only added in version 1.6, so we have to add a version check to keep things working with older sphinxes. Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-05-23	Merge tag 'docs-5.2-fixes' of git://git.lwn.net/linux	Linus Torvalds
	Pull documentation fixes from Jonathan Corbet: "A handful of fixes for a docs build problem, along with catching the spdxcheck.py script up with the current state of affairs" * tag 'docs-5.2-fixes' of git://git.lwn.net/linux: Documentation: kdump: fix minor typo scripts/spdxcheck.py: Add dual license subdirectory scripts/spdxcheck.py: Fix path to deprecated licenses counter: fix Documentation build error due to incorrect source file name
2019-05-23	arm64: Handle erratum 1418040 as a superset of erratum 1188873	Marc Zyngier
	We already mitigate erratum 1188873 affecting Cortex-A76 and Neoverse-N1 r0p0 to r2p0. It turns out that revisions r0p0 to r3p1 of the same cores are affected by erratum 1418040, which has the same workaround as 1188873. Let's expand the range of affected revisions to match 1418040, and repaint all occurences of 1188873 to 1418040. Whilst we're there, do a bit of reformating in silicon-errata.txt and drop a now unnecessary dependency on ARM_ARCH_TIMER_OOL_WORKAROUND. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	arm64/module: deal with ambiguity in PRELxx relocation ranges	Ard Biesheuvel
	The R_AARCH64_PREL16 and R_AARCH64_PREL32 relocations are documented as permitting a range of [-2^15 .. 2^16), resp. [-2^31 .. 2^32). It is also documented that this means we cannot detect overflow in some cases, which is bad. Since we always interpret the targets of these relocations as signed quantities (e.g., in the ksymtab handling code), let's tighten the overflow checks so that targets that are out of range for our signed interpretation of the relocated quantity get flagged. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	Merge branch 'bpf-jmp-seq-limit'	Daniel Borkmann
	Alexei Starovoitov says: ==================== Patch 1 - jmp sequence limit Patch 2 - improve existing tests Patch 3 - add pyperf-based realistic bpf program that takes advantage of higher limit and use it as a stress test v1->v2: fixed nit in patch 3. added Andrii's acks ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23	selftests/bpf: add pyperf scale test	Alexei Starovoitov
	Add a snippet of pyperf bpf program used to collect python stack traces as a scale test for the verifier. At 189 loop iterations llvm 9.0 starts ignoring '#pragma unroll' and generates partially unrolled loop instead. Hence use 50, 100, and 180 loop iterations to stress test. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23	selftests/bpf: adjust verifier scale test	Alexei Starovoitov
	Adjust scale tests to check for new jmp sequence limit. BPF_JGT had to be changed to BPF_JEQ because the verifier was too smart. It tracked the known safe range of R0 values and pruned the search earlier before hitting exact 8192 limit. bpf_semi_rand_get() was too (un)?lucky. k = 0; was missing in bpf_fill_scale2. It was testing a bit shorter sequence of jumps than intended. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23	bpf: bump jmp sequence limit	Alexei Starovoitov
	The limit of 1024 subsequent jumps was causing otherwise valid programs to be rejected. Bump it to 8192 and make the error more verbose. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23	ACPI/IORT: Fix build error when IOMMU_SUPPORT is disabled	Lorenzo Pieralisi
	If IOMMU_SUPPORT is not enabled (and therefore IOMMU_API is not selected), struct iommu_fwspec is an empty struct and IOMMU_FWSPEC_PCI_RC_ATS is not defined, resulting in the following compilation errors: drivers/acpi/arm64/iort.c: In function iort_iommu_configure: drivers/acpi/arm64/iort.c:1079:21: error: struct iommu_fwspec has no member named flag: dev->iommu_fwspec->flags \|= IOMMU_FWSPEC_PCI_RC_ATS; ^~ drivers/acpi/arm64/iort.c:1079:32: error: IOMMU_FWSPEC_PCI_RC_ATS undeclared (first use in this function) dev->iommu_fwspec->flags \|= IOMMU_FWSPEC_PCI_RC_ATS; ^~~~~~~~~~~~~~~~~~~~~~~ drivers/acpi/arm64/iort.c:1079:32: note: each undeclared identifier is reported only once for each function it appears in Move iort_iommu_configure() (and the helpers functions it relies on) into CONFIG_IOMMU_API preprocessor guarded code so that when CONFIG_IOMMU_SUPPORT is not enabled we prevent compiling code that is basically equivalent to no-OP, fixing the build errors. Cc: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/20190515034253.79348-1-wangkefeng.wang@huawei.com/ Fixes: 5702ee24182f ("ACPI/IORT: Check ATS capability in root complex nodes") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	arm64/kernel: kaslr: reduce module randomization range to 2 GB	Ard Biesheuvel
	The following commit 7290d5809571 ("module: use relative references for __ksymtab entries") updated the ksymtab handling of some KASLR capable architectures so that ksymtab entries are emitted as pairs of 32-bit relative references. This reduces the size of the entries, but more importantly, it gets rid of statically assigned absolute addresses, which require fixing up at boot time if the kernel is self relocating (which takes a 24 byte RELA entry for each member of the ksymtab struct). Since ksymtab entries are always part of the same module as the symbol they export, it was assumed at the time that a 32-bit relative reference is always sufficient to capture the offset between a ksymtab entry and its target symbol. Unfortunately, this is not always true: in the case of per-CPU variables, a per-CPU variable's base address (which usually differs from the actual address of any of its per-CPU copies) is allocated in the vicinity of the ..data.percpu section in the core kernel (i.e., in the per-CPU reserved region which follows the section containing the core kernel's statically allocated per-CPU variables). Since we randomize the module space over a 4 GB window covering the core kernel (based on the -/+ 4 GB range of an ADRP/ADD pair), we may end up putting the core kernel out of the -/+ 2 GB range of 32-bit relative references of module ksymtab entries that refer to per-CPU variables. So reduce the module randomization range a bit further. We lose 1 bit of randomization this way, but this is something we can tolerate. Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	arm64: errata: Add workaround for Cortex-A76 erratum #1463225	Will Deacon
	Revisions of the Cortex-A76 CPU prior to r4p0 are affected by an erratum that can prevent interrupts from being taken when single-stepping. This patch implements a software workaround to prevent userspace from effectively being able to disable interrupts. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	arm64: Remove useless message during oops	Will Deacon
	During an oops, we print the name of the current task and its pid twice. We also helpfully advertise its stack limit as "0x(____ptrval____)". Drop these useless messages. Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23	ALSA: hda/realtek - Set default power save node to 0	Kailang Yang
	I measured power consumption between power_save_node=1 and power_save_node=0. It's almost the same. Codec will enter to runtime suspend and suspend. That pin also will enter to D3. Don't need to enter to D3 by single pin. So, Disable power_save_node as default. It will avoid more issues. Windows Driver also has not this option at runtime PM. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-05-22	ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST	Eric Dumazet
	ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST Fixes: 3580d04aa674 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	Merge branch 'drm-fixes-5.2' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-fixes Fixes for 5.2: - Fix for DMCU firmware issues for stable - Add missing polaris10 pci id to kfd - Screen corruption fix on picasso - Fix for driver reload on vega10 - SR-IOV fixes - Locking fix in new SMU code - Compute profile switching fix for KFD Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190522205425.3657-1-alexander.deucher@amd.com
2019-05-23	Merge tag 'drm-misc-fixes-2019-05-22' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes - sun4i fixes to hdmi phy as well as u16 overflow in dsi (left from -next-fixes) - gma500 fix to make lvds detection more reliable - select devfreq for panfrost since it can't probe without it Signed-off-by: Dave Airlie <airlied@redhat.com> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20190522194440.GA22359@art_vandelay
2019-05-23	Merge branch 'vmwgfx-fixes-5.2' of ↵	Dave Airlie
	git://people.freedesktop.org/~thomash/linux into drm-fixes A set of misc fixes for various issues that have surfaced recently. All Cc'd stable except the dma iterator fix which shouldn't really cause any real issues on older kernels. Signed-off-by: Dave Airlie <airlied@redhat.com> From: "Thomas Hellstrom (VMware)" <thomas@shipmail.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190522115408.33185-1-thomas@shipmail.org
2019-05-22	libbpf: emit diff of mismatched public API, if any	Andrii Nakryiko
	It's easy to have a mismatch of "intended to be public" vs really exposed API functions. While Makefile does check for this mismatch, if it actually occurs it's not trivial to determine which functions are accidentally exposed. This patch dumps out a diff showing what's not supposed to be exposed facilitating easier fixing. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-22	ipv4/igmp: fix another memory leak in igmpv3_del_delrec()	Eric Dumazet
	syzbot reported memory leaks [1] that I have back tracked to a missing cleanup from igmpv3_del_delrec() when (im->sfmode != MCAST_INCLUDE) Add ip_sf_list_clear_all() and kfree_pmc() helpers to explicitely handle the cleanups before freeing. [1] BUG: memory leak unreferenced object 0xffff888123e32b00 (size 64): comm "softirq", pid 0, jiffies 4294942968 (age 8.010s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 e0 00 00 01 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000006105011b>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000006105011b>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000006105011b>] slab_alloc mm/slab.c:3326 [inline] [<000000006105011b>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<000000004bba8073>] kmalloc include/linux/slab.h:547 [inline] [<000000004bba8073>] kzalloc include/linux/slab.h:742 [inline] [<000000004bba8073>] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline] [<000000004bba8073>] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085 [<00000000a46a65a0>] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475 [<000000005956ca89>] do_ip_setsockopt.isra.0+0x1795/0x1930 net/ipv4/ip_sockglue.c:957 [<00000000848e2d2f>] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246 [<00000000b9db185c>] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616 [<000000003028e438>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130 [<0000000015b65589>] __sys_setsockopt+0x98/0x120 net/socket.c:2078 [<00000000ac198ef0>] __do_sys_setsockopt net/socket.c:2089 [inline] [<00000000ac198ef0>] __se_sys_setsockopt net/socket.c:2086 [inline] [<00000000ac198ef0>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086 [<000000000a770437>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301 [<00000000d3adb93b>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 9c8bb163ae78 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	Merge branch 'bnxt_en-Bug-fixes'	David S. Miller
	Michael Chan says: =================== bnxt_en: Bug fixes. There are 4 driver fixes in this series: 1. Fix RX buffer leak during OOM condition. 2. Call pci_disable_msix() under correct conditions to prevent hitting BUG. 3. Reduce unneeded mmeory allocation in kdump kernel to prevent OOM. 4. Don't read device serial number on VFs because it is not supported. Please queue #1, #2, #3 for -stable as well. Thanks. =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	bnxt_en: Device serial number is supported only for PFs.	Vasundhara Volam
	Don't read DSN on VFs that do not have the PCI capability. Fixes: 03213a996531 ("bnxt: move bp->switch_id initialization to PF probe") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	bnxt_en: Reduce memory usage when running in kdump kernel.	Michael Chan
	Skip RDMA context memory allocations, reduce to 1 ring, and disable TPA when running in the kdump kernel. Without this patch, the driver fails to initialize with memory allocation errors when running in a typical kdump kernel. Fixes: cf6daed098d1 ("bnxt_en: Increase context memory allocations on 57500 chips for RDMA.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	bnxt_en: Fix possible BUG() condition when calling pci_disable_msix().	Michael Chan
	When making configuration changes, the driver calls bnxt_close_nic() and then bnxt_open_nic() for the changes to take effect. A parameter irq_re_init is passed to the call sequence to indicate if IRQ should be re-initialized. This irq_re_init parameter needs to be included in the bnxt_reserve_rings() call. bnxt_reserve_rings() can only call pci_disable_msix() if the irq_re_init parameter is true, otherwise it may hit BUG() because some IRQs may not have been freed yet. Fixes: 41e8d7983752 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	bnxt_en: Fix aggregation buffer leak under OOM condition.	Michael Chan
	For every RX packet, the driver replenishes all buffers used for that packet and puts them back into the RX ring and RX aggregation ring. In one code path where the RX packet has one RX buffer and one or more aggregation buffers, we missed recycling the aggregation buffer(s) if we are unable to allocate a new SKB buffer. This leads to the aggregation ring slowly running out of buffers over time. Fix it by properly recycling the aggregation buffers. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Reported-by: Rakesh Hemnani <rhemnani@fb.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	hv_sock: perf: loop in send() to maximize bandwidth	Sunil Muthuswamy
	Currently, the hv_sock send() iterates once over the buffer, puts data into the VMBUS channel and returns. It doesn't maximize on the case when there is a simultaneous reader draining data from the channel. In such a case, the send() can maximize the bandwidth (and consequently minimize the cpu cycles) by iterating until the channel is found to be full. Perf data: Total Data Transfer: 10GB/iteration Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket reader Packet size: 64KB CPU sys time was captured using the 'time' command for the writer to send 10GB of data. 'Send Buffer Loop' is with the patch applied. The values below are over 10 iterations. \|--------------------------------------------------------\| \| \| Current \| Send Buffer Loop \| \|--------------------------------------------------------\| \| \| Throughput \| CPU sys \| Throughput \| CPU sys \| \| \| (MB/s) \| time (s) \| (MB/s) \| time (s) \| \|--------------------------------------------------------\| \| Min \| 407 \| 7.048 \| 401 \| 5.958 \| \|--------------------------------------------------------\| \| Max \| 455 \| 7.563 \| 542 \| 6.993 \| \|--------------------------------------------------------\| \| Avg \| 440 \| 7.411 \| 451 \| 6.639 \| \|--------------------------------------------------------\| \| Median \| 446 \| 7.417 \| 447 \| 6.761 \| \|--------------------------------------------------------\| Observation: 1. The avg throughput doesn't really change much with this change for this scenario. This is most probably because the bottleneck on throughput is somewhere else. 2. The average system (or kernel) cpu time goes down by 10%+ with this change, for the same amount of data transfer. Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	hv_sock: perf: Allow the socket buffer size options to influence the actual ↵	Sunil Muthuswamy
	socket buffers Currently, the hv_sock buffer size is static and can't scale to the bandwidth requirements of the application. This change allows the applications to influence the socket buffer sizes using the SO_SNDBUF and the SO_RCVBUF socket options. Few interesting points to note: 1. Since the VMBUS does not allow a resize operation of the ring size, the socket buffer size option should be set prior to establishing the connection for it to take effect. 2. Setting the socket option comes with the cost of that much memory being reserved/allocated by the kernel, for the lifetime of the connection. Perf data: Total Data Transfer: 1GB Single threaded reader/writer Results below are summarized over 10 iterations. Linux hvsocket writer + Windows hvsocket reader: \|---------------------------------------------------------------------------------------------\| \|Packet size -> \| 128B \| 1KB \| 4KB \| 64KB \| \|---------------------------------------------------------------------------------------------\| \|SO_SNDBUF size \| \| Throughput in MB/s (min/max/avg/median): \| \| v \| \| \|---------------------------------------------------------------------------------------------\| \| Default \| 109/118/114/116 \| 636/774/701/700 \| 435/507/480/476 \| 410/491/462/470 \| \| 16KB \| 110/116/112/111 \| 575/705/662/671 \| 749/900/854/869 \| 592/824/692/676 \| \| 32KB \| 108/120/115/115 \| 703/823/767/772 \| 718/878/850/866 \| 1593/2124/2000/2085 \| \| 64KB \| 108/119/114/114 \| 592/732/683/688 \| 805/934/903/911 \| 1784/1943/1862/1843 \| \|---------------------------------------------------------------------------------------------\| Windows hvsocket writer + Linux hvsocket reader: \|---------------------------------------------------------------------------------------------\| \|Packet size -> \| 128B \| 1KB \| 4KB \| 64KB \| \|---------------------------------------------------------------------------------------------\| \|SO_RCVBUF size \| \| Throughput in MB/s (min/max/avg/median): \| \| v \| \| \|---------------------------------------------------------------------------------------------\| \| Default \| 69/82/75/73 \| 313/343/333/336 \| 418/477/446/445 \| 659/701/676/678 \| \| 16KB \| 69/83/76/77 \| 350/401/375/382 \| 506/548/517/516 \| 602/624/615/615 \| \| 32KB \| 62/83/73/73 \| 471/529/496/494 \| 830/1046/935/939 \| 944/1180/1070/1100 \| \| 64KB \| 64/70/68/69 \| 467/533/501/497 \| 1260/1590/1430/1431 \| 1605/1819/1670/1660 \| \|---------------------------------------------------------------------------------------------\| Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv6: Fix redirect with VRF	David Ahern
	IPv6 redirect is broken for VRF. __ip6_route_redirect walks the FIB entries looking for an exact match on ifindex. With VRF the flowi6_oif is updated by l3mdev_update_flow to the l3mdev index and the FLOWI_FLAG_SKIP_NH_OIF set in the flags to tell the lookup to skip the device match. For redirects the device match is requires so use that flag to know when the oif needs to be reset to the skb device index. Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4/igmp: shrink struct ip_sf_list	Eric Dumazet
	Removing two 4 bytes holes allows to use kmalloc-32 kmem cache instead of kmalloc-64 on 64bit kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	neighbor: Add tracepoint to __neigh_create	David Ahern
	Add tracepoint to __neigh_create to enable debugging of new entries. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	selftests: pmtu: Simplify cleanup and namespace names	David Ahern
	The point of the pause-on-fail argument is to leave the setup as is after a test fails to allow a user to debug why it failed. Move the cleanup after posting the result to the user to make it so. Random names for the namespaces are not user friendly when trying to debug a failure. Make them simpler and more direct for the tests. Run cleanup at the beginning to ensure they are cleaned up if they already exist. Remove cleanup_done. There is no harm in doing cleanup twice; just ignore any errors related to not existing - which is already done. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	selftests: fib-onlink: Make quiet by default	David Ahern
	Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by default. Add getopt parsing of inputs and support for -v (verbose) and -p (pause on fail). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	net: Set strict_start_type for routes and rules	David Ahern
	New userspace on an older kernel can send unknown and unsupported attributes resulting in an incompelete config which is almost always wrong for routing (few exceptions are passthrough settings like the protocol that installed the route). Set strict_start_type in the policies for IPv4 and IPv6 routes and rules to detect new, unsupported attributes and fail the route add. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	Merge branch 'net-Export-functions-for-nexthop-code'	David S. Miller
	David Ahern says: ==================== net: Export functions for nexthop code This set exports ipv4 and ipv6 fib functions for use by the nexthop code. It also adds new ones to send route notifications if a nexthop configuration changes. v2 - repost of patches dropped at the end of the last dev window added patch 8 which exports nh_update_mtu since it is inline with the other patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4: Rename and export nh_update_mtu	David Ahern
	Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the nexthop code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4: export fib_info_update_nh_saddr	David Ahern
	Add scope as input argument versus relying on fib_info reference in fib_nh, and export fib_info_update_nh_saddr. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4: export fib_flush	David Ahern
	As nexthops are deleted, fib entries referencing it are marked dead. Export fib_flush so those entries can be removed in a timely manner. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4: export fib_check_nh	David Ahern
	Change fib_check_nh to take net, table and scope as input arguments over struct fib_config and export for use by nexthop code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv4: Add function to send route updates	David Ahern
	Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE notifications with NLM_F_REPLACE set for entries linked to a fib_info that have nh_updated flag set. This helper will be used by the nexthop code to notify userspace of routes that are impacted when a nexthop config is updated via replace. The new function and its helper are similar to how fib_flush and fib_table_flush work for address delete and link down events. This notification is needed for legacy apps that do not understand the new nexthop object. Apps that are nexthop aware can use the RTA_NH_ID attribute in the route notification to just ignore it. In the future this should be wrapped in a sysctl to allow OS'es that are fully updated to avoid the notificaton storm. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv6: export function to send route updates	David Ahern
	Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This helper will be used by the nexthop code to notify userspace of routes that are impacted when a nexthop config is updated via replace. This notification is needed for legacy apps that do not understand the new nexthop object. Apps that are nexthop aware can use the RTA_NH_ID attribute in the route notification to just ignore it. In the future this should be wrapped in a sysctl to allow OS'es that are fully updated to avoid the notificaton storm. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv6: Add hook to bump sernum for a route to stubs	David Ahern
	Add hook to ipv6 stub to bump the sernum up to the root node for a route. This is needed by the nexthop code when a nexthop config changes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	ipv6: Add delete route hook to stubs	David Ahern
	Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop code to remove entries linked to a nexthop that is getting deleted. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	Merge branch 'net-phy-T1-support'	David S. Miller
	Andrew Lunn says: ==================== net: phy: T1 support T1 PHYs make use of a single twisted pair, rather than the traditional 2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in the list of PHY features used by current T1 drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-22	net: phy: Make phy_basic_t1_features use base100t1.	Andrew Lunn
	Now that there is a link mode for 100BaseT1, use it in phy_basic_t1_features so T1 PHY drivers will indicate this mode via the Ethtool API. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>