linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-07-31	cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms	Srinivas Pandruvada
	Dynamic boosting of HWP performance on IO wake showed significant improvement to IO workloads. This series was intended for Skylake Xeon platforms only and feature was enabled by default based on CPU model number. But some Xeon platforms reused the Skylake desktop CPU model number. This caused some undesirable side effects to some graphics workloads. Since they are heavily IO bound, the increase in CPU performance decreased the power available for GPU to do its computing and hence decrease in graphics benchmark performance. For example on a Skylake desktop, GpuTest benchmark showed average FPS reduction from 529 to 506. This change makes sure that HWP boost feature is only enabled for Skylake server platforms by using ACPI FADT preferred PM Profile. If some desktop users wants to get benefit of boost, they can still enable boost from intel_pstate sysfs attribute "hwp_dynamic_boost". Fixes: 41ab43c9c89e (cpufreq: intel_pstate: enable boost for Skylake Xeon) Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410 Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-31	Merge branch 'acpi-soc'	Rafael J. Wysocki
	Merge a fix for hibernation regression in the ACPI driver for Intel SoCs (LPSS). * acpi-soc: ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
2018-07-31	Merge tag 'perf-urgent-for-mingo-4.18-20180730' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Update the tools copy of several files, including perf_event.h, powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and x86's memcpy_64.s (used in 'perf bench mem'), silencing the respective warnings during the perf tools build. - Fix the build on the alpine:edge distro. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-31	perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices	Kan Liang
	Masayoshi Mizuma reported that a warning message is shown while a CPU is hot-removed on Broadwell servers: WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988 uncore_pci_remove+0x10b/0x150 Call Trace: pci_device_remove+0x42/0xd0 device_release_driver_internal+0x148/0x220 pci_stop_bus_device+0x76/0xa0 pci_stop_root_bus+0x44/0x60 acpi_pci_root_remove+0x1f/0x80 acpi_bus_trim+0x57/0x90 acpi_bus_trim+0x2e/0x90 acpi_device_hotplug+0x2bc/0x4b0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x174/0x3a0 worker_thread+0x4c/0x3d0 kthread+0xf8/0x130 This bug was introduced by: commit 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well. To fix the conflict, the hardcoded index needs to be cleaned up: - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2 filter" on Broadwell, - increase UNCORE_EXTRA_PCI_DEV_MAX by one, - clean up the hardcoded index. Debugged-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Reported-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: msys.mizuma@gmail.com Cc: stable@vger.kernel.org Fixes: 15a3e845b01c ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") Link: http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: "Several smallish fixes, I don't think any of this requires another -rc but I'll leave that up to you: 1) Don't leak uninitialzed bytes to userspace in xfrm_user, from Eric Dumazet. 2) Route leak in xfrm_lookup_route(), from Tommi Rantala. 3) Premature poll() returns in AF_XDP, from Björn Töpel. 4) devlink leak in netdevsim, from Jakub Kicinski. 5) Don't BUG_ON in fib_compute_spec_dst, the condition can legitimately happen. From Lorenzo Bianconi. 6) Fix some spectre v1 gadgets in generic socket code, from Jeremy Cline. 7) Don't allow user to bind to out of range multicast groups, from Dmitry Safonov with a follow-up by Dmitry Safonov. 8) Fix metrics leak in fib6_drop_pcpu_from(), from Sabrina Dubroca" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) netlink: Don't shift with UB on nlk->ngroups net/ipv6: fix metrics leak xen-netfront: wait xenbus state change when load module manually can: ems_usb: Fix memory leak on ems_usb_disconnect() openvswitch: meter: Fix setting meter id for new entries netlink: Do not subscribe to non-existent groups NET: stmmac: align DMA stuff to largest cache line length tcp_bbr: fix bw probing to raise in-flight data for very small BDPs net: socket: Fix potential spectre v1 gadget in sock_is_registered net: socket: fix potential spectre v1 gadget in socketcall net: mdio-mux: bcm-iproc: fix wrong getter and setter pair ipv4: remove BUG_ON() from fib_compute_spec_dst enic: handle mtu change for vf properly net: lan78xx: fix rx handling before first packet is send nfp: flower: fix port metadata conversion bug bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog() bpf: fix bpf_skb_load_bytes_relative pkt length check perf build: Build error in libbpf missing initialization net: ena: Fix use of uninitialized DMA address bits field bpf: btf: Use exact btf value_size match in map_check_btf() ...
2018-07-30	scsi: qedi: Fix a potential buffer overflow	Bart Van Assche
	Tell snprintf() to store at most 255 characters in the output buffer instead of 256. This patch avoids that smatch reports the following warning: drivers/scsi/qedi/qedi_main.c:891: qedi_get_boot_tgt_info() error: snprintf() is printing too much 256 vs 255 Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: <QLogic-Storage-Upstream@cavium.com> Cc: <stable@vger.kernel.org> Acked-by: Nilesh Javali <nilesh.javali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-30	scsi: qla2xxx: Fix memory leak for allocating abort IOCB	Quinn Tran
	In the case of IOCB QFull, Initiator code can leave behind a stale pointer to an SRB structure on the outstanding command array. Fixes: 82de802ad46e ("scsi: qla2xxx: Preparation for Target MQ.") Cc: stable@vger.kernel.org #v4.16+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc	Linus Torvalds
	Pull sparc fixes from David Miller: "Some small __init annotation and build fixes from Stephen Rostedt and Thomas Petazzoni" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: use asm-generic version of msi.h sparc: move MSI related definitions to where they are used sparc/time: Add missing __init to init_tick_ops()
2018-07-30	squashfs: more metadata hardening	Linus Torvalds
	Anatoly reports another squashfs fuzzing issue, where the decompression parameters themselves are in a compressed block. This causes squashfs_read_data() to be called in order to read the decompression options before the decompression stream having been set up, making squashfs go sideways. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Acked-by: Phillip Lougher <phillip.lougher@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-31	net: xsk: don't return frames via the allocator on error	Jakub Kicinski
	xdp_return_buff() is used when frame has been successfully handled (transmitted) or if an error occurred during delayed processing and there is no way to report it back to xdp_do_redirect(). In case of __xsk_rcv_zc() error is propagated all the way back to the driver, so there is no need to call xdp_return_buff(). Driver will recycle the frame anyway after seeing that error happened. Fixes: 173d3adb6f43 ("xsk: add zero-copy support for Rx") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31	tools/bpftool: fix a percpu_array map dump problem	Yonghong Song
	I hit the following problem when I tried to use bpftool to dump a percpu array. $ sudo ./bpftool map show 61: percpu_array name stub flags 0x0 key 4B value 4B max_entries 1 memlock 4096B ... $ sudo ./bpftool map dump id 61 bpftool: malloc.c:2406: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) \|\| \ ((unsigned long) (old_size) >= MINSIZE && \ prev_inuse (old_top) && \ ((unsigned long) old_end & (pagesize - 1)) == 0)' failed. Aborted Further debugging revealed that this is due to miscommunication between bpftool and kernel. For example, for the above percpu_array with value size of 4B. The map info returned to user space has value size of 4B. In bpftool, the values array for lookup is allocated like: info->value_size * get_possible_cpus() = 4 * get_possible_cpus() In kernel (kernel/bpf/syscall.c), the values array size is rounded up to multiple of 8. round_up(map->value_size, 8) * num_possible_cpus() = 8 * num_possible_cpus() So when kernel copies the values to user buffer, the kernel will overwrite beyond user buffer boundary. This patch fixed the issue by allocating and stepping through percpu map value array properly in bpftool. Fixes: 71bb428fe2c19 ("tools: bpf: add bpftool") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-30	audit: fix potential null dereference 'context->module.name'	Yi Wang
	The variable 'context->module.name' may be null pointer when kmalloc return null, so it's better to check it before using to avoid null dereference. Another one more thing this patch does is using kstrdup instead of (kmalloc + strcpy), and signal a lost record via audit_log_lost. Cc: stable@vger.kernel.org # 4.11 Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2018-07-30	sparc: use asm-generic version of msi.h	Thomas Petazzoni
	This is necessary to be able to include <linux/msi.h> when CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Without this, a build with CONFIG_GENERIC_MSI_IRQ_DOMAIN fails with: In file included from drivers//ata/ahci.c:45:0: >> include/linux/msi.h:226:10: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? msi_alloc_info_t arg); ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:230:9: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? msi_alloc_info_t arg); ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:239:12: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? msi_alloc_info_t arg); ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:240:22: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? void (msi_finish)(msi_alloc_info_t arg, int retval); ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:241:20: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? void (set_desc)(msi_alloc_info_t arg, ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:316:18: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? int nvec, msi_alloc_info_t args); ^~~~~~~~~~~~~~~~ sg_alloc_fn include/linux/msi.h:318:29: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'? int virq, int nvec, msi_alloc_info_t *args); ^~~~~~~~~~~~~~~~ sg_alloc_fn Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	sparc: move MSI related definitions to where they are used	Thomas Petazzoni
	The definitions in arch/sparc/include/asm/msi.h are only used in arch/sparc/mm/srmmu.c, so it makes sense to have them in the C file directly. In addition, having a custom arch/sparc/include/asm/msi.h prevents from using the asm-generic version of this header, which is necessary to be able to include <linux/msi.h> when CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	sparc/time: Add missing __init to init_tick_ops()	Steven Rostedt (VMware)
	Code that was added to force gcc not to inline any function that isn't explicitly declared as inline uncovered that init_tick_ops() isn't marked as "__init". It is only called by __init functions and more importantly it too calls an __init function which would require it to be __init as well. Link: http://lkml.kernel.org/r/201806060444.hdHcKOBy%fengguang.wu@intel.com Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	netlink: Don't shift with UB on nlk->ngroups	Dmitry Safonov
	On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in hang during boot. Check for 0 ngroups and use (unsigned long long) as a type to shift. Fixes: 7acf9d4237c4 ("netlink: Do not subscribe to non-existent groups"). Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	Merge tag 'linux-can-fixes-for-4.18-20180730' of ↵	David S. Miller
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-07-30 this is a pull request of one patch for net/master. The patch by Anton Vasilyev and the Linux Driver Verification project fixes a memory leak in the ems_usb driver's disconnect function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - a build race fix - a Xen entry fix - a TSC_DEADLINE quirk future-proofing fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Fix if_changed build flip/flop bug x86/entry/64: Remove %ebx handling from error_entry/exit x86/apic: Future-proof the TSC_DEADLINE quirk for SKX
2018-07-30	Merge branch 'sched-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes: - a deadline scheduler related bug fix which triggered a kernel warning - an RT_RUNTIME_SHARE fix - a stop_machine preemption fix - a potential NULL dereference fix in sched_domain_debug_one()" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE sched/deadline: Update rq_clock of later_rq when pushing a task stop_machine: Disable preemption after queueing stopper threads sched/topology: Check variable group before dereferencing it
2018-07-30	arc: fix type warnings in arc/mm/cache.c	Randy Dunlap
	Fix type warnings in arch/arc/mm/cache.c. ../arch/arc/mm/cache.c: In function 'flush_anon_page': ../arch/arc/mm/cache.c:1062:55: warning: passing argument 2 of '__flush_dcache_page' makes integer from pointer without a cast [-Wint-conversion] __flush_dcache_page((phys_addr_t)page_address(page), page_address(page)); ^~~~~~~~~~~~~~~~~~ ../arch/arc/mm/cache.c:1013:59: note: expected 'long unsigned int' but argument is of type 'void *' void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr) ~~~~~~~~~~~~~~^~~~~ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Cc: Elad Kanfi <eladkan@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Ofer Levi <oferle@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	arc: fix build errors in arc/include/asm/delay.h	Randy Dunlap
	Fix build errors in arch/arc/'s delay.h: - add "extern unsigned long loops_per_jiffy;" - add <asm-generic/types.h> for "u64" In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32: ../arch/arc/include/asm/delay.h: In function '__udelay': ../arch/arc/include/asm/delay.h:61:12: error: 'u64' undeclared (first use in this function) loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32; ^~~ In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32: ../arch/arc/include/asm/delay.h: In function '__udelay': ../arch/arc/include/asm/delay.h:63:37: error: 'loops_per_jiffy' undeclared (first use in this function) loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32; ^~~~~~~~~~~~~~~ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Cc: Elad Kanfi <eladkan@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Ofer Levi <oferle@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c	Randy Dunlap
	Fix printk format warning in arch/arc/plat-eznps/mtm.c: In file included from ../include/linux/printk.h:7, from ../include/linux/kernel.h:14, from ../include/linux/list.h:9, from ../include/linux/smp.h:12, from ../arch/arc/plat-eznps/mtm.c:17: ../arch/arc/plat-eznps/mtm.c: In function 'set_mtm_hs_ctr': ../include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Wformat=] #define KERN_SOH "\001" /* ASCII Start Of Header / ^~~~~~ ../include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH' #define KERN_ERR KERN_SOH "3" / error conditions / ^~~~~~~~ ../include/linux/printk.h:308:9: note: in expansion of macro 'KERN_ERR' printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^~~~~~~~ ../arch/arc/plat-eznps/mtm.c:166:3: note: in expansion of macro 'pr_err' pr_err("* Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n", ^~~~~~ ../arch/arc/plat-eznps/mtm.c:166:40: note: format string is defined here pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n", ~^ %ld The hs_ctr variable can just be int instead of long, so also change kstrtol() to kstrtoint() and leave the format string as %d. Also add 2 header files since they are used in mtm.c and we prefer not to depend on accidental/indirect #includes. Cc: linux-snps-arc@lists.infradead.org Cc: Ofer Levi <oferle@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	arc: [plat-eznps] fix data type errors in platform headers	Randy Dunlap
	Add <linux/types.h> to fix build errors. Both ctop.h and <soc/nps/common.h> use u32 types and cause many errors. Examples: ../include/soc/nps/common.h:71:4: error: unknown type name 'u32' u32 __reserved:20, cluster:4, core:4, thread:4; ../include/soc/nps/common.h:76:3: error: unknown type name 'u32' u32 value; ../include/soc/nps/common.h:124:4: error: unknown type name 'u32' u32 base:8, cl_x:4, cl_y:4, ../include/soc/nps/common.h:127:3: error: unknown type name 'u32' u32 value; ../arch/arc/plat-eznps/include/plat/ctop.h:83:4: error: unknown type name 'u32' u32 gen:1, gdis:1, clk_gate_dis:1, asb:1, ../arch/arc/plat-eznps/include/plat/ctop.h:86:3: error: unknown type name 'u32' u32 value; ../arch/arc/plat-eznps/include/plat/ctop.h:93:4: error: unknown type name 'u32' u32 csa:22, dmsid:6, __reserved:3, cs:1; ../arch/arc/plat-eznps/include/plat/ctop.h:95:3: error: unknown type name 'u32' u32 value; Cc: linux-snps-arc@lists.infradead.org Cc: Ofer Levi <oferle@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: - AMD IBS data corruptor fix (uncovered by UBSAN) - an Intel PEBS entry unwind error fix - a HW-tracing crash fix - a MAINTAINERS update" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix crash when using HW tracing kernel filters perf/x86/intel: Fix unwind errors from PEBS entries (mk-II) MAINTAINERS: Add Naveen N. Rao as kprobes co-maintainer perf/x86/amd/ibs: Don't access non-started event
2018-07-30	Merge branch 'locking-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "A paravirt UP-patching fix, and an I2C MUX driver lockdep warning fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code i2c/mux, locking/core: Annotate the nested rt_mutex usage locking/rtmutex: Allow specifying a subclass for nested locking
2018-07-30	Merge branch 'efi-urgent-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Ingo Molnar: "An UEFI variables fix for SEV guests" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Access EFI MMIO data as unencrypted when SEV is active
2018-07-30	ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc	Ofer Levi
	Fixing compilation issue caused by missing struct nps_host_reg_aux_dpc definition. Fixes: 3f9cd874dcc87 ("ARC: [plat-eznps] avoid toggling of DPC register") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ofer Levi <oferle@mellanox.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	ARC: add SMP_CACHE_BYTES value validate	Eugeniy Paltsev
	Check that SMP_CACHE_BYTES (and hence ARCH_DMA_MINALIGN) is larger or equal to any cache line length by comparing it with values previously read from ARC cache BCR registers. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30	net/ipv6: fix metrics leak	Sabrina Dubroca
	Since commit d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info"), ipv6 metrics are shared and refcounted. rt6_set_from() assigns the rt->from pointer and increases the refcount on from's metrics. This reference is never released. Introduce the fib6_metrics_release() helper and use it to release the metrics. Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	xen-netfront: wait xenbus state change when load module manually	Xiao Liang
	When loading module manually, after call xenbus_switch_state to initializes the state of the netfront device, the driver state did not change so fast that may lead no dev created in latest kernel. This patch adds wait to make sure xenbus knows the driver is not in closed/unknown state. Current state: [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available With the patch installed. [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Link detected: yes Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30	perf tools: Fix the build on the alpine:edge distro	Arnaldo Carvalho de Melo
	The UAPI file byteorder/little_endian.h uses the __always_inline define without including the header where it is defined, linux/stddef.h, this ends up working in all the other distros because that file gets included seemingly by luck from one of the files included from little_endian.h. But not on Alpine:edge, that fails for all files where perf_event.h is included but linux/stddef.h isn't include before that. Adding the missing linux/stddef.h file where it breaks on Alpine:edge to fix that, in all other distros, that is just a very small header anyway. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-9r1pifftxvuxms8l7ir73p5l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30	tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'	Arnaldo Carvalho de Melo
	To cope with the changes in: 12c89130a56a ("x86/asm/memcpy_mcsafe: Add write-protection-fault handling") 60622d68227d ("x86/asm/memcpy_mcsafe: Return bytes remaining") bd131544aa7e ("x86/asm/memcpy_mcsafe: Add labels for __memcpy_mcsafe() write fault handling") da7bc9c57eb0 ("x86/asm/memcpy_mcsafe: Remove loop unrolling") This needed introducing a file with a copy of the mcsafe_handle_tail() function, that is used in the new memcpy_64.S file, as well as a dummy mcsafe_test.h header. Testing it: $ nm ~/bin/perf \| grep mcsafe 0000000000484130 T mcsafe_handle_tail 0000000000484300 T __memcpy_mcsafe $ $ perf bench mem memcpy # Running 'mem/memcpy' benchmark: # function 'default' (Default memcpy() provided by glibc) # Copying 1MB bytes ... 44.389205 GB/sec # function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 22.710756 GB/sec # function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 42.459239 GB/sec # function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB bytes ... 42.459239 GB/sec $ This silences this perf tools build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-igdpciheradk3gb3qqal52d0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30	tools headers uapi: Refresh linux/bpf.h copy	Arnaldo Carvalho de Melo
	To get the changes in: 4c79579b44b1 ("bpf: Change bpf_fib_lookup to return lookup status") That do not entail changes in tools/perf/ use of it, elliminating the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-yei494y6b3mn6bjzz9g0ws12@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30	tools headers powerpc: Update asm/unistd.h copy to pick new	Arnaldo Carvalho de Melo
	The new 'io_pgetevents' syscall was wired up in PowerPC in the following cset: b2f82565f2ca ("powerpc: Wire up io_pgetevents") Update tools/arch/powerpc/ copy of the asm/unistd.h file so that 'perf trace' on PowerPC gets it in its syscall table. This elliminated the following perf build warning: Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/unistd.h' differs from latest version at 'arch/powerpc/include/uapi/asm/unistd.h' Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Breno Leitao <leitao@debian.org> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: https://lkml.kernel.org/n/tip-9uvu7tz4ud3bxxfyxwryuz47@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30	tools headers uapi: Update tools's copy of linux/perf_event.h	Arnaldo Carvalho de Melo
	To get the changes in: 6cbc304f2f36 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)") That do not imply any changes in the tooling side, the (ab)use of sample_type is entirely done in kernel space, nothing for userspace to witness here. This cures the following warning during perf's build: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-o64mjoy35s9gd1gitunw1zg4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-30	virtio_balloon: fix another race between migration and ballooning	Jiang Biao
	Kernel panic when with high memory pressure, calltrace looks like, PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java" #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942 #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30 #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8 #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46 #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc #6 [ffff881ec7ed7838] __node_set at ffffffff81680300 #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5 #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8 [exception RIP: _raw_spin_lock_irqsave+47] RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8 RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008 RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098 R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 It happens in the pagefault and results in double pagefault during compacting pages when memory allocation fails. Analysed the vmcore, the page leads to second pagefault is corrupted with _mapcount=-256, but private=0. It's caused by the race between migration and ballooning, and lock missing in virtballoon_migratepage() of virtio_balloon driver. This patch fix the bug. Fixes: e22504296d4f64f ("virtio_balloon: introduce migration primitives to balloon pages") Cc: stable@vger.kernel.org Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Huang Chong <huang.chong@zte.com.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-07-30	media: v4l: vsp1: Fix deadlock in VSPDL DRM pipelines	Laurent Pinchart
	The VSP uses a lock to protect the BRU and BRS assignment when configuring pipelines. The lock is taken in vsp1_du_atomic_begin() and released in vsp1_du_atomic_flush(), as well as taken and released in vsp1_du_setup_lif(). This guards against multiple pipelines trying to assign the same BRU and BRS at the same time. The DRM framework calls the .atomic_begin() operations in a loop over all CRTCs included in an atomic commit. On a VSPDL (the only VSP type where this matters), a single VSP instance handles two CRTCs, with a single lock. This results in a deadlock when the .atomic_begin() operation is called on the second CRTC. The DRM framework serializes atomic commits that affect the same CRTCs, but doesn't know about two CRTCs sharing the same VSPDL. Two commits affecting the VSPDL LIF0 and LIF1 respectively can thus race each other, hence the need for a lock. This could be fixed on the DRM side by forcing serialization of commits affecting CRTCs backed by the same VSPDL, but that would negatively affect performances, as the locking is only needed when the BRU and BRS need to be reassigned, which is an uncommon case. The lock protects the whole .atomic_begin() to .atomic_flush() sequence. The only operation that can occur in-between is vsp1_du_atomic_update(), which doesn't touch the BRU and BRS, and thus doesn't need to be protected by the lock. We can thus only take the lock around the pipeline setup calls in vsp1_du_atomic_flush(), which fixes the deadlock. Fixes: f81f9adc4ee1 ("media: v4l: vsp1: Assign BRU and BRS to pipelines dynamically") Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-07-30	media: rc: read out of bounds if bpf reports high protocol number	Sean Young
	The repeat period is read from a static array. If a keydown event is reported from bpf with a high protocol number, we read out of bounds. This is unlikely to end up with a reasonable repeat period at the best of times, in which case no timely key up event is generated. Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-07-30	ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+	Vincent Whitchurch
	When building the kernel as Thumb-2 with binutils 2.29 or newer, if the assembler has seen the .type directive (via ENDPROC()) for a symbol, it automatically handles the setting of the lowest bit when the symbol is used with ADR. The badr macro on the other hand handles this lowest bit manually. This leads to a jump to a wrong address in the wrong state in the syscall return path: Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2 Modules linked in: CPU: 0 PID: 652 Comm: modprobe Tainted: G D 4.18.0-rc3+ #8 PC is at ret_fast_syscall+0x4/0x62 LR is at sys_brk+0x109/0x128 pc : [<80101004>] lr : [<801c8a35>] psr: 60000013 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 50c5387d Table: 9e82006a DAC: 00000051 Process modprobe (pid: 652, stack limit = 0x(ptrval)) 80101000 <ret_fast_syscall>: 80101000: b672 cpsid i 80101002: f8d9 2008 ldr.w r2, [r9, #8] 80101006: f1b2 4ffe cmp.w r2, #2130706432 ; 0x7f000000 80101184 <local_restart>: 80101184: f8d9 a000 ldr.w sl, [r9] 80101188: e92d 0030 stmdb sp!, {r4, r5} 8010118c: f01a 0ff0 tst.w sl, #240 ; 0xf0 80101190: d117 bne.n 801011c2 <__sys_trace> 80101192: 46ba mov sl, r7 80101194: f5ba 7fc8 cmp.w sl, #400 ; 0x190 80101198: bf28 it cs 8010119a: f04f 0a00 movcs.w sl, #0 8010119e: f3af 8014 nop.w {20} 801011a2: f2af 1ea2 subw lr, pc, #418 ; 0x1a2 To fix this, add a new symbol name which doesn't have ENDPROC used on it and use that with badr. We can't remove the badr usage since that would would cause breakage with older binutils. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-07-30	can: ems_usb: Fix memory leak on ems_usb_disconnect()	Anton Vasilyev
	ems_usb_probe() allocates memory for dev->tx_msg_buffer, but there is no its deallocation in ems_usb_disconnect(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-07-29	Linux 4.18-rc7v4.18-rc7	Linus Torvalds

2018-07-29	openvswitch: meter: Fix setting meter id for new entries	Justin Pettit
	The meter code would create an entry for each new meter. However, it would not set the meter id in the new entry, so every meter would appear to have a meter id of zero. This commit properly sets the meter id when adding the entry. Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure") Signed-off-by: Justin Pettit <jpettit@ovn.org> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29	Merge tag 'ext4_for_linus_stable' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Some miscellaneous ext4 fixes for 4.18; one fix is for a regression introduced in 4.18-rc4. Sorry for the late-breaking pull. I was originally going to wait for the next merge window, but Eric Whitney found a regression introduced in 4.18-rc4, so I decided to push out the regression plus the other fixes now. (The other commits have been baking in linux-next since early July)" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix check to prevent initializing reserved inodes ext4: check for allocation block validity with block group locked ext4: fix inline data updates with checksums enabled ext4: clear mmp sequence number when remounting read-only ext4: fix false negatives and false positives in ext4_check_descriptors()
2018-07-29	netlink: Do not subscribe to non-existent groups	Dmitry Safonov
	Make ABI more strict about subscribing to group > ngroups. Code doesn't check for that and it looks bogus. (one can subscribe to non-existing group) Still, it's possible to bind() to all possible groups with (-1) Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29	squashfs: be more careful about metadata corruption	Linus Torvalds
	Anatoly Trosinenko reports that a corrupted squashfs image can cause a kernel oops. It turns out that squashfs can end up being confused about negative fragment lengths. The regular squashfs_read_data() does check for negative lengths, but squashfs_read_metadata() did not, and the fragment size code just blindly trusted the on-disk value. Fix both the fragment parsing and the metadata reading code. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-29	ext4: fix check to prevent initializing reserved inodes	Theodore Ts'o
	Commit 8844618d8aa7: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: 8844618d8aa7 ("ext4: only look at the bg_flags field if it is valid") Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-07-29	NET: stmmac: align DMA stuff to largest cache line length	Eugeniy Paltsev
	As for today STMMAC_ALIGN macro (which is used to align DMA stuff) relies on L1 line length (L1_CACHE_BYTES). This isn't correct in case of system with several cache levels which might have L1 cache line length smaller than L2 line. This can lead to sharing one cache line between DMA buffer and other data, so we can lose this data while invalidate DMA buffer before DMA transaction. Fix that by using SMP_CACHE_BYTES instead of L1_CACHE_BYTES for aligning. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29	Merge branch 'turbostat' of ↵	Rafael J. Wysocki
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat utility fixes for 4.18 from Len Brown: "Three of them are for regressions since Linux-4.17" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 18.07.27 tools/power turbostat: Read extended processor family from CPUID tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes tools/power turbostat: fix x2apic debug message output file tools/power turbostat: fix bogus summary values tools/power turbostat: fix -S on UP systems tools/power turbostat: Update turbostat(8) RAPL throttling column description
2018-07-29	ACPICA: AML Parser: ignore control method status in module-level code	Erik Schmauss
	Previous change in the AML parser code blindly set all non-successful dispatcher statuses to AE_OK. That approach is incorrect, though, because successful control method invocations from module-level return AE_CTRL_TRANSFER. Overwriting AE_OK to this status causes the AML parser to think that there was no return value from the control method invocation. Fixes: 92c0f4af386 (ACPICA: AML Parser: ignore dispatcher error status during table load) Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-28	tcp_bbr: fix bw probing to raise in-flight data for very small BDPs	Neal Cardwell
	For some very small BDPs (with just a few packets) there was a quantization effect where the target number of packets in flight during the super-unity-gain (1.25x) phase of gain cycling was implicitly truncated to a number of packets no larger than the normal unity-gain (1.0x) phase of gain cycling. This meant that in multi-flow scenarios some flows could get stuck with a lower bandwidth, because they did not push enough packets inflight to discover that there was more bandwidth available. This was really only an issue in multi-flow LAN scenarios, where RTTs and BDPs are low enough for this to be an issue. This fix ensures that gain cycling can raise inflight for small BDPs by ensuring that in PROBE_BW mode target inflight values with a super-unity gain are always greater than inflight values with a gain <= 1. Importantly, this applies whether the inflight value is calculated for use as a cwnd value, or as a target inflight value for the end of the super-unity phase in bbr_is_next_cycle_phase() (both need to be bigger to ensure we can probe with more packets in flight reliably). This is a candidate fix for stable releases. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>