linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2018-09-24	powerpc/pseries: Fix unitialized timer reset on migration	Michael Bringmann
	After migration of a powerpc LPAR, the kernel executes code to update the system state to reflect new platform characteristics. Such changes include modifications to device tree properties provided to the system by PHYP. Property notifications received by the post_mobility_fixup() code are passed along to the kernel in general through a call to of_update_property() which in turn passes such events back to all modules through entries like the '.notifier_call' function within the NUMA module. When the NUMA module updates its state, it resets its event timer. If this occurs after a previous call to stop_topology_update() or on a system without VPHN enabled, the code runs into an unitialized timer structure and crashes. This patch adds a safety check along this path toward the problem code. An example crash log is as follows. ibmvscsi 30000081: Re-enabling adapter! ------------[ cut here ]------------ kernel BUG at kernel/time/timer.c:958! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag lockd unix_diag af_packet_diag netlink_diag grace fscache sunrpc xts vmx_crypto pseries_rng sg binfmt_misc ip_tables xfs libcrc32c sd_mod ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod CPU: 11 PID: 3067 Comm: drmgr Not tainted 4.17.0+ #179 ... NIP mod_timer+0x4c/0x400 LR reset_topology_timer+0x40/0x60 Call Trace: 0xc0000003f9407830 (unreliable) reset_topology_timer+0x40/0x60 dt_update_callback+0x100/0x120 notifier_call_chain+0x90/0x100 __blocking_notifier_call_chain+0x60/0x90 of_property_notify+0x90/0xd0 of_update_property+0x104/0x150 update_dt_property+0xdc/0x1f0 pseries_devicetree_update+0x2d0/0x510 post_mobility_fixup+0x7c/0xf0 migration_store+0xa4/0xc0 kobj_attr_store+0x30/0x60 sysfs_kf_write+0x64/0xa0 kernfs_fop_write+0x16c/0x240 __vfs_write+0x40/0x200 vfs_write+0xc8/0x240 ksys_write+0x5c/0x100 system_call+0x58/0x6c Fixes: 5d88aa85c00b ("powerpc/pseries: Update CPU maps when device tree is updated") Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-09-24	HID: intel-ish-hid: Enable Ice Lake mobile	Srinivas Pandruvada
	Added PCI ID for Ice Lake mobile platform. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-09-24	HID: i2c-hid: Remove RESEND_REPORT_DESCR quirk and its handling	Hans de Goede
	Commit 52cf93e63ee6 ("HID: i2c-hid: Don't reset device upon system resume") removes the need for the RESEND_REPORT_DESCR quirk for Raydium devices, but kept it for the SIS device id 10FB touchscreens, as the author of that commit could not determine if the quirk is still necessary there. I've tested suspend/resume on a Toshiba Click Mini L9W-B which is the device for which this quirk was added in the first place and with the "Don't reset device upon system resume" fix the quirk is no longer necessary, so this commit removes it. Note even better I also had some other devices with SIS touchscreens which suspend/resume issues, where the RESEND_REPORT_DESCR quirk did not help. I've also tested these devices with the "Don't reset device upon system resume" fix and I'm happy to report that that fix also fixes touchscreen resume on the following devices: Asus T100HA Asus T200TA Peaq C1010 Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-09-24	Merge branch 'clockevents/4.19-fixes' of ↵	Thomas Gleixner
	https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull clockevent fixed from Daniel Lezcano: - Add the CLOCK_SOURCE_SUSPEND_NONSTOP for non-am43 SoCs (Keerthy) - Fix set_next_event handler for the fttmr010 (Tao Ren)
2018-09-23	net: aquantia: memory corruption on jumbo frames	Friedemann Gerold
	This patch fixes skb_shared area, which will be corrupted upon reception of 4K jumbo packets. Originally build_skb usage purpose was to reuse page for skb to eliminate needs of extra fragments. But that logic does not take into account that skb_shared_info should be reserved at the end of skb data area. In case packet data consumes all the page (4K), skb_shinfo location overflows the page. As a consequence, __build_skb zeroed shinfo data above the allocated page, corrupting next page. The issue is rarely seen in real life because jumbo are normally larger than 4K and that causes another code path to trigger. But it 100% reproducible with simple scapy packet, like: sendp(IP(dst="192.168.100.3") / TCP(dport=443) \ / Raw(RandString(size=(4096-40))), iface="enp1s0") Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code") Reported-by: Friedemann Gerold <f.gerold@b-c-s.de> Reported-by: Michael Rauch <michael@rauch.be> Signed-off-by: Friedemann Gerold <f.gerold@b-c-s.de> Tested-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	Merge branch 'netpoll-avoid-capture-effects-for-NAPI-drivers'	David S. Miller
	Eric Dumazet says: ==================== netpoll: avoid capture effects for NAPI drivers As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture, showing one ksoftirqd eating all cycles can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller() : Most NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. This patch series take care of the first round, we will handle other drivers in future rounds. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	tun: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. tun uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	nfp: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. nfp uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	bnxt: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. bnxt uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	bnx2x: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. bnx2x uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	mlx5: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. mlx5 uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	mlx4: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. mlx4 uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	i40evf: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. i40evf uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	ice: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ice uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	igb: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. igb uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	ixgb: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ixgb uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. This also removes a problematic use of disable_irq() in a context it is forbidden, as explained in commit af3e0fcf7887 ("8139too: Use disable_irq_nosync() in rtl8139_poll_controller()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	fm10k: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture lasts for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. fm10k uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	ixgbevf: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ixgbevf uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	ixgbe: remove ndo_poll_controller	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ixgbe uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	bonding: use netpoll_poll_dev() helper	Eric Dumazet
	We want to allow NAPI drivers to no longer provide ndo_poll_controller() method, as it has been proven problematic. team driver must not look at its presence, but instead call netpoll_poll_dev() which factorize the needed actions. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	netpoll: make ndo_poll_controller() optional	Eric Dumazet
	As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller(). NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. This patch allows netpoll_poll_dev() to process NAPI contexts even for drivers not providing ndo_poll_controller(), allowing for following patches in NAPI drivers. Also we export netpoll_poll_dev() so that it can be called by bonding/team drivers in following patches. Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-24	clocksource/drivers/fttmr010: Fix set_next_event handler	Tao Ren
	Currently, the aspeed MATCH1 register is updated to <current_count - cycles> in set_next_event handler, with the assumption that COUNT register value is preserved when the timer is disabled and it continues decrementing after the timer is enabled. But the assumption is wrong: RELOAD register is loaded into COUNT register when the aspeed timer is enabled, which means the next event may be delayed because timer interrupt won't be generated until <0xFFFFFFFF - current_count + cycles>. The problem can be fixed by updating RELOAD register to <cycles>, and COUNT register will be re-loaded when the timer is enabled and interrupt is generated when COUNT register overflows. The test result on Facebook Backpack-CMM BMC hardware (AST2500) shows the issue is fixed: without the patch, usleep(100) suspends the process for several milliseconds (and sometimes even over 40 milliseconds); after applying the fix, usleep(100) takes averagely 240 microseconds to return under the same workload level. Signed-off-by: Tao Ren <taoren@fb.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Lei YU <mine260309@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2018-09-23	mlxsw: Make MLXSW_SP1_FWREV_MINOR a hard requirement	Petr Machata
	Up until now, mlxsw tolerated firmware versions that weren't exactly matching the required version, if the branch number matched. That allowed the users to test various firmware versions as long as they were on the right branch. On the other hand, it made it impossible for mlxsw to put a hard lower bound on a version that fixes all problems known to date. If a user had a somewhat older FW version installed, mlxsw would start up just fine, possibly performing non-optimally as it would use features that trigger problematic behavior. Therefore tweak the check to accept any FW version that is: - on the same branch as the preferred version, and - the same as or newer than the preferred version. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	rds: Fix build regression.	David S. Miller
	Use DECLARE_* not DEFINE_* Fixes: 8360ed6745df ("RDS: IB: Use DEFINE_PER_CPU_SHARED_ALIGNED for rds_ib_stats") Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23	Linux 4.19-rc5	Greg Kroah-Hartman

2018-09-23	Merge tag 'mfd-fixes-4.19' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Lee writes: "MFD fixes for v4.19 - Fix Dialog DA9063 regulator constraints issue causing failure in probe - Fix OMAP Device Tree compatible strings to match DT" * tag 'mfd-fixes-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: omap-usb-host: Fix dts probe of children mfd: da9063: Fix DT probing with constraints
2018-09-23	Merge tag 'sunxi-fixes-for-4.19-2' of ↵	Olof Johansson
	https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Allwinner fixes - round 2 One additional fix regarding HDMI on the R40 SoC. Based on preliminary tests and code dumps for the R40, it was thought that the whole HDMI block was the same on the R40 and A64. Recent tests regarding the A64 showed that this was not the case. The HDMI PHY on the A64 only has one clock parent. How this occurs at the hardware level is unclear, as Allwinner has not given any feedback on this matter. Nevertheless it is clear that the hardware acts differently between the A64 and R40 in such a way that the R40's HDMI PHY is not backward compatible with the A64's. As such we need to drop the fallback compatible string in the R40's device tree. This was added in v4.19-rc1. * tag 'sunxi-fixes-for-4.19-2' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: ARM: dts: sun8i: drop A64 HDMI PHY fallback compatible from R40 DT Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-23	MAINTAINERS: update the Annapurna Labs maintainer email	Antoine Tenart
	Free Electrons became Bootlin. Update my email accordingly. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-09-23	Merge tag 'for-linus-4.19d-rc5-tag' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Juergen writes: "xen: Two small fixes for xen drivers." * tag 'for-linus-4.19d-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: issue warning message when out of grant maptrack entries xen/x86/vpmu: Zero struct pt_regs before calling into sample handling code
2018-09-23	Merge tag 'for-linus-20180922' of git://git.kernel.dk/linux-block	Greg Kroah-Hartman
	Jens writes: "Just a single fix in this pull request, fixing a regression in /proc/diskstats caused by the unification of timestamps." * tag 'for-linus-20180922' of git://git.kernel.dk/linux-block: block: use nanosecond resolution for iostat
2018-09-23	Merge branch 'x86-urgent-for-linus' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Thomas writes: "A set of fixes for x86: - Resolve the kvmclock regression on AMD systems with memory encryption enabled. The rework of the kvmclock memory allocation during early boot results in encrypted storage, which is not shareable with the hypervisor. Create a new section for this data which is mapped unencrypted and take care that the later allocations for shared kvmclock memory is unencrypted as well. - Fix the build regression in the paravirt code introduced by the recent spectre v2 updates. - Ensure that the initial static page tables cover the fixmap space correctly so early console always works. This worked so far by chance, but recent modifications to the fixmap layout can - depending on kernel configuration - move the relevant entries to a different place which is not covered by the initial static page tables. - Address the regressions and issues which got introduced with the recent extensions to the Intel Recource Director Technology code. - Update maintainer entries to document reality" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Expand static page table for fixmap space MAINTAINERS: Add X86 MM entry x86/intel_rdt: Add Reinette as co-maintainer for RDT MAINTAINERS: Add Borislav to the x86 maintainers x86/paravirt: Fix some warning messages x86/intel_rdt: Fix incorrect loop end condition x86/intel_rdt: Fix exclusive mode handling of MBA resource x86/intel_rdt: Fix incorrect loop end condition x86/intel_rdt: Do not allow pseudo-locking of MBA resource x86/intel_rdt: Fix unchecked MSR access x86/intel_rdt: Fix invalid mode warning when multiple resources are managed x86/intel_rdt: Global closid helper to support future fixes x86/intel_rdt: Fix size reporting of MBA resource x86/intel_rdt: Fix data type in parsing callbacks x86/kvm: Use __bss_decrypted attribute in shared variables x86/mm: Add .bss..decrypted section to hold shared variables
2018-09-23	Merge branch 'perf-urgent-for-linus' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Thomas writes: "- Provide a strerror_r wrapper so lib/bpf can be built on systems without _GNU_SOURCE - Unbreak the man page generator when building out of tree" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf Documentation: Fix out-of-tree asciidoctor man page generation tools lib bpf: Provide wrapper for strerror_r to build in !_GNU_SOURCE systems
2018-09-23	Merge branch 'efi-urgent-for-linus' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Thomas writes: "Make the EFI arm stub device tree loader default on to unbreak existing EFI boot loaders which do not have DTB support." * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub/arm: default EFI_ARMSTUB_DTB_LOADER to y
2018-09-22	Merge branch 'hv_netvsc-Support-LRO-RSC-in-the-vSwitch'	David S. Miller
	Haiyang Zhang says: ==================== hv_netvsc: Support LRO/RSC in the vSwitch The patch adds support for LRO/RSC in the vSwitch feature. It reduces the per packet processing overhead by coalescing multiple TCP segments when possible. The feature is enabled by default on VMs running on Windows Server 2019 and later. The patch set also adds ethtool command handler and documents. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	hv_netvsc: Update document for LRO/RSC support	Haiyang Zhang
	Update document for LRO/RSC support, and the command line info to change the setting. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	hv_netvsc: Add handler for LRO setting change	Haiyang Zhang
	This patch adds the handler for LRO setting change, so that a user can use ethtool command to enable / disable LRO feature. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	hv_netvsc: Add support for LRO/RSC in the vSwitch	Haiyang Zhang
	LRO/RSC in the vSwitch is a feature available in Windows Server 2019 hosts and later. It reduces the per packet processing overhead by coalescing multiple TCP segments when possible. This patch adds netvsc driver support for this feature. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	net-ethtool: ETHTOOL_GUFO did not and should not require CAP_NET_ADMIN	Maciej Żenczykowski
	So it should not fail with EPERM even though it is no longer implemented... This is a fix for: (userns)$ egrep ^Cap /proc/self/status CapInh: 0000003fffffffff CapPrm: 0000003fffffffff CapEff: 0000003fffffffff CapBnd: 0000003fffffffff CapAmb: 0000003fffffffff (userns)$ tcpdump -i usb_rndis0 tcpdump: WARNING: usb_rndis0: SIOCETHTOOL(ETHTOOL_GUFO) ioctl failed: Operation not permitted Warning: Kernel filter failed: Bad file descriptor tcpdump: can't remove kernel filter: Bad file descriptor With this change it returns EOPNOTSUPP instead of EPERM. See also https://github.com/the-tcpdump-group/libpcap/issues/689 Fixes: 08a00fea6de2 "net: Remove references to NETIF_F_UFO from ethtool." Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-22	device-dax: Add missing address_space_operations	Dave Jiang
	With address_space_operations missing for device dax, namely the .set_page_dirty, we hit a kernel warning when running destructive ndctl unit test: make TESTS=device-dax check WARNING: CPU: 3 PID: 7380 at fs/buffer.c:581 __set_page_dirty+0xb1/0xc0 Setting address_space_operations to noop_set_page_dirty and noop_invalidatepage for device dax to prevent fallback to __set_page_dirty_buffers() and block_invalidatepage() respectively. Fixes: 2232c6382a ("device-dax: Enable page_mapping()") Acked-by: Jeff Moyer <jmoyer@redhat.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-21	Merge branch 'net-dsa-b53-SGMII-modes-fixes'	David S. Miller
	Florian Fainelli says: ==================== net: dsa: b53: SGMII modes fixes Here are two additional fixes that are required in order for SGMII to work correctly. This was discovered with using a copper SFP which would make us use SGMII mode, we would actually leave the HW configured in its default mode: Fiber. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	net: dsa: b53: Also include SGMII for mac_config and mac_link_state	Florian Fainelli
	In both 802.3z and SGMII modes we need to configure the MAC accordingly to flip between Fiber and SGMII modes, and we need to read the MAC status from the SGMII in-band control word. Fixes: 0e01491de646 ("net: dsa: b53: Add SerDes support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	net: dsa: b53: Fix B53_SERDES_DIGITAL_CONTROL offset	Florian Fainelli
	Maths went wrong, to get 0x20, we need to do 0x1e + (x) * 2, not 0x18, fix that offset so we access the correct registers. This would make us not access the correct SerDes Digital control words, status would be fine and so we would not be correctly flipping between Fiber and SGMII modes resulting in incorrect status words being pulled into the SerDes digital status register. Fixes: 0e01491de646 ("net: dsa: b53: Add SerDes support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	net: dsa: b53: Don't assign autonegotiation enabled	Florian Fainelli
	PHYLINK takes care of filing the right information into state->an_enabled, get rid of the read from the SerDes's BMCR register. Fixes: 0e01491de646 ("net: dsa: b53: Add SerDes support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	decnet: Remove unnecessary check for dev->name	Nathan Chancellor
	Clang warns that the address of a pointer will always evaluated as true in a boolean context. net/decnet/dn_dev.c:1366:10: warning: address of array 'dev->name' will always evaluate to 'true' [-Wpointer-bool-conversion] dev->name ? dev->name : "???", ~~~~~^~~~ ~ 1 warning generated. Link: https://github.com/ClangBuiltLinux/linux/issues/116 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	selftests/net: add ipv6 tests to ip_defrag selftest	Peter Oskolkov
	This patch adds ipv6 defragmentation tests to ip_defrag selftest, to complement existing ipv4 tests. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	net/ipfrag: let ip[6]frag_high_thresh in ns be higher than in init_net	Peter Oskolkov
	Currently, ip[6]frag_high_thresh sysctl values in new namespaces are hard-limited to those of the root/init ns. There are at least two use cases when it would be desirable to set the high_thresh values higher in a child namespace vs the global hard limit: - a security/ddos protection policy may lower the thresholds in the root/init ns but allow for a special exception in a child namespace - testing: a test running in a namespace may want to set these thresholds higher in its namespace than what is in the root/init ns The new behavior: # ip netns add testns # ip netns exec testns bash # sysctl -w net.ipv4.ipfrag_high_thresh=9000000 net.ipv4.ipfrag_high_thresh = 9000000 # sysctl net.ipv4.ipfrag_high_thresh net.ipv4.ipfrag_high_thresh = 9000000 # sysctl -w net.ipv6.ip6frag_high_thresh=9000000 net.ipv6.ip6frag_high_thresh = 9000000 # sysctl net.ipv6.ip6frag_high_thresh net.ipv6.ip6frag_high_thresh = 9000000 The old behavior: # ip netns add testns # ip netns exec testns bash # sysctl -w net.ipv4.ipfrag_high_thresh=9000000 net.ipv4.ipfrag_high_thresh = 9000000 # sysctl net.ipv4.ipfrag_high_thresh net.ipv4.ipfrag_high_thresh = 4194304 # sysctl -w net.ipv6.ip6frag_high_thresh=9000000 net.ipv6.ip6frag_high_thresh = 9000000 # sysctl net.ipv6.ip6frag_high_thresh net.ipv6.ip6frag_high_thresh = 4194304 Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	ipv6: discard IP frag queue on more errors	Peter Oskolkov
	This is similar to how ipv4 now behaves: commit 0ff89efb5246 ("ip: fail fast on IP defrag errors"). Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	RDS: IB: Use DEFINE_PER_CPU_SHARED_ALIGNED for rds_ib_stats	Nathan Chancellor
	Clang warns when two declarations' section attributes don't match. net/rds/ib_stats.c:40:1: warning: section does not match previous declaration [-Wsection] DEFINE_PER_CPU_SHARED_ALIGNED(struct rds_ib_statistics, rds_ib_stats); ^ ./include/linux/percpu-defs.h:142:2: note: expanded from macro 'DEFINE_PER_CPU_SHARED_ALIGNED' DEFINE_PER_CPU_SECTION(type, name, PER_CPU_SHARED_ALIGNED_SECTION) \ ^ ./include/linux/percpu-defs.h:93:9: note: expanded from macro 'DEFINE_PER_CPU_SECTION' extern __PCPU_ATTRS(sec) __typeof__(type) name; \ ^ ./include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS' __percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \ ^ net/rds/ib.h:446:1: note: previous attribute is here DECLARE_PER_CPU(struct rds_ib_statistics, rds_ib_stats); ^ ./include/linux/percpu-defs.h:111:2: note: expanded from macro 'DECLARE_PER_CPU' DECLARE_PER_CPU_SECTION(type, name, "") ^ ./include/linux/percpu-defs.h:87:9: note: expanded from macro 'DECLARE_PER_CPU_SECTION' extern __PCPU_ATTRS(sec) __typeof__(type) name ^ ./include/linux/percpu-defs.h:49:26: note: expanded from macro '__PCPU_ATTRS' __percpu __attribute__((section(PER_CPU_BASE_SECTION sec))) \ ^ 1 warning generated. The initial definition was added in commit ec16227e1414 ("RDS/IB: Infiniband transport") and the cache aligned definition was added in commit e6babe4cc4ce ("RDS/IB: Stats and sysctls") right after. The definition probably should have been updated in net/rds/ib.h, which is what this patch does. Link: https://github.com/ClangBuiltLinux/linux/issues/114 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	net/ipv4: avoid compile error in fib_info_nh_uses_dev	Eric Dumazet
	net/ipv4/fib_frontend.c: In function 'fib_info_nh_uses_dev': net/ipv4/fib_frontend.c:322:6: error: unused variable 'ret' [-Werror=unused-variable] cc1: all warnings being treated as errors Fixes: 78f2756c5fc0 ("net/ipv4: Move device validation to helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21	Merge branch 'tcp-switch-to-Early-Departure-Time-model'	David S. Miller
	Eric Dumazet says: ==================== tcp: switch to Early Departure Time model In the early days, pacing has been implemented in sch_fq (FQ) in a generic way : - SO_MAX_PACING_RATE could be used by any sockets. - TCP would vary effective pacing rate based on CWND*MSS/SRTT - FQ would ensure delays between packets based on current sk->sk_pacing_rate, but with some quantum based artifacts. (inflating RPC tail latencies) - BBR then tweaked the pacing rate in its various phases (PROBE, DRAIN, ...) This worked reasonably well, but had the side effect that TCP RTT samples would be inflated by the sojourn time of the packets in FQ. Also note that when FQ is not used and TCP wants pacing, the internal pacing fallback has very different behavior, since TCP emits packets at the time they should be sent (with unreasonable assumptions about scheduling costs) Van Jacobson gave a talk at Netdev 0x12 in Montreal, about letting TCP (or applications for UDP messages) decide of the Earliest Departure Time, instead of letting packet schedulers derive it from pacing rate. https://www.netdevconf.org/0x12/session.html?evolving-from-afap-teaching-nics-about-time https://www.files.netdevconf.org/d/46def75c2ef345809bbe/files/?p=/Evolving%20from%20AFAP%20%E2%80%93%20Teaching%20NICs%20about%20time.pdf Recent additions in linux provided SO_TXTIME and a new ETF qdisc supporting the new skb->tstamp role This patch series converts TCP and FQ to the same model. This might in the future allow us to relax tight TSQ limits (if FQ is present in the output path), and thus lower number of callbacks to tcp_write_xmit(), thanks to batching. This will be followed by FQ change allowing SO_TXTIME support so that QUIC servers can let the pacing being done in FQ (or offloaded if network device permits) For example, a TCP flow rated at 24Mbps now shows a more meaningful RTT Before : ESTAB 0 211408 10.246.7.151:41558 10.246.7.152:33723 cubic wscale:8,8 rto:203 rtt:2.195/0.084 mss:1448 rcvmss:536 advmss:1448 cwnd:20 ssthresh:20 bytes_acked:36897937 segs_out:25488 segs_in:12454 data_segs_out:25486 send 105.5Mbps lastsnd:1 lastrcv:12851 lastack:1 pacing_rate 24.0Mbps/24.0Mbps delivery_rate 22.9Mbps busy:12851ms unacked:4 rcv_space:29200 notsent:205616 minrtt:0.026 After : ESTAB 0 192584 10.246.7.151:61612 10.246.7.152:34375 cubic wscale:8,8 rto:201 rtt:0.165/0.129 mss:1448 rcvmss:536 advmss:1448 cwnd:20 ssthresh:20 bytes_acked:170755401 segs_out:117931 segs_in:57651 data_segs_out:117929 send 1404.1Mbps lastsnd:1 lastrcv:56915 lastack:1 pacing_rate 24.0Mbps/24.0Mbps delivery_rate 24.2Mbps busy:56915ms unacked:4 rcv_space:29200 notsent:186792 minrtt:0.054 A nice side effect of this patch series is a reduction of max/p99 latencies of RPC workloads, since the FQ quantum no longer adds artifact. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>