summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-06-28Merge branch 'stmmac-10GbE-using-XGMAC'David S. Miller
Jose Abreu says: ==================== net: stmmac: 10GbE using XGMAC Support for 10Gb Link using XGMAC core plus some performance tweaks. Tested in a PCI based setup. iperf3 TCP results: TSO ON, MTU=1500, TX Queues = 1, RX Queues = 1, Flow Control ON Pinned CPU (-A), Zero-Copy (-Z) [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-600.00 sec 643 GBytes 9.21 Gbits/sec 1 sender [ 5] 0.00-600.00 sec 643 GBytes 9.21 Gbits/sec receiver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Update Kconfig entryJose Abreu
We support more speeds now. Update the Kconfig entry. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Only disable interrupts if NAPI is scheduledJose Abreu
Only disable the interrupts if RX NAPI gets to be scheduled. Also, schedule the TX NAPI only when the interrupts are disabled. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Update RX Tail Pointer to last free entryJose Abreu
Update the RX Tail Pointer to the last available SKB entry. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Enable support for > 32 Bits addressing in XGMACJose Abreu
Currently, stmmac only supports 32 bits addressing for SKB. Enable the support for upto 48 bits addressing in XGMAC core. This avoids the use of bounce buffers and increases performance. Changes from v1: - Fallback to 32 bits in failure (Andrew) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Do not disable interrupts when cleaning TXJose Abreu
This is a performance killer and anyways the interrupts are being disabled by RX NAPI so no need to disable them again. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Add the missing speeds that XGMAC supportsJose Abreu
XGMAC supports following speeds: - 10G XGMII - 5G XGMII - 2.5G XGMII - 2.5G GMII - 1G GMII - 100M MII - 10M MII Add them to the stmmac driver. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: dwxgmac: Fix the undefined burst settingJose Abreu
Undefined burst shall only be set if pdata asks to. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Decrease default RX Watchdog valueJose Abreu
For performance reasons decrease the default RX Watchdog value for the minimum allowed. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Do not try to enable PHY EEE if MAC does not support itJose Abreu
Do not enable EEE feature in the PHY if MAC does not support it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: dwxgmac: Enable EDMA by defaultJose Abreu
Enable the EDMA feature by default which gives higher performance. Changes from v1: - Do not use magic values (David) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28net: stmmac: Fix case when PHY handle is not presentJose Abreu
Some DT bindings do not have the PHY handle. Let's fallback to manually discovery in case phylink_of_phy_connect() fails. Changes from v1: - Fixup comment style (Sergei) Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic") Reported-by: Katsuhiro Suzuki <katsuhiro@katsuster.net> Tested-by: Katsuhiro Suzuki <katsuhiro@katsuster.net> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge tag 'wireless-drivers-for-davem-2019-06-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 5.2 Hopefully the last set of fixes for 5.2. Nothing special this around, only small fixes and support for new cards. iwlwifi * add new cards for 22000 series and smaller fixes wl18xx * fix a clang warning about unused variables mwifiex * properly handle small vendor IEs (a regression from the recent security fix) ath * fix few SPDX tags mt76 * fix A-MSDU aggregation which got broken in v5.2-rc1 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28nl80211: Fix undefined behavior in bit shiftJiunn Chang
Shifting signed 32-bit value by 31 bits is undefined. Changing most significant bit to unsigned. Signed-off-by: Jiunn Chang <c0d1n61at3@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-06-27sis900: remove TxIDLESergej Benilov
Before "sis900: fix TX completion" patch, TX completion was done on TxIDLE interrupt. TX completion also was the only thing done on TxIDLE interrupt. Since "sis900: fix TX completion", TX completion is done on TxDESC interrupt. So it is not necessary any more to set and to check for TxIDLE. Eliminate TxIDLE from sis900. Correct some typos, too. Signed-off-by: Sergej Benilov <sergej.benilov@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27tipc: add dst_cache support for udp mediaXin Long
As other udp/ip tunnels do, tipc udp media should also have a lockless dst_cache supported on its tx path. Here we add dst_cache into udp_replicast to support dst cache for both rmcast and rcast, and rmcast uses ub->rcast and each rcast uses its own node in ub->rcast.list. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27Merge branch 'nfp-extend-flower-capabilities-for-GRE-tunnel-offload'David S. Miller
Jakub Kicinski says: ==================== nfp: extend flower capabilities for GRE tunnel offload Pieter says: This set extends the flower match and action components to offload GRE decapsulation with classification and encapsulation actions. The first 3 patches are refactor and cleanup patches for improving readability and reusability. Patch 4 and 5 implement GRE decap and encap functionality respectively. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27nfp: flower: add GRE encap action supportPieter Jansen van Vuuren
Add new GRE encapsulation support, which allows offload of filters using tunnel_key set action in combination with actions that egress to GRE type ports. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27nfp: flower: add GRE decap classification supportPieter Jansen van Vuuren
Extend the existing tunnel matching support to include GRE decap classification. Specifically matching existing tunnel fields for NVGRE (GRE with protocol field set to TEB). Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27nfp: flower: rename tunnel related functions in action offloadPieter Jansen van Vuuren
Previously tunnel related functions in action offload only applied to UDP tunnels. Rename these functions in preparation for new tunnel types. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27nfp: flower: add helper functions for tunnel classificationPieter Jansen van Vuuren
Adds IPv4 address and TTL/TOS helper functions, which is done in preparation for compiling new tunnel types. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27nfp: flower: refactor tunnel key layer calculationPieter Jansen van Vuuren
Refactor the key layer calculation function, in particular the tunnel key layer calculation by introducing helper functions. This is done in preparation for supporting GRE tunnel offloads. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27Merge branch 'net-dsa-microchip-Further-regmap-cleanups'David S. Miller
Marek Vasut says: ==================== net: dsa: microchip: Further regmap cleanups This patchset cleans up KSZ9477 switch driver by replacing various ad-hoc polling implementations and register RMW with regmap functions. Each polling function is replaced separately to make it easier to review and possibly bisect, but maybe the patches can be squashed. ==================== Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: microchip: Replace bit RMW with regmapMarek Vasut
Regmap provides read-modify-write function to update bitfields in registers. Replace ad-hoc read-modify-write with regmap_update_bits() where applicable. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: microchip: Replace ksz9477_wait_alu_sta_ready polling with regmapMarek Vasut
Regmap provides polling function to poll for bits in a register. This function is another reimplementation of polling for bit being clear in a register. Replace this with regmap polling function. Moreover, inline the function parameters, as the function is never called with any other parameter values than this one. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: microchip: Replace ksz9477_wait_alu_ready polling with regmapMarek Vasut
Regmap provides polling function to poll for bits in a register. This function is another reimplementation of polling for bit being clear in a register. Replace this with regmap polling function. Moreover, inline the function parameters, as the function is never called with any other parameter values than this one. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: microchip: Replace ksz9477_wait_vlan_ctrl_ready polling with regmapMarek Vasut
Regmap provides polling function to poll for bits in a register. This function is another reimplementation of polling for bit being clear in a register. Replace this with regmap polling function. Moreover, inline the function parameters, as the function is never called with any other parameter values than this one. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: microchip: Replace ad-hoc polling with regmapMarek Vasut
Regmap provides polling function to poll for bits in a register, use in instead of reimplementing it. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A handful of clk driver fixes and one core framework fix - Do a DT/firmware lookup in clk_core_get() even when the DT index is a nonsensical value - Fix some clk data typos in the Amlogic DT headers/code - Avoid returning junk in the TI clk driver when an invalid clk is looked for - Fix dividers for the emac clks on Stratix10 SoCs - Fix default HDA rates on Tegra210 to correct distorted audio" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: socfpga: stratix10: fix divider entry for the emac clocks clk: Do a DT parent lookup even when index < 0 clk: tegra210: Fix default rates for HDA clocks clk: ti: clkctrl: Fix returning uninitialized data clk: meson: meson8b: fix a typo in the VPU parent names array variable clk: meson: fix MPLL 50M binding id typo
2019-06-28Merge tag 'for-5.2/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix incorrect uses of kstrndup and DM logging macros in DM's early init code. - Fix DM log-writes target's handling of super block sectors so updates are made in order through use of completion. - Fix DM core's argument splitting code to avoid undefined behaviour reported as a side-effect of UBSAN analysis on ppc64le. - Fix DM verity target to limit the amount of error messages that can result from a corrupt block being found. * tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm verity: use message limit for data block corruption message dm table: don't copy from a NULL pointer in realloc_argv() dm log writes: make sure super sector log updates are written in order dm init: remove trailing newline from calls to DMERR() and DMINFO() dm init: fix incorrect uses of kstrndup()
2019-06-28Merge tag 'for-linus-20190627' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull pidfd fixes from Christian Brauner: "Userspace tools and libraries such as strace or glibc need a cheap and reliable way to tell whether CLONE_PIDFD is supported. The easiest way is to pass an invalid fd value in the return argument, perform the syscall and verify the value in the return argument has been changed to a valid fd. However, if CLONE_PIDFD is specified we currently check if pidfd == 0 and return EINVAL if not. The check for pidfd == 0 was originally added to enable us to abuse the return argument for passing additional flags along with CLONE_PIDFD in the future. However, extending legacy clone this way would be a terrible idea and with clone3 on the horizon and the ability to reuse CLONE_DETACHED with CLONE_PIDFD there's no real need for this clutch. So remove the pidfd == 0 check and help userspace out. Also, accordig to Al, anon_inode_getfd() should only be used past the point of no failure and ksys_close() should not be used at all since it is far too easy to get wrong. Al's motto being "basically, once it's in descriptor table, it's out of your control". So Al's patch switches back to what we already had in v1 of the original patchset and uses a anon_inode_getfile() + put_user() + fd_install() sequence in the success path and a fput() + put_unused_fd() in the failure path. The other two changes should be trivial" * tag 'for-linus-20190627' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: proc: remove useless d_is_dir() check copy_process(): don't use ksys_close() on cleanups samples: make pidfd-metadata fail gracefully on older kernels fork: don't check parent_tidptr with CLONE_PIDFD
2019-06-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fix for one corner case in HID++ protocol with respect to handling very long reports, from Hans de Goede - power management fix in Intel-ISH driver, from Hyungwoo Yang - use-after-free fix in Intel-ISH driver, from Dan Carpenter - a couple of new device IDs/quirks from Kai-Heng Feng, Kyle Godbey and Oleksandr Natalenko * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: intel-ish-hid: fix wrong driver_data usage HID: multitouch: Add pointstick support for ALPS Touchpad HID: logitech-dj: Fix forwarding of very long HID++ reports HID: uclogic: Add support for Huion HS64 tablet HID: chicony: add another quirk for PixArt mouse HID: intel-ish-hid: Fix a use after free in load_fw_from_host()
2019-06-28Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "A smaller batch of fixes, nothing that stands out as risky or scary. Mostly DTS tweaks for a few issues: - GPU fixlets for Meson - CPU idle fix for LS1028A - PWM interrupt fixes for i.MX6UL Also, enable a driver (FSL_EDMA) on arm64 defconfig, and a warning and two MAINTAINER tweaks" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: dts: imx6ul: fix PWM[1-4] interrupts ARM: omap2: remove incorrect __init annotation ARM: dts: gemini Fix up DNS-313 compatible string ARM: dts: Blank D-Link DIR-685 console arm64: defconfig: Enable FSL_EDMA driver arm64: dts: ls1028a: Fix CPU idle fail. MAINTAINERS: BCM53573: Add internal Broadcom mailing list MAINTAINERS: BCM2835: Add internal Broadcom mailing list ARM: dts: meson8b: fix the operating voltage of the Mali GPU ARM: dts: meson8b: drop undocumented property from the Mali GPU node ARM: dts: meson8: fix GPU interrupts and drop an undocumented property
2019-06-28Merge tag 'afs-fixes-20190620' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "The in-kernel AFS client has been undergoing testing on opendev.org on one of their mirror machines. They are using AFS to hold data that is then served via apache, and Ian Wienand had reported seeing oopses, spontaneous machine reboots and updates to volumes going missing. This patch series appears to have fixed the problem, very probably due to patch (2), but it's not 100% certain. (1) Fix the printing of the "vnode modified" warning to exclude checks on files for which we don't have a callback promise from the server (and so don't expect the server to tell us when it changes). Without this, for every file or directory for which we still have an in-core inode that gets changed on the server, we may get a message logged when we next look at it. This can happen in bulk if, for instance, someone does "vos release" to update a R/O volume from a R/W volume and a whole set of files are all changed together. We only really want to log a message if the file changed and the server didn't tell us about it or we failed to track the state internally. (2) Fix accidental corruption of either afs_vlserver struct objects or the the following memory locations (which could hold anything). The issue is caused by a union that points to two different structs in struct afs_call (to save space in the struct). The call cleanup code assumes that it can simply call the cleanup for one of those structs if not NULL - when it might be actually pointing to the other struct. This means that every Volume Location RPC op is going to corrupt something. (3) Fix an uninitialised spinlock. This isn't too bad, it just causes a one-off warning if lockdep is enabled when "vos release" is called, but the spinlock still behaves correctly. (4) Fix the setting of i_block in the inode. This causes du, for example, to produce incorrect results, but otherwise should not be dangerous to the kernel" * tag 'afs-fixes-20190620' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Fix setting of i_blocks afs: Fix uninitialised spinlock afs_volume::cb_break_lock afs: Fix vlserver record corruption afs: Fix over zealous "vnode modified" warnings
2019-06-28Merge tag 'csky-for-linus-5.2-fixup-gcc-unwind' of ↵Linus Torvalds
git://github.com/c-sky/csky-linux Pull arch/csky fixup from Guo Ren: "A fixup patch for rt_sigframe in signal.c" * tag 'csky-for-linus-5.2-fixup-gcc-unwind' of git://github.com/c-sky/csky-linux: csky: Fixup libgcc unwind error
2019-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix ppp_mppe crypto soft dependencies, from Takashi Iawi. 2) Fix TX completion to be finite, from Sergej Benilov. 3) Use register_pernet_device to avoid a dst leak in tipc, from Xin Long. 4) Double free of TX cleanup in Dirk van der Merwe. 5) Memory leak in packet_set_ring(), from Eric Dumazet. 6) Out of bounds read in qmi_wwan, from Bjørn Mork. 7) Fix iif used in mcast/bcast looped back packets, from Stephen Suryaputra. 8) Fix neighbour resolution on raw ipv6 sockets, from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits) af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET sctp: change to hold sk after auth shkey is created successfully ipv6: fix neighbour resolution with raw socket ipv6: constify rt6_nexthop() net: dsa: microchip: Use gpiod_set_value_cansleep() net: aquantia: fix vlans not working over bridged network ipv4: reset rt_iif for recirculated mcast/bcast out pkts team: Always enable vlan tx offload net/smc: Fix error path in smc_init net/smc: hold conns_lock before calling smc_lgr_register_conn() bonding: Always enable vlan tx offload net/ipv6: Fix misuse of proc_dointvec "skip_notify_on_dev_down" ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop qmi_wwan: Fix out-of-bounds read tipc: check msg->req data len in tipc_nl_compat_bearer_disable net: macb: do not copy the mac address if NULL net/packet: fix memory leak in packet_set_ring() net/tls: fix page double free on TX cleanup net/sched: cbs: Fix error path of cbs_module_init tipc: change to use register_pernet_device ...
2019-06-27Merge branch 'bpf-sockopt-hooks'Alexei Starovoitov
Stanislav Fomichev says: ==================== This series implements two new per-cgroup hooks: getsockopt and setsockopt along with a new sockopt program type. The idea is pretty similar to recently introduced cgroup sysctl hooks, but implementation is simpler (no need to convert to/from strings). What this can be applied to: * move business logic of what tos/priority/etc can be set by containers (either pass or reject) * handle existing options (or introduce new ones) differently by propagating some information in cgroup/socket local storage Compared to a simple syscall/{g,s}etsockopt tracepoint, those hooks are context aware. Meaning, they can access underlying socket and use cgroup and socket local storage. v9: * allow overwriting setsocktop arguments (Alexei Starovoitov) (see individual changes for more changelog details) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27bpftool: support cgroup sockoptStanislav Fomichev
Support sockopt prog type and cgroup hooks in the bpftool. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27bpf: add sockopt documentationStanislav Fomichev
Provide user documentation about sockopt prog type and cgroup hooks. v9: * add details about setsockopt context and inheritance v7: * add description for retval=0 and optlen=-1 v6: * describe cgroup chaining, add example v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTIStanislav Fomichev
sockopt test that verifies chaining behavior. v9: * setsockopt chaining example v7: * rework the test to verify cgroup getsockopt chaining Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27selftests/bpf: add sockopt test that exercises sk helpersStanislav Fomichev
socktop test that introduces new SOL_CUSTOM sockopt level and stores whatever users sets in sk storage. Whenever getsockopt is called, the original value is retrieved. v9: * SO_SNDBUF example to override user-supplied buffer v7: * use retval=0 and optlen-1 v6: * test 'ret=1' use-case as well (Alexei Starovoitov) v4: * don't call bpf_sk_fullsock helper v3: * drop (__u8 *)(long) casts for optval{,_end} v2: * new test Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27selftests/bpf: add sockopt testStanislav Fomichev
Add sockopt selftests: * require proper expected_attach_type * enforce context field read/write access * test bpf_sockopt_handled handler * test EPERM * test limiting optlen from getsockopt * test out-of-bounds access v9: * add tests for setsockopt argument mangling v7: * remove return 2; test retval=0 and optlen=-1 v3: * use DW for optval{,_end} loads v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27selftests/bpf: test sockopt section nameStanislav Fomichev
Add tests that make sure libbpf section detection works. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27libbpf: support sockopt hooksStanislav Fomichev
Make libbpf aware of new sockopt hooks so it can derive prog type and hook point from the section names. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27bpf: sync bpf.h to tools/Stanislav Fomichev
Export new prog type and hook points to the libbpf. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27bpf: implement getsockopt and setsockopt hooksStanislav Fomichev
Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks. BPF_CGROUP_SETSOCKOPT can modify user setsockopt arguments before passing them down to the kernel or bypass kernel completely. BPF_CGROUP_GETSOCKOPT can can inspect/modify getsockopt arguments that kernel returns. Both hooks reuse existing PTR_TO_PACKET{,_END} infrastructure. The buffer memory is pre-allocated (because I don't think there is a precedent for working with __user memory from bpf). This might be slow to do for each {s,g}etsockopt call, that's why I've added __cgroup_bpf_prog_array_is_empty that exits early if there is nothing attached to a cgroup. Note, however, that there is a race between __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup program layout might have changed; this should not be a problem because in general there is a race between multiple calls to {s,g}etsocktop and user adding/removing bpf progs from a cgroup. The return code of the BPF program is handled as follows: * 0: EPERM * 1: success, continue with next BPF program in the cgroup chain v9: * allow overwriting setsockopt arguments (Alexei Starovoitov): * use set_fs (same as kernel_setsockopt) * buffer is always kzalloc'd (no small on-stack buffer) v8: * use s32 for optlen (Andrii Nakryiko) v7: * return only 0 or 1 (Alexei Starovoitov) * always run all progs (Alexei Starovoitov) * use optval=0 as kernel bypass in setsockopt (Alexei Starovoitov) (decided to use optval=-1 instead, optval=0 might be a valid input) * call getsockopt hook after kernel handlers (Alexei Starovoitov) v6: * rework cgroup chaining; stop as soon as bpf program returns 0 or 2; see patch with the documentation for the details * drop Andrii's and Martin's Acked-by (not sure they are comfortable with the new state of things) v5: * skip copy_to_user() and put_user() when ret == 0 (Martin Lau) v4: * don't export bpf_sk_fullsock helper (Martin Lau) * size != sizeof(__u64) for uapi pointers (Martin Lau) * offsetof instead of bpf_ctx_range when checking ctx access (Martin Lau) v3: * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko) * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii Nakryiko) * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau) * use BPF_FIELD_SIZEOF() for consistency (Martin Lau) * new CG_SOCKOPT_ACCESS macro to wrap repeated parts v2: * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau) * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau) * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau) * added [0,2] return code check to verifier (Martin Lau) * dropped unused buf[64] from the stack (Martin Lau) * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau) * dropped bpf_target_off from ctx rewrites (Martin Lau) * use return code for kernel bypass (Martin Lau & Andrii Nakryiko) Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-27Merge branch 'bpf-af-xdp-mlx5e'Daniel Borkmann
Tariq Toukan says: ==================== This series contains improvements to the AF_XDP kernel infrastructure and AF_XDP support in mlx5e. The infrastructure improvements are required for mlx5e, but also some of them benefit to all drivers, and some can be useful for other drivers that want to implement AF_XDP. The performance testing was performed on a machine with the following configuration: - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz - Mellanox ConnectX-5 Ex with 100 Gbit/s link The results with retpoline disabled, single stream: txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU) rxdrop: 12.2 Mpps l2fwd: 9.4 Mpps The results with retpoline enabled, single stream: txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU) rxdrop: 9.9 Mpps l2fwd: 6.8 Mpps v2 changes: Added patches for mlx5e and addressed the comments for v1. Rebased for bpf-next. v3 changes: Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed Björn's comments for coding style. Fixed a bug in error handling flow in mlx5e_open_xsk. v4 changes: UAPI is not changed, XSK RX queues are exposed to the kernel. The lower half of the available amount of RX queues are regular queues, and the upper half are XSK RX queues. The patch "xsk: Extend channels to support combined XSK/non-XSK traffic" was dropped. The final patch was reworked accordingly. Added "net/mlx5e: Attach/detach XDP program safely", as the changes introduced in the XSK patch base on the stuff from this one. Added "libbpf: Support drivers with non-combined channels", which aligns the condition in libbpf with the condition in the kernel. Rebased over the newer bpf-next. v5 changes: In v4, ethtool reports the number of channels as 'combined' and the number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx' is 0, and 'combined' reports the double amount of channels if there is an active UMEM - to make libbpf happy. The patch for libbpf was dropped. Although it's still useful and fixes things, it raises some disagreement, so I'm dropping it - it's no longer useful for mlx5e anymore after the change above. v6 changes: As Maxim is out of office, I rebased the series on behalf of him, solved some conflicts, and re-spinned. ==================== Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-27net/mlx5e: Add XSK zero-copy supportMaxim Mikityanskiy
This commit adds support for AF_XDP zero-copy RX and TX. We create a dedicated XSK RQ inside the channel, it means that two RQs are running simultaneously: one for non-XSK traffic and the other for XSK traffic. The regular and XSK RQs use a single ID namespace split into two halves: the lower half is regular RQs, and the upper half is XSK RQs. When any zero-copy AF_XDP socket is active, changing the number of channels is not allowed, because it would break to mapping between XSK RQ IDs and channels. XSK requires different page allocation and release routines. Such functions as mlx5e_{alloc,free}_rx_mpwqe and mlx5e_{get,put}_rx_frag are generic enough to be used for both regular and XSK RQs, and they use the mlx5e_page_{alloc,release} wrappers around the real allocation functions. Function pointers are not used to avoid losing the performance with retpolines. Wherever it's certain that the regular (non-XSK) page release function should be used, it's called directly. Only the stats that could be meaningful for XSK are exposed to the userspace. Those that don't take part in the XSK flow are not considered. Note that we don't wait for WQEs on the XSK RQ (unlike the regular RQ), because the newer xdpsock sample doesn't provide any Fill Ring entries at the setup stage. We create a dedicated XSK SQ in the channel. This separation has its advantages: 1. When the UMEM is closed, the XSK SQ can also be closed and stop receiving completions. If an existing SQ was used for XSK, it would continue receiving completions for the packets of the closed socket. If a new UMEM was opened at that point, it would start getting completions that don't belong to it. 2. Calculating statistics separately. When the userspace kicks the TX, the driver triggers a hardware interrupt by posting a NOP to a dedicated XSK ICO (internal control operations) SQ, in order to trigger NAPI on the right CPU core. This XSK ICO SQ is protected by a spinlock, as the userspace application may kick the TX from any core. Store the pointers to the UMEMs in the net device private context, independently from the kernel. This way the driver can distinguish between the zero-copy and non-zero-copy UMEMs. The kernel function xdp_get_umem_from_qid does not care about this difference, but the driver is only interested in zero-copy UMEMs, particularly, on the cleanup it determines whether to close the XSK RQ and SQ or not by looking at the presence of the UMEM. Use state_lock to protect the access to this area of UMEM pointers. LRO isn't compatible with XDP, but there may be active UMEMs while XDP is off. If this is the case, don't allow LRO to ensure XDP can be reenabled at any time. The validation of XSK parameters typically happens when XSK queues open. However, when the interface is down or the XDP program isn't set, it's still possible to have active AF_XDP sockets and even to open new, but the XSK queues will be closed. To cover these cases, perform the validation also in these flows: 1. A new UMEM is registered, but the XSK queues aren't going to be created due to missing XDP program or interface being down. 2. MTU changes while there are UMEMs registered. Having this early check prevents mlx5e_open_channels from failing at a later stage, where recovery is impossible and the application has no chance to handle the error, because it got the successful return value for an MTU change or XSK open operation. The performance testing was performed on a machine with the following configuration: - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz - Mellanox ConnectX-5 Ex with 100 Gbit/s link The results with retpoline disabled, single stream: txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU) rxdrop: 12.2 Mpps l2fwd: 9.4 Mpps The results with retpoline enabled, single stream: txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU) rxdrop: 9.9 Mpps l2fwd: 6.8 Mpps Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-27net/mlx5e: Move queue param structs to en/params.hMaxim Mikityanskiy
structs mlx5e_{rq,sq,cq,channel}_param are going to be used in the upcoming XSK RX and TX patches. Move them to a header file to make them accessible from other C files. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>