summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-09igc: Add tx_csum offload functionalitySasha Neftin
Add IP generic TX checksum offload functionality. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09ixgbe: sync the first fragment unconditionallyFiro Yang
In Xen environment, if Xen-swiotlb is enabled, ixgbe driver could possibly allocate a page, DMA memory buffer, for the first fragment which is not suitable for Xen-swiotlb to do DMA operations. Xen-swiotlb have to internally allocate another page for doing DMA operations. This mechanism requires syncing the data from the internal page to the page which ixgbe sends to upper network stack. However, since commit f3213d932173 ("ixgbe: Update driver to make use of DMA attributes in Rx path"), the unmap operation is performed with DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed. Since the sync isn't performed, the upper network stack could receive a incomplete network packet. By incomplete, it means the linear data on the first fragment(between skb->head and skb->end) is invalid. So we have to copy the data from the internal xen-swiotlb page to the page which ixgbe sends to upper network stack through the sync operation. More details from Alexander Duyck: Specifically since we are mapping the frame with DMA_ATTR_SKIP_CPU_SYNC we have to unmap with that as well. As a result a sync is not performed on an unmap and must be done manually as we skipped it for the first frag. As such we need to always sync before possibly performing a page unmap operation. Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA attributes in Rx path") Signed-off-by: Firo Yang <firo.yang@suse.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09i40e: Remove EMPR traces from debugfs facilityMauro S. M. Rodrigues
Since commit '5098850c9b9b ("i40e/i40evf: i40e_register.h updates")' it is no longer possible to trigger an EMP Reset from debugfs, but it's possible to request it either way, to end up with a bad reset request: echo empr > /sys/kernel/debug/i40e/0002\:01\:00.1/command i40e 0002:01:00.1: debugfs: forcing EMPR i40e 0002:01:00.1: bad reset request 0x00010000 So let's remove this piece of code and show the available valid commands as it is when any invalid command is issued. Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09i40e: Implement debug macro hw_dbg using dev_dbgMauro S. M. Rodrigues
There are several uses of hw_dbg in the code, producing no output. This patch implements it using dev_debug. Initially the intention was to implement it using netdev_dbg, analogously to what is done in ixgbe for instance. That approach was avoided due to some early usages of hw_dbg, like i40e_pf_reset, before the VSI structure initialization causing NULL pointer dereference during the driver probe if the debug messages were turned on as soon as the module is probed. v2: - Use dev_dbg instead of pr_debug, and take advantage of dev_name instead of crafting pretty much the same device name locally as suggested by Jakub Kicinski. Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09Merge tag 'regulator-fix-v5.3-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "This is obviouly very late, containing three small and simple driver specific fixes. The main one is the TWL fix, this fixes issues with cpufreq on the PMICs used with BeagleBoard generation OMAP SoCs which had been broken due to changes in the generic OPP code exposing a bug in the regulator driver for these devices causing them to think that OPPs weren't supported on the system. Sorry about sending this so late, I hadn't registered that the TWL issue manifested in cpufreq" * tag 'regulator-fix-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: twl: voltage lists for vdd1/2 on twl4030 regulator: act8945a-regulator: fix ldo register addresses in set_mode hook regulator: slg51000: Fix a couple NULL vs IS_ERR() checks
2019-09-09i40e: fix hw_dbg usage in i40e_hmc_get_object_vaMauro S. M. Rodrigues
The mentioned function references a i40e_hw attribute, as parameter for hw_dbg, but it doesn't exist in the function scope. Fixes it by changing parameters from i40e_hmc_info to i40e_hw which can retrieve the necessary i40e_hmc_info. v2: - Fixed reverse xmas tree code style issue as suggested by Jakub Kicinski Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09igc: Remove unneeded PCI bus definesSasha Neftin
PCIe device control 2 defines does not use internally. This patch comes to clean up those. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09iavf: allow permanent MAC address to changeMitch Williams
Allow the VF to override the "permanent" MAC address set by the host. This allows bonding to work in the case where the administrator has set the VF MAC. Note that the VF must still be set to Trusted on the host if this change is to be accepted by the PF driver. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09igc: Add NVM checksum validationSasha Neftin
Add NVM checksum validation during probe functionality. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09fm10k: use a local variable for the frag pointerJacob Keller
In the function fm10k_xmit_frame_ring, we recently switched to using the skb_frag_size accessor instead of directly using the size member of the skb fragment. This made the for loop slightly harder to read because it created a very long line that is difficult to split up. Avoid this by using a local variable in the for loop, so that we do not have to break the line on an open parenthesis. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09Documentation: iavf: Update the Intel LAN driver doc for iavfJeff Kirsher
Update the LAN driver documentation to include the latest feature implementation and driver capabilities. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2019-09-09igc: Remove useless forward declarationSasha Neftin
Move igc_phy_setup_autoneg, igc_wait_autoneg and igc_set_fc_watermarks up to avoid forward declaration. It is not necessary to forward declare these static methods. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09e1000e: Make speed detection on hotplugging cable more reliableKai-Heng Feng
After hot plugging an 1Gbps Ethernet cable with 1Gbps link partner, the MII_BMSR may report 10Mbps, renders the network rather slow. The issue has much lower fail rate after commit 59653e6497d1 ("e1000e: Make watchdog use delayed work"), which essentially introduces some delay before running the watchdog task. But there's still a chance that the hot plugging event and the queued watchdog task gets run at the same time, then the original issue can be observed once again. So let's use mod_delayed_work() to add a deterministic 1 second delay before running watchdog task, after an interrupt. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09ixgbevf: Link lost in VM on ixgbevf when restoring from freeze or suspendRadoslaw Tyl
This patch fixed issue in VM which shows no link when hypervisor is restored from low-power state. The driver is responsible for re-enabling any features of the device that had been disabled during suspend calls, such as IRQs and bus mastering. Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09iavf: remove unused debug function iavf_debug_dYueHaibing
There is no caller of function iavf_debug_d() in tree since commit 75051ce4c5d8 ("iavf: Fix up debug print macro"), so it can be removed. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-09virtio_ring: fix unmap of indirect descriptorsMatthias Lange
The function virtqueue_add_split() DMA-maps the scatterlist buffers. In case a mapping error occurs the already mapped buffers must be unmapped. This happens by jumping to the 'unmap_release' label. In case of indirect descriptors the release is wrong and may leak kernel memory. Because the implementation assumes that the head descriptor is already mapped it starts iterating over the descriptor list starting from the head descriptor. However for indirect descriptors the head descriptor is never mapped in case of an error. The fix is to initialize the start index with zero in case of indirect descriptors and use the 'desc' pointer directly for iterating over the descriptor chain. Signed-off-by: Matthias Lange <matthias.lange@kernkonzept.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-09-09drm/i915: Restore relaxed padding (OCL_OOB_SUPPRES_ENABLE) for skl+Chris Wilson
This bit was fliped on for "syncing dependencies between camera and graphics". BSpec has no recollection why, and it is causing unrecoverable GPU hangs with Vulkan compute workloads. From BSpec, setting bit5 to 0 enables relaxed padding requirements for buffers, 1D and 2D non-array, non-MSAA, non-mip-mapped linear surfaces; and *must* be set to 0h on skl+ to ensure "Out of Bounds" case is suppressed. Reported-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998 Fixes: 8424171e135c ("drm/i915/gen9: h/w w/a: syncing dependencies between camera and graphics") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: denys.kostin@globallogic.com Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.1+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190904100707.7377-1-chris@chris-wilson.co.uk (cherry picked from commit 9d7b01e93526efe79dbf75b69cc5972b5a4f7b37) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-09-09drm/i915: Limit MST to <= 8bpc once againVille Syrjälä
My attempt at allowing MST to use the higher color depths has regressed some configurations. Apparently people have setups where all MST streams will fit into the DP link with 8bpc but won't fit with higher color depths. What we really should be doing is reducing the bpc for all the streams on the same link until they start to fit. But that requires a bit more work, so in the meantime let's revert back closer to the old behavior and limit MST to at most 8bpc. Cc: stable@vger.kernel.org Cc: Lyude Paul <lyude@redhat.com> Tested-by: Geoffrey Bennett <gmux22@gmail.com> Fixes: f1477219869c ("drm/i915: Remove the 8bpc shackles from DP MST") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111505 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190828102059.2512-1-ville.syrjala@linux.intel.com Reviewed-by: Lyude Paul <lyude@redhat.com> (cherry picked from commit 75427b2a2bffc083d51dec389c235722a9c69b05) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2019-09-09gpio: fix line flag validation in lineevent_createKent Gibson
lineevent_create should not allow any of GPIOHANDLE_REQUEST_OUTPUT, GPIOHANDLE_REQUEST_OPEN_DRAIN or GPIOHANDLE_REQUEST_OPEN_SOURCE to be set. Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines") Cc: stable <stable@vger.kernel.org> Signed-off-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-09-09gpio: fix line flag validation in linehandle_createKent Gibson
linehandle_create should not allow both GPIOHANDLE_REQUEST_INPUT and GPIOHANDLE_REQUEST_OUTPUT to be set. Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines") Cc: stable <stable@vger.kernel.org> Signed-off-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-09-09gpio: mockup: add missing single_release()Wei Yongjun
When using single_open() for opening, single_release() should be used instead of seq_release(), otherwise there is a memory leak. Fixes: 2a9e27408e12 ("gpio: mockup: rework debugfs interface") Cc: stable <stable@vger.kernel.org> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2019-09-08Linux 5.3-rc8v5.3-rc8Linus Torvalds
2019-09-08netfilter: nf_tables_offload: move indirect flow_block callback logic to corePablo Neira Ayuso
Add nft_offload_init() and nft_offload_exit() function to deal with the init and the exit path of the offload infrastructure. Rename nft_indr_block_get_and_ing_cmd() to nft_indr_block_cb(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08Merge tag 'compiler-attributes-for-linus-v5.3-rc8' of ↵Linus Torvalds
git://github.com/ojeda/linux Pull section attribute fix from Miguel Ojeda: "Fix Oops in Clang-compiled kernels (Nick Desaulniers)" * tag 'compiler-attributes-for-linus-v5.3-rc8' of git://github.com/ojeda/linux: include/linux/compiler.h: fix Oops for Clang-compiled kernels
2019-09-08Merge tag 'gpio-v5.3-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "All related to the PCA953x driver when handling chips with more than 8 ports, now that works again" * tag 'gpio-v5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: use pca953x_read_regs instead of regmap_bulk_read gpio: pca953x: correct type of reg_direction
2019-09-08netfilter: nf_tables_offload: avoid excessive stack usageArnd Bergmann
The nft_offload_ctx structure is much too large to put on the stack: net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=] Use dynamic allocation here, as we do elsewhere in the same function. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08netfilter: nf_tables: Fix an Oops in nf_tables_updobj() error handlingDan Carpenter
The "newobj" is an error pointer so we can't pass it to kfree(). It doesn't need to be freed so we can remove that and I also renamed the error label. Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08include/linux/compiler.h: fix Oops for Clang-compiled kernelsNick Desaulniers
GCC unescapes escaped string section names while Clang does not. Because __section uses the `#` stringification operator for the section name, it doesn't need to be escaped. This fixes an Oops observed in distro's that use systemd and not net.core.bpf_jit_enable=1, when their kernels are compiled with Clang. Link: https://github.com/ClangBuiltLinux/linux/issues/619 Link: https://bugs.llvm.org/show_bug.cgi?id=42950 Link: https://marc.info/?l=linux-netdev&m=156412960619946&w=2 Link: https://lore.kernel.org/lkml/20190904181740.GA19688@gmail.com/ Acked-by: Will Deacon <will@kernel.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> [Cherry-picked from the __section cleanup series for 5.3] [Adjusted commit message] Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-09-08x86/timer: Force PIT initialization when !X86_FEATURE_ARATJan Stancek
KVM guests with commit c8c4076723da ("x86/timer: Skip PIT initialization on modern chipsets") applied to guest kernel have been observed to have unusually higher CPU usage with symptoms of increase in vm exits for HLT and MSW_WRITE (MSR_IA32_TSCDEADLINE). This is caused by older QEMUs lacking support for X86_FEATURE_ARAT. lapic clock retains CLOCK_EVT_FEAT_C3STOP and nohz stays inactive. There's no usable broadcast device either. Do the PIT initialization if guest CPU lacks X86_FEATURE_ARAT. On real hardware it shouldn't matter as ARAT and DEADLINE come together. Fixes: c8c4076723da ("x86/timer: Skip PIT initialization on modern chipsets") Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-09-07Revert "x86/apic: Include the LDR when clearing out APIC registers"Linus Torvalds
This reverts commit 558682b5291937a70748d36fd9ba757fb25b99ae. Chris Wilson reports that it breaks his CPU hotplug test scripts. In particular, it breaks offlining and then re-onlining the boot CPU, which we treat specially (and the BIOS does too). The symptoms are that we can offline the CPU, but it then does not come back online again: smpboot: CPU 0 is now offline smpboot: Booting Node 0 Processor 0 APIC 0x0 smpboot: do_boot_cpu failed(-1) to wakeup CPU#0 Thomas says he knows why it's broken (my personal suspicion: our magic handling of the "cpu0_logical_apicid" thing), but for 5.3 the right fix is to just revert it, since we've never touched the LDR bits before, and it's not worth the risk to do anything else at this stage. [ Hotpluging of the boot CPU is special anyway, and should be off by default. See the "BOOTPARAM_HOTPLUG_CPU0" config option and the cpu0_hotplug kernel parameter. In general you should not do it, and it has various known limitations (hibernate and suspend require the boot CPU, for example). But it should work, even if the boot CPU is special and needs careful treatment - Linus ] Link: https://lore.kernel.org/lkml/156785100521.13300.14461504732265570003@skylake-alporthouse-com/ Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bandan Das <bsd@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-07ipc: fix sparc64 ipc() wrapperArnd Bergmann
Matt bisected a sparc64 specific issue with semctl, shmctl and msgctl to a commit from my y2038 series in linux-5.1, as I missed the custom sys_ipc() wrapper that sparc64 uses in place of the generic version that I patched. The problem is that the sys_{sem,shm,msg}ctl() functions in the kernel now do not allow being called with the IPC_64 flag any more, resulting in a -EINVAL error when they don't recognize the command. Instead, the correct way to do this now is to call the internal ksys_old_{sem,shm,msg}ctl() functions to select the API version. As we generally move towards these functions anyway, change all of sparc_ipc() to consistently use those in place of the sys_*() versions, and move the required ksys_*() declarations into linux/syscalls.h The IS_ENABLED(CONFIG_SYSVIPC) check is required to avoid link errors when ipc is disabled. Reported-by: Matt Turner <mattst88@gmail.com> Fixes: 275f22148e87 ("ipc: rename old-style shmctl/semctl/msgctl syscalls") Cc: stable@vger.kernel.org Tested-by: Matt Turner <mattst88@gmail.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-07Merge tag 'char-misc-5.3-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull Documentation updates from Greg KH: "A few small patches for the documenation file that came in through the char-misc tree in -rc7 for your tree. They fix the mistake in the .rst format that kept the table of companies from showing up in the html output, and most importantly, add people's names to the list showing support for our process" * tag 'char-misc-5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Documentation/process: Add Qualcomm process ambassador for hardware security issues Documentation/process/embargoed-hardware-issues: Microsoft ambassador Documentation/process: Add Google contact for embargoed hardware issues Documentation/process: Volunteer as the ambassador for Xen
2019-09-07Documentation/process: Add Qualcomm process ambassador for hardware security ↵Trilok Soni
issues Add Trilok Soni as process ambassador for hardware security issues from Qualcomm. Signed-off-by: Trilok Soni <tsoni@codeaurora.org> Link: https://lore.kernel.org/r/1567796517-8964-1-git-send-email-tsoni@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-07Merge tag 'dmaengine-fix-5.3' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull dmaengine fixes from Vinod Koul: "Some late fixes for drivers: - memory leak in ti crossbar dma driver - cleanup of omap dma probe - Fix for link list configuration in sprd dma driver - Handling fixed for DMACHCLR if iommu is mapped in rcar dma" * tag 'dmaengine-fix-5.3' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: rcar-dmac: Fix DMACHCLR handling if iommu is mapped dmaengine: sprd: Fix the DMA link-list configuration dmaengine: ti: omap-dma: Add cleanup in omap_dma_probe() dmaengine: ti: dma-crossbar: Fix a memory leak bug
2019-09-07Merge branch 'net-tls-small-TX-offload-optimizations'David S. Miller
Jakub Kicinski says: ==================== net/tls: small TX offload optimizations This set brings small TLS TX device optimizations. The biggest gain comes from fixing a misuse of non temporal copy instructions. On a synthetic workload modelled after customer's RFC application I see 3-5% percent gain. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net/tls: align non temporal copy to cache linesJakub Kicinski
Unlike normal TCP code TLS has to touch the cache lines it copies into to fill header info. On memory-heavy workloads having non temporal stores and normal accesses targeting the same cache line leads to significant overhead. Measured 3% overhead running 3600 round robin connections with additional memory heavy workload. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net/tls: remove the record tail optimizationJakub Kicinski
For TLS device offload the tag/message authentication code are filled in by the device. The kernel merely reserves space for them. Because device overwrites it, the contents of the tag make do no matter. Current code tries to save space by reusing the header as the tag. This, however, leads to an additional frag being created and defeats buffer coalescing (which trickles all the way down to the drivers). Remove this optimization, and try to allocate the space for the tag in the usual way, leave the memory uninitialized. If memory allocation fails rewind the record pointer so that we use the already copied user data as tag. Note that the optimization was actually buggy, as the tag for TLS 1.2 is 16 bytes, but header is just 13, so the reuse may had looked past the end of the page.. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net/tls: use RCU for the adder to the offload record listJakub Kicinski
All modifications to TLS record list happen under the socket lock. Since records form an ordered queue readers are only concerned about elements being removed, additions can happen concurrently. Use RCU primitives to ensure the correct access types (READ_ONCE/WRITE_ONCE). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net/tls: unref frags in orderJakub Kicinski
It's generally more cache friendly to walk arrays in order, especially those which are likely not in cache. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2019-09-06 Here's the main bluetooth-next pull request for the 5.4 kernel. - Cleanups & fixes to btrtl driver - Fixes for Realtek devices in btusb, e.g. for suspend handling - Firmware loading support for BCM4345C5 - hidp_send_message() return value handling fixes - Added support for utilizing Fast Advertising Interval - Various other minor cleanups & fixes Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07nfp: flower: cmsg rtnl locks can timeout reify messagesFred Lotter
Flower control message replies are handled in different locations. The truly high priority replies are handled in the BH (tasklet) context, while the remaining replies are handled in a predefined Linux work queue. The work queue handler orders replies into high and low priority groups, and always start servicing the high priority replies within the received batch first. Reply Type: Rtnl Lock: Handler: CMSG_TYPE_PORT_MOD no BH tasklet (mtu) CMSG_TYPE_TUN_NEIGH no BH tasklet CMSG_TYPE_FLOW_STATS no BH tasklet CMSG_TYPE_PORT_REIFY no WQ high CMSG_TYPE_PORT_MOD yes WQ high (link/mtu) CMSG_TYPE_MERGE_HINT yes WQ low CMSG_TYPE_NO_NEIGH no WQ low CMSG_TYPE_ACTIVE_TUNS no WQ low CMSG_TYPE_QOS_STATS no WQ low CMSG_TYPE_LAG_CONFIG no WQ low A subset of control messages can block waiting for an rtnl lock (from both work queue priority groups). The rtnl lock is heavily contended for by external processes such as systemd-udevd, systemd-network and libvirtd, especially during netdev creation, such as when flower VFs and representors are instantiated. Kernel netlink instrumentation shows that external processes (such as systemd-udevd) often use successive rtnl_trylock() sequences, which can result in an rtnl_lock() blocked control message to starve for longer periods of time during rtnl lock contention, i.e. netdev creation. In the current design a single blocked control message will block the entire work queue (both priorities), and introduce a latency which is nondeterministic and dependent on system wide rtnl lock usage. In some extreme cases, one blocked control message at exactly the wrong time, just before the maximum number of VFs are instantiated, can block the work queue for long enough to prevent VF representor REIFY replies from getting handled in time for the 40ms timeout. The firmware will deliver the total maximum number of REIFY message replies in around 300us. Only REIFY and MTU update messages require replies within a timeout period (of 40ms). The MTU-only updates are already done directly in the BH (tasklet) handler. Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts caused by a blocked work queue waiting on rtnl locks. Signed-off-by: Fred Lotter <frederik.lotter@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: hns3: make array spec_opcode static const, makes object smallerColin Ian King
Don't populate the array spec_opcode on the stack but instead make it static const. Makes the object code smaller by 48 bytes. Before: text data bss dec hex filename 6914 1040 128 8082 1f92 hns3/hns3vf/hclgevf_cmd.o After: text data bss dec hex filename 6866 1040 128 8034 1f62 hns3/hns3vf/hclgevf_cmd.o (gcc version 9.2.1, amd64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07be2net: make two arrays static const, makes object smallerColin Ian King
Don't populate the arrays on the stack but instead make them static const. Makes the object code smaller by 281 bytes. Before: text data bss dec hex filename 87553 5672 0 93225 16c29 benet/be_cmds.o After: text data bss dec hex filename 87112 5832 0 92944 16b10 benet/be_cmds.o (gcc version 9.2.1, amd64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07ionic: Remove unused including <linux/version.h>YueHaibing
Remove including <linux/version.h> that don't need it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: gso: Fix skb_segment splat when splitting gso_size mangled skb having ↵Shmulik Ladkani
linear-headed frag_list Historically, support for frag_list packets entering skb_segment() was limited to frag_list members terminating on exact same gso_size boundaries. This is verified with a BUG_ON since commit 89319d3801d1 ("net: Add frag_list support to skb_segment"), quote: As such we require all frag_list members terminate on exact MSS boundaries. This is checked using BUG_ON. As there should only be one producer in the kernel of such packets, namely GRO, this requirement should not be difficult to maintain. However, since commit 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper"), the "exact MSS boundaries" assumption no longer holds: An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but leaves the frag_list members as originally merged by GRO with the original 'gso_size'. Example of such programs are bpf-based NAT46 or NAT64. This lead to a kernel BUG_ON for flows involving: - GRO generating a frag_list skb - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room() - skb_segment() of the skb See example BUG_ON reports in [0]. In commit 13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb"), skb_segment() was modified to support the "gso_size mangling" case of a frag_list GRO'ed skb, but *only* for frag_list members having head_frag==true (having a page-fragment head). Alas, GRO packets having frag_list members with a linear kmalloced head (head_frag==false) still hit the BUG_ON. This commit adds support to skb_segment() for a 'head_skb' packet having a frag_list whose members are *non* head_frag, with gso_size mangled, by disabling SG and thus falling-back to copying the data from the given 'head_skb' into the generated segmented skbs - as suggested by Willem de Bruijn [1]. Since this approach involves the penalty of skb_copy_and_csum_bits() when building the segments, care was taken in order to enable this solution only when required: - untrusted gso_size, by testing SKB_GSO_DODGY is set (SKB_GSO_DODGY is set by any gso_size mangling functions in net/core/filter.c) - the frag_list is non empty, its item is a non head_frag, *and* the headlen of the given 'head_skb' does not match the gso_size. [0] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/ https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/ [1] https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/ Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07Merge branch 'stmmac-next'David S. Miller
Jose Abreu says: ==================== net: stmmac: Improvements and fixes for -next Improvements and fixes for recently introduced features. All for -next tree. More info in commit logs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: stmmac: Limit max speeds of XGMAC if asked toJose Abreu
We may have some SoCs that can't achieve XGMAC max speed. Limit it if asked to. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: stmmac: selftests: Add Split Header testJose Abreu
Add a test to validate that Split Header feature is working correctly. It works by using the rececently introduced counter that increments each time a packet with split header is received. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: stmmac: dwmac4: Enable RX Jumbo frame supportJose Abreu
We are already doing it by default in the TX path so we can also enable Jumbo Frame support in the RX path independently of MTU value. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07net: stmmac: selftests: Set RX tail pointer in Flow Control testJose Abreu
We need to set the RX tail pointer so that RX engine starts working again after finishing the Flow Control test. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>