summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-28net: ks8851: Use 16-bit read of RXFC registerMarek Vasut
The RXFC register is the only one being read using 8-bit accessors. To make it easier to support the 16-bit accesses used by the parallel bus variant of KS8851, use 16-bit accessor to read RXFC register as well as neighboring RXFCTR register. Remove ks8851_rdreg8() as it is not used anywhere anymore. There should be no functional change. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Use 16-bit writes to program MAC addressMarek Vasut
On the SPI variant of KS8851, the MAC address can be programmed with either 8/16/32-bit writes. To make it easier to support the 16-bit parallel option of KS8851 too, switch both the MAC address programming and readout to 16-bit operations. Remove ks8851_wrreg8() as it is not used anywhere anymore. There should be no functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Remove ks8851_rdreg32()Marek Vasut
The ks8851_rdreg32() is used only in one place, to read two registers using a single read. To make it easier to support 16-bit accesses via parallel bus later on, replace this single read with two 16-bit reads from each of the registers and drop the ks8851_rdreg32() altogether. If this has noticeable performance impact on the SPI variant of KS8851, then we should consider using regmap to abstract the SPI and parallel bus options and in case of SPI, permit regmap to merge register reads of neighboring registers into single, longer, read. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Use dev_{get,set}_drvdata()Marek Vasut
Replace spi_{get,set}_drvdata() with dev_{get,set}_drvdata(), which works for both SPI and platform drivers. This is done in preparation for unifying the KS8851 SPI and parallel bus drivers. There should be no functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Use devm_alloc_etherdev()Marek Vasut
Use device managed version of alloc_etherdev() to simplify the code. No functional change intended. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Pass device node into ks8851_init_mac()Marek Vasut
Since the driver probe function already has a struct device *dev pointer and can easily derive of_node pointer from it, pass the of_node pointer as a parameter to ks8851_init_mac() to avoid fishing it out from ks->spidev. This is the only reference to spidev in the function, so get rid of it. This is done in preparation for unifying the KS8851 SPI and parallel bus drivers. No functional change. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Replace dev_err() with netdev_err() in IRQ handlerMarek Vasut
Use netdev_err() instead of dev_err() to avoid accessing the spidev->dev in the interrupt handler. This is the only place which uses the spidev in this function, so replace it with netdev_err() to get rid of it. This is done in preparation for unifying the KS8851 SPI and parallel drivers. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Rename ndev to netdev in probeMarek Vasut
Rename ndev variable to netdev for the sake of consistency. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28net: ks8851: Factor out spi->dev in probe()/remove()Marek Vasut
Pull out the spi->dev into one common place in the function instead of having it repeated over and over again. This is done in preparation for unifying ks8851 and ks8851-mll drivers. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge branch 'vmxnet3-upgrade-to-version-4'David S. Miller
Ronak Doshi says: ==================== vmxnet3: upgrade to version 4 vmxnet3 emulation has recently added several new features which includes offload support for tunnel packets, support for new commands the driver can issue to emulation, change in descriptor fields, etc. This patch series extends the vmxnet3 driver to leverage these new features. Compatibility is maintained using existing vmxnet3 versioning mechanism as follows: - new features added to vmxnet3 emulation are associated with new vmxnet3 version viz. vmxnet3 version 4. - emulation advertises all the versions it supports to the driver. - during initialization, vmxnet3 driver picks the highest version number supported by both the emulation and the driver and configures emulation to run at that version. In particular, following changes are introduced: Patch 1: This patch introduces utility macros for vmxnet3 version 4 comparison and updates Copyright information. Patch 2: This patch implements get_rss_hash_opts and set_rss_hash_opts methods to allow querying and configuring different Rx flow hash configurations which can be used to support UDP/ESP RSS. Patch 3: This patch introduces segmentation and checksum offload support for encapsulated packets. This avoids segmenting and calculating checksum for each segment and hence gives performance boost. Patch 4: With all vmxnet3 version 4 changes incorporated in the vmxnet3 driver, with this patch, the driver can configure emulation to run at vmxnet3 version 4. Changes in v3 -> v4: - Replaced BUG_ON() with WARN_ON_ONCE() Changes in v2 -> v3: - fixed get_rss_hash_opts to return correct values for udp rss Changes in v2: - Fixed compilation issue due to missing closed brace - added fallthrough comment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28vmxnet3: update to version 4Ronak Doshi
With all vmxnet3 version 4 changes incorporated in the vmxnet3 driver, the driver can configure emulation to run at vmxnet3 version 4, provided the emulation advertises support for version 4. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28vmxnet3: add geneve and vxlan tunnel offload supportRonak Doshi
Vmxnet3 version 3 device supports checksum/TSO offload. Thus, vNIC to pNIC traffic can leverage hardware checksum/TSO offloads. However, vmxnet3 does not support checksum/TSO offload for Geneve/VXLAN encapsulated packets. Thus, for a vNIC configured with an overlay, the guest stack must first segment the inner packet, compute the inner checksum for each segment and encapsulate each segment before transmitting the packet via the vNIC. This results in significant performance penalty. This patch will enhance vmxnet3 to support Geneve/VXLAN TSO as well as checksum offload. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28vmxnet3: add support to get/set rx flow hashRonak Doshi
With vmxnet3 version 4, the emulation supports multiqueue(RSS) for UDP and ESP traffic. A guest can enable/disable RSS for UDP/ESP over IPv4/IPv6 by issuing commands introduced in this patch. ESP ipv6 is not yet supported in this patch. This patch implements get_rss_hash_opts and set_rss_hash_opts methods to allow querying and configuring different Rx flow hash configurations. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28vmxnet3: prepare for version 4 changesRonak Doshi
vmxnet3 is currently at version 3 and this patch initiates the preparation to accommodate changes for version 4. Introduced utility macros for vmxnet3 version 4 comparison and update Copyright information. Signed-off-by: Ronak Doshi <doshir@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28ice: Refactor VF VSI release and setup functionsBrett Creeley
Currently when a VF VSI calls ice_vsi_release() and ice_vsi_setup() it subsequently clears/sets the VF cached variables for lan_vsi_idx and lan_vsi_num. This works fine, but can be improved by handling this in the VF specific VSI release and setup functions. Also, when a VF VSI is setup too many parameters are passed that can be derived from the VF. Fix this by only calling VF VSI setup with the bare minimum parameters. Also, add functionality to invalidate a VF's VSI when it's released and/or setup fails. This will make it so a VF VSI cannot be accessed via its cached vsi_idx/vsi_num in these cases. Finally when a VF's VSI is invalidated set the lan_vsi_idx and lan_vsi_num to ICE_NO_VSI to clearly show that there is no valid VSI associated with this VF. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Refactor VF resetBrett Creeley
Currently VF VSI are being reset twice during a PFR or greater. This is causing reset, specifically resetting all VFs, to take too long. This is causing various issues with VF drivers not being able to gracefully handle the VF reset timeout. Fix this by refactoring how VF reset is handled for the case mentioned previously and for the VFR/VFLR case. The refactor was done by doing the following: 1. Removing the call to ice_vsi_rebuild_by_type for ICE_VSI_VF VSI, which was causing the initial VSI rebuild. 2. Adding functions for pre/post VSI rebuild functions that can be called in both the reset all VFs case and reset individual VF case. 3. Adding VSI rebuild functions that are specific for the reset all VFs case and adding functions that are specific for the reset individual VF case. 4. Calling the pre-rebuild function, then the specific VSI rebuild function based on the reset type, and then calling the post-rebuild function to handle VF resets. This patch series makes some assumptions about how VSI are handling by FW during reset: 1. During a PFR or greater all VSI in FW will be cleared. 2. During a VFR/VFLR the VSI rebuild responsibility is in the hands of the PF software. 3. There is code in the ice_reset_all_vfs() case to amortize operations if possible. This was left intact. 4. PF software should not be replaying VSI based filters that were added other than host configured, PF software configured, or the VF's default/LAA MAC. This is the VF drivers job after it has been reset. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: remove VM/VF disable command on CORER/GLOBR resetPaul Greenwalt
Remove VM/VF disable AQC (opcode 0x0C31) when resetting all VFs. This is not required for CORER/GLOBR reset. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Add functions to rebuild host VLAN/MAC config for a VFBrett Creeley
When resetting a VF the VLAN and MAC filter configurations need to be replayed. Add helper functions for this purpose. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Add function to set trust mode bit on resetBrett Creeley
As the title says, use a function to set trust mode bit on reset. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Renaming and simplification in VF init pathBrett Creeley
Some function names weren't very clear and some portions of VF creation could be moved into functions for clarity. Fix this by renaming some functions and move pieces of code into clearly name functions. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Separate VF VSI initialization/creation from reset flowBrett Creeley
Currently the same flow is used for VF VSI initialization/creation and VF VSI reset. This makes the initialization/creation flow unnecessarily complicated. Fix this by separating the initialization/creation of the VF VSI from the reset flow. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Add helper function for clearing VPGEN_VFRTRIGBrett Creeley
Create a helper function for clearing VPGEN_VFRTRIG as this needs to be done on reset to notify the VF that we are done resetting it. Also, it needs to be done on SR-IOV initialization/creation in case it was left in a bad state after SR-IOV tear down. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Simplify ice_sriov_configureBrett Creeley
Add a new function for checking if SR-IOV can be configured based on the PF and/or device's state/capabilities. Also, simplify the flow in ice_sriov_configure(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Refactor ice_ena_vf_mappings to split MSIX and queue mappingsBrett Creeley
Currently ice_ena_vf_mappings() does all of the VF's MSIX and queue mapping in one function. This makes it hard to digest. Fix this by creating a new function for enabling MSIX mappings and one for enabling queue mappings. Also, rename some variables in the functions for clarity. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Declare functions staticTony Nguyen
ice_get_pfa_module_tlv() and ice_read_sr_word() are not being called outside of their file. Declare them as static. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: fix kernel BUG if register_netdev failsJacob Keller
If register_netdev() fails, the driver will attempt to cleanup the q_vectors and inadvertently trigger a kernel BUG due to a NULL pointer dereference. This occurs because cleaning up q_vectors attempts to call netif_napi_del on napi_structs which were never initialized. Resolve this by releasing the netdev in ice_cfg_netdev and setting vsi->netdev to NULL. This ensures that after ice_cfg_netdev fails the state is rewound to match as if ice_cfg_netdev was never called. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: fix potential double free in probe unrollingJacob Keller
If ice_init_interrupt_scheme fails, ice_probe will jump to clearing up the interrupts. This can lead to some static analysis tools such as the compiler sanitizers complaining about double free problems. Since ice_init_interrupt_scheme already unrolls internally on failure, there is no need to call ice_clear_interrupt_scheme when it fails. Add a new unroll label and use that instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: cleanup VSI context initializationJacob Keller
Remove an unnecessary copy of vsi->info into ctxt->info in ice_vsi_init. This line is essentially a no-op because ice_set_dflt_vsi_ctx performs a memset to clear the info from the context structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28ice: Poll for reset completion when DDP load failsAnirudh Venkataramanan
There are certain cases where the DDP load fails and the FW issues a core reset. For these cases, wait for reset to complete before proceeding with reset of the driver init. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-28Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/asm-generic/topology.h: guard cpumask_of_node() macro argument fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() mm: remove VM_BUG_ON(PageSlab()) from page_mapcount() mm,thp: stop leaking unreleased file pages mm/z3fold: silence kmemleak false positives of slots
2020-05-28sfc: avoid an unused-variable warningArnd Bergmann
'nic_data' is no longer used outside of the #ifdef block in efx_ef10_set_mac_address: drivers/net/ethernet/sfc/ef10.c:3231:28: error: unused variable 'nic_data' [-Werror,-Wunused-variable] struct efx_ef10_nic_data *nic_data = efx->nic_data; Move the variable into a local scope. Fixes: dfcabb078847 ("sfc: move vport_id to struct efx_nic") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28sctp: check assoc before SCTP_ADDR_{MADE_PRIM, ADDED} eventJonas Falkevik
Make sure SCTP_ADDR_{MADE_PRIM,ADDED} are sent only for associations that have been established. These events are described in rfc6458#section-6.1 SCTP_PEER_ADDR_CHANGE: This tag indicates that an address that is part of an existing association has experienced a change of state (e.g., a failure or return to service of the reachability of an endpoint via a specific transport address). Signed-off-by: Jonas Falkevik <jonas.falkevik@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Just a few random driver fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics - add a second working PNP_ID for Lenovo T470s Input: applespi - replace zero-length array with flexible-array Input: axp20x-pek - always register interrupt handlers Input: lm8333 - update contact email Input: synaptics-rmi4 - fix error return code in rmi_driver_probe() Input: synaptics-rmi4 - really fix attn_data use-after-free Input: i8042 - add ThinkPad S230u to i8042 reset list Revert "Input: i8042 - add ThinkPad S230u to i8042 nomux list" Input: dlink-dir685-touchkeys - fix a typo in driver name Input: xpad - add custom init packet for Xbox One S controllers Input: evdev - call input_flush_device() on release(), not flush() Input: i8042 - add ThinkPad S230u to i8042 nomux list Input: usbtouchscreen - add support for BonXeon TP Input: cros_ec_keyb - use cros_ec_cmd_xfer_status helper Input: mms114 - fix handling of mms345l Input: elants_i2c - support palm detection
2020-05-28x86/ioperm: Prevent a memory leak when fork failsJay Lang
In the copy_process() routine called by _do_fork(), failure to allocate a PID (or further along in the function) will trigger an invocation to exit_thread(). This is done to clean up from an earlier call to copy_thread_tls(). Naturally, the child task is passed into exit_thread(), however during the process, io_bitmap_exit() nullifies the parent's io_bitmap rather than the child's. As copy_thread_tls() has been called ahead of the failure, the reference count on the calling thread's io_bitmap is incremented as we would expect. However, io_bitmap_exit() doesn't accept any arguments, and thus assumes it should trash the current thread's io_bitmap reference rather than the child's. This is pretty sneaky in practice, because in all instances but this one, exit_thread() is called with respect to the current task and everything works out. A determined attacker can issue an appropriate ioctl (i.e. KDENABIO) to get a bitmap allocated, and force a clone3() syscall to fail by passing in a zeroed clone_args structure. The kernel handles the erroneous struct and the buggy code path is followed, and even though the parent's reference to the io_bitmap is trashed, the child still holds a reference and thus the structure will never be freed. Fix this by tweaking io_bitmap_exit() and its subroutines to accept a task_struct argument which to operate on. Fixes: ea5f1cd7ab49 ("x86/ioperm: Remove bitmap if all permissions dropped") Signed-off-by: Jay Lang <jaytlang@mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable#@vger.kernel.org Link: https://lkml.kernel.org/r/20200524162742.253727-1-jaytlang@mit.edu
2020-05-28Merge tag 'csky-for-linus-5.7-rc8' of git://github.com/c-sky/csky-linuxLinus Torvalds
Pull csky fixes from Guo Ren: "Another four fixes for csky: - fix req_syscall debug - fix abiv2 syscall_trace - fix preempt enable - clean up regs usage in entry.S" * tag 'csky-for-linus-5.7-rc8' of git://github.com/c-sky/csky-linux: csky: Fixup CONFIG_DEBUG_RSEQ csky: Coding convention in entry.S csky: Fixup abiv2 syscall_trace break a4 & a5 csky: Fixup CONFIG_PREEMPT panic
2020-05-28Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"Jens Axboe
This reverts commit c58c1f83436b501d45d4050fd1296d71a9760bcb. io_uring does do the right thing for this case, and we're still returning -EAGAIN to userspace for the cases we don't support. Revert this change to avoid doing endless spins of resubmits. Cc: stable@vger.kernel.org # v5.6 Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-28include/asm-generic/topology.h: guard cpumask_of_node() macro argumentArnd Bergmann
drivers/hwmon/amd_energy.c:195:15: error: invalid operands to binary expression ('void' and 'int') (channel - data->nr_cpus)); ~~~~~~~~~^~~~~~~~~~~~~~~~~ include/asm-generic/topology.h:51:42: note: expanded from macro 'cpumask_of_node' #define cpumask_of_node(node) ((void)node, cpu_online_mask) ^~~~ include/linux/cpumask.h:618:72: note: expanded from macro 'cpumask_first_and' #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p), (src2p)) ^~~~~ Fixes: f0b848ce6fe9 ("cpumask: Introduce cpumask_of_{node,pcibus} to replace {node,pcibus}_to_cpumask") Fixes: 8abee9566b7e ("hwmon: Add amd_energy driver to report energy counters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20200527134623.930247-1-arnd@arndb.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info()Alexander Potapenko
KMSAN reported uninitialized data being written to disk when dumping core. As a result, several kilobytes of kmalloc memory may be written to the core file and then read by a non-privileged user. Reported-by: sam <sunhaoyl@outlook.com> Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200419100848.63472-1-glider@google.com Link: https://github.com/google/kmsan/issues/76 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()Konstantin Khlebnikov
Replace superfluous VM_BUG_ON() with comment about correct usage. Technically reverts commit 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()"), but context lines have changed. Function isolate_migratepages_block() runs some checks out of lru_lock when choose pages for migration. After checking PageLRU() it checks extra page references by comparing page_count() and page_mapcount(). Between these two checks page could be removed from lru, freed and taken by slab. As a result this race triggers VM_BUG_ON(PageSlab()) in page_mapcount(). Race window is tiny. For certain workload this happens around once a year. page:ffffea0105ca9380 count:1 mapcount:0 mapping:ffff88ff7712c180 index:0x0 compound_mapcount: 0 flags: 0x500000000008100(slab|head) raw: 0500000000008100 dead000000000100 dead000000000200 ffff88ff7712c180 raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(PageSlab(page)) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:628! invalid opcode: 0000 [#1] SMP NOPTI CPU: 77 PID: 504 Comm: kcompactd1 Tainted: G W 4.19.109-27 #1 Hardware name: Yandex T175-N41-Y3N/MY81-EX0-Y3N, BIOS R05 06/20/2019 RIP: 0010:isolate_migratepages_block+0x986/0x9b0 The code in isolate_migratepages_block() was added in commit 119d6d59dcc0 ("mm, compaction: avoid isolating pinned pages") before adding VM_BUG_ON into page_mapcount(). This race has been predicted in 2015 by Vlastimil Babka (see link below). [akpm@linux-foundation.org: comment tweaks, per Hugh] Fixes: 1d148e218a0d ("mm: add VM_BUG_ON_PAGE() to page_mapcount()") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/159032779896.957378.7852761411265662220.stgit@buzz Link: https://lore.kernel.org/lkml/557710E1.6060103@suse.cz/ Link: https://lore.kernel.org/linux-mm/158937872515.474360.5066096871639561424.stgit@buzz/T/ (v1) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm,thp: stop leaking unreleased file pagesHugh Dickins
When collapse_file() calls try_to_release_page(), it has already isolated the page: so if releasing buffers happens to fail (as it sometimes does), remember to putback_lru_page(): otherwise that page is left unreclaimable and unfreeable, and the file extent uncollapsible. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@surriel.com> Cc: <stable@vger.kernel.org> [5.4+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2005231837500.1766@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28mm/z3fold: silence kmemleak false positives of slotsQian Cai
Kmemleak reported many leaks while under memory pressue in, slots = alloc_slots(pool, gfp); which is referenced by "zhdr" in init_z3fold_page(), zhdr->slots = slots; However, "zhdr" could be gone without freeing slots as the later will be freed separately when the last "handle" off of "handles" array is freed. It will be within "slots" which is always aligned. unreferenced object 0xc000000fdadc1040 (size 104): comm "oom04", pid 140476, jiffies 4295359280 (age 3454.970s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: z3fold_zpool_malloc+0x7b0/0xe10 alloc_slots at mm/z3fold.c:214 (inlined by) init_z3fold_page at mm/z3fold.c:412 (inlined by) z3fold_alloc at mm/z3fold.c:1161 (inlined by) z3fold_zpool_malloc at mm/z3fold.c:1735 zpool_malloc+0x34/0x50 zswap_frontswap_store+0x60c/0xda0 zswap_frontswap_store at mm/zswap.c:1093 __frontswap_store+0x128/0x330 swap_writepage+0x58/0x110 pageout+0x16c/0xa40 shrink_page_list+0x1ac8/0x25c0 shrink_inactive_list+0x270/0x730 shrink_lruvec+0x444/0xf30 shrink_node+0x2a4/0x9c0 do_try_to_free_pages+0x158/0x640 try_to_free_pages+0x1bc/0x5f0 __alloc_pages_slowpath.constprop.60+0x4dc/0x15a0 __alloc_pages_nodemask+0x520/0x650 alloc_pages_vma+0xc0/0x420 handle_mm_fault+0x1174/0x1bf0 Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vitaly Wool <vitaly.wool@konsulko.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: http://lkml.kernel.org/r/20200522220052.2225-1-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-28x86/dma: Fix max PFN arithmetic overflow on 32 bit systemsAlexander Dahl
The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is 4 294 967 296 or 0x100000000 which is no problem on 64 bit systems. The patch does not change the later overall result of 0x100000 for MAX_DMA32_PFN (after it has been shifted by PAGE_SHIFT). The new calculation yields the same result, but does not require 64 bit arithmetic. On 32 bit systems the old calculation suffers from an arithmetic overflow in that intermediate term in braces: 4UL aka unsigned long int is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does not fit in 4 bytes), the in braces result is truncated to zero, the following right shift does not alter that, so MAX_DMA32_PFN evaluates to 0 on 32 bit systems. That wrong value is a problem in a comparision against MAX_DMA32_PFN in the init code for swiotlb in pci_swiotlb_detect_4gb() to decide if swiotlb should be active. That comparison yields the opposite result, when compiling on 32 bit systems. This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32 (and which landed in v3.0). In practice this wasn't a problem, unless CONFIG_SWIOTLB is active on x86-32. However if one has set CONFIG_IOMMU_INTEL, since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page is used") there's a dependency on CONFIG_SWIOTLB, which was not necessarily active before. That landed in v5.4, where we noticed it in the fli4l Linux distribution. We have CONFIG_IOMMU_INTEL active on both 32 and 64 bit kernel configs there (I could not find out why, so let's just say historical reasons). The effect is at boot time 64 MiB (default size) were allocated for bounce buffers now, which is a noticeable amount of memory on small systems like pcengines ALIX 2D3 with 256 MiB memory, which are still frequently used as home routers. We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4 (LTS) in fli4l and got that kernel messages for example: Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018 … Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem) … PCI-DMA: Using software bounce buffering for IO (SWIOTLB) software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB) The initial analysis and the suggested fix was done by user 'sourcejedi' at stackoverflow and explicitly marked as GPLv2 for inclusion in the Linux kernel: https://unix.stackexchange.com/a/520525/50007 The new calculation, which does not suffer from that overflow, is the same as for arch/mips now as suggested by Robin Murphy. The fix was tested by fli4l users on round about two dozen different systems, including both 32 and 64 bit archs, bare metal and virtualized machines. [ bp: Massage commit message. ] Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too") Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Alexander Dahl <post@lespocky.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Link: https://unix.stackexchange.com/q/520065/50007 Link: https://web.nettworks.org/bugs/browse/FFL-2560 Link: https://lkml.kernel.org/r/20200526175749.20742-1-post@lespocky.de
2020-05-28Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-05-27 This series contains updates to the ice driver only. Jesse fixes a number of issues, starting with fixing the remaining signed versus unsigned comparison issues. Cleaned up an unused code define. Fixed the implementation of the manage MAC write command, to simplify it by using a simple array to represent the MAC address when writing it. Paul fixes the setting of the VF default LAN address, by removing a check that assumed that the address had been deleted and zeroed. Surabhi prevents a memory leak on filter management initialization failures and during queue initialization and buffer allocation failures. Brett adds additional receive error counters that are reported by ethtool. Fixed the enabling and disabling of VLAN stripping when the PVID has been set. Evan fixes a race condition between the firmware and software, which can occur between the admin queue setup and the first command sent. Marta fixes the driver when XDP transmit rings are destroyed, also make sure the XDP transmit queues are also destroyed. Update the statistics when XDP transmit programs are loaded and packets are sent. Changed the number of XDP transmit queues to match the number of receive queues, instead of matching the number of transmit queues. Bruce avoids undefined behavior by not writing the 8-bit element init_q_state with the associated internal-to-hardware field which is 122-bits. Anirudh (Ani) refactors the receive checksum checks. Krzysztof notifies the user if the fill queue is not long enough to prepare all buffers before packet processing starts and allocates the buffers during the NAPI poll. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28Merge branch 'remove-most-callers-of-kernel_setsockopt-v3'David S. Miller
Christoph Hellwig says: ==================== remove most callers of kernel_setsockopt v3 this series removes most callers of the kernel_setsockopt functions, and instead switches their users to small functions that implement setting a sockopt directly using a normal kernel function call with type safety and all the other benefits of not having a function call. In some cases these functions seem pretty heavy handed as they do a lock_sock even for just setting a single variable, but this mirrors the real setsockopt implementation unlike a few drivers that just set set the fields directly. Changes since v2: - drop the separately merged kernel_getopt_removal - drop the sctp patches, as there is conflicting cleanup going on - add an additional ACK for the rxrpc changes Changes since v1: - use ->getname for sctp sockets in dlm - add a new ->bind_add struct proto method for dlm/sctp - switch the ipv6 and remaining sctp helpers to inline function so that the ipv6 and sctp modules are not pulled in by any module that could potentially use ipv6 or sctp connections - remove arguments to various sock_* helpers that are always used with the same constant arguments ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28tipc: call tsk_set_importance from tipc_topsrv_create_listenerChristoph Hellwig
Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can just use the internal helper. The only change needed is to pass a struct sock instead of tipc_sock, which is private to socket.c Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28rxrpc: add rxrpc_sock_set_min_security_levelChristoph Hellwig
Add a helper to directly set the RXRPC_MIN_SECURITY_LEVEL sockopt from kernel space without going through a fake uaccess. Thanks to David Howells for the documentation updates. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28ipv6: add ip6_sock_set_recvpktinfoChristoph Hellwig
Add a helper to directly set the IPV6_RECVPKTINFO sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28ipv6: add ip6_sock_set_addr_preferencesChristoph Hellwig
Add a helper to directly set the IPV6_ADD_PREFERENCES sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28ipv6: add ip6_sock_set_recverrChristoph Hellwig
Add a helper to directly set the IPV6_RECVERR sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28ipv6: add ip6_sock_set_v6onlyChristoph Hellwig
Add a helper to directly set the IPV6_V6ONLY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>