summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-05-23thermal: rcar_gen3_thermal: Update calculation formula of IRQTEMPYoshihiro Kaneko
Update the formula to calculate CTEMP: Currently, the CTEMP is average of val1 (is calculated by formula 1) and val2 (is calculated by formula 2). But, as description in HWM (chapter 10A.3.1.1 Setting of Normal Mode) If (STEMP < Tj_T) CTEMP value should be val1. If (STEMP > Tj_T) CTEMP value should be val2. Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23thermal: rcar_gen3_thermal: Update value of Tj_1Yoshihiro Kaneko
As evaluation of hardware team, temperature calculation formula of M3-W is difference from all other SoCs as below: - M3-W: Tj_1: 116 (so Tj_1 - Tj_3 = 157) - Others: Tj_1: 126 (so Tj_1 - Tj_3 = 167) Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23thermal: tegra: Make tegra210_tsensor_thermtrips staticYueHaibing
Fix sparse warning: drivers/thermal/tegra/tegra210-soctherm.c:211:33: warning: symbol 'tegra210_tsensor_thermtrips' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23Revert "thermal: rockchip: fix up the tsadc pinctrl setting error"Heiko Stuebner
This reverts commit 28694e009e512451ead5519dd801f9869acb1f60. The commit causes multiple issues in that: - the added call to ->control does potentially run unclocked causing a hang of the machine - the added pinctrl-states are undocumented in the binding - the added pinctrl-states are not backwards compatible, breaking old devicetrees. Fixes: 28694e009e51 ("thermal: rockchip: fix up the tsadc pinctrl setting error") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reported-by: kernelci.org bot <bot@kernelci.org> Reported-by: Enric Balletbo Serra <eballetbo@gmail.com> Reported-by: Vicente Bergas <vicencb@gmail.com> Reported-by: Jack Mitchell <ml@embed.me.uk> Reported-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-24Merge tag 'drm-intel-fixes-2019-05-23' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix boosting of new client to be non-preemptive - Fix to actually bump ready tasks ahead of busywaits - Includes gvt-fixes-2019-05-21 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190523094221.GA26026@jlahtine-desk.ger.corp.intel.com
2019-05-23ext4: do not delete unlinked inode from orphan list on failed truncateJan Kara
It is possible that unlinked inode enters ext4_setattr() (e.g. if somebody calls ftruncate(2) on unlinked but still open file). In such case we should not delete the inode from the orphan list if truncate fails. Note that this is mostly a theoretical concern as filesystem is corrupted if we reach this path anyway but let's be consistent in our orphan handling. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2019-05-23ext4: wait for outstanding dio during truncate in nojournal modeJan Kara
We didn't wait for outstanding direct IO during truncate in nojournal mode (as we skip orphan handling in that case). This can lead to fs corruption or stale data exposure if truncate ends up freeing blocks and these get reallocated before direct IO finishes. Fix the condition determining whether the wait is necessary. CC: stable@vger.kernel.org Fixes: 1c9114f9c0f1 ("ext4: serialize unlocked dio reads with truncate") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-05-23Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-05-23 This series contains updates to ice driver only. Anirudh cleans up white space issues and other code formatting issues in the driver. Also implemented LLDP persistence across reboots and start/stop of the LLDP agent. Updated print statements for driver capabilities to include if it is a device or function capability. Bruce cleaned up variable declarations by removing unneeded assignment. Dave fixes a potential hang due to a couple of flows that recursively acquire the RTNL lock which results in a deadlock. Tony updates the driver to advertise what link modes we are capable of when the user does not request a specific link mode. Usha fixes up the LLDP MIB change event handling by cleaning up workarounds and print the DCB configuration changes detected. Brett fixes the driver to handle failures in the VF reset path, which was failing to free resources upon an error. Richard fixed the reported of stats via ethtool to align with our other Intel drivers. Jesse optimizes the transmit buffer and ring structures to have more efficient ordering to get hot cache lines to have packed data. Also optimized the VF structure to use less memory, since it is used hundreds of times throughout the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-24Merge branch 'bpf-explored-states'Daniel Borkmann
Alexei Starovoitov says: ==================== Convert explored_states array into hash table and use simple hash to reduce verifier peak memory consumption for programs with bpf2bpf calls. More details in patch 3. v1->v2: fixed Jakub's small nit in patch 1 ==================== Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24bpf: convert explored_states to hash tableAlexei Starovoitov
All prune points inside a callee bpf function most likely will have different callsites. For example, if function foo() is called from two callsites the half of explored states in all prune points in foo() will be useless for subsequent walking of one of those callsites. Fortunately explored_states pruning heuristics keeps the number of states per prune point small, but walking these states is still a waste of cpu time when the callsite of the current state is different from the callsite of the explored state. To improve pruning logic convert explored_states into hash table and use simple insn_idx ^ callsite hash to select hash bucket. This optimization has no effect on programs without bpf2bpf calls and drastically improves programs with calls. In the later case it reduces total memory consumption in 1M scale tests by almost 3 times (peak_states drops from 5752 to 2016). Care should be taken when comparing the states for equivalency. Since the same hash bucket can now contain states with different indices the insn_idx has to be part of verifier_state and compared. Different hash table sizes and different hash functions were explored, but the results were not significantly better vs this patch. They can be improved in the future. Hit/miss heuristic is not counting index miscompare as a miss. Otherwise verifier stats become unstable when experimenting with different hash functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24bpf: split explored_statesAlexei Starovoitov
split explored_states into prune_point boolean mark and link list of explored states. This removes STATE_LIST_MARK hack and allows marks to be separate from states. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24bpf: cleanup explored_statesAlexei Starovoitov
clean up explored_states to prep for introduction of hashtable No functional changes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Fix crash when dumping rules after conversion to RCU, from Florian Westphal. 2) Fix incorrect hook reinjection from nf_queue in case NF_REPEAT, from Jagdish Motwani. 3) Fix check for route existence in fib extension, from Phil Sutter. 4) Fix use after free in ip_vs_in() hook, from YueHaibing. 5) Check for veth existence from netfilter selftests, from Jeffrin Jose T. 6) Checksum corruption in UDP NAT helpers due to typo, from Florian Westphal. 7) Pass up packets to classic forwarding path regardless of IPv4 DF bit, patch for the flowtable infrastructure from Florian. 8) Set liberal TCP tracking for flows that are placed in the flowtable, in case they need to go back to classic forwarding path, also from Florian. 9) Don't add flow with sequence adjustment to flowtable, from Florian. 10) Skip IPv4 options from IPv6 datapath in flowtable, from Florian. 11) Add selftest for the flowtable infrastructure, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23Merge tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Darrick Wong: "Fix an accounting mistake where we included the log space when calculating the reserve space for metadata expansion" * tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't reserve per-AG space for an internal log
2019-05-23ice: Silence semantic parser warningsBruce Allan
Recent versions of sparse warn about casting pointers to/from restricted endian types in the Linux driver. Silence those with the compiler attribute __force macro from the Linux kernel to force casts to/from restricted endian types. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Fix couple of issues in ice_vsi_releaseBrett Creeley
Currently the driver is calling ice_napi_del() and then unregister_netdev(). The call to unregister_netdev() will result in a call to ice_stop() and then ice_vsi_close(). This is where we call napi_disable() for all the MSI-X vectors. This flow is reversed so make the changes to ensure napi_disable() happens prior to napi_del(). Before calling napi_del() and free_netdev() make sure unregister_netdev() was called. This is done by making sure the __ICE_DOWN bit is set in the vsi->state for the interested VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Reorganize ice_vf structJesse Brandeburg
The ice_vf struct can be used hundreds of times in our driver so it pays to use less memory per struct. ice_vf prior to this commit: /* size: 112, cachelines: 2, members: 25 */ /* sum members: 101, holes: 4, sum holes: 8 */ /* bit holes: 2, sum bit holes: 11 bits */ /* padding: 3 */ /* last cacheline: 48 bytes */ ice_vf after this commit: /* size: 104, cachelines: 2, members: 25 */ /* sum members: 100, holes: 3, sum holes: 4 */ /* bit holes: 1, sum bit holes: 3 bits */ /* last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Use bitfields when possibleJesse Brandeburg
We can use bit fields to store boolean values and when the bit fields are next to each other, the compiler will combine them (as long as the size holds enough). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Reorganize tx_buf and ring structsJesse Brandeburg
Use more efficient structure ordering by using the pahole tool and a lot of code inspection to get hot cache lines to have packed data (no holes if possible) and adjacent warm data. ice_ring prior to this change: /* size: 192, cachelines: 3, members: 23 */ /* sum members: 158, holes: 4, sum holes: 12 */ /* padding: 22 */ ice_ring after this change: /* size: 192, cachelines: 3, members: 25 */ /* sum members: 162, holes: 1, sum holes: 1 */ /* padding: 29 */ ice_tx_buf prior to this change: /* size: 48, cachelines: 1, members: 7 */ /* sum members: 38, holes: 2, sum holes: 6 */ /* padding: 4 */ /* last cacheline: 48 bytes */ ice_tx_buf after this change: /* size: 40, cachelines: 1, members: 7 */ /* sum members: 38, holes: 1, sum holes: 2 */ /* last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Format ethtool reported statsRichard Rodriguez
Fixes ethtool -S reported stats in ice driver to match format and nomenclature of the ixgbe driver. Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Gracefully handle reset failure in ice_alloc_vfs()Brett Creeley
Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to free some resources, reset variables, and return an error value. Fix this by adding another unroll case to free the pf->vf array, set the pf->num_alloc_vfs to 0, and return an error code. Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will not be able to do SRIOV without hard rebooting the system because rmmod'ing the driver does not work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Refactor the LLDP MIB change event handlingUsha Ketineni
This patch fixes the LLDP MIB change event handling code by removing the workarounds in the current code. Added ice_dcb_need_recfg() to print the DCB configuration changes detected via MIB change event. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Advertise supported link modes if none requestedTony Nguyen
User requested link modes affect what is returned as an advertised link mode. If no modes have been requested, we are not advertising any link modes. Advertise what we are capable of supporting if no link modes have been requested. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Fix hang when ethtool disables FW LLDPDave Ertman
When disabling and enabling VSIs, there are a couple of flows that recursively acquire the RTNL lock which causes a deadlock. Fix that. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Call out dev/func caps when printingAnirudh Venkataramanan
ice_parse_caps is used to parse both device and function capabilities. Currently, capabilities are printed with a cryptic "HW caps" prefix, which makes it difficult to distinguish whether the capabilities being printed are device or function capabilities. This patch makes a change to add a "func cap" prefix when printing function capabilities, and a "dev cap" prefix when printing device capabilities. This patch also changes some of the capability print strings for consistency. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Remove braces for single statement blocksAnirudh Venkataramanan
Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary for single statement blocks" Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Cleanup an unnecessary variable initializationBruce Allan
Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg") added an assignment of vsi making the assignment during declaration unnecessary. Also, cleanup the declaration and assignment of irqmap_info to not use two lines in the variable declaration section. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Implement LLDP persistenceAnirudh Venkataramanan
Implement LLDP persistence across reboots, start and stop of LLDP agent. Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp. Also change the ethtool private flag from "disable-fw-lldp" to "enable-fw-lldp". This change will flip the boolean logic of the functionality of the flag (on = enable, off = disable). The change in name and functionality is to differentiate between the pre-persistence and post-persistence states. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23ice: Fix double spacingAnirudh Venkataramanan
Fix double spacing in ice_napi_disable_all Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23net: qualcomm: rmnet: Move common struct definitions to includeSubash Abhinov Kasiviswanathan
Create if_rmnet.h and move the rmnet MAP packet structs to this common include file. To account for portablity, add little and big endian bitfield definitions similar to the ip & tcp headers. The definitions in the headers can now be re-used by the upcoming ipa driver series as well as qmi_wwan. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23cxgb4: offload VLAN flows regardless of VLAN ethtypeRaju Rangoju
VLAN flows never get offloaded unless ivlan_vld is set in filter spec. It's not compulsory for vlan_ethtype to be set. So, always enable ivlan_vld bit for offloading VLAN flows regardless of vlan_ethtype is set or not. Fixes: ad9af3e09c (cxgb4: add tc flower match support for vlan) Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23Revert "dpaa2-eth: configure the cache stashing amount on a queue"Ioana Radulescu
This reverts commit f8b995853444aba9c16c1ccdccdd397527fde96d. The reverted change instructed the QMan hardware block to fetch RX frame annotation and beginning of frame data to cache before the core would read them. It turns out that in rare cases, it's possible that a QMan stashing transaction is delayed long enough such that, by the time it gets executed, the frame in question had already been dequeued by the core and software processing began on it. If the core manages to unmap the frame buffer _before_ the stashing transaction is executed, an SMMU exception will be raised. Unfortunately there is no easy way to work around this while keeping the performance advantages brought by QMan stashing, so disable it altogether. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23cxgb4: use firmware API for validating filter specRaju Rangoju
Adds support for validating hardware filter spec configured in firmware before offloading exact match flows. Use the new fw api FW_PARAM_DEV_FILTER_MODE_MASK to read the filter mode and mask from firmware. If the api isn't supported, then fall-back to older way of reading just the mode from indirect register. Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23Merge branch 'net-ll_temac-Fix-and-enable-multicast-support'David S. Miller
Esben Haabendal says: ==================== net: ll_temac: Fix and enable multicast support This patch series makes the necessary fixes to ll_temac driver to make multicast work, and enables support for it.so that multicast support can The main change is the change from mutex to spinlock of the lock used to synchronize access to the shared indirect register access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23net: ll_temac: Enable multicast supportEsben Haabendal
Multicast support have been tested and is working now. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23net: ll_temac: Cleanup multicast filter on changeEsben Haabendal
Avoid leaving old address table entries when using multicast. If more than one multicast address were removed, only the first removed address would actually be cleared. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23net: ll_temac: Prepare indirect register access for multicast supportEsben Haabendal
With .ndo_set_rx_mode/temac_set_multicast_list() being called in atomic context (holding addr_list_lock), and temac_set_multicast_list() needing to access temac indirect registers, the mutex used to synchronize indirect register is a no-no. Replace it with a spinlock, and avoid sleeping in temac_indirect_busywait(). To avoid excessive holding of the lock, which is now a spinlock, the temac_device_reset() function is changed to only hold the lock for short periods. With timeouts, it could be holding the spinlock for more than 2 seconds. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23net: ll_temac: Do not make promiscuous mode sticky on multicastEsben Haabendal
When user has requested IFF_ALLMULTI or have set more than 4 multicast addresses, we should just use promiscuous mode, but not set it in flags, as it causes the interface to stay in promiscuous mode even when the non-IFF_PROMISC condition that caused promiscuous mode to be enabled has gone away. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23hsr: fix don't prune the master node from the node_dbAndreas Oetken
Don't prune the master node in the hsr_prune_nodes function. Neither time_in[HSR_PT_SLAVE_A] nor time_in[HSR_PT_SLAVE_B] will ever be updated by hsr_register_frame_in for the master port. Thus, the master node will be repeatedly pruned leading to repeated packet loss. This bug never appeared because the hsr_prune_nodes function was only called once. Since commit 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") this issue is fixed unveiling the issue described above. Fixes: 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") Signed-off-by: Andreas Oetken <andreas.oetken@siemens.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23Merge branch 'nvme-5.2-rc2' of git://git.infradead.org/nvme into for-linusJens Axboe
Pull NVMe changes from Keith. * 'nvme-5.2-rc2' of git://git.infradead.org/nvme: nvme-pci: use blk-mq mapping for unmanaged irqs nvme: update MAINTAINERS nvme: copy MTFA field from identify controller nvme: fix memory leak for power latency tolerance nvme: release namespace SRCU protection before performing controller ioctls nvme: merge nvme_ns_ioctl into nvme_ioctl nvme: remove the ifdef around nvme_nvm_ioctl nvme: fix srcu locking on error return in nvme_get_ns_from_disk nvme: Fix known effects nvme-pci: Sync queues on reset nvme-pci: Unblock reset_work on IO failure nvme-pci: Don't disable on timeout in reset state nvme-pci: Fix controller freeze wait disabling
2019-05-23tools/io_uring: sync with liburingJens Axboe
Various fixes and changes have been applied to liburing since we copied some select bits to the kernel testing/examples part, sync up with liburing to get those changes. Most notable is the change that split the CQE reading into the peek and seen event, instead of being just a single function. Also fixes an unsigned wrap issue in io_uring_submit(), leak of 'fd' in setup if we fail, and various other little issues. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23tools/io_uring: fix Makefile for pthread library linkJens Axboe
Currently fails with: io_uring-bench.o: In function `main': /home/axboe/git/linux-block/tools/io_uring/io_uring-bench.c:560: undefined reference to `pthread_create' /home/axboe/git/linux-block/tools/io_uring/io_uring-bench.c:588: undefined reference to `pthread_join' collect2: error: ld returned 1 exit status Makefile:11: recipe for target 'io_uring-bench' failed make: *** [io_uring-bench] Error 1 Move -lpthread to the end. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23blk-mq: fix hang caused by freeze/unfreeze sequenceBob Liu
The following is a description of a hang in blk_mq_freeze_queue_wait(). The hang happens on attempt to freeze a queue while another task does queue unfreeze. The root cause is an incorrect sequence of percpu_ref_resurrect() and percpu_ref_kill() and as a result those two can be swapped: CPU#0 CPU#1 ---------------- ----------------- q1 = blk_mq_init_queue(shared_tags) q2 = blk_mq_init_queue(shared_tags): blk_mq_add_queue_tag_set(shared_tags): blk_mq_update_tag_set_depth(shared_tags): list_for_each_entry() blk_mq_freeze_queue(q1) > percpu_ref_kill() > blk_mq_freeze_queue_wait() blk_cleanup_queue(q1) blk_mq_freeze_queue(q1) > percpu_ref_kill() ^^^^^^ freeze_depth can't guarantee the order blk_mq_unfreeze_queue() > percpu_ref_resurrect() > blk_mq_freeze_queue_wait() ^^^^^^ Hang here!!!! This wrong sequence raises kernel warning: percpu_ref_kill_and_confirm called more than once on blk_queue_usage_counter_release! WARNING: CPU: 0 PID: 11854 at lib/percpu-refcount.c:336 percpu_ref_kill_and_confirm+0x99/0xb0 But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(), which waits for a zero of a q_usage_counter, which never happens because percpu-ref was reinited (instead of being killed) and stays in PERCPU state forever. How to reproduce: - "insmod null_blk.ko shared_tags=1 nr_devices=0 queue_mode=2" - cpu0: python Script.py 0; taskset the corresponding process running on cpu0 - cpu1: python Script.py 1; taskset the corresponding process running on cpu1 Script.py: ------ #!/usr/bin/python3 import os import sys while True: on = "echo 1 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] off = "echo 0 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] os.system(on) os.system(off) ------ This bug was first reported and fixed by Roman, previous discussion: [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com [2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org [3] https://patchwork.kernel.org/patch/9268199/ Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23block: remove the bi_seg_{front,back}_size fields in struct bioChristoph Hellwig
At this point these fields aren't used for anything, so we can remove them. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23block: remove the segment size check in bio_will_gapChristoph Hellwig
We fundamentally do not have a maximum segement size for devices with a virt boundary. So don't bother checking it, especially given that the existing checks didn't properly work to start with as we never fully update the front/back segment size and miss the bi_seg_front_size that wuld have been required for some cases. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23block: force an unlimited segment size on queues with a virt boundaryChristoph Hellwig
We currently fail to update the front/back segment size in the bio when deciding to allow an otherwise gappy segement to a device with a virt boundary. The reason why this did not cause problems is that devices with a virt boundary fundamentally don't use segments as we know it and thus don't care. Make that assumption formal by forcing an unlimited segement size in this case. Fixes: f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23block: don't decrement nr_phys_segments for physically contigous segmentsChristoph Hellwig
Currently ll_merge_requests_fn, unlike all other merge functions, reduces nr_phys_segments by one if the last segment of the previous, and the first segment of the next segement are contigous. While this seems like a nice solution to avoid building smaller than possible requests it causes a mismatch between the segments actually present in the request and those iterated over by the bvec iterators, including __rq_for_each_bio. This can for example mistrigger the single segment optimization in the nvme-pci driver, and might lead to mismatching nr_phys_segments number when recalculating the number of request when inserting a cloned request. We could possibly work around this by making the bvec iterators take the front and back segment size into account, but that would require moving them from the bio to the bio_iter and spreading this mess over all users of bvecs. Or we could simply remove this optimization under the assumption that most users already build good enough bvecs, and that the bio merge patch never cared about this optimization either. The latter is what this patch does. dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests"). Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23sbitmap: fix improper use of smp_mb__before_atomic()Andrea Parri
This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: 6c0ca7ae292ad ("sbitmap: fix wakeup hang after sbq resize") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Omar Sandoval <osandov@fb.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23bio: fix improper use of smp_mb__before_atomic()Andrea Parri
This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: dac56212e8127 ("bio: skip atomic inc/dec of ->bi_cnt for most use cases") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23aoe: list new maintainer for aoe driverEd Cashin
Justin Sanders, who has extensive experience with ATA over Ethernet in general and AoE SCSI and block-device drivers in particular, is ready to take on the role of aoe maintainer. The driver needs a more active maintainer. Signed-off-by: Ed Cashin <ed.cashin@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>