linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-05-23	thermal: rcar_gen3_thermal: Update calculation formula of IRQTEMP	Yoshihiro Kaneko
	Update the formula to calculate CTEMP: Currently, the CTEMP is average of val1 (is calculated by formula 1) and val2 (is calculated by formula 2). But, as description in HWM (chapter 10A.3.1.1 Setting of Normal Mode) If (STEMP < Tj_T) CTEMP value should be val1. If (STEMP > Tj_T) CTEMP value should be val2. Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23	thermal: rcar_gen3_thermal: Update value of Tj_1	Yoshihiro Kaneko
	As evaluation of hardware team, temperature calculation formula of M3-W is difference from all other SoCs as below: - M3-W: Tj_1: 116 (so Tj_1 - Tj_3 = 157) - Others: Tj_1: 126 (so Tj_1 - Tj_3 = 167) Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23	thermal: tegra: Make tegra210_tsensor_thermtrips static	YueHaibing
	Fix sparse warning: drivers/thermal/tegra/tegra210-soctherm.c:211:33: warning: symbol 'tegra210_tsensor_thermtrips' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-23	Revert "thermal: rockchip: fix up the tsadc pinctrl setting error"	Heiko Stuebner
	This reverts commit 28694e009e512451ead5519dd801f9869acb1f60. The commit causes multiple issues in that: - the added call to ->control does potentially run unclocked causing a hang of the machine - the added pinctrl-states are undocumented in the binding - the added pinctrl-states are not backwards compatible, breaking old devicetrees. Fixes: 28694e009e51 ("thermal: rockchip: fix up the tsadc pinctrl setting error") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reported-by: kernelci.org bot <bot@kernelci.org> Reported-by: Enric Balletbo Serra <eballetbo@gmail.com> Reported-by: Vicente Bergas <vicencb@gmail.com> Reported-by: Jack Mitchell <ml@embed.me.uk> Reported-by: Douglas Anderson <dianders@chromium.org> Tested-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-05-24	Merge tag 'drm-intel-fixes-2019-05-23' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix boosting of new client to be non-preemptive - Fix to actually bump ready tasks ahead of busywaits - Includes gvt-fixes-2019-05-21 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190523094221.GA26026@jlahtine-desk.ger.corp.intel.com
2019-05-23	ext4: do not delete unlinked inode from orphan list on failed truncate	Jan Kara
	It is possible that unlinked inode enters ext4_setattr() (e.g. if somebody calls ftruncate(2) on unlinked but still open file). In such case we should not delete the inode from the orphan list if truncate fails. Note that this is mostly a theoretical concern as filesystem is corrupted if we reach this path anyway but let's be consistent in our orphan handling. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2019-05-23	ext4: wait for outstanding dio during truncate in nojournal mode	Jan Kara
	We didn't wait for outstanding direct IO during truncate in nojournal mode (as we skip orphan handling in that case). This can lead to fs corruption or stale data exposure if truncate ends up freeing blocks and these get reallocated before direct IO finishes. Fix the condition determining whether the wait is necessary. CC: stable@vger.kernel.org Fixes: 1c9114f9c0f1 ("ext4: serialize unlocked dio reads with truncate") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-05-23	Merge branch '100GbE' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-05-23 This series contains updates to ice driver only. Anirudh cleans up white space issues and other code formatting issues in the driver. Also implemented LLDP persistence across reboots and start/stop of the LLDP agent. Updated print statements for driver capabilities to include if it is a device or function capability. Bruce cleaned up variable declarations by removing unneeded assignment. Dave fixes a potential hang due to a couple of flows that recursively acquire the RTNL lock which results in a deadlock. Tony updates the driver to advertise what link modes we are capable of when the user does not request a specific link mode. Usha fixes up the LLDP MIB change event handling by cleaning up workarounds and print the DCB configuration changes detected. Brett fixes the driver to handle failures in the VF reset path, which was failing to free resources upon an error. Richard fixed the reported of stats via ethtool to align with our other Intel drivers. Jesse optimizes the transmit buffer and ring structures to have more efficient ordering to get hot cache lines to have packed data. Also optimized the VF structure to use less memory, since it is used hundreds of times throughout the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-24	Merge branch 'bpf-explored-states'	Daniel Borkmann
	Alexei Starovoitov says: ==================== Convert explored_states array into hash table and use simple hash to reduce verifier peak memory consumption for programs with bpf2bpf calls. More details in patch 3. v1->v2: fixed Jakub's small nit in patch 1 ==================== Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24	bpf: convert explored_states to hash table	Alexei Starovoitov
	All prune points inside a callee bpf function most likely will have different callsites. For example, if function foo() is called from two callsites the half of explored states in all prune points in foo() will be useless for subsequent walking of one of those callsites. Fortunately explored_states pruning heuristics keeps the number of states per prune point small, but walking these states is still a waste of cpu time when the callsite of the current state is different from the callsite of the explored state. To improve pruning logic convert explored_states into hash table and use simple insn_idx ^ callsite hash to select hash bucket. This optimization has no effect on programs without bpf2bpf calls and drastically improves programs with calls. In the later case it reduces total memory consumption in 1M scale tests by almost 3 times (peak_states drops from 5752 to 2016). Care should be taken when comparing the states for equivalency. Since the same hash bucket can now contain states with different indices the insn_idx has to be part of verifier_state and compared. Different hash table sizes and different hash functions were explored, but the results were not significantly better vs this patch. They can be improved in the future. Hit/miss heuristic is not counting index miscompare as a miss. Otherwise verifier stats become unstable when experimenting with different hash functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24	bpf: split explored_states	Alexei Starovoitov
	split explored_states into prune_point boolean mark and link list of explored states. This removes STATE_LIST_MARK hack and allows marks to be separate from states. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-24	bpf: cleanup explored_states	Alexei Starovoitov
	clean up explored_states to prep for introduction of hashtable No functional changes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller
	Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Fix crash when dumping rules after conversion to RCU, from Florian Westphal. 2) Fix incorrect hook reinjection from nf_queue in case NF_REPEAT, from Jagdish Motwani. 3) Fix check for route existence in fib extension, from Phil Sutter. 4) Fix use after free in ip_vs_in() hook, from YueHaibing. 5) Check for veth existence from netfilter selftests, from Jeffrin Jose T. 6) Checksum corruption in UDP NAT helpers due to typo, from Florian Westphal. 7) Pass up packets to classic forwarding path regardless of IPv4 DF bit, patch for the flowtable infrastructure from Florian. 8) Set liberal TCP tracking for flows that are placed in the flowtable, in case they need to go back to classic forwarding path, also from Florian. 9) Don't add flow with sequence adjustment to flowtable, from Florian. 10) Skip IPv4 options from IPv6 datapath in flowtable, from Florian. 11) Add selftest for the flowtable infrastructure, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	Merge tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds
	Pull xfs fix from Darrick Wong: "Fix an accounting mistake where we included the log space when calculating the reserve space for metadata expansion" * tag 'xfs-5.2-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't reserve per-AG space for an internal log
2019-05-23	ice: Silence semantic parser warnings	Bruce Allan
	Recent versions of sparse warn about casting pointers to/from restricted endian types in the Linux driver. Silence those with the compiler attribute __force macro from the Linux kernel to force casts to/from restricted endian types. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Fix couple of issues in ice_vsi_release	Brett Creeley
	Currently the driver is calling ice_napi_del() and then unregister_netdev(). The call to unregister_netdev() will result in a call to ice_stop() and then ice_vsi_close(). This is where we call napi_disable() for all the MSI-X vectors. This flow is reversed so make the changes to ensure napi_disable() happens prior to napi_del(). Before calling napi_del() and free_netdev() make sure unregister_netdev() was called. This is done by making sure the __ICE_DOWN bit is set in the vsi->state for the interested VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Reorganize ice_vf struct	Jesse Brandeburg
	The ice_vf struct can be used hundreds of times in our driver so it pays to use less memory per struct. ice_vf prior to this commit: /* size: 112, cachelines: 2, members: 25 / / sum members: 101, holes: 4, sum holes: 8 / / bit holes: 2, sum bit holes: 11 bits / / padding: 3 / / last cacheline: 48 bytes / ice_vf after this commit: / size: 104, cachelines: 2, members: 25 / / sum members: 100, holes: 3, sum holes: 4 / / bit holes: 1, sum bit holes: 3 bits / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Use bitfields when possible	Jesse Brandeburg
	We can use bit fields to store boolean values and when the bit fields are next to each other, the compiler will combine them (as long as the size holds enough). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Reorganize tx_buf and ring structs	Jesse Brandeburg
	Use more efficient structure ordering by using the pahole tool and a lot of code inspection to get hot cache lines to have packed data (no holes if possible) and adjacent warm data. ice_ring prior to this change: /* size: 192, cachelines: 3, members: 23 / / sum members: 158, holes: 4, sum holes: 12 / / padding: 22 / ice_ring after this change: / size: 192, cachelines: 3, members: 25 / / sum members: 162, holes: 1, sum holes: 1 / / padding: 29 / ice_tx_buf prior to this change: / size: 48, cachelines: 1, members: 7 / / sum members: 38, holes: 2, sum holes: 6 / / padding: 4 / / last cacheline: 48 bytes / ice_tx_buf after this change: / size: 40, cachelines: 1, members: 7 / / sum members: 38, holes: 1, sum holes: 2 / / last cacheline: 40 bytes */ Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Format ethtool reported stats	Richard Rodriguez
	Fixes ethtool -S reported stats in ice driver to match format and nomenclature of the ixgbe driver. Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Gracefully handle reset failure in ice_alloc_vfs()	Brett Creeley
	Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to free some resources, reset variables, and return an error value. Fix this by adding another unroll case to free the pf->vf array, set the pf->num_alloc_vfs to 0, and return an error code. Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will not be able to do SRIOV without hard rebooting the system because rmmod'ing the driver does not work. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Refactor the LLDP MIB change event handling	Usha Ketineni
	This patch fixes the LLDP MIB change event handling code by removing the workarounds in the current code. Added ice_dcb_need_recfg() to print the DCB configuration changes detected via MIB change event. Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Advertise supported link modes if none requested	Tony Nguyen
	User requested link modes affect what is returned as an advertised link mode. If no modes have been requested, we are not advertising any link modes. Advertise what we are capable of supporting if no link modes have been requested. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Fix hang when ethtool disables FW LLDP	Dave Ertman
	When disabling and enabling VSIs, there are a couple of flows that recursively acquire the RTNL lock which causes a deadlock. Fix that. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Call out dev/func caps when printing	Anirudh Venkataramanan
	ice_parse_caps is used to parse both device and function capabilities. Currently, capabilities are printed with a cryptic "HW caps" prefix, which makes it difficult to distinguish whether the capabilities being printed are device or function capabilities. This patch makes a change to add a "func cap" prefix when printing function capabilities, and a "dev cap" prefix when printing device capabilities. This patch also changes some of the capability print strings for consistency. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Remove braces for single statement blocks	Anirudh Venkataramanan
	Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary for single statement blocks" Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Cleanup an unnecessary variable initialization	Bruce Allan
	Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg") added an assignment of vsi making the assignment during declaration unnecessary. Also, cleanup the declaration and assignment of irqmap_info to not use two lines in the variable declaration section. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Implement LLDP persistence	Anirudh Venkataramanan
	Implement LLDP persistence across reboots, start and stop of LLDP agent. Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp. Also change the ethtool private flag from "disable-fw-lldp" to "enable-fw-lldp". This change will flip the boolean logic of the functionality of the flag (on = enable, off = disable). The change in name and functionality is to differentiate between the pre-persistence and post-persistence states. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	ice: Fix double spacing	Anirudh Venkataramanan
	Fix double spacing in ice_napi_disable_all Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23	net: qualcomm: rmnet: Move common struct definitions to include	Subash Abhinov Kasiviswanathan
	Create if_rmnet.h and move the rmnet MAP packet structs to this common include file. To account for portablity, add little and big endian bitfield definitions similar to the ip & tcp headers. The definitions in the headers can now be re-used by the upcoming ipa driver series as well as qmi_wwan. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	cxgb4: offload VLAN flows regardless of VLAN ethtype	Raju Rangoju
	VLAN flows never get offloaded unless ivlan_vld is set in filter spec. It's not compulsory for vlan_ethtype to be set. So, always enable ivlan_vld bit for offloading VLAN flows regardless of vlan_ethtype is set or not. Fixes: ad9af3e09c (cxgb4: add tc flower match support for vlan) Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	Revert "dpaa2-eth: configure the cache stashing amount on a queue"	Ioana Radulescu
	This reverts commit f8b995853444aba9c16c1ccdccdd397527fde96d. The reverted change instructed the QMan hardware block to fetch RX frame annotation and beginning of frame data to cache before the core would read them. It turns out that in rare cases, it's possible that a QMan stashing transaction is delayed long enough such that, by the time it gets executed, the frame in question had already been dequeued by the core and software processing began on it. If the core manages to unmap the frame buffer _before_ the stashing transaction is executed, an SMMU exception will be raised. Unfortunately there is no easy way to work around this while keeping the performance advantages brought by QMan stashing, so disable it altogether. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	cxgb4: use firmware API for validating filter spec	Raju Rangoju
	Adds support for validating hardware filter spec configured in firmware before offloading exact match flows. Use the new fw api FW_PARAM_DEV_FILTER_MODE_MASK to read the filter mode and mask from firmware. If the api isn't supported, then fall-back to older way of reading just the mode from indirect register. Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	Merge branch 'net-ll_temac-Fix-and-enable-multicast-support'	David S. Miller
	Esben Haabendal says: ==================== net: ll_temac: Fix and enable multicast support This patch series makes the necessary fixes to ll_temac driver to make multicast work, and enables support for it.so that multicast support can The main change is the change from mutex to spinlock of the lock used to synchronize access to the shared indirect register access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	net: ll_temac: Enable multicast support	Esben Haabendal
	Multicast support have been tested and is working now. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	net: ll_temac: Cleanup multicast filter on change	Esben Haabendal
	Avoid leaving old address table entries when using multicast. If more than one multicast address were removed, only the first removed address would actually be cleared. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	net: ll_temac: Prepare indirect register access for multicast support	Esben Haabendal
	With .ndo_set_rx_mode/temac_set_multicast_list() being called in atomic context (holding addr_list_lock), and temac_set_multicast_list() needing to access temac indirect registers, the mutex used to synchronize indirect register is a no-no. Replace it with a spinlock, and avoid sleeping in temac_indirect_busywait(). To avoid excessive holding of the lock, which is now a spinlock, the temac_device_reset() function is changed to only hold the lock for short periods. With timeouts, it could be holding the spinlock for more than 2 seconds. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	net: ll_temac: Do not make promiscuous mode sticky on multicast	Esben Haabendal
	When user has requested IFF_ALLMULTI or have set more than 4 multicast addresses, we should just use promiscuous mode, but not set it in flags, as it causes the interface to stay in promiscuous mode even when the non-IFF_PROMISC condition that caused promiscuous mode to be enabled has gone away. Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	hsr: fix don't prune the master node from the node_db	Andreas Oetken
	Don't prune the master node in the hsr_prune_nodes function. Neither time_in[HSR_PT_SLAVE_A] nor time_in[HSR_PT_SLAVE_B] will ever be updated by hsr_register_frame_in for the master port. Thus, the master node will be repeatedly pruned leading to repeated packet loss. This bug never appeared because the hsr_prune_nodes function was only called once. Since commit 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") this issue is fixed unveiling the issue described above. Fixes: 5150b45fd355 ("net: hsr: Fix node prune function for forget time expiry") Signed-off-by: Andreas Oetken <andreas.oetken@siemens.com> Tested-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23	Merge branch 'nvme-5.2-rc2' of git://git.infradead.org/nvme into for-linus	Jens Axboe
	Pull NVMe changes from Keith. * 'nvme-5.2-rc2' of git://git.infradead.org/nvme: nvme-pci: use blk-mq mapping for unmanaged irqs nvme: update MAINTAINERS nvme: copy MTFA field from identify controller nvme: fix memory leak for power latency tolerance nvme: release namespace SRCU protection before performing controller ioctls nvme: merge nvme_ns_ioctl into nvme_ioctl nvme: remove the ifdef around nvme_nvm_ioctl nvme: fix srcu locking on error return in nvme_get_ns_from_disk nvme: Fix known effects nvme-pci: Sync queues on reset nvme-pci: Unblock reset_work on IO failure nvme-pci: Don't disable on timeout in reset state nvme-pci: Fix controller freeze wait disabling
2019-05-23	tools/io_uring: sync with liburing	Jens Axboe
	Various fixes and changes have been applied to liburing since we copied some select bits to the kernel testing/examples part, sync up with liburing to get those changes. Most notable is the change that split the CQE reading into the peek and seen event, instead of being just a single function. Also fixes an unsigned wrap issue in io_uring_submit(), leak of 'fd' in setup if we fail, and various other little issues. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	tools/io_uring: fix Makefile for pthread library link	Jens Axboe
	Currently fails with: io_uring-bench.o: In function `main': /home/axboe/git/linux-block/tools/io_uring/io_uring-bench.c:560: undefined reference to `pthread_create' /home/axboe/git/linux-block/tools/io_uring/io_uring-bench.c:588: undefined reference to `pthread_join' collect2: error: ld returned 1 exit status Makefile:11: recipe for target 'io_uring-bench' failed make: *** [io_uring-bench] Error 1 Move -lpthread to the end. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	blk-mq: fix hang caused by freeze/unfreeze sequence	Bob Liu
	The following is a description of a hang in blk_mq_freeze_queue_wait(). The hang happens on attempt to freeze a queue while another task does queue unfreeze. The root cause is an incorrect sequence of percpu_ref_resurrect() and percpu_ref_kill() and as a result those two can be swapped: CPU#0 CPU#1 ---------------- ----------------- q1 = blk_mq_init_queue(shared_tags) q2 = blk_mq_init_queue(shared_tags): blk_mq_add_queue_tag_set(shared_tags): blk_mq_update_tag_set_depth(shared_tags): list_for_each_entry() blk_mq_freeze_queue(q1) > percpu_ref_kill() > blk_mq_freeze_queue_wait() blk_cleanup_queue(q1) blk_mq_freeze_queue(q1) > percpu_ref_kill() ^^^^^^ freeze_depth can't guarantee the order blk_mq_unfreeze_queue() > percpu_ref_resurrect() > blk_mq_freeze_queue_wait() ^^^^^^ Hang here!!!! This wrong sequence raises kernel warning: percpu_ref_kill_and_confirm called more than once on blk_queue_usage_counter_release! WARNING: CPU: 0 PID: 11854 at lib/percpu-refcount.c:336 percpu_ref_kill_and_confirm+0x99/0xb0 But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(), which waits for a zero of a q_usage_counter, which never happens because percpu-ref was reinited (instead of being killed) and stays in PERCPU state forever. How to reproduce: - "insmod null_blk.ko shared_tags=1 nr_devices=0 queue_mode=2" - cpu0: python Script.py 0; taskset the corresponding process running on cpu0 - cpu1: python Script.py 1; taskset the corresponding process running on cpu1 Script.py: ------ #!/usr/bin/python3 import os import sys while True: on = "echo 1 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] off = "echo 0 > /sys/kernel/config/nullb/%s/power" % sys.argv[1] os.system(on) os.system(off) ------ This bug was first reported and fixed by Roman, previous discussion: [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com [2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org [3] https://patchwork.kernel.org/patch/9268199/ Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	block: remove the bi_seg_{front,back}_size fields in struct bio	Christoph Hellwig
	At this point these fields aren't used for anything, so we can remove them. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	block: remove the segment size check in bio_will_gap	Christoph Hellwig
	We fundamentally do not have a maximum segement size for devices with a virt boundary. So don't bother checking it, especially given that the existing checks didn't properly work to start with as we never fully update the front/back segment size and miss the bi_seg_front_size that wuld have been required for some cases. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	block: force an unlimited segment size on queues with a virt boundary	Christoph Hellwig
	We currently fail to update the front/back segment size in the bio when deciding to allow an otherwise gappy segement to a device with a virt boundary. The reason why this did not cause problems is that devices with a virt boundary fundamentally don't use segments as we know it and thus don't care. Make that assumption formal by forcing an unlimited segement size in this case. Fixes: f6970f83ef79 ("block: don't check if adjacent bvecs in one bio can be mergeable") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	block: don't decrement nr_phys_segments for physically contigous segments	Christoph Hellwig
	Currently ll_merge_requests_fn, unlike all other merge functions, reduces nr_phys_segments by one if the last segment of the previous, and the first segment of the next segement are contigous. While this seems like a nice solution to avoid building smaller than possible requests it causes a mismatch between the segments actually present in the request and those iterated over by the bvec iterators, including __rq_for_each_bio. This can for example mistrigger the single segment optimization in the nvme-pci driver, and might lead to mismatching nr_phys_segments number when recalculating the number of request when inserting a cloned request. We could possibly work around this by making the bvec iterators take the front and back segment size into account, but that would require moving them from the bio to the bio_iter and spreading this mess over all users of bvecs. Or we could simply remove this optimization under the assumption that most users already build good enough bvecs, and that the bio merge patch never cared about this optimization either. The latter is what this patch does. dff824b2aadb ("nvme-pci: optimize mapping of small single segment requests"). Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	sbitmap: fix improper use of smp_mb__before_atomic()	Andrea Parri
	This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: 6c0ca7ae292ad ("sbitmap: fix wakeup hang after sbq resize") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Omar Sandoval <osandov@fb.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	bio: fix improper use of smp_mb__before_atomic()	Andrea Parri
	This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic_set() primitive. Replace the barrier with an smp_mb(). Fixes: dac56212e8127 ("bio: skip atomic inc/dec of ->bi_cnt for most use cases") Cc: stable@vger.kernel.org Reported-by: "Paul E. McKenney" <paulmck@linux.ibm.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: linux-block@vger.kernel.org Cc: "Paul E. McKenney" <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-23	aoe: list new maintainer for aoe driver	Ed Cashin
	Justin Sanders, who has extensive experience with ATA over Ethernet in general and AoE SCSI and block-device drivers in particular, is ready to take on the role of aoe maintainer. The driver needs a more active maintainer. Signed-off-by: Ed Cashin <ed.cashin@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>