summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-11Merge tag 'mlx5-updates-2019-09-10' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-09-10 Misc build warnings cleanup for mlx5: 1) Reduce stack usage in FW trace 2) Fix addr's type in mlx5dr_icm_dm 3) Fix rt's type in dr_action_create_reformat_action ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11Merge branch 'stmmac-next'David S. Miller
Jose Abreu says: ==================== net: stmmac: Improvements for -next Misc patches for -next. It includes: - Two fixes for features in -next only - New features support for GMAC cores (which includes GMAC4 and GMAC5) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: ARP Offload for GMAC4+ CoresJose Abreu
Implement the ARP Offload feature in GMAC4 and GMAC5 cores. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: Add support for VLAN Insertion Offload in GMAC4+Jose Abreu
Adds support for TX VLAN Offload using descriptors based features available in GMAC4/5. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: Add support for SA Insertion/Replacement in GMAC4+Jose Abreu
Add the support for Source Address Insertion and Replacement in GMAC4 and GMAC5 cores. Two methods are supported: Descriptor based and register based. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: xgmac: Reinitialize correctly a variableJose Abreu
'value' was being or'ed with a value from another register. This is a typo and could cause new written value to be wrong. Fix it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: Add VLAN HASH filtering support in GMAC4+Jose Abreu
Adds the support for VLAN HASH Filtering in GMAC4/5 cores. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: stmmac: Prevent divide-by-zeroJose Abreu
When RX Coalesce settings are set to all zero (which is a valid setting) we will currently get a divide-by-zero error. Fix it. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: sonic: replace dev_kfree_skb in sonic_send_packetMao Wenan
sonic_send_packet will be processed in irq or non-irq context, so it would better use dev_kfree_skb_any instead of dev_kfree_skb. Fixes: d9fb9f384292 ("*sonic/natsemi/ns83829: Move the National Semi-conductor drivers") Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11wimax: i2400: fix memory leakNavid Emamdoost
In i2400m_op_rfkill_sw_toggle cmd buffer should be released along with skb response. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11Merge branch 'hns3-next'David S. Miller
Huazhong Tan says: ==================== net: hns3: add a feature & bugfixes & cleanups This patch-set includes a VF feature, bugfixes and cleanups for the HNS3 ethernet controller driver. [patch 01/07] adds ethtool_ops.set_channels support for HNS3 VF driver [patch 02/07] adds a recovery for setting channel fail. [patch 03/07] fixes an error related to shaper parameter algorithm. [patch 04/07] fixes an error related to ksetting. [patch 05/07] adds cleanups for some log pinting. [patch 06/07] adds a NULL pointer check before function calling. [patch 07/07] adds some debugging information for reset issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: add some DFX info for reset issueHuazhong Tan
This patch adds more information for reset DFX. Also, adds some cleanups to reset info, move reset_fail_cnt into struct hclge_rst_stats, and modifies some print formats. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: check NULL pointer before useGuangbin Huang
This patch checks ops->set_default_reset_request whether is NULL before using it in function hns3_slot_reset. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: modify some logs formatGuangbin Huang
The pfc_en and pfc_map need to be displayed in hexadecimal notation, printing dma address should use %pad, and the end of printed string needs to be add "\n". This patch modifies them. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: fix port setting handle for fibre portGuangbin Huang
For hardware doesn't support use specified speed and duplex to negotiate, it's unnecessary to check and modify the port speed and duplex for fibre port when autoneg is on. Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: fix shaper parameter algorithmYonglong Liu
Currently when hns3 driver configures the tm shaper to limit bandwidth below 20Mbit using the parameters calculated by hclge_shaper_para_calc(), the actual bandwidth limited by tm hardware module is not accurate enough, for example, 1.28 Mbit when the user is configuring 1 Mbit. This patch adjusts the ir_calc to be closer to ir, and always calculate the ir_b parameter when user is configuring a small bandwidth. Also, removes an unnecessary parenthesis when calculating denominator. Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: revert to old channel when setting new channel num failPeng Li
After setting new channel num, it needs free old ring memory and allocate new ring memory. If there is no enough memory and allocate new ring memory fail, the ring may initialize fail. To make sure the network interface can work normally, driver should revert the channel to the old configuration. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11net: hns3: add ethtool_ops.set_channels support for HNS3 VF driverGuangbin Huang
This patch adds ethtool_ops.set_channels support for HNS3 VF driver, and updates related TQP information and RSS information, to support modification of VF TQP number, and uses current rss_size instead of max_rss_size to initialize RSS. Also, fixes a format error in hclgevf_get_rss(). Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11mac80211_hwsim: Register support for HE meshpointSven Eckelmann
Some features of 802.11ax without central organizing (AP) STA can also be used in mesh mode. hwsim can be used to assist initial development of these features without having access to HW. Signed-off-by: Sven Eckelmann <seckelmann@datto.com> Link: https://lore.kernel.org/r/20190813063657.7544-1-sven@narfation.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11nl80211: Fix possible Spectre-v1 for CQM RSSI thresholdsMasashi Honma
commit 1222a1601488 ("nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds") was incomplete and requires one more fix to prevent accessing to rssi_thresholds[n] because user can control rssi_thresholds[i] values to make i reach to n. For example, rssi_thresholds = {-400, -300, -200, -100} when last is -34. Cc: stable@vger.kernel.org Fixes: 1222a1601488 ("nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Masashi Honma <masashi.honma@gmail.com> Link: https://lore.kernel.org/r/20190908005653.17433-1-masashi.honma@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: allow drivers to set max MTUWen Gong
Make it possibly for drivers to adjust the default max_mtu by storing it in the hardware struct and using that value for all interfaces. Signed-off-by: Wen Gong <wgong@codeaurora.org> Link: https://lore.kernel.org/r/1567738137-31748-1-git-send-email-wgong@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11cfg80211: Do not compare with boolean in nl80211_common_reg_change_eventzhong jiang
With the help of boolinit.cocci, we use !nl80211_reg_change_event_fill instead of (nl80211_reg_change_event_fill == false). Meanwhile, Clean up the code. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Link: https://lore.kernel.org/r/1567657537-65472-1-git-send-email-zhongjiang@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: IBSS: send deauth when expiring inactive STAsJohannes Berg
When we expire an inactive station, try to send it a deauth. This helps if it's actually still around, and just has issues with beacon distribution (or we do), and it will not also remove us. Then, if we have shared state, this may not be reset properly, causing problems; for example, we saw a case where aggregation sessions weren't removed properly (due to the TX start being offloaded to firmware and it relying on deauth for stop), causing a lot of traffic to get lost due to the SN reset after remove/add of the peer. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-9-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: don't check if key is NULL in ieee80211_key_link()Luca Coelho
We already assume that key is not NULL and dereference it in a few other places before we check whether it is NULL, so the check is unnecessary. Remove it. Fixes: 96fc6efb9ad9 ("mac80211: IEEE 802.11 Extended Key ID support") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-8-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: clear crypto tx tailroom counter upon keys enableLior Cohen
In case we got a fw restart while roaming from encrypted AP to non-encrypted one, we might end up with hitting a warning on the pending counter crypto_tx_tailroom_pending_dec having a non-zero value. The following comment taken from net/mac80211/key.c explains the rational for the delayed tailroom needed: /* * The reason for the delayed tailroom needed decrementing is to * make roaming faster: during roaming, all keys are first deleted * and then new keys are installed. The first new key causes the * crypto_tx_tailroom_needed_cnt to go from 0 to 1, which invokes * the cost of synchronize_net() (which can be slow). Avoid this * by deferring the crypto_tx_tailroom_needed_cnt decrementing on * key removal for a while, so if we roam the value is larger than * zero and no 0->1 transition happens. * * The cost is that if the AP switching was from an AP with keys * to one without, we still allocate tailroom while it would no * longer be needed. However, in the typical (fast) roaming case * within an ESS this usually won't happen. */ The next flow lead to the warning eventually reported as a bug: 1. Disconnect from encrypted AP 2. Set crypto_tx_tailroom_pending_dec = 1 for the key 3. Schedule work 4. Reconnect to non-encrypted AP 5. Add a new key, setting the tailroom counter = 1 6. Got FW restart while pending counter is set ---> hit the warning While on it, the ieee80211_reset_crypto_tx_tailroom() func was merged into its single caller ieee80211_reenable_keys (previously called ieee80211_enable_keys). Also, we reset the crypto_tx_tailroom_pending_dec and remove the counters warning as we just reset both. Signed-off-by: Lior Cohen <lior2.cohen@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-7-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: remove unnecessary key conditionJohannes Berg
When we reach this point, the key cannot be NULL. Remove the condition that suggests otherwise. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-6-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: list features in WEP/TKIP disable in better orderJohannes Berg
"HE/HT/VHT" is a bit confusing since really the order of development (and possible support) is different - change this to "HT/VHT/HE". Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-4-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11cfg80211: always shut down on HW rfkillJohannes Berg
When the RFKILL subsystem isn't available, then rfkill_blocked() always returns false. In the case of hardware rfkill this will be wrong though, as if the hardware reported being killed then it cannot operate any longer. Since we only ever call the rfkill_sync work in this case, just rename it to rfkill_block and always pass "true" for the blocked parameter, rather than passing rfkill_blocked(). We rely on the underlying driver to still reject any new attempt to bring up the device by itself. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830112451.21655-2-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11mac80211: vht: add support VHT EXT NSS BW in parsing VHTMordechay Goodstein
This fixes was missed in parsing the vht capabilities max bw support. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Fixes: e80d642552a3 ("mac80211: copy VHT EXT NSS BW Support/Capable data to station") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/20190830114057.22197-1-luca@coelho.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11cfg80211: fix boundary value in ieee80211_frequency_to_channel()Arend van Spriel
The boundary value used for the 6G band was incorrect as it would result in invalid 6G channel number for certain frequencies. Reported-by: Amar Singhal <asinghal@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://lore.kernel.org/r/1567510772-24263-1-git-send-email-arend.vanspriel@broadcom.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11gpio: Initialize the irqchip valid_mask with a callbackLinus Walleij
After changing the valid_mask for the struct gpio_chip to detect the need and presence of a valid mask with the presence of a .init_valid_mask() callback to fill it in, we augment the gpio_irq_chip to use the same logic. Switch all driver using the gpio_irq_chio valid_mask over to this new method. This makes sure the valid_mask for the gpio_irq_chip gets filled in when we add the gpio_chip, which makes it a little easier to switch over drivers using the old way of setting up gpio_irq_chip over to the new method of passing the gpio_irq_chip along with the gpio_chip. (See drivers/gpio/TODO for details.) Cc: Joel Stanley <joel@jms.id.au> Cc: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Patrice Chotard <patrice.chotard@st.com> Link: https://lore.kernel.org/r/20190904140104.32426-1-linus.walleij@linaro.org
2019-09-10misc: mic: Use PTR_ERR_OR_ZERO rather than its implementationzhong jiang
PTR_ERR_OR_ZERO contains if(IS_ERR(...)) + PTR_ERR. It is better to use it directly. hence just replace it. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Link: https://lore.kernel.org/r/1567665795-5901-3-git-send-email-zhongjiang@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-10netfilter: nft_{fwd,dup}_netdev: add offload supportPablo Neira Ayuso
This patch adds support for packet mirroring and redirection. The nft_fwd_dup_netdev_offload() function configures the flow_action object for the fwd and the dup actions. Extend nft_flow_rule_destroy() to release the net_device object when the flow_rule object is released, since nft_fwd_dup_netdev_offload() bumps the net_device reference counter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: wenxu <wenxu@ucloud.cn>
2019-09-10net/mlx5: FWTrace, Reduce stack usageSaeed Mahameed
Mark mlx5_tracer_print_trace as noinline as the function only uses 512 bytes on the stack to avoid the following build warning: drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:660:13: error: stack frame size of 1032 bytes in function 'mlx5_fw_tracer_handle_traces' [-Werror,-Wframe-larger-than=] Fixes: 70dd6fdb8987 ("net/mlx5: FW tracer, parse traces and kernel tracing support") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-10net/mlx5: Fix addr's type in mlx5dr_icm_dmNathan Chancellor
clang errors when CONFIG_PHYS_ADDR_T_64BIT is not set: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c:121:8: error: incompatible pointer types passing 'u64 *' (aka 'unsigned long long *') to parameter of type 'phys_addr_t *' (aka 'unsigned int *') [-Werror,-Wincompatible-pointer-types] &icm_mr->dm.addr, &icm_mr->dm.obj_id); ^~~~~~~~~~~~~~~~ include/linux/mlx5/driver.h:1092:39: note: passing argument to parameter 'addr' here u64 length, u16 uid, phys_addr_t *addr, u32 *obj_id); ^ 1 error generated. Use phys_addr_t for addr's type in mlx5dr_icm_dm, which won't change anything with 64-bit builds because phys_addr_t is u64 when CONFIG_PHYS_ADDR_T_64BIT is set, which is always when CONFIG_64BIT is set. Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator") Link: https://github.com/ClangBuiltLinux/linux/issues/653 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-10net/mlx5: Fix rt's type in dr_action_create_reformat_actionNathan Chancellor
clang warns: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1080:9: warning: implicit conversion from enumeration type 'enum mlx5_reformat_ctx_type' to different enumeration type 'enum mlx5dr_action_type' [-Wenum-conversion] rt = MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL; ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1082:9: warning: implicit conversion from enumeration type 'enum mlx5_reformat_ctx_type' to different enumeration type 'enum mlx5dr_action_type' [-Wenum-conversion] rt = MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL; ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1084:51: warning: implicit conversion from enumeration type 'enum mlx5dr_action_type' to different enumeration type 'enum mlx5_reformat_ctx_type' [-Wenum-conversion] ret = mlx5dr_cmd_create_reformat_ctx(dmn->mdev, rt, data_sz, data, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^~ 3 warnings generated. Use the right type for rt, which is mlx5_reformat_ctx_type so there are no warnings about mismatched types. Fixes: 9db810ed2d37 ("net/mlx5: DR, Expose steering action functionality") Link: https://github.com/ClangBuiltLinux/linux/issues/652 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Austin Kim <austindh.kim@gmail.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-10netfilter: nft_synproxy: add synproxy stateful object supportFernando Fernandez Mancera
Register a new synproxy stateful object type into the stateful object infrastructure. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-10hwmon: (shtc1) add support for the SHTC3 sensorDan Robertson
Add support for the Sensirion SHTC3 humidity and temperature sensor to the shtc1 module. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Link: https://lore.kernel.org/r/20190905014554.21658-2-dan@dlrobertson.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-09-10hwmon: (shtc1) fix shtc1 and shtw1 id maskDan Robertson
Fix an error in the bitmaskfor the shtc1 and shtw1 bitmask used to retrieve the chip ID from the ID register. See section 5.7 of the shtw1 or shtc1 datasheet for details. Fixes: 1a539d372edd9832444e7a3daa710c444c014dc9 ("hwmon: add support for Sensirion SHTC1 sensor") Signed-off-by: Dan Robertson <dan@dlrobertson.com> Link: https://lore.kernel.org/r/20190905014554.21658-3-dan@dlrobertson.com [groeck: Reordered to be first in series and adjusted accordingly] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-09-10iocost_monitor: Report debtTejun Heo
Report debt and rename del_ms row to delay for consistency. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10iocost_monitor: Report more info with higher accuracyTejun Heo
When outputting json: * Don't truncate numbers. * Report address of iocg to ease drilling down further. When outputting table: * Use math.ceil() for delay_ms so that small delays don't read as 0. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10iocost_monitor: Always use strings for json valuesTejun Heo
Json has limited accuracy for numbers and can silently truncate 64bit values, which can be extremely confusing. Let's consistently use string encapsulated values for json output. While at it, convert an unnecesary f-string to str(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10blk-iocost: Don't let merges push vtime into the futureTejun Heo
Merges have the same problem that forced-bios had which is fixed by the previous patch. The cost of a merge is calculated at the time of issue and force-advances vtime into the future. Until global vtime catches up, how the cgroup's hweight changes in the meantime doesn't matter and it often leads to situations where the cost is calculated at one hweight and paid at a very different one. See the previous patch for more details. Fix it by never advancing vtime into the future for merges. If budget is available, vtime is advanced. Otherwise, the cost is charged as debt. This brings merge cost handling in line with issue cost handling in ioc_rqos_throttle(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10blk-iocost: Account force-charged overage in absolute vtimeTejun Heo
Currently, when a bio needs to be force-charged and there isn't enough budget, vtime is simply pushed into the future. This means that the cost of the whole bio is scaled using the current hweight and then charged immediately. Until the global vtime advances beyond this future vtime, the cgroup won't be allowed to issue normal IOs. This is incorrect and can lead to, for example, exploding vrate or extended stalls if vrate range is constrained. Consider the following scenario. 1. A cgroup with a very low hweight runs out of budget. 2. A storm of swap-out happens on it. All of them are scaled according to the current low hweight and charged to vtime pushing it to a far future. 3. All other cgroups go idle and now the above cgroup has access to the whole device. However, because vtime is already wound using the past low hweight, what its current hweight is doesn't matter until global vtime catches up to the local vtime. 4. As a result, either vrate gets ramped up extremely or the IOs stall while the underlying device is idle. This is because the hweight the overage is calculated at is different from the hweight that it's being paid at. Fix it by remembering the overage in absoulte vtime and continuously paying with the actual budget according to the current hweight at each period. Note that non-forced bios which wait already remembers the cost in absolute vtime. This brings forced-bio accounting in line. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10blk-iocost: Fix incorrect operation order during iocg freeTejun Heo
ioc_pd_free() first cancels the hrtimers and then deactivates the iocg. However, the iocg timer can run inbetween and reschedule the hrtimers which will end up running after the iocg is freed leading to crashes like the following. general protection fault: 0000 [#1] SMP ... RIP: 0010:iocg_kick_delay+0xbe/0x1b0 RSP: 0018:ffffc90003598ea0 EFLAGS: 00010046 RAX: 1cee00fd69512b54 RBX: ffff8881bba48400 RCX: 00000000000003e8 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881bba48400 RBP: 0000000000004e20 R08: 0000000000000002 R09: 00000000000003e8 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90003598ef0 R13: 00979f3810ad461f R14: ffff8881bba4b400 R15: 25439f950d26e1d1 FS: 0000000000000000(0000) GS:ffff88885f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f64328c7e40 CR3: 0000000002409005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> iocg_delay_timer_fn+0x3d/0x60 __hrtimer_run_queues+0xfe/0x270 hrtimer_interrupt+0xf4/0x210 smp_apic_timer_interrupt+0x5e/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> Fix it by canceling hrtimers after deactivating the iocg. Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-10sctp: fix the missing put_user when dumping transport thresholdsXin Long
This issue causes SCTP_PEER_ADDR_THLDS sockopt not to be able to dump a transport thresholds info. Fix it by adding 'goto' put_user in sctp_getsockopt_paddr_thresholds. Fixes: 8add543e369d ("sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10sch_hhf: ensure quantum and hhf_non_hh_weight are non-zeroCong Wang
In case of TCA_HHF_NON_HH_WEIGHT or TCA_HHF_QUANTUM is zero, it would make no progress inside the loop in hhf_dequeue() thus kernel would get stuck. Fix this by checking this corner case in hhf_change(). Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Reported-by: syzbot+bc6297c11f19ee807dc2@syzkaller.appspotmail.com Reported-by: syzbot+041483004a7f45f1f20a@syzkaller.appspotmail.com Reported-by: syzbot+55be5f513bed37fc4367@syzkaller.appspotmail.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Terry Lam <vtlam@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10net_sched: check cops->tcf_block in tc_bind_tclass()Cong Wang
At least sch_red and sch_tbf don't implement ->tcf_block() while still have a non-zero tc "class". Instead of adding nop implementations to each of such qdisc's, we can just relax the check of cops->tcf_block() in tc_bind_tclass(). They don't support TC filter anyway. Reported-by: syzbot+21b29db13c065852f64b@syzkaller.appspotmail.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM codeSean Christopherson
Move RDMSR and WRMSR emulation into common x86 code to consolidate nearly identical SVM and VMX code. Note, consolidating RDMSR introduces an extra indirect call, i.e. retpoline, due to reaching {svm,vmx}_get_msr() via kvm_x86_ops, but a guest kernel likely has bigger problems if increasing the latency of RDMSR VM-Exits by ~70 cycles has a measurable impact on overall VM performance. E.g. the only recurring RDMSR VM-Exits (after booting) on my system running Linux 5.2 in the guest are for MSR_IA32_TSC_ADJUST via arch_cpu_idle_enter(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-10KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callersSean Christopherson
Refactor the top-level MSR accessors to take/return the index and value directly instead of requiring the caller to dump them into a msr_data struct. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>