summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-03atm: idt77252: avoid accessing the data mapped to streaming DMAJia-Ju Bai
In queue_skb(), skb->data is mapped to streaming DMA on line 850: dma_map_single(..., skb->data, ...); Then skb->data is accessed on lines 862 and 863: tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) | (skb->data[2] << 8) | (skb->data[3] << 0); and on lines 893 and 894: tbd->word_4 = (skb->data[0] << 24) | (skb->data[1] << 16) | (skb->data[2] << 8) | (skb->data[3] << 0); These accesses may cause data inconsistency between CPU cache and hardware. To fix this problem, the calculation result of skb->data is stored in a local variable before DMA mapping, and then the driver accesses this local variable instead of skb->data. Signed-off-by: Jia-Ju Bai <baijiaju@tsinghua.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03of: reserved-memory: remove duplicated call to of_get_flat_dt_prop() for ↵Yue Hu
no-map node Just use nomap instead of the second call to of_get_flat_dt_prop(). And change nomap as a bool type due to != NULL operator. Also, correct comment about node of 'align' -> 'alignment'. Signed-off-by: Yue Hu <huyue2@yulong.com> Link: https://lore.kernel.org/r/20200730092353.15644-1-zbestahu@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-03atm: eni: avoid accessing the data mapped to streaming DMAJia-Ju Bai
In do_tx(), skb->data is mapped to streaming DMA on line 1111: paddr = dma_map_single(...,skb->data,DMA_TO_DEVICE); Then skb->data is accessed on line 1153: (skb->data[3] & 0xf) This access may cause data inconsistency between CPU cache and hardware. To fix this problem, skb->data[3] is assigned to a local variable before DMA mapping, and then the driver accesses this local variable instead of skb->data[3]. Signed-off-by: Jia-Ju Bai <baijiaju@tsinghua.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: phy: mdio-mvusb: select MDIO_DEVRES in KconfigBartosz Golaszewski
PHYLIB is not selected by the mvusb driver but it uses mdio devres helpers. Explicitly select MDIO_DEVRES in this driver's Kconfig entry. Reported-by: kernel test robot <lkp@intel.com> Fixes: 1814cff26739 ("net: phy: add a Kconfig option for mdio_devres") Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03appletalk: Fix atalk_proc_init() return pathVincent Duvert
Add a missing return statement to atalk_proc_init so it doesn't return -ENOMEM when successful. This allows the appletalk module to load properly. Fixes: e2bcd8b0ce6e ("appletalk: use remove_proc_subtree to simplify procfs code") Link: https://www.downtowndougbrown.com/2020/08/hacking-up-a-fix-for-the-broken-appletalk-kernel-module-in-linux-5-1-and-newer/ Reported-by: Christopher KOBAYASHI <chris@disavowed.jp> Reported-by: Doug Brown <doug@downtowndougbrown.com> Signed-off-by: Vincent Duvert <vincent.ldev@duvert.net> [lukas: add missing tags] Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v5.1+ Cc: Yue Haibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: dsa: qca8k: Add 802.1q VLAN supportJonathan McDowell
This adds full 802.1q VLAN support to the qca8k, allowing the use of vlan_filtering and more complicated bridging setups than allowed by basic port VLAN support. Tested with a number of untagged ports with separate VLANs and then a trunk port with all the VLANs tagged on it. v3: - Pull QCA8K_PORT_VID_DEF changes into separate cleanup patch - Reverse Christmas tree notation for variable definitions - Use untagged instead of tagged for consistency v2: - Return sensible errnos on failure rather than -1 (rmk) - Style cleanups based on Florian's feedback - Silently allow VLAN 0 as device correctly treats this as no tag Signed-off-by: Jonathan McDowell <noodles@earth.li> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: dsa: qca8k: Add define for port VIDJonathan McDowell
Rather than using a magic value of 1 when configuring the port VIDs add a QCA8K_PORT_VID_DEF define and use that instead. Also fix up the bitmask in the process; the top 4 bits are reserved so this wasn't a problem, but only masking 12 bits is the correct approach. Signed-off-by: Jonathan McDowell <noodles@earth.li> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-08-01 This series contains updates to the ice driver only. Wei Yongjun marks power management functions with __maybe_unused. Nick disables VLAN pruning in promiscuous mode and renames grst_delay to grst_timeout. Kiran modifies the check for linearization and corrects the vsi_id mask value. Vignesh replaces the use of flow profile locks to RSS profile locks for RSS rule removal. Destroys flow profile lock on clearing XLT table and clears extraction sequence entries. Jesse adds some statistics and removes an unreported one. Brett allows for 2 queue configuration for VFs. Surabhi adds a check for failed allocation of an extraction sequence table. Tony updates the PTYPE lookup table and makes other trivial fixes. Victor extends profile ID locks to be held until all references are completed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: Pass NULL to skb_network_protocol() when we don't care about vlan depthMiaohe Lin
When we don't care about vlan depth, we could pass NULL instead of the address of a unused local variable to skb_network_protocol() as a param. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: Use __skb_pagelen() directly in skb_cow_data()Miaohe Lin
In fact, skb_pagelen() - skb_headlen() is equal to __skb_pagelen(), use it directly to avoid unnecessary skb_headlen() call. Also fix the CHECK note of checkpatch.pl: Comparison to NULL could be written "!__pskb_pull_tail" Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: qed: use eth_zero_addr() to clear mac addressMiaohe Lin
Use eth_zero_addr() to clear mac address instead of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: qede: use eth_zero_addr() to clear mac addressMiaohe Lin
Use eth_zero_addr() to clear mac address instead of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03cxgb4: fix extracting IP addresses in TC-FLOWER rulesRahul Lakkireddy
commit c8729cac2a11 ("cxgb4: add ethtool n-tuple filter insertion") has removed checking control key for determining IP address types for TC-FLOWER rules, which causes all the rules being inserted to hardware to become IPv6 rule type always. So, add back the check to select the correct IP address type to extract and hence fix the correct rule type being inserted to hardware. Also, ethtool_rx_flow_key doesn't have any control key and instead directly sets the IPv4/IPv6 address keys. So, explicitly set the IP address type for ethtool n-tuple filters to reuse the same code. Fixes: c8729cac2a11 ("cxgb4: add ethtool n-tuple filter insertion") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03cxgb4: fix check for running offline ethtool selftestRahul Lakkireddy
The flag indicating the selftest to run is a bitmask. So, fix the check. Also, the selftests will fail if adapter initialization has not been completed yet. So, add appropriate check and bail sooner. Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge branch 'ionic-txrx-updates'David S. Miller
Shannon Nelson says: ==================== ionic txrx updates These are a few patches to do some cleanup in the packet handling and give us more flexibility in tuning performance by allowing us to put Tx handling on separate interrupts when it makes sense for particular traffic loads. v3: simplified queue count change logging, removed unnecessary check for no count change v2: dropped the original patch 2 for ringsize change changed the separated tx/rx interrupts to use ethtool -L ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03ionic: separate interrupt for Tx and RxShannon Nelson
Add the capability to split the Tx queues onto their own interrupts with their own napi contexts. This gives the opportunity for more direct control of Tx interrupt handling, such as CPU affinity and interrupt coalescing, useful for some traffic loads. v2: use ethtool -L, not a vendor specific priv-flag v3: simplify logging, drop unnecessary "no-change" tests Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03ionic: tx separate servicingShannon Nelson
We give the tx clean path its own budget and service routine in order to give a little more leeway to be more aggressive, and in preparation for coming changes. We've found this gives us a little better performance in some packet processing scenarios without hurting other scenarios. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03ionic: use fewer firmware doorbells on rx fillShannon Nelson
We really don't need to hit the Rx queue doorbell so many times, we can wait to the end and cause a little less thrash. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: gre: recompute gre csum for sctp over gre tunnelsLorenzo Bianconi
The GRE tunnel can be used to transport traffic that does not rely on a Internet checksum (e.g. SCTP). The issue can be triggered creating a GRE or GRETAP tunnel and transmitting SCTP traffic ontop of it where CRC offload has been disabled. In order to fix the issue we need to recompute the GRE csum in gre_gso_segment() not relying on the inner checksum. The issue is still present when we have the CRC offload enabled. In this case we need to disable the CRC offload if we require GRE checksum since otherwise skb_checksum() will report a wrong value. Fixes: 90017accff61 ("sctp: Add GSO support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: bridge: clear bridge's private skb space on xmitNikolay Aleksandrov
We need to clear all of the bridge private skb variables as they can be stale due to the packet being recirculated through the stack and then transmitted through the bridge device. Similar memset is already done on bridge's input. We've seen cases where proxyarp_replied was 1 on routed multicast packets transmitted through the bridge to ports with neigh suppress which were getting dropped. Same thing can in theory happen with the port isolation bit as well. Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03of: unittest: Use bigger address cells to catch parser regressionsNicolas Saenz Julienne
Getting address and size cells for dma-ranges/ranges parsing is tricky and shouldn't rely on the node's count_cells() method. The function starts looking for cells on the parent node, as its supposed to work with device nodes, which doesn't work when input with bus nodes, as generally done when parsing ranges. Add test to catch regressions on that specific quirk as developers will be tempted to edit it out in favor of the default method. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Link: https://lore.kernel.org/r/9200970a917a9cabdc5b17483b5a8725111eb9d0.camel@suse.de Signed-off-by: Rob Herring <robh@kernel.org>
2020-08-03ipv6/addrconf: use a boolean to choose between UNREGISTER/DOWNFlorent Fourcot
"how" was used as a boolean. Change the type to bool, and improve variable name Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03ipv6/addrconf: call addrconf_ifdown with consistent valuesFlorent Fourcot
Second parameter of addrconf_ifdown "how" is used as a boolean internally. It does not make sense to call it with something different of 0 or 1. This value is set to 2 in all git history. Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge branch 'net-openvswitch-masks-cache-enhancements'David S. Miller
Eelco Chaudron says: ==================== net: openvswitch: masks cache enhancements This patchset adds two enhancements to the Open vSwitch masks cache. Changes in v4 [patch 2/2 only]: - Remove null check before calling free_percpu() - Make ovs_dp_change() return appropriate error codes Changes in v3 [patch 2/2 only]: - Use is_power_of_2() function - Use array_size() function - Fix remaining sparse errors Changes in v2 [patch 2/2 only]: - Fix sparse warnings - Fix netlink policy items reported by Florian Westphal ==================== Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: openvswitch: make masks cache size configurableEelco Chaudron
This patch makes the masks cache size configurable, or with a size of 0, disable it. Reviewed-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: openvswitch: add masks cache hit counterEelco Chaudron
Add a counter that counts the number of masks cache hits, and export it through the megaflow netlink statistics. Reviewed-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: mvpp2: fix memory leak in mvpp2_rxLorenzo Bianconi
Release skb memory in mvpp2_rx() if mvpp2_rx_refill routine fails Fixes: b5015854674b ("net: mvpp2: fix refilling BM pools in RX path") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03ethtool: ethnl_set_linkmodes: remove redundant null checkGaurav Singh
info cannot be NULL here since its being accessed earlier in the function: nlmsg_parse(info->nlhdr...). Remove this redundant NULL check. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03openvswitch: Prevent kernel-infoleak in ovs_ct_put_key()Peilin Ye
ovs_ct_put_key() is potentially copying uninitialized kernel stack memory into socket buffers, since the compiler may leave a 3-byte hole at the end of `struct ovs_key_ct_tuple_ipv4` and `struct ovs_key_ct_tuple_ipv6`. Fix it by initializing `orig` with memset(). Fixes: 9dd7f8907c37 ("openvswitch: Add original direction conntrack tuple to sw_flow_key.") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net/sched: act_ct: fix miss set mru for ovs after defrag in act_ctwenxu
When openvswitch conntrack offload with act_ct action. Fragment packets defrag in the ingress tc act_ct action and miss the next chain. Then the packet pass to the openvswitch datapath without the mru. The over mtu packet will be dropped in output action in openvswitch for over mtu. "kernel: net2: dropped over-mtu packet: 1528 > 1500" This patch add mru in the tc_skb_ext for adefrag and miss next chain situation. And also add mru in the qdisc_skb_cb. The act_ct set the mru to the qdisc_skb_cb when the packet defrag. And When the chain miss, The mru is set to tc_skb_ext which can be got by ovs datapath. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge branch 'Improve-MDIO-Ethernet-PHY-reset'David S. Miller
Bruno Thomsen says: ==================== Improve MDIO Ethernet PHY reset This patch series is a result of trying to upstream a new device tree for a TQMa7D based board[1][2]. Initial this DTS used some deprecated PHY reset properties on the FEC device; NXP Ethernet MAC also known as Freescale Fast Ethernet Controller. When switching from FEC properties[3]: "phy-reset-gpios" "phy-reset-duration" "phy-reset-post-delay" To MDIO PHY properties[4]: "reset-gpios" "reset-assert-us" "reset-deassert-us" The result was that no Ethernet PHY device was detected on boot. This issue could be worked around by disabling PHY type ID auto- detection by using "ethernet-phy-id0022.1560" as compatible string and not "ethernet-phy-ieee802.3-c22". Upstreaming a DTS with this workaround was not accepted, so I digged into the MDIO reset flow and found that it had a few missing parts compared to the deprecated FEC reset function. After some more testing and logic analyzer traces it was revealed that the failed PHY communication was due to missing initial device reset. I was suggested[5] in a earlier mail thread to use MDIO bus reset as that was performed before auto-detection, but current device tree binding was limited to reset assert in usec. Microchip/Micrel Ethernet PHYs recommended reset circuit[8], figure 7-12, is a little "slow" after reset deassert as that is left to a RC circuit with a tau of ~100ms; using a 10k PU resistor together with a 10uF decoupling capacitor. The diode in serie of the reset signal converts the GPIO push-pull output into a open-drain output. So a post reset delay in the range of 500-1000ms is needed, depending on component tolerances and general hardware design margins. In the first version of this patch series[6] I reused the "reset-delay-us" property for reset deassert in usec as that would cause 50/50% duty-cycle, but that would always apply. The solution in this patch series is to add a new MDIO bus property, so post reset delay is optional and configured separately. MDIO bus properties[7]: "reset-delay-us" "reset-post-delay-us" (new) I have not marked this with "Fixes:" as no single commit is the cause and historically this code has only supported MDIO devices that need reset after auto-detection. The patch series also uses a new flexible sleep helper function that was introduced in 5.8-rc1, so the driver uses the optimal sleep function depending on value loaded from device tree. Future work in this area could add new properties on the MDIO device, so reset points are configurable, e.g. no reset, before/after auto-detection or both. [1] https://lore.kernel.org/linux-devicetree/20200629114927.17379-2-bruno.thomsen@gmail.com/ [2] https://lore.kernel.org/linux-devicetree/20200716172611.5349-2-bruno.thomsen@gmail.com/ [3] https://elixir.bootlin.com/linux/v5.7.8/source/Documentation/devicetree/bindings/net/fsl-fec.txt#L44 [4] https://elixir.bootlin.com/linux/v5.8-rc4/source/Documentation/devicetree/bindings/net/mdio.yaml#L78 [5] https://lore.kernel.org/netdev/CAOMZO5DtYDomD8FDCZDwYCSr2AwNT81Ay4==aDxXyBxtyvPiJA@mail.gmail.com/ [6] https://lore.kernel.org/netdev/20200728090203.17313-1-bruno.thomsen@gmail.com/ [7] https://elixir.bootlin.com/linux/v5.8-rc4/source/Documentation/devicetree/bindings/net/mdio.yaml#L36 [8] http://ww1.microchip.com/downloads/en/DeviceDoc/00002202C.pdf ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: mdio device: use flexible sleeping in reset functionBruno Thomsen
MDIO device reset assert and deassert length was created by usleep_range() but that does not ensure optimal handling of all the different values from device tree properties. By switching to the new flexible sleeping helper function, fsleep(), the correct delay function is called depending on delay length, e.g. udelay(), usleep_range() or msleep(). Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: mdiobus: add reset-post-delay-us handlingBruno Thomsen
Load new "reset-post-delay-us" value from MDIO properties, and if configured to a greater then zero delay do a flexible sleeping delay after MDIO bus reset deassert. This allows devices to exit reset state before start bus communication. Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03net: mdiobus: use flexible sleeping for reset-delay-usBruno Thomsen
MDIO bus reset pulse width is created by using udelay() and that function might not be optimal depending on device tree value. By switching to the new fsleep() helper the correct delay function is called depending on delay length, e.g. udelay(), usleep_range() or msleep(). Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03dt-bindings: net: mdio: add reset-post-delay-us propertyBruno Thomsen
Add "reset-post-delay-us" parameter to MDIO bus properties, so it's possible to add a delay after reset deassert. This is optional in case external hardware slows down release of the reset signal. Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03Merge tag 'sched-core-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve uclamp performance by using a static key for the fast path - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for better power efficiency of RT tasks on battery powered devices. (The default is to maximize performance & reduce RT latencies.) - Improve utime and stime tracking accuracy, which had a fixed boundary of error, which created larger and larger relative errors as the values become larger. This is now replaced with more precise arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h. - Improve the deadline scheduler, such as making it capacity aware - Improve frequency-invariant scheduling - Misc cleanups in energy/power aware scheduling - Add sched_update_nr_running tracepoint to track changes to nr_running - Documentation additions and updates - Misc cleanups and smaller fixes * tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst sched/doc: Document capacity aware scheduling sched: Document arch_scale_*_capacity() arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE Documentation/sysctl: Document uclamp sysctl knobs sched/uclamp: Add a new sysctl to control RT default boost value sched/uclamp: Fix a deadlock when enabling uclamp static key sched: Remove duplicated tick_nohz_full_enabled() check sched: Fix a typo in a comment sched/uclamp: Remove unnecessary mutex_init() arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry arch_topology, sched/core: Cleanup thermal pressure definition trace/events/sched.h: fix duplicated word linux/sched/mm.h: drop duplicated words in comments smp: Fix a potential usage of stale nr_cpus sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal sched: nohz: stop passing around unused "ticks" parameter. sched: Better document ttwu() sched: Add a tracepoint to track rq->nr_running ...
2020-08-03Merge tag 'perf-core-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event updates from Ingo Molnar: "HW support updates: - Add uncore support for Intel Comet Lake - Add RAPL support for Hygon Fam18h - Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs, which enumerates per device performance counters via sysfs and enables the perf stat --iiostat functionality - Add support for Intel "Architectural LBRs", which generalized the model specific LBR hardware tracing feature into a model-independent, architected performance monitoring feature. Usage is mostly seamless to tooling, as the pre-existing LBR features are kept, but there's a couple of advantages under the hood, such as faster context-switching, faster LBR reads, cleaner exposure of LBR features to guest kernels, etc. ( Since architectural LBRs are supported via XSAVE, there's related changes to the x86 FPU code as well. ) ftrace/perf updates: - Add support to add a text poke event to record changes to kernel text (i.e. self-modifying code) in order to support tracers like Intel PT decoding through jump labels, kprobes and ftrace trampolines. Misc cleanups, smaller fixes..." * tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) perf/x86/rapl: Add Hygon Fam18h RAPL support kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer() x86/perf: Fix a typo perf: <linux/perf_event.h>: drop a duplicated word perf/x86/intel/lbr: Support XSAVES for arch LBR read perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature x86/fpu/xstate: Support dynamic supervisor feature for LBR x86/fpu: Use proper mask to replace full instruction mask perf/x86: Remove task_ctx_size perf/x86/intel/lbr: Create kmem_cache for the LBR context data perf/core: Use kmem_cache to allocate the PMU specific data perf/core: Factor out functions to allocate/free the task_ctx_data perf/x86/intel/lbr: Support Architectural LBR perf/x86/intel/lbr: Factor out intel_pmu_store_lbr perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all() perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline perf/x86/intel/lbr: Unify the stored format of LBR information perf/x86/intel/lbr: Support LBR_CTL perf/x86: Expose CPUID enumeration bits for arch LBR ...
2020-08-03Merge tag 'objtool-core-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Add support for non-rela relocations, in preparation to merge 'recordmcount' functionality into objtool - Fix assumption that broke under --ffunction-sections (LTO) builds - Misc cleanups * tag 'objtool-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add support for relocations without addends objtool: Rename rela to reloc objtool: Use sh_info to find the base for .rela sections objtool: Do not assume order of parent/child functions
2020-08-03Merge tag 'locking-core-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops. - KCSAN updates: the most important change is that GCC 11 now has all fixes in place to support KCSAN, so GCC support can be enabled again. Also more annotations. - futex updates: minor cleanups and simplifications - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities. - lockdep updates: - simplify IRQ trace event handling - add various new debug checks - simplify header dependencies, split out <linux/lockdep_types.h>, decouple lockdep from other low level headers some more - fix NMI handling - misc cleanups and smaller fixes * tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) kcsan: Improve IRQ state trace reporting lockdep: Refactor IRQ trace events fields into struct seqlock: lockdep assert non-preemptibility on seqcount_t write lockdep: Add preemption enabled/disabled assertion APIs seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount() seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs seqlock: Reorder seqcount_t and seqlock_t API definitions seqlock: seqcount_t latch: End read sections with read_seqcount_retry() seqlock: Properly format kernel-doc code samples Documentation: locking: Describe seqlock design and usage locking/qspinlock: Do not include atomic.h from qspinlock_types.h locking/atomic: Move ATOMIC_INIT into linux/types.h lockdep: Move list.h inclusion into lockdep.h locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs futex: Remove unused or redundant includes futex: Consistently use fshared as boolean futex: Remove needless goto's futex: Remove put_futex_key() rwsem: fix commas in initialisation docs: locking: Replace HTTP links with HTTPS ones ...
2020-08-03bpf: Allow to specify ifindex for skb in bpf_prog_test_run_skbDmitry Yakunin
Now skb->dev is unconditionally set to the loopback device in current net namespace. But if we want to test bpf program which contains code branch based on ifindex condition (eg filters out localhost packets) it is useful to allow specifying of ifindex from userspace. This patch adds such option through ctx_in (__sk_buff) parameter. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200803090545.82046-3-zeil@yandex-team.ru
2020-08-03bpf: Setup socket family and addresses in bpf_prog_test_run_skbDmitry Yakunin
Now it's impossible to test all branches of cgroup_skb bpf program which accesses skb->family and skb->{local,remote}_ip{4,6} fields because they are zeroed during socket allocation. This commit fills socket family and addresses from related fields in constructed skb. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200803090545.82046-2-zeil@yandex-team.ru
2020-08-03Merge tag 'core-rcu-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: - kfree_rcu updates - RCU tasks updates - Read-side scalability tests - SRCU updates - Torture-test updates - Documentation updates - Miscellaneous fixes * tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (109 commits) torture: Remove obsolete "cd $KVM" torture: Avoid duplicate specification of qemu command torture: Dump ftrace at shutdown only if requested torture: Add kvm-tranform.sh script for qemu-cmd files torture: Add more tracing crib notes to kvm.sh torture: Improve diagnostic for KCSAN-incapable compilers torture: Correctly summarize build-only runs torture: Pass --kmake-arg to all make invocations rcutorture: Check for unwatched readers torture: Abstract out console-log error detection torture: Add a stop-run capability torture: Create qemu-cmd in --buildonly runs rcu/rcutorture: Replace 0 with false torture: Add --allcpus argument to the kvm.sh script torture: Remove whitespace from identify_qemu_vcpus output rcutorture: NULL rcu_torture_current earlier in cleanup code rcutorture: Handle non-statistic bang-string error messages torture: Set configfile variable to current scenario rcutorture: Add races with task-exit processing locktorture: Use true and false to assign to bool variables ...
2020-08-03Merge tag 'core-headers-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull header cleanup from Ingo Molnar: "Separate out the instrumentation_begin()/end() bits from compiler.h" * tag 'core-headers-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compiler.h: Move instrumentation_begin()/end() to new <linux/instrumentation.h> header
2020-08-03Merge tag 'core-debugobjects-2020-08-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects cleanup from Ingo Molnar: "A single commit which simplifies a debugfs attribute definition" * tag 'core-debugobjects-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: debugobjects: Convert to DEFINE_SHOW_ATTRIBUTE
2020-08-03Merge tag 'irq-urgent-2020-08-02' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Fix a recent IRQ affinities regression, add in a missing debugfs printout that helps the debugging of IRQ affinity logic bugs, and fix a memory leak" * tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/debugfs: Add missing irqchip flags genirq/affinity: Make affinity setting if activated opt-in irqdomain/treewide: Free firmware node after domain removal
2020-08-03Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 and cross-arch updates from Catalin Marinas: "Here's a slightly wider-spread set of updates for 5.9. Going outside the usual arch/arm64/ area is the removal of read_barrier_depends() series from Will and the MSI/IOMMU ID translation series from Lorenzo. The notable arm64 updates include ARMv8.4 TLBI range operations and translation level hint, time namespace support, and perf. Summary: - Removal of the tremendously unpopular read_barrier_depends() barrier, which is a NOP on all architectures apart from Alpha, in favour of allowing architectures to override READ_ONCE() and do whatever dance they need to do to ensure address dependencies provide LOAD -> LOAD/STORE ordering. This work also offers a potential solution if compilers are shown to convert LOAD -> LOAD address dependencies into control dependencies (e.g. under LTO), as weakly ordered architectures will effectively be able to upgrade READ_ONCE() to smp_load_acquire(). The latter case is not used yet, but will be discussed further at LPC. - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter and apply the resulting changes to the device ID space provided by the Freescale FSL bus. - arm64 support for TLBI range operations and translation table level hints (part of the ARMv8.4 architecture version). - Time namespace support for arm64. - Export the virtual and physical address sizes in vmcoreinfo for makedumpfile and crash utilities. - CPU feature handling cleanups and checks for programmer errors (overlapping bit-fields). - ACPI updates for arm64: disallow AML accesses to EFI code regions and kernel memory. - perf updates for arm64. - Miscellaneous fixes and cleanups, most notably PLT counting optimisation for module loading, recordmcount fix to ignore relocations other than R_AARCH64_CALL26, CMA areas reserved for gigantic pages on 16K and 64K configurations. - Trivial typos, duplicate words" Link: http://lkml.kernel.org/r/20200710165203.31284-1-will@kernel.org Link: http://lkml.kernel.org/r/20200619082013.13661-1-lorenzo.pieralisi@arm.com * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits) arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page ...
2020-08-03Merge tag 'm68k-for-v5.9-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - several Kbuild improvements - several Mac fixes - minor cleanups and fixes - defconfig updates * tag 'm68k-for-v5.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: defconfig: Update defconfigs for v5.8-rc3 m68k: Use CLEAN_FILES to clean up files m68k: mac: Improve IOP debug messages m68k: mac: Don't send uninitialized data in IOP message reply m68k: mac: Fix IOP status/control register writes m68k: mac: Don't send IOP message until channel is idle m68k: atari: Annotate dummy read in ROM port IO code as __maybe_unused m68k: Use sizeof_field() helper m68k: Pass -D options to KBUILD_CPPFLAGS instead of KBUILD_{A,C}FLAGS m68k: Optimize cc-option calls for cpuflags-y m68k: sun3: Descend to prom from arch/m68k/sun3 m68k: Add arch/m68k/Kbuild
2020-08-03Merge tag 'rm-unicore32' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux Pull unicore32 removal from Mike Rapoport: "Remove unicore32 support. The unicore32 port do not seem maintained for a long time now, there is no upstream toolchain that can create unicore32 binaries and all the links to prebuilt toolchains for unicore32 are dead. Even compilers that were available are not supported by the kernel anymore. Guenter Roeck says: "I have stopped building unicore32 images since v4.19 since there is no available compiler that is still supported by the kernel. I am surprised that support for it has not been removed from the kernel" However, it's worth pointing out two things: - Guan Xuetao is still listed as maintainer and asked for the port to be kept around the last time Arnd suggested removing it two years ago. He promised that there would be compiler sources (presumably llvm), but has not made those available since. - https://github.com/gxt has patches to linux-4.9 and qemu-2.7, both released in 2016, with patches dated early 2019. These patches mainly restore a syscall ABI that was never part of mainline Linux but apparently used in production. qemu-2.8 removed support for that ABI and newer kernels (4.19+) can no longer be built with the old toolchain, so apparently there will not be any future updates to that git tree" * tag 'rm-unicore32' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux: MAINTAINERS: remove "PKUNITY SOC DRIVERS" entry rtc: remove fb-puv3 driver video: fbdev: remove fb-puv3 driver pwm: remove pwm-puv3 driver input: i8042: remove support for 8042-unicore32io i2c/buses: remove i2c-puv3 driver cpufreq: remove unicore32 driver arch: remove unicore32 port
2020-08-03Merge tag 's390-5.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add support for function error injection. - Add support for custom exception handlers, as required by BPF_PROBE_MEM. - Add support for BPF_PROBE_MEM. - Add trace events for idle enter / exit for the s390 specific idle implementation. - Remove unused zcore memmmap device. - Remove unused "raw view" from s390 debug feature. - AP bus + zcrypt device driver code refactoring. - Provide cex4 cca sysfs attributes for cex3 for zcrypt device driver. - Expose only minimal interface to walk physmem for mm/memblock. This is a common code change and it has been agreed on with Mike Rapoport and Andrew Morton that this can go upstream via the s390 tree. - Rework of the s390 vmem/vmmemap code to allow for future memory hot remove. - Get rid of FORCE_MAX_ZONEORDER to finally allow for order-10 allocations again, instead of only order-8 allocations. - Various small improvements and fixes. * tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390/vmemmap: coding style updates s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive sections s390/vmemmap: remember unused sub-pmd ranges s390/vmemmap: fallback to PTEs if mapping large PMD fails s390/vmem: cleanup empty page tables s390/vmemmap: take the vmem_mutex when populating/freeing s390/vmemmap: cleanup when vmemmap_populate() fails s390/vmemmap: extend modify_pagetable() to handle vmemmap s390/vmem: consolidate vmem_add_range() and vmem_remove_range() s390/vmem: rename vmem_add_mem() to vmem_add_range() s390: enable HAVE_FUNCTION_ERROR_INJECTION s390/pci: clarify comment in s390_mmio_read/write s390/time: improve comparison for tod steering s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE s390/time: use CLOCKSOURCE_MASK s390/bpf: implement BPF_PROBE_MEM s390/kernel: expand exception table logic to allow new handling options s390/kernel: unify EX_TABLE* implementations s390/mm: allow order 10 allocations s390/mm: avoid trimming to MAX_ORDER ...
2020-08-03Merge tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: "Lots of cleanups in here, hardening the code and/or making it easier to read and fixing bugs, but a core feature/change too adding support for real async buffered reads. With the latter in place, we just need buffered write async support and we're done relying on kthreads for the fast path. In detail: - Cleanup how memory accounting is done on ring setup/free (Bijan) - sq array offset calculation fixup (Dmitry) - Consistently handle blocking off O_DIRECT submission path (me) - Support proper async buffered reads, instead of relying on kthread offload for that. This uses the page waitqueue to drive retries from task_work, like we handle poll based retry. (me) - IO completion optimizations (me) - Fix race with accounting and ring fd install (me) - Support EPOLLEXCLUSIVE (Jiufei) - Get rid of the io_kiocb unionizing, made possible by shrinking other bits (Pavel) - Completion side cleanups (Pavel) - Cleanup REQ_F_ flags handling, and kill off many of them (Pavel) - Request environment grabbing cleanups (Pavel) - File and socket read/write cleanups (Pavel) - Improve kiocb_set_rw_flags() (Pavel) - Tons of fixes and cleanups (Pavel) - IORING_SQ_NEED_WAKEUP clear fix (Xiaoguang)" * tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block: (127 commits) io_uring: flip if handling after io_setup_async_rw fs: optimise kiocb_set_rw_flags() io_uring: don't touch 'ctx' after installing file descriptor io_uring: get rid of atomic FAA for cq_timeouts io_uring: consolidate *_check_overflow accounting io_uring: fix stalled deferred requests io_uring: fix racy overflow count reporting io_uring: deduplicate __io_complete_rw() io_uring: de-unionise io_kiocb io-wq: update hash bits io_uring: fix missing io_queue_linked_timeout() io_uring: mark ->work uninitialised after cleanup io_uring: deduplicate io_grab_files() calls io_uring: don't do opcode prep twice io_uring: clear IORING_SQ_NEED_WAKEUP after executing task works io_uring: batch put_task_struct() tasks: add put_task_struct_many() io_uring: return locked and pinned page accounting io_uring: don't miscount pinned memory io_uring: don't open-code recv kbuf managment ...