summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-09-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2020-09-29 The following pull-request contains BPF updates for your *net* tree. We've added 7 non-merge commits during the last 14 day(s) which contain a total of 7 files changed, 28 insertions(+), 8 deletions(-). The main changes are: 1) fix xdp loading regression in libbpf for old kernels, from Andrii. 2) Do not discard packet when NETDEV_TX_BUSY, from Magnus. 3) Fix corner cases in libbpf related to endianness and kconfig, from Tony. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: mscc: ocelot: automatically detect VCAP constantsVladimir Oltean
The numbers in struct vcap_props are not intuitive to derive, because they are not a straightforward copy-and-paste from the reference manual but instead rely on a fairly detailed level of understanding of the layout of an entry in the TCAM and in the action RAM. For this reason, bugs are very easy to introduce here. Ease the work of hardware porters and read from hardware the constants that were exported for this particular purpose. Note that this implies that struct vcap_props can no longer be const. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and targetVladimir Oltean
As a preparation step for the offloading to ES0, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and targetVladimir Oltean
As a preparation step for the offloading to IS1, let's create the infrastructure for talking with this hardware block. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: mscc: ocelot: generalize existing code for VCAPVladimir Oltean
In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which have the same configuration interface, but different sets of keys and actions. The driver currently only supports VCAP IS2. In preparation of VCAP IS1 and ES0 support, the existing code must be generalized to work with any VCAP. In that direction, we should move the structures that depend upon VCAP instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct ocelot and into struct vcap_props .keys and .actions, a structure that is replicated 3 times, once per VCAP. We'll pass that structure as an argument to each function that does the key and action packing - only the control logic needs to distinguish between ocelot->vcap[VCAP_IS2] or IS1 or ES0. Another change is to make use of the newly introduced ocelot_target_read and ocelot_target_write API, since the 3 VCAPs have the same registers but put at different addresses. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: mscc: ocelot: introduce a new ocelot_target_{read,write} APIVladimir Oltean
There are some targets (register blocks) in the Ocelot switch that are instantiated more than once. For example, the VCAP IS1, IS2 and ES0 blocks all share the same register layout for interacting with the cache for the TCAM and the action RAM. For the VCAPs, the procedure for servicing them is actually common. We just need an API specifying which VCAP we are talking to, and we do that via these raw ocelot_target_read and ocelot_target_write accessors. In plain ocelot_read, the target is encoded into the register enum itself: u16 target = reg >> TARGET_OFFSET; For the VCAPs, the registers are currently defined like this: enum ocelot_reg { [...] S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET, S2_CORE_MV_CFG, S2_CACHE_ENTRY_DAT, S2_CACHE_MASK_DAT, S2_CACHE_ACTION_DAT, S2_CACHE_CNT_DAT, S2_CACHE_TG_DAT, [...] }; which is precisely what we want to avoid, because we'd have to duplicate the same register map for S1 and for S0, and then figure out how to pass VCAP instance-specific registers to the ocelot_read calls (basically another lookup table that undoes the effect of shifting with TARGET_OFFSET). So for some targets, propose a more raw API, similar to what is currently done with ocelot_port_readl and ocelot_port_writel. Those targets can only be accessed with ocelot_target_{read,write} and not with ocelot_{read,write} after the conversion, which is fine. The VCAP registers are not actually modified to use this new API as of this patch. They will be modified in the next one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: Add netif_rx_any_context()Sebastian Andrzej Siewior
Quite some drivers make conditional decisions based on in_interrupt() to invoke either netif_rx() or netif_rx_ni(). Conditionals based on in_interrupt() or other variants of preempt count checks in drivers should not exist for various reasons and Linus clearly requested to either split the code pathes or pass an argument to the common functions which provides the context. This is obviously the correct solution, but for some of the affected drivers this needs a major rewrite due to their convoluted structure. As in_interrupt() usage in drivers needs to be phased out, provide netif_rx_any_context() as a stop gap for these drivers. This confines the in_interrupt() conditional to core code which in turn allows to remove the access to this check for driver code and provides one central place to do further modifications once the driver maze is cleaned up. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29net: caif: Remove unused caif SPI driverThomas Gleixner
While chasing in_interrupt() (ab)use in drivers it turned out that the caif_spi driver has never been in use since the driver was merged 10 years ago. There never was any matching code which provides a platform device. The driver has not seen any update (asided of treewide changes and cleanups) since 8 years and the maintainers vanished from the planet. So analysing the potential contexts and the (in)correctness of in_interrupt() usage is just a pointless exercise. Remove the cruft. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29devlink: include <linux/const.h> for _BITULJacob Keller
Commit 5d5b4128c4ca ("devlink: introduce flash update overwrite mask") added a usage of _BITUL to the UAPI <linux/devlink.h> header, but failed to include the header file where it was defined. It happens that this does not break any existing kernel include chains because it gets included through other sources. However, when including the UAPI headers in a userspace application (such as devlink in iproute2), _BITUL is not defined. Fixes: 5d5b4128c4ca ("devlink: introduce flash update overwrite mask") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29l2tp: report rx cookie discards in netlink getTom Parkin
When an L2TPv3 session receives a data frame with an incorrect cookie l2tp_core logs a warning message and bumps a stats counter to reflect the fact that the packet has been dropped. However, the stats counter in question is missing from the l2tp_netlink get message for tunnel and session instances. Include the statistic in the netlink get response. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2020-09-29 Here's the main bluetooth-next pull request for 5.10: - Multiple fixes to suspend/resume handling - Added mgmt events for controller suspend/resume state - Improved extended advertising support - btintel: Enhanced support for next generation controllers - Added Qualcomm Bluetooth SoC WCN6855 support - Several other smaller fixes & improvements ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-29bpf: Support attaching freplace programs to multiple attach pointsToke Høiland-Jørgensen
This enables support for attaching freplace programs to multiple attach points. It does this by amending the UAPI for bpf_link_Create with a target btf ID that can be used to supply the new attachment point along with the target program fd. The target must be compatible with the target that was supplied at program load time. The implementation reuses the checks that were factored out of check_attach_btf_id() to ensure compatibility between the BTF types of the old and new attachment. If these match, a new bpf_tracing_link will be created for the new attach target, allowing multiple attachments to co-exist simultaneously. The code could theoretically support multiple-attach of other types of tracing programs as well, but since I don't have a use case for any of those, there is no API support for doing so. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355169.48470.17165680973640685368.stgit@toke.dk
2020-09-29bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attachToke Høiland-Jørgensen
In preparation for allowing multiple attachments of freplace programs, move the references to the target program and trampoline into the bpf_tracing_link structure when that is created. To do this atomically, introduce a new mutex in prog->aux to protect writing to the two pointers to target prog and trampoline, and rename the members to make it clear that they are related. With this change, it is no longer possible to attach the same tracing program multiple times (detaching in-between), since the reference from the tracing program to the target disappears on the first attach. However, since the next patch will let the caller supply an attach target, that will also make it possible to attach to the same place multiple times. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355059.48470.2503076992210324984.stgit@toke.dk
2020-09-28genetlink: add missing kdoc for validation flagsJakub Kicinski
Validation flags are missing kdoc, add it. Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28stmmac: intel: Adding ref clock 1us tic for LPI cntrRusaimi Amira Ruslan
Adding reference clock (1us tic) for all LPI timer on Intel platforms. The reference clock is derived from ptp clk. This also enables all LPI counter. Signed-off-by: Rusaimi Amira Ruslan <rusaimi.amira.rusaimi@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28bpf: Add bpf_seq_printf_btf helperAlan Maguire
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_snprintf_btf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Add bpf_snprintf_btf helperAlan Maguire
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced - the BTF id of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff *skb could do the following: static struct btf_ptr b = { }; b.ptr = skb; b.type_id = __builtin_btf_type_id(struct sk_buff, 1); bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT: no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Move to generic BTF show support, apply it to seq files/stringsAlan Maguire
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Provide function to get vmlinux BTF informationAlan Maguire
It will be used later for BPF structure display support Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-2-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: verifier: refactor check_attach_btf_id()Toke Høiland-Jørgensen
The check_attach_btf_id() function really does three things: 1. It performs a bunch of checks on the program to ensure that the attachment is valid. 2. It stores a bunch of state about the attachment being requested in the verifier environment and struct bpf_prog objects. 3. It allocates a trampoline for the attachment. This patch splits out (1.) and (3.) into separate functions which will perform the checks, but return the computed values instead of directly modifying the environment. This is done in preparation for reusing the checks when the actual attachment is happening, which will allow tracing programs to have multiple (compatible) attachments. This also fixes a bug where a bunch of checks were skipped if a trampoline already existed for the tracing target. Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN") Fixes: 1e6c62a88215 ("bpf: Introduce sleepable BPF programs") Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28bpf: change logging calls from verbose() to bpf_log() and use log pointerToke Høiland-Jørgensen
In preparation for moving code around, change a bunch of references to env->log (and the verbose() logging helper) to use bpf_log() and a direct pointer to struct bpf_verifier_log. While we're touching the function signature, mark the 'prog' argument to bpf_check_type_match() as const. Also enhance the bpf_verifier_log_needed() check to handle NULL pointers for the log struct so we can re-use the code with logging disabled. Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28net/smc: introduce CHID callback for ISM devicesUrsula Braun
With SMCD version 2 the CHIDs of ISM devices are needed for the CLC handshake. This patch provides the new callback to retrieve the CHID of an ISM device. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce System Enterprise ID (SEID)Ursula Braun
SMCD version 2 defines a System Enterprise ID (short SEID). This patch contains the SEID creation and adds the callback to retrieve the created SEID. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: core: add nested_level variable in net_deviceTaehee Yoo
This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. netif_addr_lock() can be called recursively so spin_lock_nested() is used instead of spin_lock() and dev->lower_level is used as a parameter of spin_lock_nested(). But, dev->lower_level value can be updated while it is being used. So, lockdep would warn a possible deadlock scenario. When a stacked interface is deleted, netif_{uc | mc}_sync() is called recursively. So, spin_lock_nested() is called recursively too. At this moment, the dev->lower_level variable is used as a parameter of it. dev->lower_level value is updated when interfaces are being unlinked/linked immediately. Thus, After unlinking, dev->lower_level shouldn't be a parameter of spin_lock_nested(). A (macvlan) | B (vlan) | C (bridge) | D (macvlan) | E (vlan) | F (bridge) A->lower_level : 6 B->lower_level : 5 C->lower_level : 4 D->lower_level : 3 E->lower_level : 2 F->lower_level : 1 When an interface 'A' is removed, it releases resources. At this moment, netif_addr_lock() would be called. Then, netdev_upper_dev_unlink() is called recursively. Then dev->lower_level is updated. There is no problem. But, when the bridge module is removed, 'C' and 'F' interfaces are removed at once. If 'F' is removed first, a lower_level value is like below. A->lower_level : 5 B->lower_level : 4 C->lower_level : 3 D->lower_level : 2 E->lower_level : 1 F->lower_level : 1 Then, 'C' is removed. at this moment, netif_addr_lock() is called recursively. The ordering is like this. C(3)->D(2)->E(1)->F(1) At this moment, the lower_level value of 'E' and 'F' are the same. So, lockdep warns a possible deadlock scenario. In order to avoid this problem, a new variable 'nested_level' is added. This value is the same as dev->lower_level - 1. But this value is updated in rtnl_unlock(). So, this variable can be used as a parameter of spin_lock_nested() safely in the rtnl context. Test commands: ip link add br0 type bridge vlan_filtering 1 ip link add vlan1 link br0 type vlan id 10 ip link add macvlan2 link vlan1 type macvlan ip link add br3 type bridge vlan_filtering 1 ip link set macvlan2 master br3 ip link add vlan4 link br3 type vlan id 10 ip link add macvlan5 link vlan4 type macvlan ip link add br6 type bridge vlan_filtering 1 ip link set macvlan5 master br6 ip link add vlan7 link br6 type vlan id 10 ip link add macvlan8 link vlan7 type macvlan ip link set br0 up ip link set vlan1 up ip link set macvlan2 up ip link set br3 up ip link set vlan4 up ip link set macvlan5 up ip link set br6 up ip link set vlan7 up ip link set macvlan8 up modprobe -rv bridge Splat looks like: [ 36.057436][ T744] WARNING: possible recursive locking detected [ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted [ 36.059959][ T744] -------------------------------------------- [ 36.061391][ T744] ip/744 is trying to acquire lock: [ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30 [ 36.064922][ T744] [ 36.064922][ T744] but task is already holding lock: [ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.068851][ T744] [ 36.068851][ T744] other info that might help us debug this: [ 36.070731][ T744] Possible unsafe locking scenario: [ 36.070731][ T744] [ 36.072497][ T744] CPU0 [ 36.073238][ T744] ---- [ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.076590][ T744] [ 36.076590][ T744] *** DEADLOCK *** [ 36.076590][ T744] [ 36.078515][ T744] May be due to missing lock nesting notation [ 36.078515][ T744] [ 36.080491][ T744] 3 locks held by ip/744: [ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490 [ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80 [ 36.088400][ T744] [ 36.088400][ T744] stack backtrace: [ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728 [ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 36.093630][ T744] Call Trace: [ 36.094416][ T744] dump_stack+0x77/0x9b [ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40 [ 36.096522][ T744] lock_acquire+0xb4/0x3b0 [ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30 [ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0 [ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20 [ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0 [ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70 [ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.105461][ T744] dev_set_rx_mode+0x19/0x30 [ 36.106532][ T744] dev_set_promiscuity+0x36/0x50 [ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50 [ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge] [ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge] [ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50 [ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90 [ 36.116249][ T744] dev_uc_sync+0x70/0x80 [ 36.117244][ T744] dev_uc_add+0x50/0x60 [ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan] [ 36.119470][ T744] __dev_open+0xd6/0x170 [ 36.120470][ T744] __dev_change_flags+0x181/0x1d0 [ 36.121644][ T744] dev_change_flags+0x23/0x60 [ 36.122741][ T744] do_setlink+0x30a/0x11e0 [ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0 [ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0 [ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850 [ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0 [ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60 [ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380 [ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.139164][ T744] rtnl_newlink+0x47/0x70 [ ... ] Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: core: introduce struct netdev_nested_priv for nested interface ↵Taehee Yoo
infrastructure Functions related to nested interface infrastructure such as netdev_walk_all_{ upper | lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. In the following patch, a new member variable will be added into this struct to fix the lockdep issue. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28bpf: Enable BPF_PROG_TEST_RUN for raw_tracepointSong Liu
Add .test_run for raw_tracepoint. Also, introduce a new feature that runs the target program on a specific CPU. This is achieved by a new flag in bpf_attr.test, BPF_F_TEST_RUN_ON_CPU. When this flag is set, the program is triggered on cpu with id bpf_attr.test.cpu. This feature is needed for BPF programs that handle perf_event and other percpu resources, as the program can access these resource locally. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200925205432.1777-2-songliubraving@fb.com
2020-09-28udp_tunnel: add the ability to share port tablesJakub Kicinski
Unfortunately recent Intel NIC designs share the UDP port table across netdevs. So far the UDP tunnel port state was maintained per netdev, we need to extend that to cater to Intel NICs. Expect NICs to allocate the info structure dynamically and link to the state from there. All the shared NICs will record port offload information in the one instance of the table so we need to make sure that the use count can accommodate larger numbers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-09-28 1) Fix a build warning in ip_vti if CONFIG_IPV6 is not set. From YueHaibing. 2) Restore IPCB on espintcp before handing the packet to xfrm as the information there is still needed. From Sabrina Dubroca. 3) Fix pmtu updating for xfrm interfaces. From Sabrina Dubroca. 4) Some xfrm state information was not cloned with xfrm_do_migrate. Fixes to clone the full xfrm state, from Antony Antony. 5) Use the correct address family in xfrm_state_find. The struct flowi must always be interpreted along with the original address family. This got lost over the years. Fix from Herbert Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28Merge tag 'nfs-for-5.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: - NFSv4.2: copy_file_range needs to invalidate caches on success - NFSv4.2: Fix security label length not being reset - pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly on read - pNFS/flexfiles: Fix signed/unsigned type issues with mirror indices" * tag 'nfs-for-5.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: pNFS/flexfiles: Be consistent about mirror index types pNFS/flexfiles: Ensure we initialise the mirror bsizes correctly on read NFSv4.2: fix client's attribute cache management for copy_file_range nfs: Fix security label length not being reset
2020-09-28nl80211: extend support to config spatial reuse parameter setRajkumar Manoharan
Allow the user to configure below Spatial Reuse Parameter Set element. * Non-SRG OBSS PD Max Offset * SRG BSS Color Bitmap * SRG Partial BSSID Bitmap Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org> Link: https://lore.kernel.org/r/1601278091-20313-2-git-send-email-rmanohar@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: Support not iterating over not-sdata-in-driver ifacesBen Greear
Allow drivers to request that interface-iterator does NOT iterate over interfaces that are not sdata-in-driver. This will allow us to fix crashes in ath10k (and possibly other drivers). To summarize Johannes' explanation: Consider add interface wlan0 add interface wlan1 iterate active interfaces -> wlan0 wlan1 add interface wlan2 iterate active interfaces -> wlan0 wlan1 wlan2 If you apply this scenario to a restart, which ought to be functionally equivalent to the normal startup, just compressed in time, you're basically saying that today you get add interface wlan0 add interface wlan1 iterate active interfaces -> wlan0 wlan1 wlan2 << problem here add interface wlan2 iterate active interfaces -> wlan0 wlan1 wlan2 which yeah, totally seems wrong. But fixing that to be add interface wlan0 add interface wlan1 iterate active interfaces -> <nothing> add interface wlan2 iterate active interfaces -> <nothing> (or maybe -> wlan0 wlan1 wlan2 if the reconfig already completed) This is also at least somewhat wrong, but better to not iterate over something that exists in the driver than iterate over something that does not. Originally the first issue was causing crashes in testing with lots of station vdevs on an ath10k radio, combined with firmware crashing. I ran with a similar patch for years with no obvious bad results, including significant testing with ath9k and ath10k. Signed-off-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20200922191957.25257-1-greearb@candelatech.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28nl80211: include frequency offset in survey infoThomas Pedersen
Recently channels gained a potential frequency offset, so include this in the per-channel survey info. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-16-thomas@adapt-ip.com [add the offset only if non-zero] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: support S1G associationThomas Pedersen
The changes required for associating in S1G are: - apply S1G BSS channel info before assoc - mark all S1G STAs as QoS STAs - include and parse AID request element - handle new Association Response format - don't fail assoc if supported rates element is missing Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-15-thomas@adapt-ip.com [pass skb to ieee80211_add_aid_request_ie(), remove unused variable 'bss'] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: handle S1G low ratesThomas Pedersen
S1G doesn't have legacy (sband->bitrates) rates, only MCS. For now, just send a frame at MCS 0 if a low rate is requested. Note we also redefine (since we're out of TX flags) TX_RC_VHT_MCS as TX_RC_S1G_MCS to indicate an S1G MCS. This is probably OK as VHT MCS is not valid on S1G band and vice versa. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-12-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28mac80211: encode listen interval for S1GThomas Pedersen
S1G allows listen interval up to 2^14 * 10000 beacon intervals. In order to do this listen interval needs a scaling factor applied to the lower 14 bits. Calculate this and properly encode the listen interval for S1G STAs. See IEEE802.11ah-2016 Table 9-44a for reference. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-10-thomas@adapt-ip.com [move listen_int_usf into function using it] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28cfg80211: handle Association Response from S1G STAThomas Pedersen
The sending STA type is implicit based on beacon or probe response content. If sending STA was an S1G STA, adjust the Information Element location accordingly. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-9-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28cfg80211: convert S1G beacon to scan resultsThomas Pedersen
The S1G beacon is an extension frame as opposed to management frame for the regular beacon. This means we may have to occasionally cast the frame buffer to a different header type. Luckily this isn't too bad as scan results mostly only care about the IEs. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-6-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28nl80211: support S1G capability overrides in assocThomas Pedersen
NL80211_ATTR_S1G_CAPABILITY can be passed along with NL80211_ATTR_S1G_CAPABILITY_MASK to NL80211_CMD_ASSOCIATE to indicate S1G capabilities which should override the hardware capabilities in eg. the association request. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-4-thomas@adapt-ip.com [johannes: always require both attributes together, commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28nl80211/cfg80211: support 6 GHz scanningTova Mussai
Support 6 GHz scanning, by * a new scan flag to scan for colocated BSSes advertised by (and found) APs on 2.4 & 5 GHz * doing the necessary reduced neighbor report parsing for this, to find them * adding the ability to split the scan request in case the device by itself cannot support this. Also add some necessary bits in mac80211 to not break with these changes. Signed-off-by: Tova Mussai <tova.mussai@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200918113313.232917c93af9.Ida22f0212f9122f47094d81659e879a50434a6a2@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28memstick: Skip allocating card when removing hostKai-Heng Feng
After commit 6827ca573c03 ("memstick: rtsx_usb_ms: Support runtime power management"), removing module rtsx_usb_ms will be stuck. The deadlock is caused by powering on and powering off at the same time, the former one is when memstick_check() is flushed, and the later is called by memstick_remove_host(). Soe let's skip allocating card to prevent this issue. Fixes: 6827ca573c03 ("memstick: rtsx_usb_ms: Support runtime power management") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20200925084952.13220-1-kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-27ptp: add stub function for ptp_get_msgtype()Yangbo Lu
Added the missing stub function for ptp_get_msgtype(). Fixes: 036c508ba95e ("ptp: Add generic ptp message type function") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27mm/fork: Pass new vma pointer into copy_page_range()Peter Xu
This prepares for the future work to trigger early cow on pinned pages during fork(). No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-27mm: Introduce mm_struct.has_pinnedPeter Xu
(Commit message majorly collected from Jason Gunthorpe) Reduce the chance of false positive from page_maybe_dma_pinned() by keeping track if the mm_struct has ever been used with pin_user_pages(). This allows cases that might drive up the page ref_count to avoid any penalty from handling dma_pinned pages. Future work is planned, to provide a more sophisticated solution, likely to turn it into a real counter. For now, make it atomic_t but use it as a boolean for simplicity. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26net: dsa: point out the tail taggersVladimir Oltean
The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and KSZ9893 switches also use tail tags. Tell that to the DSA core, since this makes a difference for the flow dissector. Most switches break the parsing of frame headers, but these ones don't, so no flow dissector adjustment needs to be done for them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: add a generic procedure for the flow dissectorVladimir Oltean
For all DSA formats that don't use tail tags, it looks like behind the obscure number crunching they're all doing the same thing: locating the real EtherType behind the DSA tag. Nonetheless, this is not immediately obvious, so create a generic helper for those DSA taggers that put the header before the EtherType. Another assumption for the generic function is that the DSA tags are of equal length on RX and on TX. Prior to the previous patch, this was not true for ocelot and for gswip. The problem was resolved for ocelot, but for gswip it still remains, so that can't use this helper yet. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: make the .flow_dissect tagger callback return voidVladimir Oltean
There is no tagger that returns anything other than zero, so just change the return type appropriately. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: tag_ocelot: use a short prefix on both ingress and egressVladimir Oltean
There are 2 goals that we follow: - Reduce the header size - Make the header size equal between RX and TX The issue that required long prefix on RX was the fact that the ocelot DSA tag, being put before Ethernet as it is, would overlap with the area that a DSA master uses for RX filtering (destination MAC address mainly). Now that we can ask DSA to put the master in promiscuous mode, in theory we could remove the prefix altogether and call it a day, but it looks like we can't. Using no prefix on ingress, some packets (such as ICMP) would be received, while others (such as PTP) would not be received. This is because the DSA master we use (enetc) triggers parse errors ("MAC rx frame errors") presumably because it sees Ethernet frames with a bad length. And indeed, when using no prefix, the EtherType (bytes 12-13 of the frame, bits 96-111) falls over the REW_VAL field from the extraction header, aka the PTP timestamp. When turning the short (32-bit) prefix on, the EtherType overlaps with bits 64-79 of the extraction header, which are a reserved area transmitted as zero by the switch. The packets are not dropped by the DSA master with a short prefix. Actually, the frames look like this in tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 - added by a downstream sja1105). 89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \ dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \ ctrl 0x0004: Information, send seq 2, rcv seq 0, \ Flags [Response], length 78 0x0000: 8880 000a 001d 890c a9f2 0100 0000 100f ................ 0x0010: 0400 0000 0180 c200 000e 001f 7b63 0248 ............{c.H 0x0020: dadb 0482 88f7 1202 0036 0000 0000 0000 .........6...... 0x0030: 0000 0000 0000 0000 0000 001f 7bff fe63 ............{..c 0x0040: 0248 0001 1f81 0500 0000 0000 0000 0000 .H.............. 0x0050: 0000 0000 0000 0000 0000 0000 ............ So the short prefix is our new default: we've shortened our RX frames by 12 octets, increased TX by 4, and headers are now equal between RX and TX. Note that we still need promiscuous mode for the DSA master to not drop it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: dsa: allow drivers to request promiscuous mode on masterVladimir Oltean
Currently DSA assumes that taggers don't mess with the destination MAC address of the frames on RX. That is not always the case. Some DSA headers are placed before the Ethernet header (ocelot), and others simply mangle random bytes from the destination MAC address (sja1105 with its incl_srcpt option). Currently the DSA master goes to promiscuous mode automatically when the slave devices go too (such as when enslaved to a bridge), but in standalone mode this is a problem that needs to be dealt with. So give drivers the possibility to signal that their tagging protocol will get randomly dropped otherwise, and let DSA deal with fixing that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26net: mscc: ocelot: move NPI port configuration to DSAVladimir Oltean
Remove the ocelot_configure_cpu() function, which was in fact bringing up 2 ports: the CPU port module, which both switchdev and DSA have, and the NPI port, which only DSA has. The (non-Ethernet) CPU port module is at a fixed index in the analyzer, whereas the NPI port is selected through the "ethernet" property in the device tree. Therefore, the function to set up an NPI port is DSA-specific, so we move it there, simplifying the ocelot switch library a little bit. Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: one in drivers (lpfc) and two for zoned block devices. The latter also impinges on the block layer but only to introduce a new block API for setting the zone model rather than fiddling with the queue directly in the zoned block driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: sd_zbc: Fix ZBC disk initialization scsi: sd: sd_zbc: Fix handling of host-aware ZBC disks scsi: lpfc: Fix initial FLOGI failure due to BBSCN not supported