summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2020-09-23net: bridge: mcast: add support for blocked port groupsNikolay Aleksandrov
When excluding S,G entries we need a way to block a particular S,G,port. The new port group flag is managed based on the source's timer as per RFCs 3376 and 3810. When a source expires and its port group is in EXCLUDE mode, it will be blocked. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: handle port group filter modesNikolay Aleksandrov
We need to handle group filter mode transitions and initial state. To change a port group's INCLUDE -> EXCLUDE mode (or when we have added a new port group in EXCLUDE mode) we need to add that port to all of *,G ports' S,G entries for proper replication. When the EXCLUDE state is changed from IGMPv3 report, br_multicast_fwd_filter_exclude() must be called after the source list processing because the assumption is that all of the group's S,G entries will be created before transitioning to EXCLUDE mode, i.e. most importantly its blocked entries will already be added so it will not get automatically added to them. The transition EXCLUDE -> INCLUDE happens only when a port group timer expires, it requires us to remove that port from all of *,G ports' S,G entries where it was automatically added previously. Finally when we are adding a new S,G entry we must add all of *,G's EXCLUDE ports to it. In order to distinguish automatically added *,G EXCLUDE ports we have a new port group flag - MDB_PG_FLAGS_STAR_EXCL. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mcast: add rt_protocol field to the port group structNikolay Aleksandrov
We need to be able to differentiate between pg entries created by user-space and the kernel when we start generating S,G entries for IGMPv3/MLDv2's fast path. User-space entries are created by default as RTPROT_STATIC and the kernel entries are RTPROT_KERNEL. Later we can allow user-space to provide the entry rt_protocol so we can differentiate between who added the entries specifically (e.g. clag, admin, frr etc). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: add support for add/del/dump of entries with sourceNikolay Aleksandrov
Add new mdb attributes (MDBE_ATTR_SOURCE for setting, MDBA_MDB_EATTR_SOURCE for dumping) to allow add/del and dump of mdb entries with a source address (S,G). New S,G entries are created with filter mode of MCAST_INCLUDE. The same attributes are used for IPv4 and IPv6, they're validated and parsed based on their protocol. S,G host joined entries which are added by user are not allowed yet. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23net: bridge: mdb: add support to extend add/del commandsNikolay Aleksandrov
Since the MDB add/del code expects an exact struct br_mdb_entry we can't really add any extensions, thus add a new nested attribute at the level of MDBA_SET_ENTRY called MDBA_SET_ENTRY_ATTRS which will be used to pass all new options via netlink attributes. This patch doesn't change anything functionally since the new attribute is not used yet, only parsed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-09-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 95 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 4211 insertions(+), 2040 deletions(-). The main changes are: 1) Full multi function support in libbpf, from Andrii. 2) Refactoring of function argument checks, from Lorenz. 3) Make bpf_tail_call compatible with functions (subprograms), from Maciej. 4) Program metadata support, from YiFei. 5) bpf iterator optimizations, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from Jakub Kicinski: - fix failure to add bond interfaces to a bridge, the offload-handling code was too defensive there and recent refactoring unearthed that. Users complained (Ido) - fix unnecessarily reflecting ECN bits within TOS values / QoS marking in TCP ACK and reset packets (Wei) - fix a deadlock with bpf iterator. Hopefully we're in the clear on this front now... (Yonghong) - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel) - fix AQL on mt76 devices with FW rate control and add a couple of AQL issues in mac80211 code (Felix) - fix authentication issue with mwifiex (Maximilian) - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro) - fix exception handling for multipath routes via same device (David Ahern) - revert back to a BH spin lock flavor for nsid_lock: there are paths which do require the BH context protection (Taehee) - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke) - fix ife module load deadlock (Cong) - make an adjustment to netlink reply message type for code added in this release (the sole change touching uAPI here) (Michal) - a number of fixes for small NXP and Microchip switches (Vladimir) [ Pull request acked by David: "you can expect more of this in the future as I try to delegate more things to Jakub" ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits) net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU net: Update MAINTAINERS for MediaTek switch driver net/mlx5e: mlx5e_fec_in_caps() returns a boolean net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock net/mlx5e: kTLS, Fix leak on resync error flow net/mlx5e: kTLS, Add missing dma_unmap in RX resync net/mlx5e: kTLS, Fix napi sync and possible use-after-free net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats() net/mlx5e: Fix multicast counter not up-to-date in "ip -s" net/mlx5e: Fix endianness when calculating pedit mask first bit net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported net/mlx5e: CT: Fix freeing ct_label mapping net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready net/mlx5e: Use synchronize_rcu to sync with NAPI net/mlx5e: Use RCU to protect rq->xdp_prog ...
2020-09-22Merge branches 'v5.10/vfio/bardirty', 'v5.10/vfio/dma_avail', ↵Alex Williamson
'v5.10/vfio/misc', 'v5.10/vfio/no-cmd-mem' and 'v5.10/vfio/yan_zhao_fixes' into v5.10/vfio/next
2020-09-22fscrypt: make "#define fscrypt_policy" user-onlyEric Biggers
The fscrypt UAPI header defines fscrypt_policy to fscrypt_policy_v1, for source compatibility with old userspace programs. Internally, the kernel doesn't want that compatibility definition. Instead, fscrypt_private.h #undefs it and re-defines it to a union. That works for now. However, in order to add fscrypt_operations::get_dummy_policy(), we'll need to forward declare 'union fscrypt_policy' in include/linux/fscrypt.h. That would cause build errors because "fscrypt_policy" is used in ioctl numbers. To avoid this, modify the UAPI header to make the fscrypt_policy compatibility definition conditional on !__KERNEL__, and make the ioctls use fscrypt_policy_v1 instead of fscrypt_policy. Note that this doesn't change the actual ioctl numbers. Acked-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200917041136.178600-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-09-22nitro_enclaves: Add ioctl interface definitionAndra Paraschiv
The Nitro Enclaves driver handles the enclave lifetime management. This includes enclave creation, termination and setting up its resources such as memory and CPU. An enclave runs alongside the VM that spawned it. It is abstracted as a process running in the VM that launched it. The process interacts with the NE driver, that exposes an ioctl interface for creating an enclave and setting up its resources. Changelog v9 -> v10 * Update commit message to include the changelog before the SoB tag(s). v8 -> v9 * No changes. v7 -> v8 * Add NE custom error codes for user space memory regions not backed by pages multiple of 2 MiB, invalid flags and enclave CID. * Add max flag value for enclave image load info. v6 -> v7 * Clarify in the ioctls documentation that the return value is -1 and errno is set on failure. * Update the error code value for NE_ERR_INVALID_MEM_REGION_SIZE as it gets in user space as value 25 (ENOTTY) instead of 515. Update the NE custom error codes values range to not be the same as the ones defined in include/linux/errno.h, although these are not propagated to user space. v5 -> v6 * Fix typo in the description about the NE CPU pool. * Update documentation to kernel-doc format. * Remove the ioctl to query API version. v4 -> v5 * Add more details about the ioctl calls usage e.g. error codes, file descriptors used. * Update the ioctl to set an enclave vCPU to not return a file descriptor. * Add specific NE error codes. v3 -> v4 * Decouple NE ioctl interface from KVM API. * Add NE API version and the corresponding ioctl call. * Add enclave / image load flags options. v2 -> v3 * Remove the GPL additional wording as SPDX-License-Identifier is already in place. v1 -> v2 * Add ioctl for getting enclave image load metadata. * Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE. * Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE ioctls. * Update NE ioctls definition based on the updated ioctl range for major and minor. Reviewed-by: Alexander Graf <graf@amazon.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Alexandru Vasile <lexnv@amazon.com> Signed-off-by: Andra Paraschiv <andraprs@amazon.com> Link: https://lore.kernel.org/r/20200921121732.44291-2-andraprs@amazon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-22Merge branch 'x86-seves-for-paolo' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD
2020-09-21Merge tag 'mac80211-next-for-net-next-2020-09-21' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== This time we have: * some AP-side infrastructure for FILS discovery and unsolicited probe resonses * a major rework of the encapsulation/header conversion offload from Felix, to fit better with e.g. AP_VLAN interfaces * performance fix for VHT A-MPDU size, don't limit to HT * some initial patches for S1G (sub 1 GHz) support, more will come in this area * minor cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21vfio iommu: Add dma available capabilityMatthew Rosato
Commit 492855939bdb ("vfio/type1: Limit DMA mappings per container") added the ability to limit the number of memory backed DMA mappings. However on s390x, when lazy mapping is in use, we use a very large number of concurrent mappings. Let's provide the current allowable number of DMA mappings to userspace via the IOMMU info chain so that userspace can take appropriate mitigation. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-09-21vfio: Fix typo of the device_stateZenghui Yu
A typo fix ("_RUNNNG" => "_RUNNING") in comment block of the uapi header. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-09-19ethtool: Add 100base-FX link mode entriesDan Murphy
Add entries for the 100base-FX full and half duplex supported modes. $ ethtool eth0 Supported ports: [ FIBRE ] Supported link modes: 100baseFX/Half 100baseFX/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 100baseFX/Half 100baseFX/Full Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 100Mb/s Duplex: Full Auto-negotiation: off Port: MII PHYAD: 1 Transceiver: external Supports Wake-on: gs Wake-on: d SecureOn password: 00:00:00:00:00:00 Current message level: 0x00000000 (0) Link detected: yes Signed-off-by: Dan Murphy <dmurphy@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18tipc: add automatic rekeying for encryption keyTuong Lien
Rekeying is required for security since a key is less secure when using for a long time. Also, key will be detached when its nonce value (or seqno ...) is exhausted. We now make the rekeying process automatic and configurable by user. Basically, TIPC will at a specific interval generate a new key by using the kernel 'Random Number Generator' cipher, then attach it as the node TX key and securely distribute to others in the cluster as RX keys (- the key exchange). The automatic key switching will then take over, and make the new key active shortly. Afterwards, the traffic from this node will be encrypted with the new session key. The same can happen in peer nodes but not necessarily at the same time. For simplicity, the automatically generated key will be initiated as a per node key. It is not too hard to also support a cluster key rekeying (e.g. a given node will generate a unique cluster key and update to the others in the cluster...), but that doesn't bring much benefit, while a per-node key is even more secure. We also enable user to force a rekeying or change the rekeying interval via netlink, the new 'set key' command option: 'TIPC_NLA_NODE_REKEYING' is added for these purposes as follows: - A value >= 1 will be set as the rekeying interval (in minutes); - A value of 0 will disable the rekeying; - A value of 'TIPC_REKEYING_NOW' (~0) will force an immediate rekeying; The default rekeying interval is (60 * 24) minutes i.e. done every day. There isn't any restriction for the value but user shouldn't set it too small or too large which results in an "ineffective" rekeying (thats ok for testing though). Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18tipc: introduce encryption master keyTuong Lien
In addition to the supported cluster & per-node encryption keys for the en/decryption of TIPC messages, we now introduce one option for user to set a cluster key as 'master key', which is simply a symmetric key like the former but has a longer life cycle. It has two purposes: - Authentication of new member nodes in the cluster. New nodes, having no knowledge of current session keys in the cluster will still be able to join the cluster as long as they know the master key. This is because all neighbor discovery (LINK_CONFIG) messages must be encrypted with this key. - Encryption of session encryption keys during automatic exchange and update of those.This is a feature we will introduce in a later commit in this series. We insert the new key into the currently unused slot 0 in the key array and start using it immediately once the user has set it. After joining, a node only knowing the master key should be fully communicable to existing nodes in the cluster, although those nodes may have their own session keys activated (i.e. not the master one). To support this, we define a 'grace period', starting from the time a node itself reports having no RX keys, so the existing nodes will use the master key for encryption instead. The grace period can be extended but will automatically stop after e.g. 5 seconds without a new report. This is also the basis for later key exchanging feature as the new node will be impossible to decrypt anything without the support from master key. For user to set a master key, we define a new netlink flag - 'TIPC_NLA_NODE_KEY_MASTER', so it can be added to the current 'set key' netlink command to specify the setting key to be a master key. Above all, the traditional cluster/per-node key mechanism is guaranteed to work when user comes not to use this master key option. This is also compatible to legacy nodes without the feature supported. Even this master key can be updated without any interruption of cluster connectivity but is so is needed, this has to be coordinated and set by the user. Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18devlink: add timeout information to status_notifyShannon Nelson
Add a timeout element to the DEVLINK_CMD_FLASH_UPDATE_STATUS netlink message for use by a userland utility to show that a particular firmware flash activity may take a long but bounded time to finish. Also add a handy helper for drivers to make use of the new timeout value. UI usage hints: - if non-zero, add timeout display to the end of the status line [component] status_msg ( Xm Ys : Am Bs ) using the timeout value for Am Bs and updating the Xm Ys every second - if the timeout expires while awaiting the next update, display something like [component] status_msg ( timeout reached : Am Bs ) - if new status notify messages are received, remove the timeout and start over Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18fuse: add submount support to <uapi/linux/fuse.h>Max Reitz
- Add fuse_attr.flags - Add FUSE_ATTR_SUBMOUNT This is a flag for fuse_attr.flags that indicates that the given entry resides on a different filesystem than the parent, and as such should have a different st_dev. - Add FUSE_SUBMOUNTS The client sets this flag if it supports automounting directories. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18nl80211: Unsolicited broadcast probe response supportAloka Dixit
This patch adds new attributes to support unsolicited broadcast probe response transmission used for in-band discovery in 6GHz band (IEEE P802.11ax/D6.0 26.17.2.3.2, AP behavior for fast passive scanning). The new attribute, NL80211_ATTR_UNSOL_BCAST_PROBE_RESP, is nested which supports following parameters: (1) NL80211_UNSOL_BCAST_PROBE_RESP_ATTR_INT - Packet interval (2) NL80211_UNSOL_BCAST_PROBE_RESP_ATTR_TMPL - Template data Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/010101747a946698-aac263ae-2ed3-4dab-9590-0bc7131214e1-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: Add FILS discovery supportAloka Dixit
FILS discovery attribute, NL80211_ATTR_FILS_DISCOVERY, is nested which supports following parameters as given in IEEE Std 802.11ai-2016, Annex C.3 MIB detail: (1) NL80211_FILS_DISCOVERY_ATTR_INT_MIN - Minimum packet interval (2) NL80211_FILS_DISCOVERY_ATTR_INT_MAX - Maximum packet interval (3) NL80211_FILS_DISCOVERY_ATTR_TMPL - Template data Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/20200805011838.28166-2-alokad@codeaurora.org [fix attribute and other names, use NLA_RANGE(), use policy only once] Link: https://lore.kernel.org/r/010101747a7b38a8-306f06b2-9061-4baf-81c1-054a42a18e22-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: advertise supported channel width in S1GThomas Pedersen
S1G supports 5 channel widths: 1, 2, 4, 8, and 16. One channel width is allowed per frequency in each operating class, so it makes more sense to advertise the specific channel width allowed. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200908190323.15814-3-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-17net: remove comments on struct rtnl_link_statsJakub Kicinski
We removed the misleading comments from struct rtnl_link_stats64 when we added proper kdoc. struct rtnl_link_stats has the same inline comments, so remove them, too. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17ethtool: add and use message type for tunnel info replyMichal Kubecek
Tunnel offload info code uses ETHTOOL_MSG_TUNNEL_INFO_GET message type (cmd field in genetlink header) for replies to tunnel info netlink request, i.e. the same value as the request have. This is a problem because we are using two separate enums for userspace to kernel and kernel to userspace message types so that this ETHTOOL_MSG_TUNNEL_INFO_GET (28) collides with ETHTOOL_MSG_CABLE_TEST_TDR_NTF which is what message type 28 means for kernel to userspace messages. As the tunnel info request reached mainline in 5.9 merge window, we should still be able to fix the reply message type without breaking backward compatibility. Fixes: c7d759eb7b12 ("ethtool: add tunnel info interface") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17coresight: stm: Support marked packetTingwei Zhang
STP_PACKET_MARKED is not supported by STM currently. Add STM_FLAG_MARKED to support marked packet in STM. Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20200916191737.4001561-3-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-16Merge branch 'virtio-shm' of ↵Maxime Ripard
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into drm-misc-next Topic pull request for core virtio changes that will be required by the DRM driver. Signed-off-by: Maxime Ripard <maxime@cerno.tech> From: Gurchetan Singh <gurchetansingh@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/CAAfnVBn2BzXWFY3hhjDxd5q0P2_JWn-HdkVxgS94x9keAUZiow@mail.gmail.com
2020-09-15bpf: Add BPF_PROG_BIND_MAP syscallYiFei Zhu
This syscall binds a map to a program. Returns success if the map is already bound to the program. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-3-sdf@google.com
2020-09-15devlink: introduce the health reporter test commandJiri Pirko
Introduce a test command for health reporters. User might use this command to trigger test event on a reporter if the reporter supports it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15ethtool: add standard pause statsJakub Kicinski
Currently drivers have to report their pause frames statistics via ethtool -S, and there is a wide variety of names used for these statistics. Add the two statistics defined in IEEE 802.3x to the standard API. Create a new ethtool request header flag for including statistics in the response to GET commands. Always create the ETHTOOL_A_PAUSE_STATS nest in replies when flag is set. Testing if driver declares the op is not a reliable way of checking if any stats will actually be included and therefore we don't want to give the impression that presence of ETHTOOL_A_PAUSE_STATS indicates driver support. Note that this patch does not include PFC counters, which may fit better in dcbnl? But mostly I don't need them/have a setup to test them so I haven't looked deeply into exposing them :) v3: - add a helper for "uninitializing" stats, rather than a cryptic memset() (Andrew) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15ipmi:msghandler: retry to get device id on an errorXianting Tian
We fail to get the BMCS's device id with low probability when loading the ipmi driver and it causes BMC device registration failed. When this issue occurs we got below kernel prints: [Wed Sep 9 19:52:03 2020] ipmi_si IPI0001:00: IPMI message handler: device id demangle failed: -22 [Wed Sep 9 19:52:03 2020] IPMI BT: using default values [Wed Sep 9 19:52:03 2020] IPMI BT: req2rsp=5 secs retries=2 [Wed Sep 9 19:52:03 2020] ipmi_si IPI0001:00: Unable to get the device id: -5 [Wed Sep 9 19:52:04 2020] ipmi_si IPI0001:00: Unable to register device: error -5 When this issue happens, we want to manually unload the driver and try to load it again, but it can't be unloaded by 'rmmod' as it is already 'in use'. We add a print in handle_one_recv_msg(), when this issue happens, the msg we received is "Recv: 1c 01 d5", which means the data_len is 1, data[0] is 0xd5 (completion code), which means "bmc cannot execute command. Command, or request parameter(s), not supported in present state". Debug code: static int handle_one_recv_msg(struct ipmi_smi *intf, struct ipmi_smi_msg *msg) { printk("Recv: %*ph\n", msg->rsp_size, msg->rsp); ... ... } Then in ipmi_demangle_device_id(), it returned '-EINVAL' as 'data_len < 7' and 'data[0] != 0'. We created this patch to retry the get device id when this error happens. We reproduced this issue again and the retry succeed on the first retry, we finally got the correct msg and then all is ok: Recv: 1c 01 00 01 81 05 84 02 af db 07 00 01 00 b9 00 10 00 So use a retry machanism in this patch to give bmc more opportunity to correctly response kernel when we received specific completion codes. Signed-off-by: Xianting Tian <tian.xianting@h3c.com> Message-Id: <20200915071817.4484-1-tian.xianting@h3c.com> [Cleaned up the verbage a bit in the header and prints.] Signed-off-by: Corey Minyard <cminyard@mvista.com>
2020-09-15powerpc/8xx: Support 16k hugepages with 4k pagesChristophe Leroy
The 8xx has 4 page sizes: 4k, 16k, 512k and 8M 4k and 16k can be selected at build time as standard page sizes, and 512k and 8M are hugepages. When 4k standard pages are selected, 16k pages are not available. Allow 16k pages as hugepages when 4k pages are used. To allow that, implement arch_make_huge_pte() which receives the necessary arguments to allow setting the PTE in accordance with the page size: - 512 k pages must have _PAGE_HUGE and _PAGE_SPS. They are set by pte_mkhuge(). arch_make_huge_pte() does nothing. - 16 k pages must have only _PAGE_SPS. arch_make_huge_pte() clears _PAGE_HUGE. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a518abc29266a708dfbccc8fce9ae6694fe4c2c6.1598862623.git.christophe.leroy@csgroup.eu
2020-09-14Merge v5.9-rc5 into drm-nextDaniel Vetter
Paul needs 1a21e5b930e8 ("drm/ingenic: Fix leak of device_node pointer") and 3b5b005ef7d9 ("drm/ingenic: Fix driver not probing when IPU port is missing") from -fixes to be able to merge further ingenic patches into -next. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2020-09-14media: v4l2-ctrl: Add VP9 codec levelsStanimir Varbanov
Add menu control for VP9 codec levels. A total of 14 levels are defined for Profile 0 (8bit) and Profile 2 (10bit). Each level is a set of constrained bitstreams coded with targeted resolutions, frame rates, and bitrates. The definitions have been taken from webm project [1]. [1] https://www.webmproject.org/vp9/levels/ Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-14media: media/v4l2: remove V4L2_FLAG_MEMORY_NON_CONSISTENT flagSergey Senozhatsky
The patch partially reverts some of the UAPI bits of the buffer cache management hints. Namely, the queue consistency (memory coherency) user-space hint because, as it turned out, the kernel implementation of this feature was misusing DMA_ATTR_NON_CONSISTENT. The patch reverts both kernel and user space parts: removes the DMA consistency attr functions, rolls back changes to v4l2_requestbuffers, v4l2_create_buffers structures and corresponding UAPI functions (plus compat32 layer) and cleans up the documentation. [hverkuil: fixed a few typos in the commit log] [hverkuil: fixed vb2_core_reqbufs call in drivers/media/dvb-core/dvb_vb2.c] [mchehab: fixed a typo in the commit log: revers->reverts] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-14Merge 5.9-rc5 into char-misc-nextGreg Kroah-Hartman
We want the char/misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-14Merge 5.9-rc5 into staging-nextGreg Kroah-Hartman
We want the staging/iio changes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "A bit on the bigger side, mostly due to me being on vacation, then busy, then on parental leave, but there's nothing worrisome. ARM: - Multiple stolen time fixes, with a new capability to match x86 - Fix for hugetlbfs mappings when PUD and PMD are the same level - Fix for hugetlbfs mappings when PTE mappings are enforced (dirty logging, for example) - Fix tracing output of 64bit values x86: - nSVM state restore fixes - Async page fault fixes - Lots of small fixes everywhere" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: emulator: more strict rsm checks. KVM: nSVM: more strict SMM checks when returning to nested guest SVM: nSVM: setup nested msr permission bitmap on nested state load SVM: nSVM: correctly restore GIF on vmexit from nesting after migration x86/kvm: don't forget to ACK async PF IRQ x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit KVM: SVM: avoid emulation with stale next_rip KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN KVM: SVM: Periodically schedule when unregistering regions on destroy KVM: MIPS: Change the definition of kvm type kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control KVM: fix memory leak in kvm_io_bus_unregister_dev() KVM: Check the allocation of pv cpu mask KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected KVM: arm64: Update page shift if stage 2 block mapping not supported KVM: arm64: Fix address truncation in traces KVM: arm64: Do not try to map PUDs when they are folded into PMD arm64/x86: KVM: Introduce steal-time cap ...
2020-09-11KVM: MIPS: Change the definition of kvm typeHuacai Chen
MIPS defines two kvm types: #define KVM_VM_MIPS_TE 0 #define KVM_VM_MIPS_VZ 1 In Documentation/virt/kvm/api.rst it is said that "You probably want to use 0 as machine type", which implies that type 0 be the "automatic" or "default" type. And, in user-space libvirt use the null-machine (with type 0) to detect the kvm capability, which returns "KVM not supported" on a VZ platform. I try to fix it in QEMU but it is ugly: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html And Thomas Huth suggests me to change the definition of kvm type: https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html So I define like this: #define KVM_VM_MIPS_AUTO 0 #define KVM_VM_MIPS_VZ 1 #define KVM_VM_MIPS_TE 2 Since VZ and TE cannot co-exists, using type 0 on a TE platform will still return success (so old user-space tools have no problems on new kernels); the advantage is that using type 0 on a VZ platform will not return failure. So, the only problem is "new user-space tools use type 2 on old kernels", but if we treat this as a kernel bug, we can backport this patch to old stable kernels. Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-10bpf: Fix comment for helper bpf_current_task_under_cgroup()Song Liu
This should be "current" not "skb". Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/bpf/20200910203314.70018-1-songliubraving@fb.com
2020-09-10ipmr: Add high byte of VIF ID to igmpmsgPaul Davey
Use the unused3 byte in struct igmpmsg to hold the high 8 bits of the VIF ID. If using more than 255 IPv4 multicast interfaces it is necessary to have access to a VIF ID for cache reports that is wider than 8 bits, the VIF ID present in the igmpmsg reports sent to mroute_sk was only 8 bits wide in the igmpmsg header. Adding the high 8 bits of the 16 bit VIF ID in the unused byte allows use of more than 255 IPv4 multicast interfaces. Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10ipmr: Add route table ID to netlink cache reportsPaul Davey
Insert the multicast route table ID as a Netlink attribute to Netlink cache report notifications. When multiple route tables are in use it is necessary to have a way to determine which route table a given cache report belongs to when receiving the cache report. Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10virtiofs: implement dax read/write operationsVivek Goyal
This patch implements basic DAX support. mmap() is not implemented yet and will come in later patches. This patch looks into implemeting read/write. We make use of interval tree to keep track of per inode dax mappings. Do not use dax for file extending writes, instead just send WRITE message to daemon (like we do for direct I/O path). This will keep write and i_size change atomic w.r.t crash. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: introduce setupmapping/removemapping commandsVivek Goyal
Introduce two new fuse commands to setup/remove memory mappings. This will be used to setup/tear down file mapping in dax window. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: implement FUSE_INIT map_alignment fieldStefan Hajnoczi
The device communicates FUSE_SETUPMAPPING/FUSE_REMOVMAPPING alignment constraints via the FUST_INIT map_alignment field. Parse this field and ensure our DAX mappings meet the alignment constraints. We don't actually align anything differently since our mappings are already 2MB aligned. Just check the value when the connection is established. If it becomes necessary to honor arbitrary alignments in the future we'll have to adjust how mappings are sized. The upshot of this commit is that we can be confident that mappings will work even when emulating x86 on Power and similar combinations where the host page sizes are different. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: set up virtio_fs dax_deviceStefan Hajnoczi
Setup a dax device. Use the shm capability to find the cache entry and map it. The DAX window is accessed by the fs/dax.c infrastructure and must have struct pages (at least on x86). Use devm_memremap_pages() to map the DAX window PCI BAR and allocate struct page. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10Merge branch 'virtio-shm' into for-nextMiklos Szeredi
Pull virtio shared memory region patches for virtiofs shared memory (DAX) support.
2020-09-10virtio: Implement get_shm_region for MMIO transportSebastien Boeuf
On MMIO a new set of registers is defined for finding SHM regions. Add their definitions and use them to find the region. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtio: Implement get_shm_region for PCI transportSebastien Boeuf
On PCI the shm regions are found using capability entries; find a region by searching for the capability. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: kbuild test robot <lkp@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10quota: Expand comment describing d_itimerJan Kara
Expand comment describing d_itimer in struct fs_disk_quota. Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz>