summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2021-04-21KVM: x86: Support KVM VMs sharing SEV contextNathan Tempelman
Add a capability for userspace to mirror SEV encryption context from one vm to another. On our side, this is intended to support a Migration Helper vCPU, but it can also be used generically to support other in-guest workloads scheduled by the host. The intention is for the primary guest and the mirror to have nearly identical memslots. The primary benefits of this are that: 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they can't accidentally clobber each other. 2) The VMs can have different memory-views, which is necessary for post-copy migration (the migration vCPUs on the target need to read and write to pages, when the primary guest would VMEXIT). This does not change the threat model for AMD SEV. Any memory involved is still owned by the primary guest and its initial state is still attested to through the normal SEV_LAUNCH_* flows. If userspace wanted to circumvent SEV, they could achieve the same effect by simply attaching a vCPU to the primary VM. This patch deliberately leaves userspace in charge of the memslots for the mirror, as it already has the power to mess with them in the primary guest. This patch does not support SEV-ES (much less SNP), as it does not handle handing off attested VMSAs to the mirror. For additional context, we need a Migration Helper because SEV PSP migration is far too slow for our live migration on its own. Using an in-guest migrator lets us speed this up significantly. Signed-off-by: Nathan Tempelman <natet@google.com> Message-Id: <20210408223214.2582277-1-natet@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21of: linux/of.h: fix kernel-doc warningsRandy Dunlap
Correct kernel-doc notation warnings: ../include/linux/of.h:1211: warning: Function parameter or member 'output' not described in 'of_property_read_string_index' ../include/linux/of.h:1211: warning: Excess function parameter 'out_string' description in 'of_property_read_string_index' ../include/linux/of.h:1477: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Overlay support Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210417061244.2262-1-rdunlap@infradead.org
2021-04-21gpio: omap: Save and restore sysconfigTony Lindgren
As we are using cpu_pm to save and restore context, we must also save and restore the GPIO sysconfig register. This is needed because we are not calling PM runtime functions at all with cpu_pm. We need to save the sysconfig on idle as it's value can get reconfigured by PM runtime and can be different from the init time value. Device specific flags like "ti,no-idle-on-init" can affect the init value. Fixes: b764a5863fd8 ("gpio: omap: Remove custom PM calls and use cpu_pm instead") Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Adam Ford <aford173@gmail.com> Cc: Andreas Kemnade <andreas@kemnade.info> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-04-21sched: Warn on long periods of pending need_reschedPaul Turner
CPU scheduler marks need_resched flag to signal a schedule() on a particular CPU. But, schedule() may not happen immediately in cases where the current task is executing in the kernel mode (no preemption state) for extended periods of time. This patch adds a warn_on if need_resched is pending for more than the time specified in sysctl resched_latency_warn_ms. If it goes off, it is likely that there is a missing cond_resched() somewhere. Monitoring is done via the tick and the accuracy is hence limited to jiffy scale. This also means that we won't trigger the warning if the tick is disabled. This feature (LATENCY_WARN) is default disabled. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210416212936.390566-1-joshdon@google.com
2021-04-21NFS: The 'fattr_valid' field in struct nfs_server should be unsigned intTrond Myklebust
Fix up a static compiler warning: "fs/nfs/nfs4proc.c:3882 _nfs4_server_capabilities() warn: was expecting a 64 bit value instead of '(1 << 11)'" The fix is to convert the fattr_valid field to match the type of the 'valid' field in struct nfs_fattr. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-04-21div64: Correct inline documentation for `do_div'Maciej W. Rozycki
Correct inline documentation for `do_div', which is a function-like macro the `n' parameter of which has the semantics of a C++ reference: it is both read and written in the context of the caller without an explicit dereference such as with a pointer. In the C programming language it has no equivalent for proper functions, in terms of which the documentation expresses the semantics of `do_div', but substituting a pointer in documentation is misleading, and using the C++ notation should at least raise the reader's attention and encourage to seek explanation even if the C++ semantics is not readily understood. While at it observe that "semantics" is an uncountable noun, so refer to it with a singular rather than plural verb. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-04-21drivers: hv: Create a consistent pattern for checking Hyper-V hypercall statusJoseph Salisbury
There is not a consistent pattern for checking Hyper-V hypercall status. Existing code uses a number of variants. The variants work, but a consistent pattern would improve the readability of the code, and be more conformant to what the Hyper-V TLFS says about hypercall status. Implemented new helper functions hv_result(), hv_result_success(), and hv_repcomp(). Changed the places where hv_do_hypercall() and related variants are used to use the helper functions. Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1618620183-9967-2-git-send-email-joseph.salisbury@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-21x86/hyperv: Move hv_do_rep_hypercall to asm-genericJoseph Salisbury
This patch makes no functional changes. It simply moves hv_do_rep_hypercall() out of arch/x86/include/asm/mshyperv.h and into asm-generic/mshyperv.h hv_do_rep_hypercall() is architecture independent, so it makes sense that it should be in the architecture independent mshyperv.h, not in the x86-specific mshyperv.h. This is done in preperation for a follow up patch which creates a consistent pattern for checking Hyper-V hypercall status. Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1618620183-9967-1-git-send-email-joseph.salisbury@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-04-20drm/amdkfd: add ioctl to configure and query xnack retriesAlex Sierra
Xnack retries are used for page fault recovery. Some AMD chip families support continuously retry while page table entries are invalid. The driver must handle the page fault interrupt and fill in a valid entry for the GPU to continue. This ioctl allows to enable/disable XNACK retries per KFD process. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20drm/amdkfd: add svm ioctl APIPhilip Yang
Add svm (shared virtual memory) ioctl data structure and API definition. The svm ioctl API is designed to be extensible in the future. All operations are provided by a single IOCTL to preserve ioctl number space. The arguments structure ends with a variable size array of attributes that can be used to set or get one or multiple attributes. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20net: dsa: Allow default tag protocol to be overridden from DTTobias Waldekranz
Some combinations of tag protocols and Ethernet controllers are incompatible, and it is hard for the driver to keep track of these. Therefore, allow the device tree author (typically the board vendor) to inform the driver of this fact by selecting an alternate protocol that is known to work. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20Merge tag 'mac80211-next-for-net-next-2021-04-20' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Another set of updates, all over the map: * set sk_pacing_shift for 802.3->802.11 encap offload * some monitor support for 802.11->802.3 decap offload * HE (802.11ax) spec updates * userspace API for TDLS HE support * along with various other small features, cleanups and fixups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20net: phy: marvell: add support for Amethyst internal PHYMarek Behún
Add support for Amethyst internal PHY. The only difference from Peridot is HWMON. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20Merge tag 'mlx5-updates-2021-04-19' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-04-19 This patchset provides some updates to mlx5e and mlx5 SW steering drivers: 1) Tariq and Vladyslav they both provide some trivial update to mlx5e netdev. The next 12 patches in the patchset are focused toward mlx5 SW steering: 2) 3 trivial cleanup patches 3) Dynamic Flex parser support: Flex parser is a HW parser that can support protocols that are not natively supported by the HCA, such as Geneve (TLV options) and GTP-U. There are 8 such parsers, and each of them can be assigned to parse a specific set of protocols. 4) Enable matching on Geneve TLV options 5) Use Flex parser for MPLS over UDP/GRE 6) Enable matching on tunnel GTP-U and GTP-U first extension header using 7) Improved QoS for SW steering internal QPair for a better insertion rate ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20net: dsa: enable selftest support for all switches by defaultOleksij Rempel
Most of generic selftest should be able to work with probably all ethernet controllers. The DSA switches are not exception, so enable it by default at least for DSA. This patch was tested with SJA1105 and AR9331. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20net: add generic selftest supportOleksij Rempel
Port some parts of the stmmac selftest and reuse it as basic generic selftest library. This patch was tested with following combinations: - iMX6DL FEC -> AT8035 - iMX6DL FEC -> SJA1105Q switch -> KSZ8081 - iMX6DL FEC -> SJA1105Q switch -> KSZ9031 - AR9331 ag71xx -> AR9331 PHY - AR9331 ag71xx -> AR9331 switch -> AR9331 PHY Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20net: phy: genphy_loopback: add link speed configurationOleksij Rempel
In case of loopback, in most cases we need to disable autoneg support and force some speed configuration. Otherwise, depending on currently active auto negotiated link speed, the loopback may or may not work. This patch was tested with following PHYs: TJA1102, KSZ8081, KSZ9031, AT8035, AR9331. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20capabilities: require CAP_SETFCAP to map uid 0Serge E. Hallyn
cap_setfcap is required to create file capabilities. Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"), a process running as uid 0 but without cap_setfcap is able to work around this as follows: unshare a new user namespace which maps parent uid 0 into the child namespace. While this task will not have new capabilities against the parent namespace, there is a loophole due to the way namespaced file capabilities are represented as xattrs. File capabilities valid in userns 1 are distinguished from file capabilities valid in userns 2 by the kuid which underlies uid 0. Therefore the restricted root process can unshare a new self-mapping namespace, add a namespaced file capability onto a file, then use that file capability in the parent namespace. To prevent that, do not allow mapping parent uid 0 if the process which opened the uid_map file does not have CAP_SETFCAP, which is the capability for setting file capabilities. As a further wrinkle: a task can unshare its user namespace, then open its uid_map file itself, and map (only) its own uid. In this case we do not have the credential from before unshare, which was potentially more restricted. So, when creating a user namespace, we record whether the creator had CAP_SETFCAP. Then we can use that during map_write(). With this patch: 1. Unprivileged user can still unshare -Ur ubuntu@caps:~$ unshare -Ur root@caps:~# logout 2. Root user can still unshare -Ur ubuntu@caps:~$ sudo bash root@caps:/home/ubuntu# unshare -Ur root@caps:/home/ubuntu# logout 3. Root user without CAP_SETFCAP cannot unshare -Ur: root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap -- root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap unable to set CAP_SETFCAP effective capability: Operation not permitted root@caps:/home/ubuntu# unshare -Ur unshare: write failed /proc/self/uid_map: Operation not permitted Note: an alternative solution would be to allow uid 0 mappings by processes without CAP_SETFCAP, but to prevent such a namespace from writing any file capabilities. This approach can be seen at [1]. Background history: commit 95ebabde382 ("capabilities: Don't allow writing ambiguous v3 file capabilities") tried to fix the issue by preventing v3 fscaps to be written to disk when the root uid would map to the same uid in nested user namespaces. This led to regressions for various workloads. For example, see [2]. Ultimately this is a valid use-case we have to support meaning we had to revert this change in 3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")"). Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1] Link: https://github.com/containers/buildah/issues/3071 [2] Signed-off-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Tested-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-20RDMA/mlx5: Expose private query portMark Bloch
Expose a non standard query port via IOCTL that will be used to expose port attributes that are specific to mlx5 devices. The new interface receives a port number to query and returns a structure that contains the available attributes for that port. This will be used to fill the gap between pure DEVX use cases and use cases where a kernel needs to inform userspace about various kernel driver configurations that userspace must use in order to work correctly. Flags is used to indicate which fields are valid on return. MLX5_IB_UAPI_QUERY_PORT_VPORT: The vport number of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID: The VHCA ID of the vport of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX: The vport's RX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX: The vport's TX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0: The metadata used to tag egress packets of the vport. MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID: The E-Switch owner vhca id of the vport. Link: https://lore.kernel.org/r/6e2ef13e5a266a6c037eb0105eb1564c7bb52f23.1618743394.git.leonro@nvidia.com Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-20btrfs: zoned: automatically reclaim zonesJohannes Thumshirn
When a file gets deleted on a zoned file system, the space freed is not returned back into the block group's free space, but is migrated to zone_unusable. As this zone_unusable space is behind the current write pointer it is not possible to use it for new allocations. In the current implementation a zone is reset once all of the block group's space is accounted as zone unusable. This behaviour can lead to premature ENOSPC errors on a busy file system. Instead of only reclaiming the zone once it is completely unusable, kick off a reclaim job once the amount of unusable bytes exceeds a user configurable threshold between 51% and 100%. It can be set per mounted filesystem via the sysfs tunable bg_reclaim_threshold which is set to 75% by default. Similar to reclaiming unused block groups, these dirty block groups are added to a to_reclaim list and then on a transaction commit, the reclaim process is triggered but after we deleted unused block groups, which will free space for the relocation process. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-20platform/chrome: cros_ec: Add Type C hard resetPrashant Malani
Update the EC command header to include the new event bit. This bit is included in the latest version of the Chrome EC headers[1]. [1] https://chromium.googlesource.com/chromiumos/platform/ec/+/refs/heads/main/include/ec_commands.h Signed-off-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20210420171617.3830902-1-pmalani@chromium.org
2021-04-20spi: altera: separate core code from platform codeMatthew Gerlach
In preparation of adding support for a new bus type, separate the core spi-altera code from the platform driver code. Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Link: https://lore.kernel.org/r/20210416165720.554144-2-matthew.gerlach@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-20ASoC: audio-graph: move audio_graph_remove() to simple-card-utils.cKuninori Morimoto
audio-graph-card2 can reuse audio_graph_remove() / asoc_simple_remove(). This patch moves it to simple-card-utils.c. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87y2df3uby.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-20ASoC: audio-graph: move audio_graph_card_probe() to simple-card-utils.cKuninori Morimoto
audio-graph-card2 can reuse audio_graph_card_probe(). This patch moves it to simple-card-utils.c. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://lore.kernel.org/r/87zgxv3uc4.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-20drm/bridge/synopsys: dw-hdmi: Add an option to suppress loading CEC driverJernej Skrabec
This adds DW-HDMI driver a glue option to disable loading of the CEC sub-driver. On some SoCs, the CEC functionality is enabled in the IP config bits, but the CEC bus is non-functional like on Amlogic SoCs, where the CEC config bit is set but the DW-HDMI CEC signal is not connected to a physical pin, leading to some confusion when the DW-HDMI CEC controller can't communicate on the bus. Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Link: https://patchwork.freedesktop.org/patch/msgid/20210416092737.1971876-2-narmstrong@baylibre.com
2021-04-20floppy: cleanups: remove trailing whitespacesDenis Efremov
Cleanup trailing whitespaces as checkpatch.pl suggests. Signed-off-by: Denis Efremov <efremov@linux.com> Link: https://lore.kernel.org/r/20210416083449.72700-2-efremov@linux.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20PCI/MSI: Let PCI host bridges declare their reliance on MSI domainsMarc Zyngier
There is a whole class of host bridges that cannot know whether MSIs will be provided or not, as they rely on other blocks to provide the MSI functionnality, using MSI domains. This is the case for example on systems that use the ARM GIC architecture. Introduce a new attribute ('msi_domain') indicating that implicit dependency, and use this property to set the NO_MSI flag when no MSI domain is found at probe time. Link: https://lore.kernel.org/r/20210330151145.997953-11-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-20PCI/MSI: Kill default_teardown_msi_irqs()Marc Zyngier
It doesn't have any caller left. Link: https://lore.kernel.org/r/20210330151145.997953-10-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-20PCI/MSI: Kill msi_controller structureMarc Zyngier
msi_controller had a good, long life as the abstraction for a driver providing MSIs to PCI devices. But it has been replaced in all drivers by the more expressive generic MSI framework. Farewell, struct msi_controller. Link: https://lore.kernel.org/r/20210330151145.997953-9-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-20PCI/MSI: Drop use of msi_controller from core codeMarc Zyngier
As there is no driver using msi_controller, we can now safely remove its use from the PCI probe code. Link: https://lore.kernel.org/r/20210330151145.997953-8-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-20Merge branch 'fixes' into nextVinod Koul
2021-04-20KVM: x86: Add capability to grant VM access to privileged SGX attributeSean Christopherson
Add a capability, KVM_CAP_SGX_ATTRIBUTE, that can be used by userspace to grant a VM access to a priveleged attribute, with args[0] holding a file handle to a valid SGX attribute file. The SGX subsystem restricts access to a subset of enclave attributes to provide additional security for an uncompromised kernel, e.g. to prevent malware from using the PROVISIONKEY to ensure its nodes are running inside a geniune SGX enclave and/or to obtain a stable fingerprint. To prevent userspace from circumventing such restrictions by running an enclave in a VM, KVM restricts guest access to privileged attributes by default. Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <0b099d65e933e068e3ea934b0523bab070cb8cea.1618196135.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-20KVM: Stop looking for coalesced MMIO zones if the bus is destroyedSean Christopherson
Abort the walk of coalesced MMIO zones if kvm_io_bus_unregister_dev() fails to allocate memory for the new instance of the bus. If it can't instantiate a new bus, unregister_dev() destroys all devices _except_ the target device. But, it doesn't tell the caller that it obliterated the bus and invoked the destructor for all devices that were on the bus. In the coalesced MMIO case, this can result in a deleted list entry dereference due to attempting to continue iterating on coalesced_zones after future entries (in the walk) have been deleted. Opportunistically add curly braces to the for-loop, which encompasses many lines but sneaks by without braces due to the guts being a single if statement. Fixes: f65886606c2d ("KVM: fix memory leak in kvm_io_bus_unregister_dev()") Cc: stable@vger.kernel.org Reported-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210412222050.876100-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-20Merge tag 'v5.12-rc8' into sched/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-04-19net/mlx5: DR, Add support for isolate_vl_tc QPYevgeny Kliteynik
When using SW steering, rule insertion rate depends on the RDMA RC QP performance used for writing to the ICM. During stress this QP is competing on the HW resources with all the other QPs that are used to send data. To protect SW steering QP's performance in such cases, we set this QP to use isolated VL. The VL number is reserved by FW and is not exposed to the driver. Support for this QP on isolated VL exists only when both force-loopback and isolate_vl_tc capabilities are set. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19net/mlx5: DR, Add support for force-loopback QPYevgeny Kliteynik
When supported by the device, SW steering RoCE RC QP that is used to write/read to/from ICM will be created with force-loopback attribute. Such QP doesn't require GID index upon creation. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19net/mlx5: mlx5_ifc updates for flex parserYevgeny Kliteynik
Added the required definitions for supporting more protocols by flex parsers (GTP-U, Geneve TLV options), and for using the right flex parser that was configured for this protocol. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19net/mlx5e: RX, Add checks for calculated Striding RQ attributesTariq Toukan
Striding RQ attributes below are mutually dependent. An unaware change to one might take the others out of the valid range derived by the HW caps: - The MPWQE size in bytes - The number of strides in a MPWQE - The stride size Add checks to verify they are valid and comply to the HW spec and SW assumptions/requirements. This is not a fix, no particular issue exists today. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19net: phy: add genphy_c45_pma_suspend/resumeRadu Pirea (NXP OSS)
Add generic PMA suspend and resume callback functions for C45 PHYs. Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Add vlan match and pop actions to the flowtable offload, patches from wenxu. 2) Reduce size of the netns_ct structure, which itself is embedded in struct net Make netns_ct a read-mostly structure. Patches from Florian Westphal. 3) Add FLOW_OFFLOAD_XMIT_UNSPEC to skip dst check from garbage collector path, as required by the tc CT action. From Roi Dayan. 4) VLAN offload fixes for nftables: Allow for matching on both s-vlan and c-vlan selectors. Fix match of VLAN id due to incorrect byteorder. Add a new routine to properly populate flow dissector ethertypes. 5) Missing keys in ip{6}_route_me_harder() results in incorrect routes. This includes an update for selftest infra. Patches from Ido Schimmel. 6) Add counter hardware offload support through FLOW_CLS_STATS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19netlink: simplify nl_set_extack_cookie_u64(), nl_set_extack_cookie_u32()Alexey Dobriyan
Taking address of a function argument directly works just fine. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19bpf: Add a bpf_snprintf helperFlorent Revest
The implementation takes inspiration from the existing bpf_trace_printk helper but there are a few differences: To allow for a large number of format-specifiers, parameters are provided in an array, like in bpf_seq_printf. Because the output string takes two arguments and the array of parameters also takes two arguments, the format string needs to fit in one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to a zero-terminated read-only map so we don't need a format string length arg. Because the format-string is known at verification time, we also do a first pass of format string validation in the verifier logic. This makes debugging easier. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
2021-04-19bpf: Add a ARG_PTR_TO_CONST_STR argument typeFlorent Revest
This type provides the guarantee that an argument is going to be a const pointer to somewhere in a read-only map value. It also checks that this pointer is followed by a zero character before the end of the map value. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-3-revest@chromium.org
2021-04-19bpf: Factorize bpf_trace_printk and bpf_seq_printfFlorent Revest
Two helpers (trace_printk and seq_printf) have very similar implementations of format string parsing and a third one is coming (snprintf). To avoid code duplication and make the code easier to maintain, this moves the operations associated with format string parsing (validation and argument sanitization) into one generic function. The implementation of the two existing helpers already drifted quite a bit so unifying them entailed a lot of changes: - bpf_trace_printk always expected fmt[fmt_size] to be the terminating NULL character, this is no longer true, the first 0 is terminating. - bpf_trace_printk now supports %% (which produces the percentage char). - bpf_trace_printk now skips width formating fields. - bpf_trace_printk now supports the X modifier (capital hexadecimal). - bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6 - argument casting on 32 bit has been simplified into one macro and using an enum instead of obscure int increments. - bpf_seq_printf now uses bpf_trace_copy_string instead of strncpy_from_kernel_nofault and handles the %pks %pus specifiers. - bpf_seq_printf now prints longs correctly on 32 bit architectures. - both were changed to use a global per-cpu tmp buffer instead of one stack buffer for trace_printk and 6 small buffers for seq_printf. - to avoid per-cpu buffer usage conflict, these helpers disable preemption while the per-cpu buffer is in use. - both helpers now support the %ps and %pS specifiers to print symbols. The implementation is also moved from bpf_trace.c to helpers.c because the upcoming bpf_snprintf helper will be made available to all BPF programs and will need it. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-2-revest@chromium.org
2021-04-19perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHEKan Liang
Current Hardware events and Hardware cache events have special perf types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't pass the PMU type in the user interface. For a hybrid system, the perf subsystem doesn't know which PMU the events belong to. The first capable PMU will always be assigned to the events. The events never get a chance to run on the other capable PMUs. Extend the two types to become PMU aware types. The PMU type ID is stored at attr.config[63:32]. Add a new PMU capability, PERF_PMU_CAP_EXTENDED_HW_TYPE, to indicate a PMU which supports the extended PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The PMU type is only required when searching a specific PMU. The PMU specific codes will only be interested in the 'real' config value, which is stored in the low 32 bit of the event->attr.config. Update the event->attr.config in the generic code, so the PMU specific codes don't need to calculate it separately. If a user specifies a PMU type, but the PMU doesn't support the extended type, error out. If an event cannot be initialized in a PMU specified by a user, error out immediately. Perf should not try to open it on other PMUs. The new PMU capability is only set for the X86 hybrid PMUs for now. Other architectures, e.g., ARM, may need it as well. The support on ARM may be implemented later separately. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1618237865-33448-22-git-send-email-kan.liang@linux.intel.com
2021-04-19perf/x86: Add structures for the attributes of Hybrid PMUsKan Liang
Hybrid PMUs have different events and formats. In theory, Hybrid PMU specific attributes should be maintained in the dedicated struct x86_hybrid_pmu, but it wastes space because the events and formats are similar among Hybrid PMUs. To reduce duplication, all hybrid PMUs will share a group of attributes in the following patch. To distinguish an attribute from different Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type is required for the attribute structure. The type is internal usage. It is not visible in the sysfs API. Hybrid PMUs may support the same event name, but with different event encoding, e.g., the mem-loads event on an Atom PMU has different event encoding from a Core PMU. It brings issue if two attributes are created for them. Current sysfs_update_group finds an attribute by searching the attr name (aka event name). If two attributes have the same event name, the first attribute will be replaced. To address the issue, only one attribute is created for the event. The event_str is extended and stores event encodings from all Hybrid PMUs. Each event encoding is divided by ";". The order of the event encodings must follow the order of the hybrid PMU index. The event_str is internal usage as well. When a user wants to show the attribute of a Hybrid PMU, only the corresponding part of the string is displayed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/1618237865-33448-18-git-send-email-kan.liang@linux.intel.com
2021-04-19dm: replace dm_vcalloc()Matthew Wilcox (Oracle)
Use kvcalloc or kvmalloc_array instead (depending whether zeroing is useful). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-19btrfs: add and use readahead_batch_lengthMatthew Wilcox (Oracle)
Implement readahead_batch_length() to determine the number of bytes in the current batch of readahead pages and use it in btrfs. Also use the readahead_pos to get the offset. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-04-19fs: introduce a wrapper uuid_to_fsid()Amir Goldstein
Some filesystem's use a digest of their uuid for f_fsid. Create a simple wrapper for this open coded folding. Filesystems that have a non null uuid but use the block device number for f_fsid may also consider using this helper. [JK: Added missing asm/byteorder.h include] Link: https://lore.kernel.org/r/20210322173944.449469-2-amir73il@gmail.com Acked-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-04-19KVM: x86/mmu: Re-add const qualifier in kvm_tdp_mmu_zap_collapsible_sptesBen Gardon
kvm_tdp_mmu_zap_collapsible_sptes unnecessarily removes the const qualifier from its memlsot argument, leading to a compiler warning. Add the const annotation and pass it to subsequent functions. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>