Age | Commit message (Collapse) | Author |
|
- support arbitrary PCI host bridge offsets on sparc (Yinghai Lu)
- remove System and Video ROM reservations on sparc (Bjorn Helgaas)
* pci/sparc:
sparc/PCI: Stop reserving System ROM and Video ROM in PCI space
sparc/PCI: Support arbitrary host bridge address offset
|
|
- use generic pci_mmap_resource_range() instead of powerpc and xtensa
arch-specific versions (David Woodhouse)
* pci/resource-mmap:
xtensa/PCI: Use generic pci_mmap_resource_range()
powerpc/pci: Use generic pci_mmap_resource_range()
|
|
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick
Lawler)
- merge pcieport_if.h into portdrv.h (Bjorn Helgaas)
- move workaround for BIOS PME issue from portdrv to PCI core (Bjorn
Helgaas)
- completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas)
- remove portdrv link order dependency (Bjorn Helgaas)
- remove support for unused VC portdrv service (Bjorn Helgaas)
- simplify portdrv feature permission checking (Bjorn Helgaas)
- remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn
Helgaas)
- remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas)
- use cached AER capability offset (Frederick Lawler)
- don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg)
- rename pcie-dpc.c to dpc.c (Bjorn Helgaas)
* pci/portdrv:
PCI/DPC: Rename from pcie-dpc.c to dpc.c
PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS
PCI/AER: Use cached AER Capability offset
PCI/portdrv: Rename and reverse sense of pcie_ports_auto
PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h>
PCI/portdrv: Simplify PCIe feature permission checking
PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
PCI/portdrv: Remove pcie_port_bus_type link order dependency
PCI/portdrv: Disable port driver in compat mode
PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
PCI/PM: Move pcie_clear_root_pme_status() to core
PCI/portdrv: Merge pcieport_if.h into portdrv.h
PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/
Conflicts:
drivers/pci/pcie/Makefile
drivers/pci/pcie/portdrv.h
|
|
- don't set up INTx if MSI or MSI-X is enabled to align cris, frv, ia64,
and mn10300 with x86 (Bjorn Helgaas)
* pci/msi:
PCI/MSI: Don't set up INTx if MSI or MSI-X is enabled
|
|
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas)
- remove possible NULL pointer dereference in of_pci_bus_find_domain_nr()
(Shawn Lin)
- report quirk timings with dev_info (Bjorn Helgaas)
- report quirks that take longer than 10ms (Bjorn Helgaas)
- add and use Altera Vendor ID (Johannes Thumshirn)
- tidy Makefiles and comments (Bjorn Helgaas)
* pci/misc:
PCI: Always define the of_node helpers
PCI: Tidy comments
PCI: Tidy Makefiles
mcb: Add Altera PCI ID to mcb-pci
PCI: Add Altera vendor ID
PCI: Report quirks that take more than 10ms
PCI: Report quirk timings with pci_info() instead of pr_debug()
PCI: Fix NULL pointer dereference in of_pci_bus_find_domain_nr()
rapidio/tsi721: use PCI_EXP_DEVCTL2_COMP_TIMEOUT macro
|
|
- add support for PCI I/O port space that's neither directly accessible
via CPU in/out instructions nor directly mapped into CPU physical
memory space (Zhichang Yuan)
- add support for HiSilicon Hip06/Hip07 LPC I/O space (Zhichang Yuan,
John Garry)
* pci/lpc:
MAINTAINERS: Add John Garry as maintainer for HiSilicon LPC driver
HISI LPC: Add ACPI support
ACPI / scan: Do not enumerate Indirect IO host children
ACPI / scan: Rename acpi_is_serial_bus_slave() for more general use
HISI LPC: Support the LPC host on Hip06/Hip07 with DT bindings
of: Add missing I/O range exception for indirect-IO devices
PCI: Apply the new generic I/O management on PCI IO hosts
PCI: Add fwnode handler as input param of pci_register_io_range()
PCI: Remove __weak tag from pci_register_io_range()
lib: Add generic PIO mapping method
|
|
- fix possible cpqphp NULL pointer dereference (Shawn Lin)
- rescan more of the hierarchy on ACPI hotplug to fix Thunderbolt/xHCI
hotplug (Mika Westerberg)
* pci/hotplug:
ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()
PCI: cpqphp: Fix possible NULL pointer dereference
|
|
- add decoding for 16 GT/s link speed (Jay Fang)
- add interfaces to get max link speed and width (Tal Gilboa)
- add pcie_bandwidth_capable() to compute max supported link bandwidth
(Tal Gilboa)
- add pcie_bandwidth_available() to compute bandwidth available to device
(Tal Gilboa)
- add pcie_print_link_status() to log link speed and whether it's limited
(Tal Gilboa)
- use PCI core interfaces to report when device performance may be
limited by its slot instead of doing it in each driver (Tal Gilboa)
* pci/enumeration:
fm10k: Report PCIe link properties with pcie_print_link_status()
net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth
net/mlx5: Report PCIe link properties with pcie_print_link_status()
net/mlx4_core: Report PCIe link properties with pcie_print_link_status()
PCI: Add pcie_print_link_status() to log link speed and whether it's limited
PCI: Add pcie_bandwidth_available() to compute bandwidth available to device
PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth
PCI: Add pcie_get_width_cap() to find max supported link width
PCI: Add pcie_get_speed_cap() to find max supported link speed
PCI: Add decoding for 16 GT/s link speed
|
|
- remove last user of pci_get_bus_and_slot() and the function itself
(Sinan Kaya)
* pci/deprecate-get-bus-and-slot:
PCI: Remove pci_get_bus_and_slot() function
drm/i915: Deprecate pci_get_bus_and_slot()
|
|
- skip ASPM common clock warning if BIOS already configured it (Sinan
Kaya)
- fix ASPM Coverity warning about threshold_ns (Gustavo A. R. Silva)
* pci/aspm:
PCI/ASPM: Don't warn if already in common clock mode
PCI/ASPM: Declare threshold_ns as u32, not u64
|
|
- move pci_uevent_ers() out of pci.h (Michael Ellerman)
* pci/aer:
PCI/AER: Move pci_uevent_ers() out of pci.h
|
|
Add a binding for the I2C controller that can be found in the Socionext
SynQuacer SoC.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
When a Raw Ethernet QP is created, we actually create a few objects.
One of these objects is a TIR. Currently, a TIR could hash (and spread
the traffic) by IP or port only. Adding a hashing by IPSec SPI to TIR
creation with the required UAPI bit.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Users should be able to query for IPSec support. Adding a few
capabilities bits as part of the driver specific part in
alloc_ucontext:
MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA
Payload's header is returned with metadata representing the
IPSec decryption state.
MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_RX
Support ESP_AES_GCM in ingress path.
MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX
Support ESP_AES_GCM in egress path.
MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_RSS_ONLY
Hardware doesn't support matching SPI in flow steering rules
but just hashing and spreading the traffic accordingly.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
This commit introduces support for the esp_aes_gcm flow
specification for the Innova device. To that end we add
support for egress steering and some validations that an
IPsec rule is indeed valid.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Simple wrapper to understand if we are dealing with IPsec flow.
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Adding implementation in mlx5 driver to modify action_xfrm object. This
merely call the accel layer. Currently a user can modify only the
ESN parameters.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Adding implementation in mlx5 driver to create and destroy action_xfrm
object. This merely call the accel layer.
A user may pass MLX5_IB_XFRM_FLAGS_REQUIRE_METADATA flag which states
that [s]he expects a metadata header to be added to the payload. This
header represents information regarding the transformation's state.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Adding a new ESP steering match filter that could match against
spi and seq used in IPSec protocol.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
flow_actions of ESP type could be modified during runtime. This could be
common for example when ESN should be changed. Adding a new
UVERBS_FLOW_ACTION_ESP_MODIFY method for changing ESP parameters of an
existing ESP flow_action.
The new method uses the UVERBS_FLOW_ACTION_ESP_CREATE attributes, but
adds a new IB_FLOW_ACTION_ESP_FLAGS_MOD_ESP_ATTRS which means ESP_ATTRS
should be changed.
In addition, we add a new FLOW_ACTION_ESP_REPLAY_NONE replay type that
could be used when one wants to disable a replay protection over a
specific flow_action.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The egress flag indicates that this flow steering rule is for egress
traffic. The scope of an egress rule is port-wide, meaning all packets
originated from that port, which match the steering rule specification
will be effected by this steering rule's action.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Binding a flow_action to flow steering rule requires using a new
specification. Therefore, adding such an IB_FLOW_SPEC_ACTION_HANDLE flow
specification.
Flow steering rules could use flow_action(s) and as of that we need to
avoid deleting flow_action(s) as long as they're being used.
Moreover, when the attached rules are deleted, action_handle reference
count should be decremented. Introducing a new mechanism of flow
resources to keep track on the attached action_handle(s). Later on, this
mechanism should be extended to other attached flow steering resources
like flow counters.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
A verbs application may receive and transmits packets using a data
path pipeline. Sometimes, the first stage in the receive pipeline or
the last stage in the transmit pipeline involves transforming a
packet, either in order to make it easier for later stages to process
it or to prepare it for transmission over the wire. Such transformation
could be stripping/encapsulating the packet (i.e. vxlan),
decrypting/encrypting it (i.e. ipsec), altering headers, doing some
complex FPGA changes, etc.
Some hardware could do such transformations without software data path
intervention at all. The flow steering API supports steering a
packet (either to a QP or dropping it) and some simple packet
immutable actions (i.e. tagging a packet). Complex actions, that may
change the packet, could bloat the flow steering API extensively.
Sometimes the same action should be applied to several flows.
In this case, it's easier to bind several flows to the same action and
modify it than change all matching flows.
Introducing a new flow_action object that abstracts any packet
transformation (out of a standard and well defined set of actions).
This flow_action object could be tied to a flow steering rule via a
new specification.
Currently, we support esp flow_action, which encrypts or decrypts a
packet according to the given parameters. However, we present a
flexible schema that could be used to other transformation actions tied
to flow rules.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The current implementation of kern_spec_to_ib_spec_filter, which takes
a uAPI based flow steering specification and creates the respective kernel
API flow steering structure, gets a ib_uverbs_flow_spec structure.
The new flow_action uAPI gets a match mask and filter from user-space
which aren't encoded in the flow steering's ib_uverbs_flow_spec structure.
Exporting the logic out of kern_spec_to_ib_spec_filter to get user-space
blobs rather than ib_uverbs_flow_spec structure.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
ConnectX3 doesn't support egress flow steering. Return an EOPNOTSUPP
error when such a flow is being created.
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Methods sometimes need to get one attribute out of a group of
pre-defined attributes. This is an enum-like behavior. Since
this is a common requirement, we add a new ENUM attribute to the
generic uverbs ioctl() layer. This attribute is embedded in methods,
like any other attributes we currently have. ENUM attributes point to
an array of standard UVERBS_ATTR_PTR_IN. The user-space encodes the
enum's attribute id in the id field and the internal PTR_IN attr id in
the enum_data.elem_id field. This ENUM attribute could be shared by
several attributes and it can get UVERBS_ATTR_SPEC_F_MANDATORY flag,
stating this attribute must be supported by the kernel, like any other
attribute.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
In order to have a custom parsing tree, a provider driver needs to
assign its parsing tree to ib_device specs_tree field. Otherwise, the
uverbs client assigns a common default parsing tree for it.
In downstream patches, the mlx5_ib driver gains a custom parsing tree,
which contains both the common objects and a new flags field for the
UVERBS_FLOW_ACTION_ESP_CREATE command.
This patch makes mlx5_ib assign its own tree to specs_root, which
later on will be extended.
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
This includes the infrastructure to map the test into the guest and
run code from the test program inside a VM.
Signed-off-by: Ken Hofsass <hofsass@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Testsuite contributed by Google and cleaned up by myself for
inclusion in Linux.
Signed-off-by: Ken Hofsass <hofsass@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
fix a "warning: no previous prototype".
Cc: stable@vger.kernel.org
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There is no easy way to force KVM to run an instruction through the emulator
(by design as that will expose the x86 emulator as a significant attack-surface).
However, we do wish to expose the x86 emulator in case we are testing it
(e.g. via kvm-unit-tests). Therefore, this patch adds a "force emulation prefix"
that is designed to raise #UD which KVM will trap and it's #UD exit-handler will
match "force emulation prefix" to run instruction after prefix by the x86 emulator.
To not expose the x86 emulator by default, we add a module parameter that should
be off by default.
A simple testcase here:
#include <stdio.h>
#include <string.h>
#define HYPERVISOR_INFO 0x40000000
#define CPUID(idx, eax, ebx, ecx, edx) \
asm volatile (\
"ud2a; .ascii \"kvm\"; cpuid" \
:"=b" (*ebx), "=a" (*eax), "=c" (*ecx), "=d" (*edx) \
:"0"(idx) );
void main()
{
unsigned int eax, ebx, ecx, edx;
char string[13];
CPUID(HYPERVISOR_INFO, &eax, &ebx, &ecx, &edx);
*(unsigned int *)(string + 0) = ebx;
*(unsigned int *)(string + 4) = ecx;
*(unsigned int *)(string + 8) = edx;
string[12] = 0;
if (strncmp(string, "KVMKVMKVM\0\0\0", 12) == 0)
printf("kvm guest\n");
else
printf("bare hardware\n");
}
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
[Correctly handle usermode exits. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduce handle_ud() to handle invalid opcode, this function will be
used by later patches.
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
vmx_save_host_state has multiple ifdefs for CONFIG_X86_64 that have
no other code between them. Simplify by reducing them to a single
conditional.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The local variable was newly introduced but is only accessed in one
place on x86_64, but not on 32-bit:
arch/x86/kvm/vmx.c: In function 'vmx_save_host_state':
arch/x86/kvm/vmx.c:2175:6: error: unused variable 'cpu' [-Werror=unused-variable]
This puts it into another #ifdef.
Fixes: 35060ed6a1ff ("x86/kvm/vmx: avoid expensive rdmsr for MSR_GS_BASE")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The __net_initdata section cannot currently be used for structures that
get cleaned up in an exitcall using unregister_pernet_operations:
WARNING: vmlinux.o(.text+0x868c34): Section mismatch in reference from the function nsim_devlink_exit() to the (unknown reference) .init.data:(unknown)
The function nsim_devlink_exit() references
the (unknown reference) __initdata (unknown).
This is often because nsim_devlink_exit lacks a __initdata
annotation or the annotation of (unknown) is wrong.
WARNING: vmlinux.o(.text+0x868c64): Section mismatch in reference from the function nsim_devlink_init() to the (unknown reference) .init.data:(unknown)
WARNING: vmlinux.o(.text+0x8692bc): Section mismatch in reference from the function nsim_fib_exit() to the (unknown reference) .init.data:(unknown)
WARNING: vmlinux.o(.text+0x869300): Section mismatch in reference from the function nsim_fib_init() to the (unknown reference) .init.data:(unknown)
As that warning tells us, discarding the structure after a module is
loaded would lead to a undefined behavior when that module is removed.
It might be possible to change that annotation so it has no effect for
loadable modules, but I have not figured out exactly how to do that, and
we want this to be fixed in -rc1.
This just removes the annotations, just like we do for all other such
modules.
Fixes: 37923ed6b8ce ("netdevsim: Add simple FIB resource controller via devlink")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the fmode_t that is passed to dm_blk_ioctl() rather than
inconsistently (varies across targets) drop it on the floor by
overriding it with the fmode_t stored in 'struct dm_dev'.
All the persistent reservation functions weren't using the fmode_t they
got back from .prepare_ioctl so remove them.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Commit 519049afead ("dm: use blkdev_get rather than bdgrab when issuing
pass-through ioctl") inadvertantly introduced a regression relative to
users of device cgroups that issue ioctls (e.g. libvirt). Using
blkdev_get() in DM's passthrough ioctl support implicitly introduced a
cgroup permissions check that would fail unless care were taken to add
all devices in the IO stack to the device cgroup. E.g. rather than just
adding the top-level DM multipath device to the cgroup all the
underlying devices would need to be allowed.
Fix this, to no longer require allowing all underlying devices, by
simply holding the live DM table (which includes the table's original
blkdev_get() reference on the blockdevice that the ioctl will be issued
to) for the duration of the ioctl.
Also, bump the DM ioctl version so a user can know that their device
cgroup allow workaround is no longer needed.
Reported-by: Michal Privoznik <mprivozn@redhat.com>
Suggested-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 519049afead ("dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl")
Cc: stable@vger.kernel.org # 4.16
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
parse_raid_params() compares variable "int value" with INT_MAX.
E.g. related Coverity report excerpt:
CID 1364818 (#2 of 3): Operands don't affect result (CONSTANT_EXPRESSION_RESULT) [select issue]
1433 if (value > INT_MAX) {
Fix by changing checks to avoid INT_MAX.
Whilst on it, avoid unnecessary checks against constants
and add check for sane recovery speed min/max.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
Fixes the following sparse warning:
drivers/md/dm-verity-target.c:375:6: warning:
symbol 'verity_for_io_block' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
The ctpio_dmabuf_start entry is not actually a stat and shouldn't
be exposed to ethtool.
Fixes: 2c0b6ee837db ("sfc: expose CTPIO stats on NICs that support them")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Giving an integer to proc_doulongvec_minmax() is dangerous on 64bit arches,
since linker might place next to it a non zero value preventing a change
to ip6frag_low_thresh.
ip6frag_low_thresh is not used anymore in the kernel, but we do not
want to prematuraly break user scripts wanting to change it.
Since specifying a minimal value of 0 for proc_doulongvec_minmax()
is moot, let's remove these zero values in all defrag units.
Fixes: 6e00f7dd5e4e ("ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove the WARN_ON in handle_ept_misconfig() as it is unnecessary
and causes false positives. Return the unmodified result of
kvm_mmu_page_fault() instead of converting a system error code to
KVM_EXIT_UNKNOWN so that userspace sees the error code of the
actual failure, not a generic "we don't know what went wrong".
* kvm_mmu_page_fault() will WARN if reserved bits are set in the
SPTEs, i.e. it covers the case where an EPT misconfig occurred
because of a KVM bug.
* The WARN_ON will fire on any system error code that is hit while
handling the fault, e.g. -ENOMEM from mmu_topup_memory_caches()
while handling a legitmate MMIO EPT misconfig or -EFAULT from
kvm_handle_bad_page() if the corresponding HVA is invalid. In
either case, userspace should receive the original error code
and firing a warning is incorrect behavior as KVM is operating
as designed.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The bug that led to commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5
was a benign warning (no adverse affects other than the warning
itself) that was detected by syzkaller. Further inspection shows
that the WARN_ON in question, in handle_ept_misconfig(), is
unnecessary and flawed (this was also briefly discussed in the
original patch: https://patchwork.kernel.org/patch/10204649).
* The WARN_ON is unnecessary as kvm_mmu_page_fault() will WARN
if reserved bits are set in the SPTEs, i.e. it covers the case
where an EPT misconfig occurred because of a KVM bug.
* The WARN_ON is flawed because it will fire on any system error
code that is hit while handling the fault, e.g. -ENOMEM can be
returned by mmu_topup_memory_caches() while handling a legitmate
MMIO EPT misconfig.
The original behavior of returning -EFAULT when userspace munmaps
an HVA without first removing the memslot is correct and desirable,
i.e. KVM is letting userspace know it has generated a bad address.
Returning RET_PF_EMULATE masks the WARN_ON in the EPT misconfig path,
but does not fix the underlying bug, i.e. the WARN_ON is bogus.
Furthermore, returning RET_PF_EMULATE has the unwanted side effect of
causing KVM to attempt to emulate an instruction on any page fault
with an invalid HVA translation, e.g. a not-present EPT violation
on a VM_PFNMAP VMA whose fault handler failed to insert a PFN.
* There is no guarantee that the fault is directly related to the
instruction, i.e. the fault could have been triggered by a side
effect memory access in the guest, e.g. while vectoring a #DB or
writing a tracing record. This could cause KVM to effectively
mask the fault if KVM doesn't model the behavior leading to the
fault, i.e. emulation could succeed and resume the guest.
* If emulation does fail, KVM will return EMULATION_FAILED instead
of -EFAULT, which is a red herring as the user will either debug
a bogus emulation attempt or scratch their head wondering why we
were attempting emulation in the first place.
TL;DR: revert to returning -EFAULT and remove the bogus WARN_ON in
handle_ept_misconfig in a future patch.
This reverts commit 95e057e25892eaa48cad1e2d637b80d0f1a4fac5.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
To fetch UID info for socket diagnostics, we determine the
namespace of user context using tipc socket instance. This
may cause namespace violation, as the kernel will remap based
on UID.
We fix this by fetching namespace info using the calling userspace
netlink socket.
Fixes: c30b70deb5f4 (tipc: implement socket diagnostics for AF_TIPC)
Reported-by: syzbot+326e587eff1074657718@syzkaller.appspotmail.com
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates
done by __ip_append_data()") and commit 1f4c6eb24029 ("ipv6:
factorize sk_wmem_alloc updates done by __ip6_append_data()"),
when transmitting sub MTU datagram, an addtional, unneeded atomic
operation is performed in ip*_append_data() to update wmem_alloc:
in the above condition the delta is 0.
The above cause small but measurable performance regression in UDP
xmit tput test with packet size below MTU.
This change avoids such overhead updating wmem_alloc only if
wmem_alloc_delta is non zero.
The error path is left intentionally unmodified: it's a slow path
and simplicity is preferred to performances.
Fixes: 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()")
Fixes: 1f4c6eb24029 ("ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is very similar to the aligned versions movaps/movapd.
We have seen the corresponding emulation failures with openbsd as guest
and with Windows 10 with intel HD graphics pass through.
Signed-off-by: Christian Ehrhardt <christian_ehrhardt@genua.de>
Signed-off-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter
an exception in Protected Mode while emulating guest due to invalid
guest state. Unlike Big RM, KVM doesn't support emulating exceptions
in PM, i.e. PM exceptions are always injected via the VMCS. Because
we will never do VMRESUME due to emulation_required, the exception is
never realized and we'll keep emulating the faulting instruction over
and over until we receive a signal.
Exit to userspace iff there is a pending exception, i.e. don't exit
simply on a requested event. The purpose of this check and exit is to
aid in debugging a guest that is in all likelihood already doomed.
Invalid guest state in PM is extremely limited in normal operation,
e.g. it generally only occurs for a few instructions early in BIOS,
and any exception at this time is all but guaranteed to be fatal.
Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly
handled/emulated, while checking for vectored interrupts, e.g. INTR
and NMI, without hitting false positives would add a fair amount of
complexity for almost no benefit (getting hit by lightning seems
more likely than encountering this specific scenario).
Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an
exception via the VMCS and emulation_required is true.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
The new of_get_nvmem_mac_address() helper function causes a link error
with CONFIG_NVMEM=m:
drivers/of/of_net.o: In function `of_get_nvmem_mac_address':
of_net.c:(.text+0x168): undefined reference to `of_nvmem_cell_get'
of_net.c:(.text+0x19c): undefined reference to `nvmem_cell_read'
of_net.c:(.text+0x1a8): undefined reference to `nvmem_cell_put'
I could not come up with a good solution for this, as the code is always
built-in. Using an #if IS_REACHABLE() check around it would solve the
link time issue but then stop it from working in that configuration.
Making of_nvmem_cell_get() an inline function could also solve that, but
seems a bit ugly since it's somewhat larger than most inline functions,
and it would just bring that problem into the callers. Splitting the
function into a separate file might be an alternative.
This uses the big hammer by making CONFIG_NVMEM itself a 'bool' symbol,
which avoids the problem entirely but makes the vmlinux larger for anyone
that might use NVMEM support but doesn't need it built-in otherwise.
Fixes: 9217e566bdee ("of_net: Implement of_get_nvmem_mac_address helper")
Cc: Mike Looijmans <mike.looijmans@topic.nl>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mike Looijmans
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When enable the config item "CONFIG_ARM64_64K_PAGES", the size of PAGE_SIZE
is 65536(64K). But the type of length is u16, it will overflow. So change it
to u32.
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|