Age | Commit message (Collapse) | Author |
|
At boot time, KVM stashes the host MDCR_EL2 value, but only does this
when the kernel is not running in hyp mode (i.e. is non-VHE). In these
cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
lead to CONSTRAINED UNPREDICTABLE behaviour.
Since we use this value to derive the MDCR_EL2 value when switching
to/from a guest, after a guest have been run, the performance counters
do not behave as expected. This has been observed to result in accesses
via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
counters, resulting in events not being counted. In these cases, only
the fixed-purpose cycle counter appears to work as expected.
Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.
Cc: Christopher Dall <christoffer.dall@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Fixes: 1e947bad0b63b351 ("arm64: KVM: Skip HYP setup when already running in HYP")
Tested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
The find_ref_head shouldn't return the first entry even if no exact match
is found. So move the hidden behavior to higher level.
Besides, remove the useless local variables in the btrfs_select_ref_head.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
[ reformat comment ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Create a configfs attribute in each nvme-fabrics namespace to enable P2P
memory use. The attribute may be enabled (with a boolean) or a specific
P2P device may be given (with the device's PCI name).
When enabled, the namespace will ensure the underlying block device
supports P2P and is compatible with any specified P2P device. If no device
was specified it will ensure there is compatible P2P memory somewhere in
the system. Enabling a namespace with P2P memory will fail with EINVAL
(and an appropriate dmesg error) if any of these conditions are not met.
Once a controller is set up on a specific port, the P2P device to use for
each namespace will be found and stored in a radix tree by namespace ID.
When memory is allocated for a request, the tree is used to look up the P2P
device to allocate memory against. If no device is in the tree (because no
appropriate device was found), or if allocation of P2P memory fails, fall
back to using regular memory.
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
[hch: partial rewrite of the initial code]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Add helpers to allocate and free the SGL in a struct nvmet_req:
int nvmet_req_alloc_sgl(struct nvmet_req *req)
void nvmet_req_free_sgl(struct nvmet_req *req)
This will be expanded in a future patch to implement peer-to-peer memory
DMAs and should be common with all target drivers.
The new helpers are used in nvmet-rdma. Seeing we use req.transfer_len as
the length of the SGL it is set earlier and cleared on any error. It also
seems to be unnecessary to accumulate the length as the map_sgl functions
should only ever be called once per request.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
|
|
For P2P requests, we must use the pci_p2pmem_map_sg() function instead of
the dma_map_sg functions.
With that, we can then indicate PCI_P2P support in the request queue. For
this, we create an NVME_F_PCI_P2P flag which tells the core to set
QUEUE_FLAG_PCI_P2P in the request queue.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
|
|
Register the CMB buffer as p2pmem and use the appropriate allocation
functions to create and destroy the IO submission queues.
If the CMB supports WDS and RDS, publish it for use as P2P memory by other
devices.
Kernels without CONFIG_PCI_P2PDMA will also no longer support NVMe CMB.
However, seeing the main use-cases for the CMB is P2P operations, this
seems like a reasonable dependency.
We drop the __iomem safety on the buffer seeing that, by convention, it's
safe to directly access memory mapped by memremap()/devm_memremap_pages().
Architectures where this is not safe will not be supported by memremap()
and therefore will not support PCI P2P and have no support for CMB.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
In order to use PCI P2P memory the pci_p2pmem_map_sg() function must be
called to map the correct PCI bus address.
To do this, check the first page in the scatter list to see if it is P2P
memory or not. At the moment, scatter lists that contain P2P memory must
be homogeneous so if the first page is P2P the entire SGL should be P2P.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
|
Add QUEUE_FLAG_PCI_P2P, meaning a driver's request queue supports targeting
P2P memory. This will be used by P2P providers and orchestrators (in
subsequent patches) to ensure block devices can support P2P memory before
submitting P2P-backed pages to submit_bio().
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a restructured text file describing how to write drivers with support
for P2P DMA transactions. The document describes how to use the APIs that
were added in the previous few commits.
Also adds an index for the PCI documentation tree even though this is the
only PCI document that has been converted to restructured text at this
time.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
|
|
Add a new directory in the driver API guide for PCI-specific documentation.
This is in preparation for adding a new PCI P2P DMA driver writers guide
which will go in this directory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Sanyog Kale <sanyog.r.kale@intel.com>
Cc: Sagar Dharia <sdharia@codeaurora.org>
|
|
Users of the P2PDMA infrastructure will typically need a way for the user
to tell the kernel to use P2P resources. Typically this will be a simple
on/off boolean operation but sometimes it may be desirable for the user to
specify the exact device to use for the P2P operation.
Add new helpers for attributes which take a boolean or a PCI device. Any
boolean as accepted by strtobool() turn P2P on or off (such as 'y', 'n',
'1', '0', etc). Specifying a full PCI device name/BDF will select the
specific device.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
The DMA address used when mapping PCI P2P memory must be the PCI bus
address. Thus, introduce pci_p2pmem_map_sg() to map the correct addresses
when using P2P memory.
Memory mapped in this way does not need to be unmapped and thus if we
provided pci_p2pmem_unmap_sg() it would be empty. This breaks the expected
balance between map/unmap but was left out as an empty function doesn't
really provide any benefit. In the future, if this call becomes necessary
it can be added without much difficulty.
For this, we assume that an SGL passed to these functions contain all P2P
memory or no P2P memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Add a sysfs group to display statistics about P2P memory that is registered
in each PCI device.
Attributes in the group display the total amount of P2P memory, the amount
available and whether it is published or not.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
With live migration support and finally a good solution for exception
event injection, nested VMX should be ready for having a stable userspace
ABI. The results of syzkaller fuzzing are not perfect but not horrible
either (and might be partially due to running on GCE, so that effectively
we're testing three-level nesting on a fork of upstream KVM!). Enabling
it by default seems like a nice way to conclude the 4.20 pull request. :)
Unfortunately, enabling nested SVM in 2009 (commit 4b6e4dca701) was a
bit premature. However, until live migration support is in place we can
reasonably expect that it does not offer much in terms of ABI guarantees.
Therefore we are still in time to break things and conform as much as
possible to the interface used for VMX.
Suggested-by: Jim Mattson <jmattson@google.com>
Suggested-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Celebrated-by: Liran Alon <liran.alon@oracle.com>
Celebrated-by: Wanpeng Li <kernellwp@gmail.com>
Celebrated-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
x86_64 zero-extends 32bit xor operation to a full 64bit register.
Also add a comment and remove unnecessary instruction suffix in vmx.c
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is a per-VM capability which can be enabled by userspace so that
the faulting linear address will be included with the information
about a pending #PF in L2, and the "new DR6 bits" will be included
with the information about a pending #DB in L2. With this capability
enabled, the L1 hypervisor can now intercept #PF before CR2 is
modified. Under VMX, the L1 hypervisor can now intercept #DB before
DR6 and DR7 are modified.
When userspace has enabled KVM_CAP_EXCEPTION_PAYLOAD, it should
generally provide an appropriate payload when injecting a #PF or #DB
exception via KVM_SET_VCPU_EVENTS. However, to support restoring old
checkpoints, this payload is not required.
Note that bit 16 of the "new DR6 bits" is set to indicate that a debug
exception (#DB) or a breakpoint exception (#BP) occurred inside an RTM
region while advanced debugging of RTM transactional regions was
enabled. This is the reverse of DR6.RTM, which is cleared in this
scenario.
This capability also enables exception.pending in struct
kvm_vcpu_events, which allows userspace to distinguish between pending
and injected exceptions.
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When exception payloads are enabled by userspace (which is not yet
possible) and a #DB is raised in L2, defer the setting of DR6 until
later. Under VMX, this allows the L1 hypervisor to intercept the fault
before DR6 is modified. Under SVM, DR6 is modified before L1 can
intercept the fault (as has always been the case with DR7).
Note that the payload associated with a #DB exception includes only
the "new DR6 bits." When the payload is delievered, DR6.B0-B3 will be
cleared and DR6.RTM will be set prior to merging in the new DR6 bits.
Also note that bit 16 in the "new DR6 bits" is set to indicate that a
debug exception (#DB) or a breakpoint exception (#BP) occurred inside
an RTM region while advanced debugging of RTM transactional regions
was enabled. Though the reverse of DR6.RTM, this makes the #DB payload
field compatible with both the pending debug exceptions field under
VMX and the exit qualification for #DB exceptions under VMX.
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When exception payloads are enabled by userspace (which is not yet
possible) and a #PF is raised in L2, defer the setting of CR2 until
the #PF is delivered. This allows the L1 hypervisor to intercept the
fault before CR2 is modified.
For backwards compatibility, when exception payloads are not enabled
by userspace, kvm_multiple_exception modifies CR2 when the #PF
exception is raised.
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
kvm_multiple_exception now takes two additional operands: has_payload
and payload, so that updates to CR2 (and DR6 under VMX) can be delayed
until the exception is delivered. This is necessary to properly
emulate VMX or SVM hardware behavior for nested virtualization.
The new behavior is triggered by
vcpu->kvm->arch.exception_payload_enabled, which will (later) be set
by a new per-VM capability, KVM_CAP_EXCEPTION_PAYLOAD.
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The per-VM capability KVM_CAP_EXCEPTION_PAYLOAD (to be introduced in a
later commit) adds the following fields to struct kvm_vcpu_events:
exception_has_payload, exception_payload, and exception.pending.
With this capability set, all of the details of vcpu->arch.exception,
including the payload for a pending exception, are reported to
userspace in response to KVM_GET_VCPU_EVENTS.
With this capability clear, the original ABI is preserved, and the
exception.injected field is set for either pending or injected
exceptions.
When userspace calls KVM_SET_VCPU_EVENTS with
KVM_CAP_EXCEPTION_PAYLOAD clear, exception.injected is no longer
translated to exception.pending. KVM_SET_VCPU_EVENTS can now only
establish a pending exception when KVM_CAP_EXCEPTION_PAYLOAD is set.
Reported-by: Jim Mattson <jmattson@google.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Separate registration out of of_sama5d4_clk_h32mx_setup to allow other code
to use it.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
As the SSCs don't have gclk, don't check for their ID to allow them to set
the audio pll rate.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The allocation for the audio pmc is using the size of struct clk_audio_pad
instead of struct clk_audio_pmc. This works fine because the former is
larger than the latter but it is safer to be correct.
Fixes: ("0865805d82d4 clk: at91: add audio pll clock drivers")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
A typo that makes it impossible to get the correct clocks for
MMP2_CLK_SDH2 and MMP2_CLK_SDH3.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Fixes: 1ec770d92a62 ("clk: mmp: add mmp2 DT support for clock driver")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Extended atomic operations cmp&swp and fetch&add is a Mellanox
feature extending the standard atomic operation to use, varied
operand sizes, as apposed to normal atomic operation that use
an 8 byte operand only.
Extended atomics allows masking the results and arguments.
This patch configures QP to support extended atomic operation
with the maximum size possible, as exposed by HCA capabilities.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When writing out a block group free space cache we can end deadlocking
with ourselves on an extent buffer lock resulting in a warning like the
following:
[245043.379979] WARNING: CPU: 4 PID: 2608 at fs/btrfs/locking.c:251 btrfs_tree_lock+0x1be/0x1d0 [btrfs]
[245043.392792] CPU: 4 PID: 2608 Comm: btrfs-transacti Tainted: G
W I 4.16.8 #1
[245043.395489] RIP: 0010:btrfs_tree_lock+0x1be/0x1d0 [btrfs]
[245043.396791] RSP: 0018:ffffc9000424b840 EFLAGS: 00010246
[245043.398093] RAX: 0000000000000a30 RBX: ffff8807e20a3d20 RCX: 0000000000000001
[245043.399414] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff8807e20a3d20
[245043.400732] RBP: 0000000000000001 R08: ffff88041f39a700 R09: ffff880000000000
[245043.402021] R10: 0000000000000040 R11: ffff8807e20a3d20 R12: ffff8807cb220630
[245043.403296] R13: 0000000000000001 R14: ffff8807cb220628 R15: ffff88041fbdf000
[245043.404780] FS: 0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
[245043.406050] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[245043.407321] CR2: 00007fffdbdb9f10 CR3: 0000000001c09005 CR4: 00000000000206e0
[245043.408670] Call Trace:
[245043.409977] btrfs_search_slot+0x761/0xa60 [btrfs]
[245043.411278] btrfs_insert_empty_items+0x62/0xb0 [btrfs]
[245043.412572] btrfs_insert_item+0x5b/0xc0 [btrfs]
[245043.413922] btrfs_create_pending_block_groups+0xfb/0x1e0 [btrfs]
[245043.415216] do_chunk_alloc+0x1e5/0x2a0 [btrfs]
[245043.416487] find_free_extent+0xcd0/0xf60 [btrfs]
[245043.417813] btrfs_reserve_extent+0x96/0x1e0 [btrfs]
[245043.419105] btrfs_alloc_tree_block+0xfb/0x4a0 [btrfs]
[245043.420378] __btrfs_cow_block+0x127/0x550 [btrfs]
[245043.421652] btrfs_cow_block+0xee/0x190 [btrfs]
[245043.422979] btrfs_search_slot+0x227/0xa60 [btrfs]
[245043.424279] ? btrfs_update_inode_item+0x59/0x100 [btrfs]
[245043.425538] ? iput+0x72/0x1e0
[245043.426798] write_one_cache_group.isra.49+0x20/0x90 [btrfs]
[245043.428131] btrfs_start_dirty_block_groups+0x102/0x420 [btrfs]
[245043.429419] btrfs_commit_transaction+0x11b/0x880 [btrfs]
[245043.430712] ? start_transaction+0x8e/0x410 [btrfs]
[245043.432006] transaction_kthread+0x184/0x1a0 [btrfs]
[245043.433341] kthread+0xf0/0x130
[245043.434628] ? btrfs_cleanup_transaction+0x4e0/0x4e0 [btrfs]
[245043.435928] ? kthread_create_worker_on_cpu+0x40/0x40
[245043.437236] ret_from_fork+0x1f/0x30
[245043.441054] ---[ end trace 15abaa2aaf36827f ]---
This is because at write_one_cache_group() when we are COWing a leaf from
the extent tree we end up allocating a new block group (chunk) and,
because we have hit a threshold on the number of bytes reserved for system
chunks, we attempt to finalize the creation of new block groups from the
current transaction, by calling btrfs_create_pending_block_groups().
However here we also need to modify the extent tree in order to insert
a block group item, and if the location for this new block group item
happens to be in the same leaf that we were COWing earlier, we deadlock
since btrfs_search_slot() tries to write lock the extent buffer that we
locked before at write_one_cache_group().
We have already hit similar cases in the past and commit d9a0540a79f8
("Btrfs: fix deadlock when finalizing block group creation") fixed some
of those cases by delaying the creation of pending block groups at the
known specific spots that could lead to a deadlock. This change reworks
that commit to be more generic so that we don't have to add similar logic
to every possible path that can lead to a deadlock. This is done by
making __btrfs_cow_block() disallowing the creation of new block groups
(setting the transaction's can_flush_pending_bgs to false) before it
attempts to allocate a new extent buffer for either the extent, chunk or
device trees, since those are the trees that pending block creation
modifies. Once the new extent buffer is allocated, it allows creation of
pending block groups to happen again.
This change depends on a recent patch from Josef which is not yet in
Linus' tree, named "btrfs: make sure we create all new block groups" in
order to avoid occasional warnings at btrfs_trans_release_chunk_metadata().
Fixes: d9a0540a79f8 ("Btrfs: fix deadlock when finalizing block group creation")
CC: stable@vger.kernel.org # 4.4+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199753
Link: https://lore.kernel.org/linux-btrfs/CAJtFHUTHna09ST-_EEiyWmDH6gAqS6wa=zMNMBsifj8ABu99cw@mail.gmail.com/
Reported-by: E V <eliventer@gmail.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Pull single NVMe fix from Christoph.
* 'nvme-4.19' of git://git.infradead.org/nvme:
nvme: remove ns sibling before clearing path
|
|
When using the NO_HOLES feature and logging a regular file, we were
expecting that if we find an inline extent, that either its size in RAM
(uncompressed and unenconded) matches the size of the file or if it does
not, that it matches the sector size and it represents compressed data.
This assertion does not cover a case where the length of the inline extent
is smaller than the sector size and also smaller the file's size, such
case is possible through fallocate. Example:
$ mkfs.btrfs -f -O no-holes /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0xb60 0 21" /mnt/foobar
$ xfs_io -c "falloc 40 40" /mnt/foobar
$ xfs_io -c "fsync" /mnt/foobar
In the above example we trigger the assertion because the inline extent's
length is 21 bytes while the file size is 80 bytes. The fallocate() call
merely updated the file's size and did not touch the existing inline
extent, as expected.
So fix this by adjusting the assertion so that an inline extent length
smaller than the file size is valid if the file size is smaller than the
filesystem's sector size.
A test case for fstests follows soon.
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Fixes: a89ca6f24ffe ("Btrfs: fix fsync after truncate when no_holes feature is enabled")
CC: stable@vger.kernel.org # 4.14+
Link: https://lore.kernel.org/linux-btrfs/CAE5jQCfRSBC7n4pUTFJcmHh109=gwyT9mFkCOL+NKfzswmR=_Q@mail.gmail.com/
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When add_port() is done for port == 0, it indicates that ports hardware
counters initialization should be skipped. Reflect so in the comment.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
ib_register_device() does several allocation and initialization
steps. Split it into smaller more readable functions for easy
review and maintenance.
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
If port pkey list initialization fails, free the port_immutable memory
during cleanup path. Currently it is missed out.
If cache setup fails, free the pkey list during cleanup path.
Fixes: d291f1a65 ("IB/core: Enforce PKey security on QPs")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The WARN_ON() is pointless as the rport is placed in SDEV_TRANSPORT_OFFLINE
at that time, so no new commands can be submitted via srp_queuecommand()
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
At inode.c:compress_file_range(), under the "free_pages_out" label, we can
end up dereferencing the "pages" pointer when it has a NULL value. This
case happens when "start" has a value of 0 and we fail to allocate memory
for the "pages" pointer. When that happens we jump to the "cont" label and
then enter the "if (start == 0)" branch where we immediately call the
cow_file_range_inline() function. If that function returns 0 (success
creating an inline extent) or an error (like -ENOMEM for example) we jump
to the "free_pages_out" label and then access "pages[i]" leading to a NULL
pointer dereference, since "nr_pages" has a value greater than zero at
that point.
Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
the "pages" pointer.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
Fixes: 771ed689d2cd ("Btrfs: Optimize compressed writeback and reads")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The DAC960 driver has been obsoleted by the myrb/myrs drivers,
so it can be dropped.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
i.MX7D uses virtual cpu clock of "arm" clock to be child clock
of "arm_a7_root_clk" and it is with CLK_IS_CRITICAL flag set, so
no need to add CLK_IS_CRITICAL flag for keeping "arm_a7_root_clk"
use count correct, latest clock tree is as below in clk_summary:
pll_arm_main 1 1 0 792000000 0
pll_arm_main_bypass 1 1 0 792000000 0
pll_arm_main_clk 1 1 0 792000000 0
arm_a7_src 1 1 0 792000000 0
arm_a7_cg 1 1 0 792000000 0
arm_a7_div 1 1 0 792000000 0
arm_a7_root_clk 1 1 0 792000000 0
arm 1 1 0 792000000
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Add CLK_IS_CRITICAL flag for cpu clock type to
make cpu clock use count correct, as cpu clock
should be always critical.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Requester scatter to CQE is restricted to QPs configured to signal
all WRs.
This patch adds ability to enable scatter to cqe (force enable)
in the requester without sig_all, for users who do not want all WRs
signaled but rather just the ones whose data found in the CQE.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Flags sent down from user might not be supported by
running driver.
This might lead to unwanted bugs.
To solve this, added macro to test for unsupported flags.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Scatter to CQE is a HW offload that saves PCI writes by scattering the
payload to the CQE.
This patch extends already existing functionality to support DC
transport type.
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Pick up changes to mlx5_ifc.h needed for direct Scatter to CQE support
series to come next.
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch adds the necessary code to patch a running kernel at runtime
to improve performance.
The current implementation offers a few optimizations variants:
- When running a SMP kernel on a single UP processor, unwanted assembler
statements like locking functions are overwritten with NOPs. When
multiple instructions shall be skipped, one branch instruction is used
instead of multiple nop instructions.
- In the UP case, some pdtlb and pitlb instructions are patched to
become pdtlb,l and pitlb,l which only flushes the CPU-local tlb
entries instead of broadcasting the flush to other CPUs in the system
and thus may improve performance.
- fic and fdc instructions are skipped if no I- or D-caches are
installed. This should speed up qemu emulation and cacheless systems.
- If no cache coherence is needed for IO operations, the relevant fdc
and sync instructions in the sba and ccio drivers are replaced by
nops.
- On systems which share I- and D-TLBs and thus don't have a seperate
instruction TLB, the pitlb instruction is replaced by a nop.
Live-patching is done early in the boot process, just after having run
the system inventory. No drivers are running and thus no external
interrupts should arrive. So the hope is that no TLB exceptions will
occur during the patching. If this turns out to be wrong we will
probably need to do the patching in real-mode.
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Implement remove() callback function for the Mediatek PCIe controller
driver to add loadable kernel module support.
Save the PCIe's GIC IRQ at probe so that it can be retrieved to
call dispose_irq() to tear down the IRQ upon module removal.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
|
|
In order to reduce the PCIe power consumption in system suspend,
the PCI bus physical layer should be gated. On system resume, the PCIe
link should be re-established and the related control register values
should be restored.
Define suspend_noirq & resume_noirq callback functions to implement
PM system syspend hooks for the PCI host controller.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
|
|
Commit 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and
MT7622") added MSI support but enabled MSI in the wrong place, at a step
in the probe sequence where clocks were not still enabled.
Fix this issue by calling mtk_pcie_enable_msi() in mtk_pcie_startup_port_v2()
since clocks are enabled when mtk_pcie_startup_port_v2() is called.
To avoid forward declaration of mtk_pcie_enable_msi(), move the
mtk_pcie_startup_port_v2() function definition in the file.
Fixes: 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and MT7622")
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: squashed commit and adapted log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
|
|
Part of mtk_pcie_register_host() is an open-coded version of
pci_host_probe(). So instead of duplicating this code, use
pci_host_probe() directly and remove mtk_pcie_register_host().
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log changes]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
|
|
There is no need to check whether device have a PM domain attached before
calling the PM runtime methods. Remove it.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log changes]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
|
|
Fixes the signedness bug returning '(-22)' on the return type by removing the
sanity checker in rockchip_ddrclk_get_parent(). The function should return
and unsigned value only and it's safe to remove the sanity checker as the
core functions that call get_parent like clk_core_get_parent_by_index already
ensures the validity of the clk index returned (index >= core->num_parents).
Fixes: a4f182bf81f18 ("clk: rockchip: add new clock-type for the ddrclk")
Cc: stable@vger.kernel.org
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
The Acer One 10 uses a clamshell design with a detachable keyboard.
As such in normal operating mode, with the keyboard attach the device
is in landscape mode (and the Acer logo at boot also shows in landscape
mode).
But the device uses a portrait screen rotated 90 degrees (sigh). This
commit adds a quirk for this device so that we shown the fbcon the
right way up and that we hint userspace to also show e.g. plymouth and
gdm the right way up.
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181012101610.29100-1-hdegoede@redhat.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Helge writes:
"parisc fix:
Fix an unitialized variable usage in the parisc unwind code."
* 'parisc-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix uninitialized variable usage in unwind.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Stephen writes:
"clk fixes for v4.19-rc8
One fix for the Allwinner A10 SoC's audio PLL that wasn't properly
set and generating noise."
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi-ng: sun4i: Set VCO and PLL bias current to lowest setting
|