Age | Commit message (Collapse) | Author |
|
In order to use the helper macros that perform type mangling with the
EFI PCI I/O protocol struct typedefs, align their Linux typenames with
the convention we use for definitionns that originate in the UEFI spec,
and add the trailing _t to each.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-14-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Recognize the IA32/X64 Processor Error Section.
Do the section decoding in a new "cper-x86.c" file and add this to the
Makefile depending on a new "UEFI_CPER_X86" config option.
Print the Local APIC ID and CPUID info from the Processor Error Record.
The "Processor Error Info" and "Processor Context" fields will be
decoded in following patches.
Based on UEFI 2.7 Table 252. Processor Error Record.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-5-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Based on UEFI 2.7 Table 255. Processor Error Record, the "Local APIC_ID"
field is 8 bytes but Linux defines this field as 1 byte.
Fix this in the struct cper_sec_proc_ia definition.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-4-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
definition for mixed mode
Mixed mode allows a kernel built for x86_64 to interact with 32-bit
EFI firmware, but requires us to define all struct definitions carefully
when it comes to pointer sizes.
'struct efi_pci_io_protocol_32' currently uses a 'void *' for the
'romimage' field, which will be interpreted as a 64-bit field
on such kernels, potentially resulting in bogus memory references
and subsequent crashes.
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-13-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Needed by ext4 to test frozen fs before updating s_last_mounted.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/pti updates from Thomas Gleixner:
"A mixed bag of fixes and updates for the ghosts which are hunting us.
The scheduler fixes have been pulled into that branch to avoid
conflicts.
- A set of fixes to address a khread_parkme() race which caused lost
wakeups and loss of state.
- A deadlock fix for stop_machine() solved by moving the wakeups
outside of the stopper_lock held region.
- A set of Spectre V1 array access restrictions. The possible
problematic spots were discuvered by Dan Carpenters new checks in
smatch.
- Removal of an unused file which was forgotten when the rest of that
functionality was removed"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Remove unused file
perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr
perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver
perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map()
perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_*
perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[]
sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]
sched/core: Introduce set_special_state()
kthread, sched/wait: Fix kthread_parkme() completion issue
kthread, sched/wait: Fix kthread_parkme() wait-loop
sched/fair: Fix the update of blocked load when newly idle
stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
|
|
GICv3 offers the possibility to signal SPIs using a pair of doorbells
(SETPI, CLRSPI) under the name of Message Based Interrupts (MBI).
They can be used as either traditional (edge) MSIs, or the more exotic
level-triggered flavour.
Let's implement support for platform MSI, which is the original intent
for this feature.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-8-marc.zyngier@arm.com
|
|
At the beginning of times, irq_find_host() was simple. Each device node
implemented at most one irq domain, and we were happy. Over time, things
have become more complex, and we now have nodes implementing a plurality
of domains, tagged by "bus_token".
Crutially, users of irq_find_host() all expect the most basic domain
to be returned, and not any other domain such as a bus-specific MSI
domain.
So let's change irq_find_host() to first look for a DOMAIN_BUS_WIRED
domain, and only if this fails fallback to DOMAIN_BUS_ANY. Note that
this is consistent with what irq_create_fwspec_mapping is already
doing, see 530cbe100ef7 ("irqdomain: Allow domain lookup with
DOMAIN_BUS_WIRED token").
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-6-marc.zyngier@arm.com
|
|
Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected
results in the following splat:
In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0:
./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’
static inline int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
^~~~~~~~~~
./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration
static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
^~~~~~~~~
scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed
Fix it by including linux/types.h.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com
|
|
Nobody would be insane enough to try and use level triggered
MSIs on PCI, but let's make sure it doesn't happen. Also,
let's mandate that the irqchip backing the platform MSI domain
is providing the IRQCHIP_SUPPORTS_LEVEL_MSI flag.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-3-marc.zyngier@arm.com
|
|
So far, MSIs have been used to signal edge-triggered interrupts, as
a write is a good model for an edge (you can't "unwrite" something).
On the other hand, routing zillions of wires in an SoC because you
need level interrupts is a bit extreme.
People have come up with a variety of schemes to support this, which
involves sending two messages: one to signal the interrupt, and one
to clear it. Since the kernel cannot represent this, we've ended up
with side-band mechanisms that are pretty awful.
Instead, let's acknoledge the requirement, and ensure that, under the
right circumstances, the irq_compose_msg and irq_write_msg can take
as a parameter an array of two messages instead of a pointer to a
single one. We also add some checking that the compose method only
clobbers the second message if the MSI domain has been created with
the MSI_FLAG_LEVEL_CAPABLE flags.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-next
Vinod writes:
soundwire streaming
This contains:
- Support for SoundWire Streaming
- Documentation updates for streaming
- Cadence and Intel driver updates for streaming
- ASoC API for programming soundwire stream
|
|
In commit e7ff3a47630d (x86/amd: Check for the C1E bug post ACPI subsystem
init) a new function arch_post_acpi_subsys_init() was introduced. This
weak function can potentially be overridden on a per arch basis, introduce
the prototype for clarity.
Silence the following gcc warning (W=1):
init/main.c:484:20: warning: no previous prototype for ‘arch_post_acpi_subsys_init’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Remove redunadant description of label in struct mlxreg_hotplug_device.
Change location of access_mode in struct mlxreg_hotplug_device.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
|
|
Move the tsl2772 driver out of staging and into mainline.
Signed-off-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Merge misc fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
rbtree: include rcu.h
scripts/faddr2line: fix error when addr2line output contains discriminator
ocfs2: take inode cluster lock before moving reflinked inode from orphan dir
mm, oom: fix concurrent munlock and oom reaper unmap, v3
mm: migrate: fix double call of radix_tree_replace_slot()
proc/kcore: don't bounds check against address 0
mm: don't show nr_indirectly_reclaimable in /proc/vmstat
mm: sections are not offlined during memory hotremove
z3fold: fix reclaim lock-ups
init: fix false positives in W+X checking
lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit()
KASAN: prohibit KASAN+STRUCTLEAK combination
MAINTAINERS: update Shuah's email address
|
|
The bpf syscall and selftests conflicts were trivial
overlapping changes.
The r8169 change involved moving the added mdelay from 'net' into a
different function.
A TLS close bug fix overlapped with the splitting of the TLS state
into separate TX and RX parts. I just expanded the tests in the bug
fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf
== X".
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since commit c1adf20052d8 ("Introduce rb_replace_node_rcu()")
rbtree_augmented.h uses RCU related data structures but does not include
the header file. It works as long as it gets somehow included before
that and fails otherwise.
Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.
This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma. Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.
This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.
Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap(). It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.
This issue fixes CVE-2018-1000200.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull networking fixes from David Miller:
1) Verify lengths of keys provided by the user is AF_KEY, from Kevin
Easton.
2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka.
3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R.
Silva.
4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most
grateful for this fix.
5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we
do appreciate.
6) Use after free in TLS code. Once again we are blessed by the
honorable Eric Dumazet with this fix.
7) Fix regression in TLS code causing stalls on partial TLS records.
This fix is bestowed upon us by Andrew Tomt.
8) Deal with too small MTUs properly in LLC code, another great gift
from Eric Dumazet.
9) Handle cached route flushing properly wrt. MTU locking in ipv4, to
Hangbin Liu we give thanks for this.
10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux.
Paolo Abeni, he gave us this.
11) Range check coalescing parameters in mlx4 driver, thank you Moshe
Shemesh.
12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother
David Howells.
13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens,
you're the best!
14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata
Benerjee saved us!
15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship
is now water tight, thanks to Andrey Ignatov.
16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes
everywhere!
17) Fix error path in tcf_proto_create, Jiri Pirko what would we do
without you!
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
net sched actions: fix refcnt leak in skbmod
net: sched: fix error path in tcf_proto_create() when modules are not configured
net sched actions: fix invalid pointer dereferencing if skbedit flags missing
ixgbe: fix memory leak on ipsec allocation
ixgbevf: fix ixgbevf_xmit_frame()'s return type
ixgbe: return error on unsupported SFP module when resetting
ice: Set rq_last_status when cleaning rq
ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()'
bonding: send learning packets for vlans on slave
bonding: do not allow rlb updates to invalid mac
net/mlx5e: Err if asked to offload TC match on frag being first
net/mlx5: E-Switch, Include VF RDMA stats in vport statistics
net/mlx5: Free IRQs in shutdown path
rxrpc: Trace UDP transmission failure
rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages
rxrpc: Fix the min security level for kernel calls
rxrpc: Fix error reception on AF_INET6 sockets
rxrpc: Fix missing start of call timeout
qed: fix spelling mistake: "taskelt" -> "tasklet"
...
|
|
Bump the internal tag to 32, instead of stealing the last tag in
our regular command space. This works just fine, since we don't
actually need a separate hardware tag for this. Internal commands
cannot coexist with NCQ commands.
As a bonus, we get rid of the special casing of what tag to use
for the internal command.
This is in preparation for utilizing all 32 commands for normal IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Some check for the value directly, use the provided helper instead.
Also make it return a bool, since that's what it does.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This is in preparation for allowing full usage of the tag space,
which means that our reserved error handling command will be
using an internal tag value of 32. This doesn't fit in a u32, so
move to a u64.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Rigth now these are the same, but drivers should be using ->hw_tag
for their command setup and issue.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Clean up: Eliminate a structure that is no longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
While sending each RPC Reply, svc_rdma_sendto allocates and DMA-
maps a separate buffer where the RPC/RDMA transport header is
constructed. The buffer is unmapped and released in the Send
completion handler. This is significant per-RPC overhead,
especially for small RPCs.
Instead, allocate and DMA-map a buffer, and cache it in each
svc_rdma_send_ctxt. This buffer and its mapping can be re-used
for each RPC, saving the cost of memory allocation and DMA
mapping.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Clean up: Now that the send_wr is part of the svc_rdma_send_ctxt,
svc_rdma_post_send_wr is nearly empty.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Receive buffers are always the same size, but each Send WR has a
variable number of SGEs, based on the contents of the xdr_buf being
sent.
While assembling a Send WR, keep track of the number of SGEs so that
we don't exceed the device's maximum, or walk off the end of the
Send SGE array.
For now the Send path just fails if it exceeds the maximum.
The current logic in svc_rdma_accept bases the maximum number of
Send SGEs on the largest NFS request that can be sent or received.
In the transport layer, the limit is actually based on the
capabilities of the underlying device, not on properties of the
Upper Layer Protocol.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of which grab a globally shared lock that disables interrupts.
Introduce a replacement to svc_rdma_op_ctxt's that is built
especially for the svcrdma Send path.
Subsequent patches will take advantage of this new structure by
allocating real resources which are then cached in these objects.
The allocations are freed when the transport is torn down.
I've renamed the structure so that static type checking can be used
to ensure that uses of op_ctxt and send_ctxt are not confused. As an
additional clean up, structure fields are renamed to conform with
kernel coding conventions.
Additional clean ups:
- Handle svc_rdma_send_ctxt_get allocation failure at each call
site, rather than pre-allocating and hoping we guessed correctly
- All send_ctxt_put call-sites request page freeing, so remove
the @free_pages argument
- All send_ctxt_put call-sites unmap SGEs, so fold that into
svc_rdma_send_ctxt_put
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Clean up: Since there's already a svc_rdma_op_ctxt being passed
around with the running count of mapped SGEs, drop unneeded
parameters to svc_rdma_post_send_wr().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Clean up: svc_rdma_dma_map_buf does mostly the same thing as
svc_rdma_dma_map_page, so let's fold these together.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
There is a significant latency penalty when processing an ingress
Receive if the Receive buffer resides in memory that is not on the
same NUMA node as the the CPU handling completions for a CQ.
The system administrator and the device driver determine which CPU
handles completions. This CPU does not change during life of the CQ.
Further the Upper Layer does not have any visibility of which CPU it
is.
Allocating Receive buffers in the Receive completion handler
guarantees that Receive buffers are allocated on the preferred NUMA
node for that CQ.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
The current Receive path uses an array of pages which are allocated
and DMA mapped when each Receive WR is posted, and then handed off
to the upper layer in rqstp::rq_arg. The page flip releases unused
pages in the rq_pages pagelist. This mechanism introduces a
significant amount of overhead.
So instead, kmalloc the Receive buffer, and leave it DMA-mapped
while the transport remains connected. This confers a number of
benefits:
* Each Receive WR requires only one receive SGE, no matter how large
the inline threshold is. This helps the server-side NFS/RDMA
transport operate on less capable RDMA devices.
* The Receive buffer is left allocated and mapped all the time. This
relieves svc_rdma_post_recv from the overhead of allocating and
DMA-mapping a fresh buffer.
* svc_rdma_wc_receive no longer has to DMA unmap the Receive buffer.
It has to DMA sync only the number of bytes that were received.
* svc_rdma_build_arg_xdr no longer has to free a page in rq_pages
for each page in the Receive buffer, making it a constant-time
function.
* The Receive buffer is now plugged directly into the rq_arg's
head[0].iov_vec, and can be larger than a page without spilling
over into rq_arg's page list. This enables simplification of
the RDMA Read path in subsequent patches.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Currently svc_rdma_recv_ctxt_put's callers have to know whether they
want to free the ctxt's pages or not. This means the human
developers have to know when and why to set that free_pages
argument.
Instead, the ctxt should carry that information with it so that
svc_rdma_recv_ctxt_put does the right thing no matter who is
calling.
We want to keep track of the number of pages in the Receive buffer
separately from the number of pages pulled over by RDMA Read. This
is so that the correct number of pages can be freed properly and
that number is well-documented.
So now, rc_hdr_count is the number of pages consumed by head[0]
(ie., the page index where the Read chunk should start); and
rc_page_count is always the number of pages that need to be released
when the ctxt is put.
The @free_pages argument is no longer needed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Clean up: No need to retain rq_depth in struct svcrdma_xprt, it is
used only in svc_rdma_accept().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
svc_rdma_op_ctxt's are pre-allocated and maintained on a per-xprt
free list. This eliminates the overhead of calling kmalloc / kfree,
both of which grab a globally shared lock that disables interrupts.
To reduce contention further, separate the use of these objects in
the Receive and Send paths in svcrdma.
Subsequent patches will take advantage of this separation by
allocating real resources which are then cached in these objects.
The allocations are freed when the transport is torn down.
I've renamed the structure so that static type checking can be used
to ensure that uses of op_ctxt and recv_ctxt are not confused. As an
additional clean up, structure fields are renamed to conform with
kernel coding conventions.
As a final clean up, helpers related to recv_ctxt are moved closer
to the functions that use them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.
Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.
Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall <hubcap@omnibond.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add DAI registration and DAI ops for the Intel driver along with
callback for topology configuration.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add APIs for prepare, enable, disable and de-prepare stream.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
SoundWire supports two registers banks. So, program the alternate bank
with new configuration and then performs bank switch.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add helpers to configure, prepare, enable, disable and
de-prepare ports.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Master and Slave port registers need to be programmed for each port
used in a stream. Add the helpers for port register programming.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add Soundwire port data structures and APIS for initialization
and release of ports.
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
This patch adds APIs and relevant stream data structures
for initialization and release of stream.
Signed-off-by: Hardik T Shah <hardik.t.shah@intel.com>
Signed-off-by: Sanyog Kale <sanyog.r.kale@intel.com>
Signed-off-by: Shreyas NC <shreyas.nc@intel.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
drm-misc-next is still based on v4.16-rc7, and was getting a bit stale.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
Our virtual machines make use of device assignment by configuring
12 NVMe disks for high I/O performance. Each NVMe device has 129
MSI-X Table entries:
Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
The windows virtual machines fail to boot since they will map the number of
MSI-table entries that the NVMe hardware reported to the bus to msi routing
table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
for all archs, in the future this might be extended again if needed.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Tonny Lu <tonnylu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
1st round of IIO new device support, features and cleanup for the 4.18 cycle
A nice mix this time of excellent cleanups (many to send drivers
speeding toward staging graduations) and new drivers / device support.
A good part of this is Brian Masney's never ending task on the tsl2x7x
driver. The end is in sight so hopefully we'll get that one out of
staging very soon!
New device support
* AD5686
- Support AD5685R (was wrongly present as AD5685)
- Support AD5672R, AD5676, AD5676, AD5684R and AD5686R 4 and 8 channel
SPI DACs with various precisions.
- Support AD5671R, AD5675R, AD5694, AD5694R, AD5695R, AD5696 and AD5696R
I2C DACs with various percisions and numbers of channels.
* Analog front end rescale driver - New driver.
- Support current sensing usings a shunt resistor.
- Support simple voltage dividers.
- support simple current sense amplifiers.
* TI dac5571
- New driver and device bindings supporting:
dac5571, dac6571, dac7571, dac5574, dac6574, dac7574,
dac5573, dac6573 and dac7573
* Meson-adc
- Support for Meson AXG with DT bindings.
* mpu6050
- Support the mpu9255 which only requires additional WHOAMI entry and
compatible string.
* st_lsm6dsx
- Support for lsm330dlc combinded accelerometer and gyro sensors with
DT bindings.
* stm32_adc
- Add support for STM32MP1 with bindings.
Staging graduations
* adis16201 after some excelent cleanup by Himanshu Jha.
* adis16029 after some excelent cleanup by Shreeya Patel.
New features:
* ABI docs
- Add core ABI docs for angle channels.
* inv_mpu6050
- Provide support for the full range of interrupts the device
supports.
* st_accel
- Add SMO8840 ACPI ID seen in the wild on some Lenovo machines.
* stx104
- Provide a multiple gpio get function.
Cleanups / Minor fixes
* core
- Use new nested structure support to improve kernel-doc.
* ad2s1200
- Use be16_to_cpup instead of opencoding.
* ad5686
- Indentation tidy up.
- Switch to SPDX
- Refactor to allow various numbers of channels.
- Refactor to separate core and SPI specific support, prior to
addition of i2c equivalent devices.
* ad7606
- Use drvdata directly from device rather than boucing via the
platform_device structure.
* ad7746
- Replace opencoded byte swapped i2c calls with _swapped variants.
- White space and line break readability improvements.
- Reorder includes and variable declarations where appropriate.
* ad7791
- Changes to the AD ADC library used by this driver took in the
sampling frequency. This lead to be the wrong path being the one
tied to the resulting attribute, so it didn't work, and a warning
to be printed.
* ad7780
- Remove apparent support for sampling frequency control on devices
that don't support changing the sampling attributes.
* ade7854
- Fix a read of the wrong number of bits.
- Improve error handling on i2c read/write errors.
- Rework i2c and spi code to reduce duplication.
* adis16201 (staging)
- Improve meaning inherent in some macro names by adding units etc
where relevant.
- Adjust comments to improve detail and drop the irrelevant.
- Rename register address definitions definitions to add a _REG
postfix, clearly separating them from field definitions. Reorganize
the definitions to group register address and fields.
- Use sign_extend32 rather than open coding.
- Reverse Xmas tree ordering where appropriate and align function args.
- Remove unused headers.
- Use GENMASK where appropriate instead of open coding.
* adis16209 (staging)
- Indent field definitions to visually separate them from
register address definitions.
- Use reverse xmas tree ordering where appropriate.
- Add some whitespace where it will help readability.
- Drop some unused headers.
- Use GENMASK where appropriate.
* ad2s1200
- Drop unnecessary includes and reorder alphabetically.
- Reverse xmas tree and blank line cleanups.
* atlas-ph-sensor
- Use msleep instead of usleep_range where the precise value doesn't
matter and the delays are long.
* bcm150
- Drop transaction splitting as core now handles it.
* cros_ec
- Move the shared header to the include/iio/common directory.
This brings it inline with the other multiple type devices.
- Use drvdata directly from device rather than boucing via the
platform_device structure.
* hid-sensors
- Use drvdata directly from device rather than boucing via the
platform_device structure.
* inv_mpu6050
- Clear out a second function definition for the same function.
- Don't flush fifo when the iio buffer is full but just drop excess
data.
- Tidy up set_power_itg and ensure it is used in the right places.
- Use set_power_itg rather than opencoding it again in the i2c mux
control.
- Make sure error paths disable the power if undoing power on.
- Used managed devm_ functions during probe. Delete remove function.
- Refactor to pull raw data read out of read_raw function.
- Simplify data reading error paths.
- Only enable the i2c mux for chips with the i2c aux bus (not icm20608)
- Fix a potential deadlock due to varying lock ordering.
- Fix an issue where first sample from gyro after enabling is unstable
by dropping the first sample.
- Fix an issue where the user_ctrl register is incorrectly overwritten.
- Tidy up some grammar and spelling minor issus.
* mcp320x
- Use vendor compatible strings.
* mcp4018
- Switch to using i2c .probe_new.
* mcp4351
- switch to using i2c .probe_new.
* meson-adc
- rework handing on common ADC platform data so it can be shared
across multiple families of SoCs.
* sca3000
- Fix an error handling path if the ring configure fails.
* st_lsm6dsx
- Fix a wrong fifo threshold mask (no actual effect)
* stm32-dfsdm
- Style fixes and cleanups.
- Check filter ID is in range and check spi-max-frequency.
* tsl2x7x (staging)
- Drop some unnecessary function calls, unused variables and
unnecessary local variables.
- Fix wrong interrupt type.
- Avoid unnecessary double clear of interrupt.
- Simplify proximity calibration call which did various things
unrelated to actually calibrating.
- Separate control of the proximity and ALS interrupts.
- Improve consistency of logging.
- Separate ALS and proximity persistence settings as they have
separate hardware controls.
- Tidy up variable ordering.
- Add Brian to copyright notice given consider work on this driver.
- Take advantage of hardware support for I2C address auto increment.
- Combine individuaal enable and period attributes for the two
directions on the threshold events into a single value as the
hardware doesn't separate them.
- Move integration_time* attributes from light channel to
intensity value as they effect the intensity readings directly
and the light reading only indirectly. Hence this better
reflects reality. Also move the calibscale_available.
- Avoid returning an error in the IRQ handler.
- Hard code the reg value in _clear_interrupts as it only takes
one value in the code. Result is the function has little
purpose so opencode the two remaining i2c_smbus_write_byte
calls.
- Drop some unnecessary checking of the chip status register.
- Tidy up return path in _write_interrupt_config.
- Tidy up the ID verification code.
- Move the power and diode settings defines into the header as these
are needed for platform data configuration.
- Various renames and comment cleanups for consistency and clarity.
- Use actual device defaults for default startup settings.
- SPDX
- Add some range sanity checking to sysfs attribute writes.
- Don't provide event interfaces if the interrupt line isn't available.
- Use IIO_CONST_ATTR macro for calibscale_available as it's a constant
string.
- Fix the integration time and lux equations.
- Make device IDs explicit index values in the device_channel_config array.
|
|
This patch adds support to APR bus (Asynchronous Packet Router) driver.
APR driver is made as a bus driver so that the apr devices can added removed
more dynamically depending on the state of the services on the dsp.
APR is used for communication between application processor and QDSP to
use services on QDSP like Audio and others.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-and-tested-by: Rohit kumar <rohitkr@codeaurora.org>
Acked-by: Andy Gross <andy.gross@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
This API has been replaced by the spi_mem_xx() one, its only user
(spi-nor) has been converted to spi_mem_xx() and all SPI controller
drivers that were implementing the ->spi_flash_xxx() hooks are also
implementing the spi_mem ones. So we can safely get rid of this API.
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@exceet.de>
Tested-by: Frieder Schrempf <frieder.schrempf@exceet.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
|