Age | Commit message (Collapse) | Author |
|
This patch is to remove the typedef sctp_datahdr_t, and replace with
struct sctp_datahdr in the places where it's using this typedef.
It is also to use izeof(variable) instead of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove this typedef, there is even no places using it.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to remove the typedef sctp_param_t, and replace with
struct sctp_paramhdr in the places where it's using this typedef.
It is also to remove the useless declaration sctp_addip_addr_config
and fix the lack of params for some other functions' declaration.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to remove the typedef sctp_paramhdr_t, and replace
with struct sctp_paramhdr in the places where it's using this
typedef.
It is also to fix some indents and use sizeof(variable) instead
of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove this typedef, there is even no places using it.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to remove the typedef sctp_cid_t, and replace
with struct sctp_cid in the places where it's using this
typedef.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to remove the typedef sctp_chunkhdr_t, and replace
with struct sctp_chunkhdr in the places where it's using this
typedef.
It is also to fix some indents and use sizeof(variable) instead
of sizeof(type)., especially in sctp_new.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to remove the typedef sctp_sctphdr_t, and replace
with struct sctphdr in the places where it's using this typedef.
It is also to fix some indents and use sizeof(variable) instead
of sizeof(type).
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a helper to allow switchdev ops to be set if NET_SWITCHDEV is configured
and do nothing otherwise. This allows for slightly cleaner code which
uses switchdev but does not select NET_SWITCHDEV.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ACPI 6.2 added new NVDIMM root DSM functions. Define their
data structures.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add a bus level dsm_mask to nvdimm_bus_descriptor to allow the passthru
calling mechanism to specify a different mask from the cmd_mask.
Populate bus_dsm_mask and use it to filter dsm calls that user can
make through the pass thru interface.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
[djbw: use command number constants instead of a magic mask value]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Set ND_CMD_CALL in the cmd_mask to enable calling root
functions via the pass thru mechanism.
Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
This conversion requires overall +1 on the whole
refcounting scheme.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
uapi/linux/a.out.h uses a number of predefined macros that are
deprecated because they're in the application namespace
(e.g. '#ifdef linux' instead of '#ifdef __linux__').
This patch either corrects or just removes them if they are not
applicable to Linux.
The primary reason this is worth bothering to fix, considering how
obsolete a.out binary support is, is that the GCC build process
considers this such a severe error that it will copy the header into a
private directory and change the macro names, which causes future
updates to the header to be masked. This header probably doesn't get
updated very often anymore, but it is the _only_ uapi header that gets
this treatment, so IMHO it is worth patching just to drive that number
all the way to zero.
Signed-off-by: Zack Weinberg <zackw@panix.com>
[hch: removed dead conditionals]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
"in a rcu enabled hashtable" is repeated twice in a comment.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This marks most of the layout of task_struct as randomizable, but leaves
thread_info and scheduler state untouched at the start, and thread_struct
untouched at the end.
Other parts of the kernel use unnamed structures, but the 0-day builder
using gcc-4.4 blows up on static initializers. Officially, it's documented
as only working on gcc 4.6 and later, which further confuses me:
https://gcc.gnu.org/wiki/C11Status
The structure layout randomization already requires gcc 4.7, but instead
of depending on the plugin being enabled, just check the gcc versions
for wider build testing. At Linus's suggestion, the marking is hidden
in a macro to reduce how ugly it looks. Additionally, indenting is left
unchanged since it would make things harder to read.
Randomization of task_struct is modified from Brad Spengler/PaX Team's
code in the last public patch of grsecurity/PaX based on my understanding
of the code. Changes or omissions from the original code are mine and
don't reflect the original grsecurity/PaX code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
This marks many critical kernel structures for randomization. These are
structures that have been targeted in the past in security exploits, or
contain functions pointers, pointers to function pointer tables, lists,
workqueues, ref-counters, credentials, permissions, or are otherwise
sensitive. This initial list was extracted from Brad Spengler/PaX Team's
code in the last public patch of grsecurity/PaX based on my understanding
of the code. Changes or omissions from the original code are mine and
don't reflect the original grsecurity/PaX code.
Left out of this list is task_struct, which requires special handling
and will be covered in a subsequent patch.
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A set of overlapping changes in macvlan and the rocker
driver, nothing serious.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree. This batch contains connection tracking updates for the cleanup
iteration path, patches from Florian Westphal:
X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set
dying bit to let the CPU release them.
X) Add nf_ct_iterate_destroy() to be used on module removal, to kill
conntrack from all namespace.
X) Restart iteration on hashtable resizing, since both may occur at
the same time.
X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT
mapping on module removal.
X) Use nf_ct_iterate_destroy() to remove conntrack entries helper
module removal, from Liping Zhang.
X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension
if user requests this, also from Liping.
X) Add net_ns_barrier() and use it from FTP helper, so make sure
no concurrent namespace removal happens at the same time while
the helper module is being removed.
X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce
module size. Same thing in nf_tables.
Updates for the nf_tables infrastructure:
X) Prepare usage of the extended ACK reporting infrastructure for
nf_tables.
X) Remove unnecessary forward declaration in nf_tables hash set.
X) Skip set size estimation if number of element is not specified.
X) Changes to accomodate a (faster) unresizable hash set implementation,
for anonymous sets and dynamic size fixed sets with no timeouts.
X) Faster lookup function for unresizable hash table for 2 and 4
bytes key.
And, finally, a bunch of asorted small updates and cleanups:
X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe
to device events and look up for index from the packet path, this
is fixing an issue that is present since the very beginning, patch
from Xin Long.
X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal.
X) Use ebt_invalid_target() whenever possible in the ebtables tree,
from Gao Feng.
X) Calm down compilation warning in nf_dup infrastructure, patch from
stephen hemminger.
X) Statify functions in nftables rt expression, also from stephen.
X) Update Makefile to use canonical method to specify nf_tables-objs.
From Jike Song.
X) Use nf_conntrack_helpers_register() in amanda and H323.
X) Space cleanup for ctnetlink, from linzhang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add some DAPM widget types to better support the construction of DAPM
graphs within DSPs.
Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM updates for 4.13
- vcpu request overhaul
- allow timer and PMU to have their interrupt number
selected from userspace
- workaround for Cavium erratum 30115
- handling of memory poisonning
- the usual crop of fixes and cleanups
Conflicts:
arch/s390/include/asm/kvm_host.h
|
|
Usage of these apis and their compat versions makes
the syscalls: clock_nanosleep and nanosleep and
their compat implementations simpler.
This is a preparatory patch to isolate data conversions to
struct timespec64 at userspace boundaries. This helps contain
the changes needed to transition to new y2038 safe types.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Update Public Action field values as updated in IEEE Std 802.11-2016,
so that modules/drivers can refer it.
Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
If NAN interface is created with NL80211_ATTR_SOCKET_OWNER, the socket
that is used to create the interface is used for all NAN operations and
reporting NAN events.
However, it turns out that sending commands and receiving events on
the same socket is not possible in a completely race-free way:
If the socket buffer is overflowed by the events, the command response
will not be sent. In that case the caller will block forever on recv.
Using non-blocking socket for commands is more complicated and still
the command response or ack may not be received.
So, keep unicasting NAN events to the interface creator, but allow
using a different socket for commands.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
|
|
There we actually have useful information about object sizes.
Note: this patch has them done for all iov_iter flavours.
Right now we do them twice in iovec case, but that'll change
very shortly.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and move them into thread_info.h, next to check_object_size()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
De-dupliate some code and allow for passing the flags argument to
vfs_iter_write. Additionally it now properly updates timestamps.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
De-dupliate some code and allow for passing the flags argument to
vfs_iter_read. Additional it properly updates atime now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull networking fixes from David Miller:
1) Need to access netdev->num_rx_queues behind an accessor in netvsc
driver otherwise the build breaks with some configs, from Arnd
Bergmann.
2) Add dummy xfrm_dev_event() so that build doesn't fail when
CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu.
3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan
Carpenter.
4) Fix MCDI command size for filter operations in sfc driver, from
Martin Habets.
5) Fix UFO segmenting so that we don't calculate incorrect checksums,
from Michal Kubecek.
6) When ipv6 datagram connects fail, reset destination address and
port. From Wei Wang.
7) TCP disconnect must reset the cached receive DST, from WANG Cong.
8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric
Dumazet.
9) fman driver has to depend on HAS_DMA, from Madalin Bucur.
10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann.
11) Fix negative page counts with GFO, from Michal Kubecek.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
sfc: fix attempt to translate invalid filter ID
net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish()
bpf: prevent leaking pointer via xadd on unpriviledged
arcnet: com20020-pci: add missing pdev setup in netdev structure
arcnet: com20020-pci: fix dev_id calculation
arcnet: com20020: remove needless base_addr assignment
Trivial fix to spelling mistake in arc_printk message
arcnet: change irq handler to lock irqsave
rocker: move dereference before free
mlxsw: spectrum_router: Fix NULL pointer dereference
net: sched: Fix one possible panic when no destroy callback
virtio-net: serialize tx routine during reset
net: usb: asix88179_178a: Add support for the Belkin B2B128
fsl/fman: add dependency on HAS_DMA
net: prevent sign extension in dev_get_stats()
tcp: reset sk_rx_dst in tcp_disconnect()
net: ipv6: reset daddr and dport in sk if connect() fails
bnx2x: Don't log mc removal needlessly
bnxt_en: Fix netpoll handling.
bnxt_en: Add missing logic to handle TPA end error conditions.
...
|
|
The UEFI 2.7 specification defines an updated BTT metadata format,
bumping the revision to 2.0. Add support for the new format, while
retaining compatibility for the old 1.1 format.
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS.
The lookup returns a prog-id or map-id to the userspace.
The userspace can then use the BPF_PROG_GET_FD_BY_ID
or BPF_MAP_GET_FD_BY_ID to get a fd.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Newer asics with 4 SEs are not able to fit the entire bitmask in the
original field, use an array instead.
v2: keep cu_ao_mask for backward compatibility.
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2017-06-27 (Innova IPsec offload support)
This patchset adds support for Innova IPSec network interface card.
About Innova device:
--------------------
Innova is a network card with a ConnectX chip and an FPGA chip as a
bump-on-the-wire.
Internal
+----------+ Link +-----------------+
| +--------------+ FPGA | +------+
| ConnectX | | Shell +--+ QSFP |
| +--------------+ +-------+ | | Port |
+----------+ I2C | | SBU | | +------+
| +-------+ |
+--+----------+---+
| |
+--+--+ +---+---+
| DDR | | Flash |
+-----+ +-------+
The FPGA synthesized logic is loaded from dedicated flash storage and has
access to its own dedicated DDR RAM.
The ConnectX chip firmware programs the FPGA by accessing its configuration
space over either the slow internal I2C link or the high-speed internal link.
The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU).
mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality,
while other components may handle the various SBU functionalities.
The driver opens high-speed reliable communication channels with the shell and
the SBU over the internal link.
These channels may be used for high-bandwidth configuration or for SBU-specific
out-of-band data paths.
About Innova IPSec device:
--------------------------
Innova IPSec is a network card that allows offloading IPSec cryptography operations
from the host CPU to the NIC. It is an Innova card with an IPSec SBU.
The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's
DDR memory.
Internal
+----------+ Link +-----------------+
| +--------------+ FPGA | +------+
| ConnectX | | Shell +--+ QSFP |
| +--------------+ +-------+ | | Port |
+----------+ Internal I2C | | IPSec | | +------+
| | SBU | |
| +-------+ |
+--+----------+---+
| |
+--+--+ +---+---+
| DDR | | |
| | | Flash |
|SADB | | |
+-----+ +-------+
Modes and ciphers:
Currently the following modes and ciphers are supported:
IPv4 and IPv6
ESP tunnel and transport modes
AES 128 and 256 bit encryption, with GCM authentication (RFC4106)
IV is generated using seqiv, in sync with Linux's geniv.
More modes and ciphers may be added later.
Notes:
In the future similar functionality will be included in a single-chip NIC.
About the driver:
-----------------
Patches 1-4 prepare some existing driver code for the new feature:
* Add support for reserved GIDs in the hardware GID table
* Allow multiple modules to enable hardware RoCE support independently
Patches 5-6 define structs and helper functions for QP work-queues.
Patches 7-11 add various FPGA-related features required for Innova.
IPSec.
Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices.
atches 13-16 add IPSec offload support to the mlx5 netdevice.
This driver services the new IPSec offload API introduced in commit
d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Configuration Path:
If Innova IPSec device is detected, the mlx5e netdevice gets the new
NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload
capabilities, and also the matching TX checksum and GSO features.
The driver configures offloaded Security Associations (SAs) by sending
an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR.
These messages and their responses are sent over a high-speed channel.
Counters for ethtool are retrieved by the driver from the SBU.
Data path:
On receive path, the SBU decrypts ESP packets which match the offloaded SADB,
but keeps them encapsulated.
The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload
has taken place, the SA with which it was done, and the authentication result.
The ConnectX chip performs RX checksum offload on the packet, and RSS using the
ESP SPI value. The driver detects the special ethertype, and attaches a struct
secpath to the RX SKB, including flags to indicate that crypto offload took place,
the authentication result, and which xfrm_state was used for decryption, in the
olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate
patchset will add support for that in the xfrm stack.
On transmit path, the stack encapsulates the packet but does not encrypt it, and
indicates in the SKB's secpath that crypto offload is to be performed and the SA
to use to do so.
The driver avoids performing crypto-offload for ESP fragments, and packets with
IP options, as the SBU cannot currently do that. For eligible packets, the driver
prepends a special ethertype with metadata instructing the hardware to perform crypto offload.
The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer.
The driver trims it off, because the SBU automatically appends the trailer for offloaded packets.
The ConnectX chip performs TX checksum offload on inner UDP or TCP packets,
and GSO for TCP packets (duplicating the prepended metadata).
The segmented packets then undergo encryption in the SBU before going on the wire.
Performance:
We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
Using AES-NI with ESP GSO we get constant 4.1 Gbps.
Using crypto offload we get constant 18 Gbps.
Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately.
- Ilan Tayari
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The pmem driver attaches to both persistent and volatile memory ranges
advertised by the ACPI NFIT. When the region is volatile it is redundant
to spend cycles flushing caches at fsync(). Check if the hosting region
is volatile and do not set dax_write_cache() if it is.
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|