Age | Commit message (Collapse) | Author |
|
ndisc_notify is the ipv6 equivalent to arp_notify. When arp_notify is
set to 1, gratuitous arp requests are sent when the device is brought up.
The same is expected when ndisc_notify is set to 1 (per ndisc_notify in
Documentation/networking/ip-sysctl.txt). The NA is not sent on NETDEV_UP
event; add it.
Fixes: 5cb04436eef6 ("ipv6: add knob to send unsolicited ND on link-layer address change")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Saeed Mahameed says:
====================
Mellanox, mlx5 RDMA net device support
This series provides the lower level mlx5 support of RDMA netdevice
creation API [1] suggested and introduced by Intel's HFI OPA VNIC
netdevice driver [2], to enable IPoIB mlx5 RDMA netdevice creation.
mlx5 IPoIB RDMA netdev will serve as an acceleration netdevice for the current
IPoIB ULP generic netdevice, providing:
- mlx5 RSS support.
- mlx5 HW RX,TX offloads (checksum, TSO, LRO, etc ..).
- Full mlx5 HW features transparent to the ULP itself.
The idea here is to reuse and benefit from the already implemented mlx5e netdevice
management and channels API for both etherent and RDMA netdevices, since both IPoIB
and Ethernet netdevices share same common mlx5 HW resources (with some small
exceptions) and share most of the control/data path logic, it is more natural to
have them share the same code.
The differences between IPoIB and Ethernet netdevices can be summarized to:
Steering:
In mlx5, IPoIB traffic is sent and received from an underlay special QP, and in Ethernet
the traffic is handled by vports and vport steering is managed by e-switch or FW.
For IPoIB traffic to get steered correctly the only thing we need to do is to create RSS
HW contexts for RX and TX HW contexts for TX (similar to mlx5e) with the underlay QP attached to
them (underlay QP will be 0 in case of Ethernet).
RX,TX:
Since IPoIB traffic is different, slightly modified RX and TX handlers are required,
still we do some code reuse in data path via common helper functions.
All of the other generic netdevice and mlx5 aspects will be shared between mlx5 Ethernet
and IPoIB netdevices, e.g.
- Channels creation and handling (RQs,SQs,CQs, NAPI, interrupt moderation, etc..)
- Offloads, checksum, GRO, LRO, TSO, and more.
- netdevice logic and non Ethernet specific ndos (open/close, etc..)
In order to achieve what we want:
In patchet 1 to 3, Erez added the supported for underlay QP in mlx5_ifc and refactored
the mlx5 steering code to accept the underlay QP as a parameter for creating steering
objects and enabled flow steering for IB link.
Then we are going to use the mlx5e netdevice profile, which is already used to separate between
NIC and VF representors netdevices, to create new type of IPoIB netdevice profile.
For that, one small refactoring is required to make mlx5e netdevice profile management
more genetic and agnostic to link type which is done in patch #4.
In patch #5, we introduce ipoib.c to host all of mlx5 IPoIB (mlx5i) specific logic and a
skeleton for the IPoIB mlx5 netdevice profile, and we will start filling it in next patches,
using mlx5e already existing APIs.
Patch #6 and #7, Implement init/cleanup RX mlx5i netdev profile handlers to create mlx5 RSS
resources, same as mlx5e but without vlan and L2 steering tables.
Patch #8, Implement init/cleanup TX mlx5i netdev profile handlers, to create TX resources
same as mlx5e but with one TC (tc = 0) support.
Patch #9, Implement mlx5i open/close ndos, where we reuese the mlx5e channels API, to start/stop TX/RX channels.
Patch #10, Create the underlay QP and attach it to mlx5i RSS and TX HW contexts.
Patch #11 and #12, Break down the mlx5e xmit flow into smaller helper function and implement the
mlx5i IPoIB xmit routine.
Patch #13 and #14, Have an RX handler per netdevice profile. We already do this before this series
in a non clean way to separate between NIC netdev and VF representor RX handlers, in patch 13 we make
the RX handler generic and bound to a profile and in patch 14 we implement the IPoIB RX handlers.
Patch #15, Small cleanup to avoid e-switch with IPoIB netdev.
In order to enable mlx5 IPoIB, a merge between the IPoIB RDMA netdev offolad support [3]
- which was alread submitted to the rdma mailing list - and this series is required
plus an extra small patch [4] which will connect between both sides and actually enables the offload.
Once both patch-sets are merged into linux we will have to submit the extra small patch [4], to enable
the feature.
Thanks,
Saeed.
[1] https://patchwork.kernel.org/patch/9676637/
[2] https://lwn.net/Articles/715453/
https://patchwork.kernel.org/patch/9587815/
[3] https://patchwork.kernel.org/patch/9672069/
[4] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/commit/?id=0141db6a686e32294dee015b7d07706162ba48d8
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add check for bit IB_QP_CREATE_NETIF_QP while creating QP.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the driver support only ethernet eswitch, and we want to
protect downstream IPoIB netdev from trying to access it in IB link.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement IPoIB RX SKB handler.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to have different RX handler per profile, fix and refactor the
current code to take the rx handler directly from the netdevice profile
rather than computing it on runtime as it was done with the switchdev
mode representor rx handler.
This will also remove the current wrong assumption in mlx5e_alloc_rq
code that mlx5e_priv->ppriv is of the type vport_rep.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement mlx5e's IPoIB SKB transmit using the helper functions provided
by mlx5e ethernet tx flow, the only difference in the code between
mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill
(UD datagram segment) in the TX descriptor (WQE) and it doesn't need to
have any vlan handling.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Break current mlx5e xmit flow into smaller blocks (helper functions)
in order to reuse them for IPoIB SKB transmission.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create IPoIB underlay QP needed by the IPoIB netdevice profile for RSS
and TX HW context to perform on IPoIB traffic.
Reset the underlay QP on dev_uninit ndo to stop IPoIB traffic going
through this QP when the ULP IPoIB decides to cleanup.
Implement attach/detach mcast RDMA netdev callbacks for later RDMA
netdev use.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement open/close of IPoIB netdevice ndos using mlx5e's
channels API to manage data path resources (RQs/SQs/CQs).
Set IPoIB netdev address on dev_init ndo.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Modify mlx5e tis creation function to accept underlay qp number, which
will be needed by IPoIB.
Implement mlx5i (IPoIB) tx init/cleanup netdevice profile flows to
create one TIS with the IPoIB underlay qp, for IPoIB TX SQs.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Like the mlx5e ethernet mode, on IPoIB mode we need to create RX steering
tables, but IPoIB do not require MAC and VLAN steering tables so the
only tables we create in here are:
1. TTC Table (Traffic Type Classifier table for RSS steering)
2. ARFS Table (for accelerated RFS support)
Creation of those tables is identical to mlx5e ethernet mode, hence the
use of mlx5e_create_ttc_table and mlx5e_arfs_create_tables.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement IPoIB RX RSS (RQTs and TIRs) HW objects creation,
All we do here is simply reuse the mlx5e implementation to create
direct and indirect (RSS) steering HW objects.
For that we just expose
mlx5e_{create,destroy}_{direct,indirect}_{rqt,tir} functions into en.h
and call them from ipoib.c in init/cleanup_rx IPoIB netdevice profile
callbacks.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create mlx5e IPoIB netdevice profile skeleton in the new ipoib.c
file with empty implementation.
Downstream patches will provide the full mlx5 rdma netdevice acceleration
support for IPoIB into this new file, by using the mlx5e netdevice
profile and new mlx5_channels APIs and infrastructures.
Same as already done in mlx5e NIC netdevice and switchdev mode VF
representors.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for mlx5e RDMA net_device support, here we generalize
mlx5e_attach/detach in a way that those functions will be agnostic
to link type. For that we move ethernet specific NIC net device logic out
of those functions into {nic,rep}_{enable/disable} mlx5e NIC and
representor profiles callbacks.
Also some of the logic was moved only to NIC profile since it is not right
to have this logic for representor net device (e.g. set port MTU).
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Get the relevant capabilities if supports ipoib_enhanced_offloads and
init the flow steering table accordingly.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IB flow tables need the underlay qp to perform flow steering.
Here we change the API of the flow tables creation to accept the
underlay QP number as a parameter in order to support IB (IPoIB) flow
steering.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
New capability bit: ipoib_enhanced_offloads, indicates new ability for UD
QP to do RSS and enhanced IPoIB offloads and acceleration.
Add underlay_qpn to the TIS and flow_table objects In order to support
SET_ROOT command, to connect between IPoIB QPs and flow steering tables.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Azure hosts are not supporting non-TCP port numbers in vRSS hashing for
now. For example, UDP packet loss rate will be high if port numbers are
also included in vRSS hash.
So, we created this patch to use only IP numbers for hashing in non-TCP
traffic.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the outgoing skb has a RX queue mapping available, we use the queue
number directly, other than put it through Send Indirection Table.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch moves as is the legacy DSA code from dsa.c to legacy.c,
except the few shared symbols which remain in dsa.c.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The number of rx queues is determined by the rss_cpus parameter
or the cpu topology. If that is higher than EFX_MAX_RX_QUEUES the
driver can corrupt state.
Fixes: 8ceee660aacb ("New driver "sfc" for Solarstorm SFC4000 controller.")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* if a local variable of type uint16_t is unaligned, your compiler is FUBAR
* the whole point of get_unaligned_... is to avoid memcpy + ..._to_cpu().
Using it *after* memcpy() (into aligned object, no less) is pointless.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
fix for WARN:
usb 3-2.4.1: NFC: Exchanging data failed (error 0x13)
llcp: nfc_llcp_recv: err -5
llcp: nfc_llcp_symm_timer: SYMM timeout
------------[ cut here ]------------
WARNING: CPU: 1 PID: 26397 at .../drivers/usb/core/hcd.c:1584 usb_hcd_map_urb_for_dma+0x370/0x550
transfer buffer not dma capable
[...]
Workqueue: events nfc_llcp_timeout_work [nfc]
Call Trace:
? dump_stack+0x46/0x5a
? __warn+0xb9/0xe0
? warn_slowpath_fmt+0x5a/0x80
? usb_hcd_map_urb_for_dma+0x370/0x550
? usb_hcd_submit_urb+0x2fb/0xa60
? dequeue_entity+0x3f2/0xc30
? pn533_usb_send_ack+0x5d/0x80 [pn533_usb]
? pn533_usb_abort_cmd+0x13/0x20 [pn533_usb]
? pn533_dep_link_down+0x32/0x70 [pn533]
? nfc_dep_link_down+0x87/0xd0 [nfc]
[...]
usb 3-2.4.1: NFC: Exchanging data failed (error 0x13)
llcp: nfc_llcp_recv: err -5
llcp: nfc_llcp_symm_timer: SYMM timeout
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Again, a batch that's been sitting a couple of weeks, mostly because
I anticipated a bit more material but it didn't show up -- which is
good.
These are all your garden variety fixes for ARM platforms.
The most visible issue fixed here is probably the SMP reset issue on
OMAP, the rest are minor stuff"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: add pmu0 regs for USB PHY
ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer
reset: add exported __reset_control_get, return NULL if optional
ARM: orion5x: only call into phylib when available
ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot
ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
ARM: dts: ti: fix PCI bus dtc warnings
ARM: dts: am335x-baltos: disable EEE for Atheros 8035 PHY
ARM: dts: OMAP3: Fix MFG ID EEPROM
ARM: sun8i: a33: add operating-points-v2 property to all nodes
ARM: sun8i: a33: remove highest OPP to fix CPU crashes
|
|
Pull block fixes from Jens Axboe:
"Four small fixes.
Three of them fix the same error in NVMe, in loop, fc, and rdma
respectively. The last fix from Ming fixes a regression in this
series, where our bvec gap logic was wrong and causes an oops on
NVMe for certain conditions"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix bio_will_gap() for first bvec with offset
nvme-fc: Fix sqsize wrong assignment based on ctrl MQES capability
nvme-rdma: Fix sqsize wrong assignment based on ctrl MQES capability
nvme-loop: Fix sqsize wrong assignment based on ctrl MQES capability
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Regression fix for omap interconnect code for deferred probe.
Without this fix we can get PM related warnings for devices that
use deferred probe. If necessary, this fix can wait for the
v4.12 merge window no problem.
* tag 'omap-for-v4.11/fixes-rc6-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer
ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot
ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
ARM: dts: ti: fix PCI bus dtc warnings
ARM: dts: am335x-baltos: disable EEE for Atheros 8035 PHY
ARM: dts: OMAP3: Fix MFG ID EEPROM
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"Unfortunately, the commit to fix the cgroup mount race in the previous
pull request can lead to hangs.
The original bug has been around for a while and isn't too likely to
be triggered in usual use cases. Revert the commit for now"
* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
Revert "cgroup: avoid attaching a cgroup root to two different superblocks"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty fix from Greg KH:
"Here is a single tty core revert for a patch that was reported to
cause problems.
The original issue is one that we have lived with for decades, so
trying to scramble to fix the fix in time for 4.11-final does not make
sense due to the fragility of the tty ldisc layer. Just reverting it
makes sense for now"
* tag 'tty-4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "tty: don't panic on OOM in tty_set_ldisc()"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
"While rewriting the function probe code, I stumbled over a long
standing bug. This bug has been there sinc function tracing was added
way back when. But my new development depends on this bug being fixed,
and it should be fixed regardless as it causes ftrace to disable
itself when triggered, and a reboot is required to enable it again.
The bug is that the function probe does not disable itself properly if
there's another probe of its type still enabled. For example:
# cd /sys/kernel/debug/tracing
# echo schedule:traceoff > set_ftrace_filter
# echo do_IRQ:traceoff > set_ftrace_filter
# echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
# echo do_IRQ:traceoff > set_ftrace_filter
The above registers two traceoff probes (one for schedule and one for
do_IRQ, and then removes do_IRQ.
But since there still exists one for schedule, it is not done
properly. When adding do_IRQ back, the breakage in the accounting is
noticed by the ftrace self tests, and it causes a warning and disables
ftrace"
* tag 'trace-v4.11-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix removing of second function probe
|
|
This reverts commit bfb0b80db5f9dca5ac0a5fd0edb765ee555e5a8e.
Andrei reports CRIU test hangs with the patch applied. The bug fixed
by the patch isn't too likely to trigger in actual uses. Revert the
patch for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Link: http://lkml.kernel.org/r/20170414232737.GC20350@outlook.office365.com
|
|
This fixes a bug in which the upper 32-bits of a 64-bit value which is
read by get_user() was lost on a 32-bit kernel.
While touching this code, split out pre-loading of %sr2 space register
and clean up code indent.
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Conflicts were simply overlapping changes. In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.
In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull nvdimm fixes from Dan Williams:
"A small crop of lockdep, sleeping while atomic, and other fixes /
band-aids in advance of the full-blown reworks targeting the next
merge window. The largest change here is "libnvdimm: fix blk free
space accounting" which deletes a pile of buggy code that better
testing would have caught before merging. The next change that is
borderline too big for a late rc is switching the device-dax locking
from rcu to srcu, I couldn't think of a smaller way to make that fix.
The __copy_user_nocache fix will have a full replacement in 4.12 to
move those pmem special case considerations into the pmem driver. The
"libnvdimm: band aid btt vs clear poison locking" commit admits that
our error clearing support for btt went in broken, so we just disable
it in 4.11 and -stable. A replacement / full fix is in the pipeline
for 4.12
Some of these would have been caught earlier had DEBUG_ATOMIC_SLEEP
been enabled on my development station. I wonder if we should have:
config DEBUG_ATOMIC_SLEEP
default PROVE_LOCKING
...since I mistakenly thought I got both with PROVE_LOCKING=y.
These have received a build success notification from the 0day robot,
and some have appeared in a -next release with no reported issues"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions
device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation
libnvdimm: band aid btt vs clear poison locking
libnvdimm: fix reconfig_mutex, mmap_sem, and jbd2_handle lockdep splat
libnvdimm: fix blk free space accounting
acpi, nfit, libnvdimm: fix interleave set cookie calculation (64-bit comparison)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is seven small fixes which are all for user visible issues that
fortunately only occur in rare circumstances.
The most serious is the sr one in which QEMU can cause us to read
beyond the end of a buffer (I don't think it's exploitable, but just
in case).
The next is the sd capacity fix which means all non 512 byte sector
drives greater than 2TB fail to be correctly sized.
The rest are either in new drivers (qedf) or on error legs"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION
scsi: aacraid: fix PCI error recovery path
scsi: sd: Fix capacity calculation with 32-bit sector_t
scsi: qla2xxx: Add fix to read correct register value for ISP82xx.
scsi: qedf: Fix crash due to unsolicited FIP VLAN response.
scsi: sr: Sanity check returned mode data
scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
"Mikulas Patocka fixed a few bugs in our new pa_memcpy() assembler
function, e.g. one bug made the kernel unbootable if source and
destination address are the same"
* 'parisc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix bugs in pa_memcpy
|
|
Otherwise lockdep says:
[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017] #0: (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520
Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax. Then xfstests generic/422 ran sync which deadlocks.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
Normal pathname lookup doesn't allow empty pathnames, but using
AT_EMPTY_PATH (with name_to_handle_at() or fstatat(), for example) you
can trigger an empty pathname lookup.
And not only is the RCU lookup in that case entirely unnecessary
(because we'll obviously immediately finalize the end result), it is
actively wrong.
Why? An empth path is a special case that will return the original
'dirfd' dentry - and that dentry may not actually be RCU-free'd,
resulting in a potential use-after-free if we were to initialize the
path lazily under the RCU read lock and depend on complete_walk()
finalizing the dentry.
Found by syzkaller and KASAN.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The patch 554bfeceb8a22d448cd986fc9efce25e833278a1 ("parisc: Fix access
fault handling in pa_memcpy()") reimplements the pa_memcpy function.
Unfortunatelly, it makes the kernel unbootable. The crash happens in the
function ide_complete_cmd where memcpy is called with the same source
and destination address.
This patch fixes a few bugs in pa_memcpy:
* When jumping to .Lcopy_loop_16 for the first time, don't skip the
instruction "ldi 31,t0" (this bug made the kernel unbootable)
* Use the COND macro when comparing length, so that the comparison is
64-bit (a theoretical issue, in case the length is greater than
0xffffffff)
* Don't use the COND macro after the "extru" instruction (the PA-RISC
specification says that the upper 32-bits of extru result are undefined,
although they are set to zero in practice)
* Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
* Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
.Lcopy8_fault)
Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 554bfeceb8a2 ("parisc: Fix access fault handling in pa_memcpy()")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
This function is now obsolete and always returns false.
This change has no effect on generated code.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
resurrect an old patch from Pablo Neira to remove the untracked objects.
Currently, there are four possible states of an skb wrt. conntrack.
1. No conntrack attached, ct is NULL.
2. Normal (kmem cache allocated) ct attached.
3. a template (kmalloc'd), not in any hash tables at any point in time
4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
IPS_UNTRACKED_BIT in ct->status.
Untracked is supposed to be identical to case 1. It exists only
so users can check
-m conntrack --ctstate UNTRACKED vs.
-m conntrack --ctstate INVALID
e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
supposed to be a no-op.
Thus currently we need to check
ct == NULL || nf_ct_is_untracked(ct)
in a lot of places in order to avoid altering untracked objects.
The other consequence of the percpu untracked object is that all
-j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
(inc/dec the untracked conntracks refcount).
This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
make the distinction instead.
The (few) places that care about packet invalid (ct is NULL) vs.
packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
but all other places can omit the nf_ct_is_untracked() check.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
1. Remove single !events condition check to deliver the missed event
even though there is no new event happened.
Consider this case:
1) nf_ct_deliver_cached_events is invoked at the first time, the
event is failed to deliver, then the missed is set.
2) nf_ct_deliver_cached_events is invoked again, but there is no
any new event happened.
The missed event is lost really.
It would try to send the missed event again after remove this check.
And it is ok if there is no missed event because the latter check
!((events | missed) & e->ctmask) could avoid it.
2. Correct the return value check of notify->fcn.
When send the event successfully, it returns 0, not postive value.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The __nf_nat_alloc_null_binding invokes nf_nat_setup_info which may
return NF_DROP when memory is exhausted, so convert NF_DROP to -ENOMEM
to make ctnetlink happy. Or ctnetlink_setup_nat treats it as a success
when one error NF_DROP happens actully.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Simon Horman says:
====================
Second Round of IPVS Updates for v4.12
please consider these clean-ups and enhancements to IPVS for v4.12.
* Removal unused variable
* Use kzalloc where appropriate
* More efficient detection of presence of NAT extension
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Conflicts:
net/netfilter/ipvs/ip_vs_ftp.c
|
|
There are no in-tree callers.
Signed-off-by: Aaron Conole <aconole@bytheb.org>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Just a small update to xpad driver to recognize yet another gamepad,
and another change making sure userio.h is exported"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xpad - add support for Razer Wildcat gamepad
uapi: add missing install of userio.h
|
|
Pull networking fixes from David Miller:
"Things seem to be settling down as far as networking is concerned,
let's hope this trend continues...
1) Add iov_iter_revert() and use it to fix the behavior of
skb_copy_datagram_msg() et al., from Al Viro.
2) Fix the protocol used in the synthetic SKB we cons up for the
purposes of doing a simulated route lookup for RTM_GETROUTE
requests. From Florian Larysch.
3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong
Wang.
4) Don't call netdev_change_features with the team lock held, from
Xin Long.
5) Revert TCP F-RTO extension to catch more spurious timeouts because
it interacts very badly with some middle-boxes. From Yuchung
Cheng.
6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from
Guillaume Nault.
7) ctnetlink uses bit positions where it should be using bit masks,
fix from Liping Zhang.
8) Missing RCU locking in netfilter helper code, from Gao Feng.
9) Avoid double frees and use-after-frees in tcp_disconnect(), from
Eric Dumazet.
10) Don't do a changelink before we register the netdevice in
bridging, from Ido Schimmel.
11) Lock the ipv6 device address list properly, from Rabin Vincent"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
netfilter: nft_hash: do not dump the auto generated seed
drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
ipv6: Fix idev->addr_list corruption
net: xdp: don't export dev_change_xdp_fd()
bridge: netlink: register netdevice before executing changelink
bridge: implement missing ndo_uninit()
bpf: reference may_access_skb() from __bpf_prog_run()
tcp: clear saved_syn in tcp_disconnect()
netfilter: nf_ct_expect: use proper RCU list traversal/update APIs
netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL
netfilter: make it safer during the inet6_dev->addr_list traversal
netfilter: ctnetlink: make it safer when checking the ct helper name
netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find
netfilter: ctnetlink: using bit to represent the ct event
netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb
l2tp: don't mask errors in pppol2tp_getsockopt()
l2tp: don't mask errors in pppol2tp_setsockopt()
tcp: restrict F-RTO to work-around broken middle-boxes
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of small fixes for x86:
- fix locking in RDT to prevent memory leaks and freeing in use
memory
- prevent setting invalid values for vdso32_enabled which cause
inconsistencies for user space resulting in application crashes.
- plug a race in the vdso32 code between fork and sysctl which causes
inconsistencies for user space resulting in application crashes.
- make MPX signal delivery work in compat mode
- make the dmesg output of traps and faults readable again"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Fix locking in rdtgroup_schemata_write()
x86/debug: Fix the printk() debug output of signal_fault(), do_trap() and do_general_protection()
x86/vdso: Plug race between mapping and ELF header setup
x86/vdso: Ensure vdso32_enabled gets set to valid values only
x86/signals: Fix lower/upper bound reporting in compat siginfo
|