Age | Commit message (Collapse) | Author |
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for virtio driver we need to adjust XDP_PASS handling by recalculating
length of the packet if it was passed to the TCP/IP stack
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for tun driver we need to adjust XDP_PASS handling by recalculating
length of the packet if it was passed to the TCP/IP stack
(in case if after xdp's prog run data_end pointer was adjusted)
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for nfp driver we will just calculate packet's length unconditionally
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for cavium's thunder driver we will just calculate packet's length
unconditionally
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for bnxt driver we will just calculate packet's length unconditionally
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
well (only "decrease" of pointer's location is going to be supported).
changing of this pointer will change packet's size.
for mlx4 driver we will just calculate packet's length unconditionally
(the same way as it's already being done in mlx5)
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Add support to display pause settings
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Like tos inherit, ttl inherit should also means inherit the inner protocol's
ttl values, which actually not implemented in vxlan yet.
But we could not treat ttl == 0 as "use the inner TTL", because that would be
used also when the "ttl" option is not specified and that would be a behavior
change, and breaking real use cases.
So add a different attribute IFLA_VXLAN_TTL_INHERIT when "ttl inherit" is
specified with ip cmd.
Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Changing API ndo_xdp_xmit to take a struct xdp_frame instead of struct
xdp_buff. This brings xdp_return_frame and ndp_xdp_xmit in sync.
This builds towards changing the API further to become a bulk API,
because xdp_buff is not a queue-able object while xdp_frame is.
V4: Adjust for commit 59655a5b6c83 ("tuntap: XDP_TX can use native XDP")
V7: Adjust for commit d9314c474d4f ("i40e: add support for XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Changing API xdp_return_frame() to take struct xdp_frame as argument,
seems like a natural choice. But there are some subtle performance
details here that needs extra care, which is a deliberate choice.
When de-referencing xdp_frame on a remote CPU during DMA-TX
completion, result in the cache-line is change to "Shared"
state. Later when the page is reused for RX, then this xdp_frame
cache-line is written, which change the state to "Modified".
This situation already happens (naturally) for, virtio_net, tun and
cpumap as the xdp_frame pointer is the queued object. In tun and
cpumap, the ptr_ring is used for efficiently transferring cache-lines
(with pointers) between CPUs. Thus, the only option is to
de-referencing xdp_frame.
It is only the ixgbe driver that had an optimization, in which it can
avoid doing the de-reference of xdp_frame. The driver already have
TX-ring queue, which (in case of remote DMA-TX completion) have to be
transferred between CPUs anyhow. In this data area, we stored a
struct xdp_mem_info and a data pointer, which allowed us to avoid
de-referencing xdp_frame.
To compensate for this, a prefetchw is used for telling the cache
coherency protocol about our access pattern. My benchmarks show that
this prefetchw is enough to compensate the ixgbe driver.
V7: Adjust for commit d9314c474d4f ("i40e: add support for XDP_REDIRECT")
V8: Adjust for commit bd658dda4237 ("net/mlx5e: Separate dma base address
and offset in dma_sync call")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch shows how it is possible to have both the driver local page
cache, which uses elevated refcnt for "catching"/avoiding SKB
put_page returns the page through the page allocator. And at the
same time, have pages getting returned to the page_pool from
ndp_xdp_xmit DMA completion.
The performance improvement for XDP_REDIRECT in this patch is really
good. Especially considering that (currently) the xdp_return_frame
API and page_pool_put_page() does per frame operations of both
rhashtable ID-lookup and locked return into (page_pool) ptr_ring.
(It is the plan to remove these per frame operation in a followup
patchset).
The benchmark performed was RX on mlx5 and XDP_REDIRECT out ixgbe,
with xdp_redirect_map (using devmap) . And the target/maximum
capability of ixgbe is 13Mpps (on this HW setup).
Before this patch for mlx5, XDP redirected frames were returned via
the page allocator. The single flow performance was 6Mpps, and if I
started two flows the collective performance drop to 4Mpps, because we
hit the page allocator lock (further negative scaling occurs).
Two test scenarios need to be covered, for xdp_return_frame API, which
is DMA-TX completion running on same-CPU or cross-CPU free/return.
Results were same-CPU=10Mpps, and cross-CPU=12Mpps. This is very
close to our 13Mpps max target.
The reason max target isn't reached in cross-CPU test, is likely due
to RX-ring DMA unmap/map overhead (which doesn't occur in ixgbe to
ixgbe testing). It is also planned to remove this unnecessary DMA
unmap in a later patchset
V2: Adjustments requested by Tariq
- Changed page_pool_create return codes not return NULL, only
ERR_PTR, as this simplifies err handling in drivers.
- Save a branch in mlx5e_page_release
- Correct page_pool size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
V5: Updated patch desc
V8: Adjust for b0cedc844c00 ("net/mlx5e: Remove rq_headroom field from params")
V9:
- Adjust for 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication")
- Adjust for 73281b78a37a ("net/mlx5e: Derive Striding RQ size from MTU")
- Correct handling if page_pool_create fail for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
V10: Req from Tariq
- Change pool_size calc for MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Need a fast page recycle mechanism for ndo_xdp_xmit API for returning
pages on DMA-TX completion time, which have good cross CPU
performance, given DMA-TX completion time can happen on a remote CPU.
Refurbish my page_pool code, that was presented[1] at MM-summit 2016.
Adapted page_pool code to not depend the page allocator and
integration into struct page. The DMA mapping feature is kept,
even-though it will not be activated/used in this patchset.
[1] http://people.netfilter.org/hawk/presentations/MM-summit2016/generic_page_pool_mm_summit2016.pdf
V2: Adjustments requested by Tariq
- Changed page_pool_create return codes, don't return NULL, only
ERR_PTR, as this simplifies err handling in drivers.
V4: many small improvements and cleanups
- Add DOC comment section, that can be used by kernel-doc
- Improve fallback mode, to work better with refcnt based recycling
e.g. remove a WARN as pointed out by Tariq
e.g. quicker fallback if ptr_ring is empty.
V5: Fixed SPDX license as pointed out by Alexei
V6: Adjustments requested by Eric Dumazet
- Adjust ____cacheline_aligned_in_smp usage/placement
- Move rcu_head in struct page_pool
- Free pages quicker on destroy, minimize resources delayed an RCU period
- Remove code for forward/backward compat ABI interface
V8: Issues found by kbuild test robot
- Address sparse should be static warnings
- Only compile+link when a driver use/select page_pool,
mlx5 selects CONFIG_PAGE_POOL, although its first used in two patches
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the IDA infrastructure for getting a cyclic increasing ID number,
that is used for keeping track of each registered allocator per
RX-queue xdp_rxq_info. Instead of using the IDR infrastructure, which
uses a radix tree, use a dynamic rhashtable, for creating ID to
pointer lookup table, because this is faster.
The problem that is being solved here is that, the xdp_rxq_info
pointer (stored in xdp_buff) cannot be used directly, as the
guaranteed lifetime is too short. The info is needed on a
(potentially) remote CPU during DMA-TX completion time . In an
xdp_frame the xdp_mem_info is stored, when it got converted from an
xdp_buff, which is sufficient for the simple page refcnt based recycle
schemes.
For more advanced allocators there is a need to store a pointer to the
registered allocator. Thus, there is a need to guard the lifetime or
validity of the allocator pointer, which is done through this
rhashtable ID map to pointer. The removal and validity of of the
allocator and helper struct xdp_mem_allocator is guarded by RCU. The
allocator will be created by the driver, and registered with
xdp_rxq_info_reg_mem_model().
It is up-to debate who is responsible for freeing the allocator
pointer or invoking the allocator destructor function. In any case,
this must happen via RCU freeing.
Use the IDA infrastructure for getting a cyclic increasing ID number,
that is used for keeping track of each registered allocator per
RX-queue xdp_rxq_info.
V4: Per req of Jason Wang
- Use xdp_rxq_info_reg_mem_model() in all drivers implementing
XDP_REDIRECT, even-though it's not strictly necessary when
allocator==NULL for type MEM_TYPE_PAGE_SHARED (given it's zero).
V6: Per req of Alex Duyck
- Introduce rhashtable_lookup() call in later patch
V8: Address sparse should be static warnings (from kbuild test robot)
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now all the users of ndo_xdp_xmit have been converted to use xdp_return_frame.
This enable a different memory model, thus activating another code path
in the xdp_return_frame API.
V2: Fixed issues pointed out by Tariq.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Also convert driver i40e, which very recently got XDP_REDIRECT support
in commit d9314c474d4f ("i40e: add support for XDP_REDIRECT").
V7: This patch got added in V7 of this patchset.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The virtio_net driver assumes XDP frames are always released based on
page refcnt (via put_page). Thus, is only queues the XDP data pointer
address and uses virt_to_head_page() to retrieve struct page.
Use the XDP return API to get away from such assumptions. Instead
queue an xdp_frame, which allow us to use the xdp_return_frame API,
when releasing the frame.
V8: Avoid endianness issues (found by kbuild test robot)
V9: Change __virtnet_xdp_xmit from bool to int return value (found by Dan Carpenter)
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The tuntap driver invented it's own driver specific way of queuing
XDP packets, by storing the xdp_buff information in the top of
the XDP frame data.
Convert it over to use the more generic xdp_frame structure. The
main problem with the in-driver method is that the xdp_rxq_info pointer
cannot be trused/used when dequeueing the frame.
V3: Remove check based on feedback from Jason
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend struct ixgbe_tx_buffer to store the xdp_mem_info.
Notice that this could be optimized further by putting this into
a union in the struct ixgbe_tx_buffer, but this patchset
works towards removing this again. Thus, this is not done.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This implements basic XDP redirect support in mlx5 driver.
Notice that the ndo_xdp_xmit() is NOT implemented, because that API
need some changes that this patchset is working towards.
The main purpose of this patch is have different drivers doing
XDP_REDIRECT to show how different memory models behave in a cross
driver world.
Update(pre-RFCv2 Tariq): Need to DMA unmap page before xdp_do_redirect,
as the return API does not exist yet to to keep this mapped.
Update(pre-RFCv3 Saeed): Don't mix XDP_TX and XDP_REDIRECT flushing,
introduce xdpsq.db.redirect_flush boolian.
V9: Adjust for commit 121e89275471 ("net/mlx5e: Refactor RQ XDP_TX indication")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
1. Added red_drops stats. Inbound packets dropped by RED, buffer exhaustion
2. Included fcs_err, jabber_err, l2_err and frame_err errors under
rx_errors
3. Included fifo_err, dmac_drop, red_drops, fw_err_pko, fw_err_link and
fw_err_drop under rx_dropped
4. Included max_collision_fail, max_deferral_fail, total_collisions,
fw_err_pko, fw_err_link, fw_err_drop and fw_err_pki under tx_dropped
5. Counting dma mapping errors
6. Added some firmware stats description and removed for some
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Acked-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace magic number "0x5 << MAX_READ_REQUEST_SHIFT" with the
appropriate constant as defined in PCI core.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch stmmac_mode_ops to generic Hardware Interface Helpers instead of
using hard-coded callbacks. This makes the code more readable and more
flexible.
No functional change.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch stmmac_hwtimestamp to generic Hardware Interface Helpers instead
of using hard-coded callbacks. This makes the code more readable and
more flexible.
No functional change.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch stmmac_ops to generic Hardware Interface Helpers instead of using
hard-coded callbacks. This makes the code more readable and more
flexible.
No functional change.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch stmmac_dma_ops to generic Hardware Interface Helpers instead of
using hard-coded callbacks. This makes the code more readable and more
flexible.
No functional change.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Switch stmmac_desc_ops to generic Hardware Interface Helpers instead of
using hard-coded callbacks. This makes the code more readable and more
flexible.
No functional change.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the interface is down, head/tail of the descriptor
ring address is set to 0 in netsec_netdev_stop().
But netsec hardware still keeps the previous descriptor
ring address, so there is inconsistency between driver
and hardware after interface is up at a later time.
To address this inconsistency, add netsec_reset_hardware()
when the interface is down.
In addition, to minimize the reset process,
add flag to decide whether driver loads the netsec microcode.
Even if driver resets the netsec hardware, netsec microcode
keeps resident on RAM, so it is ok we only load the microcode
at initialization.
This patch is critical for installation over network.
Signed-off-by: Masahisa KOJIMA <masahisa.kojima@linaro.org>
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enable TX-irq as well during ndo_open() as we can not count upon
RX to arrive early enough to trigger the napi. This patch is critical
for installation over network.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The usage of of_device_get_match_data() reduce the code size a bit.
Also, the only way to call mtk_probe() is to match an entry in
of_mtk_match[], so match cannot be NULL.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) In ip_gre tunnel, handle the conflict between TUNNEL_{SEQ,CSUM} and
GSO/LLTX properly. From Sabrina Dubroca.
2) Stop properly on error in lan78xx_read_otp(), from Phil Elwell.
3) Don't uncompress in slip before rstate is initialized, from Tejaswi
Tanikella.
4) When using 1.x firmware on aquantia, issue a deinit before we
hardware reset the chip, otherwise we break dirty wake WOL. From
Igor Russkikh.
5) Correct log check in vhost_vq_access_ok(), from Stefan Hajnoczi.
6) Fix ethtool -x crashes in bnxt_en, from Michael Chan.
7) Fix races in l2tp tunnel creation and duplicate tunnel detection,
from Guillaume Nault.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits)
l2tp: fix race in duplicate tunnel detection
l2tp: fix races in tunnel creation
tun: send netlink notification when the device is modified
tun: set the flags before registering the netdevice
lan78xx: Don't reset the interface on open
bnxt_en: Fix NULL pointer dereference at bnxt_free_irq().
bnxt_en: Need to include RDMA rings in bnxt_check_rings().
bnxt_en: Support max-mtu with VF-reps
bnxt_en: Ignore src port field in decap filter nodes
bnxt_en: do not allow wildcard matches for L2 flows
bnxt_en: Fix ethtool -x crash when device is down.
vhost: return bool from *_access_ok() functions
vhost: fix vhost_vq_access_ok() log check
vhost: Fix vhost_copy_to_user()
net: aquantia: oops when shutdown on already stopped device
net: aquantia: Regression on reset with 1.x firmware
cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
slip: Check if rstate is initialized before uncompressing
lan78xx: Avoid spurious kevent 4 "error"
lan78xx: Correctly indicate invalid OTP
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"A few fixes of Xen related core code and drivers"
* tag 'for-linus-4.17-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen
xen/acpi: off by one in read_acpi_id()
xen/acpi: upload _PSD info for non Dom0 CPUs too
x86/xen: Delay get_cpu_cap until stack canary is established
xen: xenbus_dev_frontend: Verify body of XS_TRANSACTION_END
xen: xenbus: Catch closing of non existent transactions
xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Prevent bus reference leak in mmc_blk_init()
MMC host:
- tmio: Fix error handling when issuing CMD23
- jz4740: Fix race condition in IRQ mask update"
* tag 'mmc-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: tmio: Fix error handling when issuing CMD23
mmc: core: Prevent bus reference leak in mmc_blk_init()
mmc: jz4740: Fix race condition in IRQ mask update
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb
Pull kdb updates from Jason Wessel:
- fix 2032 time access issues and new compiler warnings
- minor regression test cleanup
- formatting fixes for end user use of kdb
* tag 'for_linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
kdb: use memmove instead of overlapping memcpy
kdb: use ktime_get_mono_fast_ns() instead of ktime_get_ts()
kdb: bl: don't use tab character in output
kdb: drop newline in unknown command output
kdb: make "mdr" command repeat
kdb: use __ktime_get_real_seconds instead of __current_kernel_time
misc: kgdbts: Display progress of asynchronous tests
|
|
Pull virtio update from Michael Tsirkin:
"This adds reporting hugepage stats to virtio-balloon"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: export hugetlb page allocation counts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- OF_IOMMU support for the Rockchip iommu driver so that it can use
generic DT bindings
- rework of locking in the AMD IOMMU interrupt remapping code to make
it work better in RT kernels
- support for improved iotlb flushing in the AMD IOMMU driver
- support for 52-bit physical and virtual addressing in the ARM-SMMU
- various other small fixes and cleanups
* tag 'iommu-updates-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
iommu/io-pgtable-arm: Avoid warning with 32-bit phys_addr_t
iommu/rockchip: Support sharing IOMMU between masters
iommu/rockchip: Add runtime PM support
iommu/rockchip: Fix error handling in init
iommu/rockchip: Use OF_IOMMU to attach devices automatically
iommu/rockchip: Use IOMMU device for dma mapping operations
dt-bindings: iommu/rockchip: Add clock property
iommu/rockchip: Control clocks needed to access the IOMMU
iommu/rockchip: Fix TLB flush of secondary IOMMUs
iommu/rockchip: Use iopoll helpers to wait for hardware
iommu/rockchip: Fix error handling in attach
iommu/rockchip: Request irqs in rk_iommu_probe()
iommu/rockchip: Fix error handling in probe
iommu/rockchip: Prohibit unbind and remove
iommu/amd: Return proper error code in irq_remapping_alloc()
iommu/amd: Make amd_iommu_devtable_lock a spin_lock
iommu/amd: Drop the lock while allocating new irq remap table
iommu/amd: Factor out setting the remap table for a devid
iommu/amd: Use `table' instead `irt' as variable name in amd_iommu_update_ga()
iommu/amd: Remove the special case from alloc_irq_table()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These include one big-ticket item which is the rework of the idle loop
in order to prevent CPUs from spending too much time in shallow idle
states. It reduces idle power on some systems by 10% or more and may
improve performance of workloads in which the idle loop overhead
matters. This has been in the works for several weeks and it has been
tested and reviewed quite thoroughly.
Also included are changes that finalize the cpufreq cleanup moving
frequency table validation from drivers to the core, a few fixes and
cleanups of cpufreq drivers, a cpuidle documentation update and a PM
QoS core update to mark the expected switch fall-throughs in it.
Specifics:
- Rework the idle loop in order to prevent CPUs from spending too
much time in shallow idle states by making it stop the scheduler
tick before putting the CPU into an idle state only if the idle
duration predicted by the idle governor is long enough.
That required the code to be reordered to invoke the idle governor
before stopping the tick, among other things (Rafael Wysocki,
Frederic Weisbecker, Arnd Bergmann).
- Add the missing description of the residency sysfs attribute to the
cpuidle documentation (Prashanth Prakash).
- Finalize the cpufreq cleanup moving frequency table validation from
drivers to the core (Viresh Kumar).
- Fix a clock leak regression in the armada-37xx cpufreq driver
(Gregory Clement).
- Fix the initialization of the CPU performance data structures for
shared policies in the CPPC cpufreq driver (Shunyong Yang).
- Clean up the ti-cpufreq, intel_pstate and CPPC cpufreq drivers a
bit (Viresh Kumar, Rafael Wysocki).
- Mark the expected switch fall-throughs in the PM QoS core (Gustavo
Silva)"
* tag 'pm-4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
tick-sched: avoid a maybe-uninitialized warning
cpufreq: Drop cpufreq_table_validate_and_show()
cpufreq: SCMI: Don't validate the frequency table twice
cpufreq: CPPC: Initialize shared perf capabilities of CPUs
cpufreq: armada-37xx: Fix clock leak
cpufreq: CPPC: Don't set transition_latency
cpufreq: ti-cpufreq: Use builtin_platform_driver()
cpufreq: intel_pstate: Do not include debugfs.h
PM / QoS: mark expected switch fall-throughs
cpuidle: Add definition of residency to sysfs documentation
time: hrtimer: Use timerqueue_iterate_next() to get to the next timer
nohz: Avoid duplication of code related to got_idle_tick
nohz: Gather tick_sched booleans under a common flag field
cpuidle: menu: Avoid selecting shallow states with stopped tick
cpuidle: menu: Refine idle state selection for running tick
sched: idle: Select idle state before stopping the tick
time: hrtimer: Introduce hrtimer_next_event_without()
time: tick-sched: Split tick_nohz_stop_sched_tick()
cpuidle: Return nohz hint from cpuidle_select()
jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC
...
|
|
Pull UBI and UBIFS updates from Richard Weinberger:
"Minor bug fixes and improvements"
* tag 'tags/upstream-4.17-rc1' of git://git.infradead.org/linux-ubifs:
ubi: Reject MLC NAND
ubifs: Remove useless parameter of lpt_heap_replace
ubifs: Constify struct ubifs_lprops in scan_for_leb_for_idx
ubifs: remove unnecessary assignment
ubi: Fix error for write access
ubi: fastmap: Don't flush fastmap work on detach
ubifs: Check ubifs_wbuf_sync() return code
|
|
I added dumping of link information about tun devices over netlink in
commit 1ec010e70593 ("tun: export flags, uid, gid, queue information
over netlink"), but didn't add the missing netlink notifications when
the device's exported properties change.
This patch adds notifications when owner/group or flags are modified,
when queues are attached/detached, and when a tun fd is closed.
Reported-by: Thomas Haller <thaller@redhat.com>
Fixes: 1ec010e70593 ("tun: export flags, uid, gid, queue information over netlink")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Otherwise, register_netdevice advertises the creation of the device with
the default flags, instead of what the user requested.
Reported-by: Thomas Haller <thaller@redhat.com>
Fixes: 1ec010e70593 ("tun: export flags, uid, gid, queue information over netlink")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 92571a1aae40 ("lan78xx: Connect phy early") moves the PHY
initialisation into lan78xx_probe, but lan78xx_open subsequently calls
lan78xx_reset. As well as forcing a second round of link negotiation,
this reset frequently prevents the phy interrupt from being generated
(even though the link is up), rendering the interface unusable.
Fix this issue by removing the lan78xx_reset call from lan78xx_open.
Fixes: 92571a1aae40 ("lan78xx: Connect phy early")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When open fails during ethtool -L ring change, for example, the driver
may crash at bnxt_free_irq() because bp->bnapi is NULL.
If we fail to allocate all the new rings, bnxt_open_nic() will free
all the memory including bp->bnapi. Subsequent call to bnxt_close_nic()
will try to dereference bp->bnapi in bnxt_free_irq().
Fix it by checking for !bp->bnapi in bnxt_free_irq().
Fixes: e5811b8c09df ("bnxt_en: Add IRQ remapping logic.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With recent changes to reserve both L2 and RDMA rings, we need to include
the RDMA rings in bnxt_check_rings(). Otherwise we will under-estimate
the rings we need during ethtool -L and may lead to failure.
Fixes: fbcfc8e46741 ("bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While a VF is configured with a bigger mtu (> 1500), any packets that
are punted to the VF-rep (slow-path) get dropped by OVS kernel-datapath
with the following message: "dropped over-mtu packet". Fix this by
returning the max-mtu value for a VF-rep derived from its corresponding VF.
VF-rep's mtu can be changed using 'ip' command as shown in this example:
$ ip link set bnxt0_pf0vf0 mtu 9000
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver currently uses src port field (along with other fields) in the
decap tunnel key, while looking up and adding tunnel nodes. This leads to
redundant cfa_decap_filter_alloc() requests to the FW and flow-miss in the
flow engine. Fix this by ignoring the src port field in decap tunnel nodes.
Fixes: f484f6782e01 ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before this patch the following commands would succeed as far as the
user was concerned:
$ tc qdisc add dev p1p1 ingress
$ tc filter add dev p1p1 parent ffff: protocol all \
flower skip_sw action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:01/44 action drop
The current flow offload infrastructure used does not support wildcard
matching for ethernet headers, so do not allow the second or third
commands to succeed. If a user wants to drop traffic on that interface
the protocol and MAC addresses need to be specified explicitly:
$ tc qdisc add dev p1p1 ingress
$ tc filter add dev p1p1 parent ffff: protocol arp \
flower skip_sw action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw action drop
...
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:01 action drop
$ tc filter add dev p1p1 parent ffff: protocol ipv4 \
flower skip_sw src_mac 00:02:00:00:00:02 action drop
...
There are also checks for VLAN parameters in this patch as other callers
may wildcard those parameters even if tc does not. Using different
flow infrastructure could allow this to work in the future for L2 flows,
but for now it does not.
Fixes: 2ae7408fedfe ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix ethtool .get_rxfh() crash by checking for valid indirection table
address before copying the data.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge more updates from Andrew Morton:
- almost all of the rest of MM
- kasan updates
- lots of procfs work
- misc things
- lib/ updates
- checkpatch
- rapidio
- ipc/shm updates
- the start of willy's XArray conversion
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (140 commits)
page cache: use xa_lock
xarray: add the xa_lock to the radix_tree_root
fscache: use appropriate radix tree accessors
export __set_page_dirty
unicore32: turn flush_dcache_mmap_lock into a no-op
arm64: turn flush_dcache_mmap_lock into a no-op
mac80211_hwsim: use DEFINE_IDA
radix tree: use GFP_ZONEMASK bits of gfp_t for flags
linux/const.h: refactor _BITUL and _BITULL a bit
linux/const.h: move UL() macro to include/linux/const.h
linux/const.h: prefix include guard of uapi/linux/const.h with _UAPI
xen, mm: allow deferred page initialization for xen pv domains
elf: enforce MAP_FIXED on overlaying elf segments
fs, elf: drop MAP_FIXED usage from elf_map
mm: introduce MAP_FIXED_NOREPLACE
MAINTAINERS: update bouncing aacraid@adaptec.com addresses
fs/dcache.c: add cond_resched() in shrink_dentry_list()
include/linux/kfifo.h: fix comment
ipc/shm.c: shm_split(): remove unneeded test for NULL shm_file_data.vm_ops
kernel/sysctl.c: add kdoc comments to do_proc_do{u}intvec_minmax_conv_param
...
|
|
Remove the address_space ->tree_lock and use the xa_lock newly added to
the radix_tree_root. Rename the address_space ->page_tree to ->i_pages,
since we don't really care that it's a tree.
[willy@infradead.org: fix nds32, fs/dax.c]
Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is preferred to opencoding an IDA_INIT.
Link: http://lkml.kernel.org/r/20180313132639.17387-2-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Once the dma request is passed to the DMA engine, the DMA subsystem
would hold a pointer to this structure and could call the completion
callback after do_dma_request() has timed out.
The current code deals with this by putting timed out SYNC requests to a
pending list and freeing them later, when the mport cdev device is
released. This still does not guarantee that the DMA subsystem is
really done with those transfers, so in theory
dma_xfer_callback/dma_req_free could be called after
mport_cdev_release_dma and could potentially access already freed
memory.
This patch simplifies the current handling by using a kref in the mport
dma request structure, so that it gets freed only when nobody uses it
anymore.
This also simplifies the code a bit, as FAF transfers are now handled in
the same way as SYNC and ASYNC transfers. There is no need anymore for
the pending list and for the dma workqueue which was used in case of FAF
transfers, so we remove them both.
Link: http://lkml.kernel.org/r/20180405203342.GA16191@nokia.com
Signed-off-by: Ioan Nicu <ioan.nicu.ext@nokia.com>
Acked-by: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Frank Kunz <frank.kunz@nokia.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|