Age | Commit message (Collapse) | Author |
|
Pull networking fixes from David Miller:
"Let's begin the holiday weekend with some networking fixes:
1) Whoops need to restrict cfg80211 wiphy names even more to 64
bytes. From Eric Biggers.
2) Fix flags being ignored when using kernel_connect() with SCTP,
from Xin Long.
3) Use after free in DCCP, from Alexey Kodanev.
4) Need to check rhltable_init() return value in ipmr code, from Eric
Dumazet.
5) XDP handling fixes in virtio_net from Jason Wang.
6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.
7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
Morgenstein.
8) Prevent out-of-bounds speculation using indexes in BPF, from
Daniel Borkmann.
9) Fix regression added by AF_PACKET link layer cure, from Willem de
Bruijn.
10) Correct ENIC dma mask, from Govindarajulu Varadarajan.
11) Missing config options for PMTU tests, from Stefano Brivio"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
ibmvnic: Fix partial success login retries
selftests/net: Add missing config options for PMTU tests
mlx4_core: allocate ICM memory in page size chunks
enic: set DMA mask to 47 bit
ppp: remove the PPPIOCDETACH ioctl
ipv4: remove warning in ip_recv_error
net : sched: cls_api: deal with egdev path only if needed
vhost: synchronize IOTLB message with dev cleanup
packet: fix reserve calculation
net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
bpf: properly enforce index mask to prevent out-of-bounds speculation
net/mlx4: Fix irq-unsafe spinlock usage
net: phy: broadcom: Fix bcm_write_exp()
net: phy: broadcom: Fix auxiliary control register reads
net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message
ibmvnic: Only do H_EOI for mobility events
tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
virtio-net: fix leaking page for gso packet during mergeable XDP
...
|
|
There is no point in testing .set_mmss versus .set_mmss64 as there are both
taking the exact same argument (truncated for set_mmss though).
Also, this allows to constify struct rtc_ops.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
The case of a new numa node got missed in avoiding using the node info
from page_struct during hotplug. In this path we have a call to
register_mem_sect_under_node (which allows us to specify it is hotplug
so don't change the node), via link_mem_sections which unfortunately
does not.
Fix is to pass check_nid through link_mem_sections as well and disable
it in the new numa node path.
Note the bug only 'sometimes' manifests depending on what happens to be
in the struct page structures - there are lots of them and it only needs
to match one of them.
The result of the bug is that (with a new memory only node) we never
successfully call register_mem_sect_under_node so don't get the memory
associated with the node in sysfs and meminfo for the node doesn't
report it.
It came up whilst testing some arm64 hotplug patches, but appears to be
universal. Whilst I'm triggering it by removing then reinserting memory
to a node with no other elements (thus making the node disappear then
appear again), it appears it would happen on hotplugging memory where
there was none before and it doesn't seem to be related the arm64
patches.
These patches call __add_pages (where most of the issue was fixed by
Pavel's patch). If there is a node at the time of the __add_pages call
then all is well as it calls register_mem_sect_under_node from there
with check_nid set to false. Without a node that function returns
having not done the sysfs related stuff as there is no node to use.
This is expected but it is the resulting path that fails...
Exact path to the problem is as follows:
mm/memory_hotplug.c: add_memory_resource()
The node is not online so we enter the 'if (new_node)' twice, on the
second such block there is a call to link_mem_sections which calls
into
drivers/node.c: link_mem_sections() which calls
drivers/node.c: register_mem_sect_under_node() which calls
get_nid_for_pfn and keeps trying until the output of that matches
the expected node (passed all the way down from
add_memory_resource)
It is effectively the same fix as the one referred to in the fixes tag
just in the code path for a new node where the comments point out we
have to rerun the link creation because it will have failed in
register_new_memory (as there was no node at the time). (actually that
comment is wrong now as we don't have register_new_memory any more it
got renamed to hotplug_memory_register in Pavel's patch).
Link: http://lkml.kernel.org/r/20180504085311.1240-1-Jonathan.Cameron@huawei.com
Fixes: fc44f7f9231a ("mm/memory_hotplug: don't read nid from struct page during hotplug")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Move all RQ, SQ and channel counters from the channel objects into the
priv structure. With this change, counters will not be reset upon
channel configuration changes.
Channel's statistics for SQs which are associated with TCs higher than
zero will be presented in ethtool -S, only for SQs which were opened at
least once since the module was loaded (regardless of their open/close
current status). This is done in order to decrease the total amount of
statistics presented and calculated for the common out of box use (no
QoS).
mlx5e_channel_stats is a compound of CH,RQ,SQs stats in order to
create locality for the NAPI when handling TX and RX of the same
channel.
Align the new statistics struct per ring to avoid several channels
update to the same cache line at the same time.
Packet rate was tested, no degradation sensed.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
CC: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We are now able to configure a pipeline directly into a local display
list body. Take advantage of this fact, and create a cacheable body to
store the configuration of the pipeline in the pipeline object.
vsp1_video_pipeline_run() is now the last user of the pipe->dl object.
Convert this function to use the cached pipe->stream_config body and
obtain a local display list reference.
Attach the pipe->stream_config body to the display list when needed
before committing to hardware.
Use a flag 'configured' to know when we should attach our stream_config
to the next outgoing display list to reconfigure the hardware in the
event of our first frame, or the first frame following a suspend/resume
cycle.
Our video DL usage now looks like the below output:
dl->body0 contains our disposable runtime configuration. Max 41.
dl_child->body0 is our partition specific configuration. Max 12.
dl->bodies shows our constant configuration and LUTs.
These two are LUT/CLU:
* dl->bodies[x]->num_entries 256 / max 256
* dl->bodies[x]->num_entries 4914 / max 4914
Which shows that our 'constant' configuration cache is currently
utilised to a maximum of 64 entries.
trace-cmd report | \
dl->body0->num_entries 13 / max 128
dl->body0->num_entries 14 / max 128
dl->body0->num_entries 16 / max 128
dl->body0->num_entries 20 / max 128
dl->body0->num_entries 27 / max 128
dl->body0->num_entries 34 / max 128
dl->body0->num_entries 41 / max 128
dl_child->body0->num_entries 10 / max 128
dl_child->body0->num_entries 12 / max 128
dl->bodies[x]->num_entries 15 / max 128
dl->bodies[x]->num_entries 16 / max 128
dl->bodies[x]->num_entries 17 / max 128
dl->bodies[x]->num_entries 18 / max 128
dl->bodies[x]->num_entries 20 / max 128
dl->bodies[x]->num_entries 21 / max 128
dl->bodies[x]->num_entries 256 / max 256
dl->bodies[x]->num_entries 31 / max 128
dl->bodies[x]->num_entries 32 / max 128
dl->bodies[x]->num_entries 39 / max 128
dl->bodies[x]->num_entries 40 / max 128
dl->bodies[x]->num_entries 47 / max 128
dl->bodies[x]->num_entries 48 / max 128
dl->bodies[x]->num_entries 4914 / max 4914
dl->bodies[x]->num_entries 55 / max 128
dl->bodies[x]->num_entries 56 / max 128
dl->bodies[x]->num_entries 63 / max 128
dl->bodies[x]->num_entries 64 / max 128
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Currently the entities store their configurations into a display list.
Adapt this such that the code can be configured into a body directly,
allowing greater flexibility and control of the content.
All users of vsp1_dl_list_write() are removed in this process, thus it
too is removed.
A helper, vsp1_dl_list_get_body0() is provided to access the internal body0
from the display list.
[laurent.pinchart+renesas@ideasonboard.com: Don't remove blank line unnecessarily]
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
The entities provide a single .configure operation which configures the
object into the target display list, based on the vsp1_entity_params
selection.
Split the configure function into three parts, '.configure_stream()',
'.configure_frame()', and '.configure_partition()' to facilitate
splitting the configuration of each parameter class into separate
display list bodies.
[laurent.pinchart+renesas@ideasonboard.com: Blank line reformatting, remote unneeded local variable initialization]
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Extend the display list body with a reference count, allowing bodies to
be kept as long as a reference is maintained. This provides the ability
to keep a cached copy of bodies which will not change, so that they can
be re-applied to multiple display lists.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Adapt the dl->body0 object to use an object from the body pool. This
greatly reduces the pressure on the TLB for IPMMU use cases, as all of
the lists use a single allocation for the main body.
The CLU and LUT objects pre-allocate a pool containing three bodies,
allowing a userspace update before the hardware has committed a previous
set of tables.
Bodies are no longer 'freed' in interrupt context, but instead released
back to their respective pools. This allows us to remove the garbage
collector in the DLM.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Each display list allocates a body to store register values in a dma
accessible buffer from a dma_alloc_wc() allocation. Each of these
results in an entry in the IOMMU TLB, and a large number of display list
allocations adds pressure to this resource.
Reduce TLB pressure on the IPMMUs by allocating multiple display list
bodies in a single allocation, and providing these to the display list
through a 'body pool'. A pool can be allocated by the display list
manager or entities which require their own body allocations.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
The body write function relies on the code never asking it to write more
than the entries available in the list.
Currently with each list body containing 256 entries, this is fine, but
we can reduce this number greatly saving memory. In preparation of this
add a level of protection to catch any buffer overflows.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Throughout the codebase, the term 'fragment' is used to represent a
display list body. This term duplicates the 'body' which is already in
use.
The datasheet references these objects as a body, therefore replace all
mentions of a fragment with a body, along with the corresponding
pluralised terms.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
The suspend and resume handlers are only utilised by video pipelines,
yet the functions currently reside in the vsp1_pipe object.
This causes an issue with resume, as the functions incorrectly call
vsp1_pipeline_run() directly instead of processing the video object
through vsp1_video_pipeline_run().
Move the functions to the video object, renaming accordingly and update
the resume handler to call vsp1_video_pipeline_run() as appropriate.
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Commit 372b2b0399fc ("media: v4l: vsp1: Release buffers in
start_streaming error path") introduced a helper to clean up buffers on
error paths, but inadvertently changed the code such that only the
output WPF buffers were cleaned, rather than the video node being
operated on.
Since then vsp1_video_cleanup_pipeline() has grown to perform both video
node cleanup, as well as pipeline cleanup. Split the implementation into
two distinct functions that perform the required work, so that each
video node can release its buffers correctly on streamoff. The pipe
cleanup that was performed in the vsp1_video_stop_streaming() (releasing
the pipe->dl) is moved to the function for clarity.
Fixes: 372b2b0399fc ("media: v4l: vsp1: Release buffers in start_streaming error path")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
VIDEO_RENESAS_VSP1 depends on ARCH_RENESAS && OF.
As ARCH_RENESAS implies OF, the latter can be dropped.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
The CEC receive buffer was not always cleared correctly. The
datasheet was a bit confusing since sometimes it mentioned that the
bit in CEC register 0x4a had to be toggled, and sometimes it suggested
it was a 'Clear-on-write' bit. But it really needs to be toggled.
The patch also enables/disables the CEC irqs after the other irq are
enabled/disabled instead of doing it before. It may not matter, but it
feels more logical to do it in that order, and the implementation that
we (Cisco) have used until now and that is known to be reliable also
did it in that order.
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
|
|
Subsequent patches in the series convert inode timestamps
to use struct timespec64 instead of struct timespec as
part of solving the y2038 problem.
Convert these print formats to use long long types to
avoid warnings and errors on conversion.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
CC: andreas.dilger@intel.com
|
|
In some cases pcie_get_minimum_link() returned misleading information
because it found the slowest link and the narrowest link without
considering the total bandwidth of the link.
For example, consider a path with these two links:
- 16.0 GT/s x1 link (16.0 * 10^9 * 128 / 130) * 1 / 8 = 1969 MB/s
- 2.5 GT/s x16 link ( 2.5 * 10^9 * 8 / 10) * 16 / 8 = 4000 MB/s
The available bandwidth of the path is limited by the 16 GT/s link to about
1969 MB/s, but pcie_get_minimum_link() returned 2.5 GT/s x1, which
corresponds to only 250 MB/s.
Callers should use pcie_print_link_status() instead, or
pcie_bandwidth_available() if they need more detailed information.
Remove pcie_get_minimum_link() since there are no callers left.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
The dmesg change is:
- PCI Express bandwidth of %dGT/s available
- (Speed:%s, Width: x%d, Encoding Loss:%s)
+ %u.%03u Gb/s available PCIe bandwidth (%s x%d link)
or, if the device is capable of better performance than is available in the
current slot:
- This is not sufficient for optimal performance of this card.
- For optimal performance, at least %dGT/s of bandwidth is required.
- A slot with more lanes and/or higher speed is suggested.
+ %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)
Note that the driver previously used dev_warn() to suggest using a
different slot, but pcie_print_link_status() uses dev_info() because if the
platform has no faster slot available, the user can't do anything about the
warning and may not want to be bothered with it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
The dmesg change is:
- PCIe link speed is %s, device supports %s
- PCIe link width is x%d, device supports x%d
+ %u.%03u Gb/s available PCIe bandwidth (%s x%d link)
or, if the device is capable of better performance than is available in the
current slot:
- A slot with more lanes and/or higher speed is suggested for optimal performance.
+ %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
The dmesg change is:
- PCIe: Speed %s Width x%d
+ %u.%03u Gb/s available PCIe bandwidth (%s x%d link)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Previously the driver used pcie_get_minimum_link() to warn when the NIC
is in a slot that can't supply as much bandwidth as the NIC could use.
pcie_get_minimum_link() can be misleading because it finds the slowest link
and the narrowest link (which may be different links) without considering
the total bandwidth of each link. For a path with a 16 GT/s x1 link and a
2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of
bandwidth, not the true available bandwidth of about 1969 MB/s for a
16 GT/s x1 link.
Use pcie_print_link_status() to report PCIe link speed and possible
limitations instead of implementing this in the driver itself. This finds
the slowest link in the path to the device by computing the total bandwidth
of each link and compares that with the capabilities of the device.
The dmesg change is:
- %s (%c%d) PCI-E x%d %s found at mem %lx, IRQ %d, node addr %pM
+ %s (%c%d) PCI-E found at mem %lx, IRQ %d, node addr %pM
+ %u.%03u Gb/s available PCIe bandwidth (%s x%d link)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Manipulating the enable_cnt behind the back of the driver will wreak
complete havoc with the kernel state, so disallow it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Keith Busch <keith.busch@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into next/drivers
Power-domain support for Rockchip socs px30, rk3128, rk3228 and rk3036.
* tag 'v4.18-rockchip-drivers-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
soc: rockchip: power-domain: add power domain support for px30
dt-bindings: power: add binding for px30 power domains
dt-bindings: power: add PX30 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3228
dt-bindings: power: add binding for rk3228 power domains
dt-bindings: power: add RK3228 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3128
dt-bindings: power: add binding for rk3128 power domains
dt-bindings: power: add RK3128 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3036
dt-bindings: power: add binding for rk3036 power domains
dt-bindings: power: add RK3036 SoCs header for power-domain
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/soc
One ti-sysc fix for v4.18 merge window
This fixes an array access errors if there are more optional clocks
than one.
* tag 'omap-for-v4.18/ti-sysc-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
bus: ti-sysc: Fix optional clocks array access
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Introduce a new read/write lock that will protect statistics gathering from
netdev channels configuration changes.
e.g. when channels are being replaced (increase/decrease number of rings)
prevent statistic gathering (ndo_get_stats64) to read the statistics of
in-active channels (channels that are being closed).
Plus update channels software statistics on the fly when calling
ndo_get_stats64, and remove it from stats periodic work.
Fixes: 9218b44dcc05 ("net/mlx5e: Statistics handling refactoring")
Signed-off-by: Shalom Lagziel <shaloml@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
PHY link down events counter belongs to phy_counters group.
although it has special handling, it doesn't mean it can't be there.
Move it to phy_counters_grp handler.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Complete the transition of all WQ types to use fragmented
order-0 coherent memory instead of high-order allocations.
CQ-WQ already uses order-0.
Here we do the same for cyclic and linked-list WQs.
This allows the driver to load cleanly on systems with a highly
fragmented coherent memory.
Performance tests:
ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet rate of 64B packets, single transmit ring, size 8K.
No degradation is sensed.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If CONFIG_MLX5_CORE_IPOIB is not set, compile-out the
IPOIB related headers.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We fill SQ edge with NOPs to avoid WQEs wrap.
Here, instead of doing that in advance for the maximum possible
WQE size, we do it on-demand using the actual WQE size.
We re-order some parts in mlx5e_sq_xmit to finish the calculation
of WQE size (ds_cnt) before doing any writes to the WQE buffer.
When SQ work queue is fragmented (introduced in an downstream patch),
dealing with WQE wraps becomes more frequent. This change would drastically
reduce the overhead in this case.
Performance tests:
ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet rate of 64B packets, single transmit ring, size 8K.
Before: 14.9 Mpps
After: 15.8 Mpps
Improvement of 6%.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Use the WQ API to get the WQ size, and to map a counter
into a WQ entry index.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If a TC rule needs to be split for mirroring, create two HW rules,
in the first level and the second level flow tables accordingly.
In the first level flow table, forward the packet to the mirror
port and forward the packet to the second level flow table for
further processing, eg. encap, vlan push or header re-write.
Currently the matching is repeated in both stages.
While here, simplify the setup of the vhca id valid indicator also
in the existing code.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Currently, we only support the mirred redirect TC sub-action. In order
to support flow based vport mirroring, add support to parse the mirred
mirror sub-action.
For mirroring, user-space will typically set the action order such that
the mirror port (mirror VF) sees packets as the original port (VF under
mirroring) sent them or as it will receive them.
In the general case, it means that packets are potentially sent to the
mirror port before or after some actions were applied on them. To
properly do that, we should follow on the exact action order as set for
the flow and make sure this will also be the case when we program the HW
offload.
We introduce a counter for the output ports (attr->out_count), which we
increase when parsing each mirred redirect/mirror sub-action and when
dealing with encap.
We introduce a counter (attr->mirror_count) telling us if split is
needed. If no split is needed and mirroring is just multicasting to
vport, the mirror count is zero, all the actions of the TC flow should
apply on that single HW flow.
If split is needed, the mirror count tells where to do the split, all
non-mirred tc actions should apply only after the split.
The mirror count is set while parsing the following actions encap/decap,
header re-write, vlan push/pop.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
If firmware supports the forward action with a destination list
that includes a flow table, create a second level FDB flow table.
This is going to be used for flow based mirroring under the switchdev
offloads mode.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
We have several fdb flow tables for each of the legacy and switchdev
modes. In the switchdev mode, there are fast path and slow path flow
tables. Towards adding more flow tables in upcoming patches, reorganize
and rename the various existing ones to reflect their functionality.
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
|
|
Create function qcom_smem_virt_to_phys(), which returns the physical
address corresponding to a given SMEM item's virtual address. This
feature is required for a driver that will soon be out for review.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
In qmi_handle_init(), a buffer is allocated for to hold messages
received through the handle's socket. Any "normal" messages
(expected by the caller) will have a header prepended, so the
buffer size is adjusted to accomodate that.
The buffer must also be of sufficient size to receive control
messages, so the size is increased if necessary to ensure these
will fit.
Unfortunately the calculation is done wrong, making it possible
for the calculated buffer size to be too small to hold a "normal"
message. Specifically, if:
recv_buf_size > sizeof(struct qrtr_ctrl_pkt) - sizeof(struct qmi_header)
AND
recv_buf_size < sizeof(struct qrtr_ctrl_pkt)
the current logic will use sizeof(struct qrtr_ctrl_pkt) as the
receive buffer size, which is not enough to hold the maximum
"normal" message plus its header. Currently this problem occurs
for (13 < recv_buf_size < 20).
This patch corrects this.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
Incoming Qualcomm changes for GENI, i2c [1], and cmd-db [2] are enabled
with COMPILE_TEST in drivers/soc/qcom. For this to work, the Makefile
in that directory has to be included unconditionally, rather than only
if ARCH_QCOM is enabled.
Example of the errors seen on allmodconfig with the GENI, i2c, and
cmd-db patches applied:
Kernel: arch/x86/boot/bzImage is ready (#1)
ERROR: "geni_se_select_mode" [drivers/tty/serial/qcom_geni_serial.ko] undefined!
ERROR: "geni_se_init" [drivers/tty/serial/qcom_geni_serial.ko] undefined!
ERROR: "geni_se_config_packing" [drivers/tty/serial/qcom_geni_serial.ko] undefined!
ERROR: "geni_se_resources_on" [drivers/tty/serial/qcom_geni_serial.ko] undefined!
ERROR: "geni_se_resources_off" [drivers/tty/serial/qcom_geni_serial.ko] undefined!
ERROR: "geni_se_tx_dma_unprep" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_tx_dma_prep" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_rx_dma_unprep" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_rx_dma_prep" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_select_mode" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_config_packing" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_init" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_resources_off" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
ERROR: "geni_se_resources_on" [drivers/i2c/busses/i2c-qcom-geni.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1
make: *** [Makefile:1237: modules] Error 2
[1] https://patchwork.ozlabs.org/cover/893437/
[2] https://lkml.org/lkml/2018/4/10/714
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
There's no sense in scanning the partition table again if we know
the global partition has already been discovered. Check for a
non-null global_partition pointer in qcom_smem_set_global_partition()
immediately.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
If there is at least one entry in the partition table, but no global
entry, the qcom_smem_set_global_partition() should return an error
just like it does if there are no partition table entries.
It turns out the function still returns an error in this case, but
it waits to do so until it has mistakenly treated the last entry in
the table as if it were the global entry found.
Fix the function to return immediately if no global entry is found
in the table.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
It's OK if the space for a newly-allocated uncached entry actually
touches the free cached space boundary. It's only a problem if it
would cross it.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
Two places report an error when a partition header is found to
not contain the right canary value. The error messages do not
properly byte swap the host ids. Fix this, and adjust the format
specificier to match the 16-bit unsigned data type.
Move the error handling for a bad canary value to the end of
qcom_smem_alloc_private(). This avoids some long lines, and
reduces the distraction of handling this unexpected problem.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
What phdr_to_last_uncached_entry() returns is the address of the
start of the free space following all allocated uncached entries.
It really doesn't refer to an actual (initialized) private entry
structure. Similarly phdr_to_last_cached_entry() returns the
address of the end of free space, preceding the last allocated cache
entry. Change both functions' return type to be pointer to void
to reflect this.
Meanwhile, phdr_to_first_cached_entry() really *does* point to a
private entry structure, so change its return type to reflect
this fact.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
Cached items are found at the high end of an smem partition. A
cached item's shared memory precedes the private entry structure
that describes it.
The address of the structure describing the first cached item should
be returned by phdr_to_first_cached_entry(). However the function
calculates the start address using the wrong structure size.
Fix this by computing the first item's entry structure address by
subtracting the size of a private entry structure rather than a
partition header structure.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
This driver deals with memory that is stored in little-endian format.
Update the structures with the proper little-endian types and then
do the proper conversions when reading the fields. Note that we compare
the ids with a memcmp() because we already pad out the string 'id' field
to exactly 8 bytes with the strncpy() onto the stack.
Cc: Mahesh Sivasubramanian <msivasub@codeaurora.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Evan Green <evgreen@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
Command DB is a simple database in the shared memory of QCOM SoCs, that
provides information regarding shared resources. Some shared resources
in the SoC have properties that are probed dynamically at boot by the
remote processor. The information pertaining to the SoC and the platform
are made available in the shared memory. Drivers can query this
information using predefined strings.
Signed-off-by: Mahesh Sivasubramanian <msivasub@codeaurora.org>
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/drivers
i.MX drivers update for 4.18:
- Use platform_device_add_data() instead of a pointer to a static
memory in gpc/gpcv2 driver for platform data passing, so that we
can avoid a BUG() when calling platform_device_put().
* tag 'imx-drivers-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: gpc: Do not pass static memory as platform data
soc: imx: gpcv2: Do not pass static memory as platform data
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Remove redundant debug prints from phy_read/write since we can trace those
calls through trace events. Enhance dynamic debug prints to print arguments
which helps figuring how what is going on at the driver level with higher level
configuration interfaces.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-05-19
This series contains updates for mlx5e netdevice driver with one subject,
DSCP to priority mapping, in the first patch Huy adds the needed API in
dcbnl, the second patch adds the needed mlx5 core capability bits for the
feature, and all other patches are mlx5e (netdev) only changes to add
support for the feature.
From: Huy Nguyen
Dscp to priority mapping for Ethernet packet:
These patches enable differentiated services code point (dscp) to
priority mapping for Ethernet packet. Once this feature is
enabled, the packet is routed to the corresponding priority based on its
dscp. User can combine this feature with priority flow control (pfc)
feature to have priority flow control based on the dscp.
Firmware interface:
Mellanox firmware provides two control knobs for this feature:
QPTS register allow changing the trust state between dscp and
pcp mode. The default is pcp mode. Once in dscp mode, firmware will
route the packet based on its dscp value if the dscp field exists.
QPDPM register allow mapping a specific dscp (0 to 63) to a
specific priority (0 to 7). By default, all the dscps are mapped to
priority zero.
Software interface:
This feature is controlled via application priority TLV. IEEE
specification P802.1Qcd/D2.1 defines priority selector id 5 for
application priority TLV. This APP TLV selector defines DSCP to priority
map. This APP TLV can be sent by the switch or can be set locally using
software such as lldptool. In mlx5 drivers, we add the support for net
dcb's getapp and setapp call back. Mlx5 driver only handles the selector
id 5 application entry (dscp application priority application entry).
If user sends multiple dscp to priority APP TLV entries on the same
dscp, the last sent one will take effect. All the previous sent will be
deleted.
This attribute combined with pfc attribute allows advanced user to
fine tune the qos setting for specific priority queue. For example,
user can give dedicated buffer for one or more priorities or user
can give large buffer to certain priorities.
The dcb buffer configuration will be controlled by lldptool.
>> lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6
maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6
>> lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0
sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively
After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to
choose dcbnl over devlink interface since this feature is intended to set
port attributes which are governed by the netdev instance of that port, where
devlink API is more suitable for global ASIC configurations.
The firmware trust state (in QPTS register) is changed based on the
number of dscp to priority application entries. When the first dscp to
priority application entry is added by the user, the trust state is
changed to dscp. When the last dscp to priority application entry is
deleted by the user, the trust state is changed to pcp.
When the port is in DSCP trust state, the transmit queue is selected
based on the dscp of the skb.
When the port is in DSCP trust state and vport inline mode is not NONE,
firmware requires mlx5 driver to copy the IP header to the
wqe ethernet segment inline header if the skb has it.
This is done by changing the transmit queue sq's min inline mode to L3.
Note that the min inline mode of sqs that belong to other features
such as xdpsq, icosq are not modified.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The call to free_netdev() in __rtl8139_cleanup_dev() clears the network device
napi list, and explicit calls to netif_napi_del() are unnecessary.
Signed-off-by: Bo Chen <chenbo@pdx.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|