Age | Commit message (Collapse) | Author |
|
The CXL memory device send interface will have a number of supported
commands. The raw command is not such a command. Raw commands allow
userspace to send a specified opcode to the underlying hardware and
bypass all driver checks on the command. The primary use for this
command is to [begrudgingly] allow undocumented vendor specific hardware
commands.
While not the main motivation, it also allows prototyping new hardware
commands without a driver patch and rebuild.
While this all sounds very powerful it comes with a couple of caveats:
1. Bug reports using raw commands will not get the same level of
attention as bug reports using supported commands (via taint).
2. Supported commands will be rejected by the RAW command.
With this comes new debugfs knob to allow full access to your toes with
your weapon of choice.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2)
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ariel Sibley <Ariel.Sibley@microchip.com>
Link: https://lore.kernel.org/r/20210217040958.1354670-6-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add a straightforward IOCTL that provides a mechanism for userspace to
query the supported memory device commands. CXL commands as they appear
to userspace are described as part of the UAPI kerneldoc. The command
list returned via this IOCTL will contain the full set of commands that
the driver supports, however, some of those commands may not be
available for use by userspace.
Memory device commands first appear in the CXL 2.0 specification. They
are submitted through a mailbox mechanism specified in the CXL 2.0
specification.
The send command allows userspace to issue mailbox commands directly to
the hardware. The list of available commands to send are the output of
the query command. The driver verifies basic properties of the command
and possibly inspect the input (or output) payload to determine whether
or not the command is allowed (or might taint the kernel).
Reported-by: kernel test robot <lkp@intel.com> # bug in earlier revision
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2)
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20210217040958.1354670-5-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Create the /sys/bus/cxl hierarchy to enumerate:
* Memory Devices (per-endpoint control devices)
* Memory Address Space Devices (platform address ranges with
interleaving, performance, and persistence attributes)
* Memory Regions (active provisioned memory from an address space device
that is in use as System RAM or delegated to libnvdimm as Persistent
Memory regions).
For now, only the per-endpoint control devices are registered on the
'cxl' bus. However, going forward it will provide a mechanism to
coordinate cross-device interleave.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> (v2)
Link: https://lore.kernel.org/r/20210217040958.1354670-4-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Provide enough functionality to utilize the mailbox of a memory device.
The mailbox is used to interact with the firmware running on the memory
device. The flow is proven with one implemented command, "identify".
Because the class code has already told the driver this is a memory
device and the identify command is mandatory.
CXL devices contain an array of capabilities that describe the
interactions software can have with the device or firmware running on
the device. A CXL compliant device must implement the device status and
the mailbox capability. Additionally, a CXL compliant memory device must
implement the memory device capability. Each of the capabilities can
[will] provide an offset within the MMIO region for interacting with the
CXL device.
The capabilities tell the driver how to find and map the register space
for CXL Memory Devices. The registers are required to utilize the CXL
spec defined mailbox interface. The spec outlines two mailboxes, primary
and secondary. The secondary mailbox is earmarked for system firmware,
and not handled in this driver.
Primary mailboxes are capable of generating an interrupt when submitting
a background command. That implementation is saved for a later time.
Reported-by: Colin Ian King <colin.king@canonical.com> (coverity)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> (smatch)
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com> (v2)
Link: https://www.computeexpresslink.org/download-the-specification
Link: https://lore.kernel.org/r/20210217040958.1354670-3-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The CXL.mem protocol allows a device to act as a provider of "System
RAM" and/or "Persistent Memory" that is fully coherent as if the memory
was attached to the typical CPU memory controller.
With the CXL-2.0 specification a PCI endpoint can implement a "Type-3"
device interface and give the operating system control over "Host
Managed Device Memory". See section 2.3 Type 3 CXL Device.
The memory range exported by the device may optionally be described by
the platform firmware memory map, or by infrastructure like LIBNVDIMM to
provision persistent memory capacity from one, or more, CXL.mem devices.
A pre-requisite for Linux-managed memory-capacity provisioning is this
cxl_mem driver that can speak the mailbox protocol defined in section
8.2.8.4 Mailbox Registers.
For now just land the initial driver boiler-plate and Documentation/
infrastructure.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: David Rientjes <rientjes@google.com> (v1)
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://www.computeexpresslink.org/download-the-specification
Link: https://lore.kernel.org/r/20210217040958.1354670-2-ben.widawsky@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The driver core ignores the return value of struct bus_type::remove()
because there is only little that can be done. To simplify the quest to
make this function return void, let struct dax_device_driver::remove()
return void, too. All users already unconditionally return 0, this commit
makes it obvious that returning an error code isn't intended.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Link: https://lore.kernel.org/r/20210205222842.34896-6-uwe@kleine-koenig.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The dax core properly handles a dax driver not having a remove callback.
So drop it without changing the effective behaviour.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Link: https://lore.kernel.org/r/20210205222842.34896-5-uwe@kleine-koenig.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The static variable match_always_count is supposed to track if there is
a driver registered that has .match_always set. If driver_register()
fails, the previous increment must be undone.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Link: https://lore.kernel.org/r/20210205222842.34896-4-uwe@kleine-koenig.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
If all resources are allocated in .probe() using devm_ functions it
might make sense to not provide a .remove() callback. Then the right
thing is to just return success.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Link: https://lore.kernel.org/r/20210205222842.34896-3-uwe@kleine-koenig.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The bus probe function dax_bus_probe() calls the probe callback without
checking it to be non-NULL. Prevent a NULL pointer exception if a driver
without a probe function is registered by refusing to register this
driver.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Link: https://lore.kernel.org/r/20210205222842.34896-2-uwe@kleine-koenig.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
All drivers return 0 in their remove callback and the driver core ignores
the return value of nvdimm_bus_remove() anyhow. So simplify by changing
the driver remove callback to return void and return 0 unconditionally
to the upper layer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210212171043.2136580-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
nvdimm_remove is only ever called after nvdimm_probe() returned
successfully. In this case driver data is always set to a non-NULL value
so the check for driver data being NULL can go away as it's always false.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210212171043.2136580-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
|
|
Support also transmitting frames using the custom "8899 A"
4 byte tag.
Qingfang came up with the solution: we need to pad the
ethernet frame to 60 bytes using eth_skb_pad(), then the
switch will happily accept frames with custom tags.
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reported-by: DENG Qingfang <dqfext@gmail.com>
Fixes: efd7fe68f0c6 ("net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
smatch gives the warning:
drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR'
Which is trying to say smatch has shown that srv is not an error pointer
and thus cannot be passed to PTR_ERR.
The solution is to move the list_add() down after full initilization of
rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the
list operation as suggested by Leon.
Fixes: 03e9b33a0fd6 ("RDMA/rtrs: Only allow addition of path to an already established session")
Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The current code computes a number of channels per SRP target and spreads
them equally across all online NUMA nodes. Each channel is then assigned
a CPU within this node.
In the case of unbalanced, or even unpopulated nodes, some channels do not
get a CPU associated and thus do not get connected. This causes the SRP
connection to fail.
This patch solves the issue by rewriting channel computation and
allocation:
- Drop channel to node/CPU association as it had no real effect on
locality but added unnecessary complexity.
- Tweak the number of channels allocated to reduce CPU contention when
possible:
- Up to one channel per CPU (instead of up to 4 by node)
- At least 4 channels per node, unless ch_count module parameter is
used.
Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com
Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Robert Hancock says:
====================
Broadcom PHY driver updates
Updates to the Broadcom PHY driver related to use with copper SFP modules.
Changed since v3:
-fixed kerneldoc error
Changed since v2:
-Create flag for PHY on SFP module and use that rather than accessing
attached_dev directly in PHY driver
Changed since v1:
-Reversed conditional to reduce indentation
-Added missing setting of MII_BCM54XX_AUXCTL_MISC_WREN in
MII_BCM54XX_AUXCTL_SHDWSEL_MISC register
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bcm54xx_config_init was modifying the PHY LED configuration to enable link
and activity indications. However, some SFP modules (such as Bel-Fuse
SFP-1GBT-06) have no LEDs but use the LED outputs to control the SFP LOS
signal, and modifying the LED settings will cause the LOS output to
malfunction. Skip this configuration for PHYs which are bound to an SFP
bus.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a flag and helper function to indicate that a PHY device is part of
an SFP module, which is set on attach. This can be used by PHY drivers
to handle SFP-specific quirks or behavior.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The default configuration for the BCM54616S PHY may not match the desired
mode when using 1000BaseX or SGMII interface modes, such as when it is on
an SFP module. Add code to explicitly set the correct mode using
programming sequences provided by Bel-Fuse:
https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf
https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On cpu architectures w/o dma cache snooping, dma_unmap() is a
is a very expensive operation, because its resulting sync
needs to invalidate cpu caches.
Increase efficiency/performance by syncing only those sections
of the lan743x's rx ring buffers that are actually in use.
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The buffers in the lan743x driver's receive ring are always 9K,
even when the largest packet that can be received (the mtu) is
much smaller. This performs particularly badly on cpu archs
without dma cache snooping (such as ARM): each received packet
results in a 9K dma_{map|unmap} operation, which is very expensive
because cpu caches need to be invalidated.
Careful measurement of the driver rx path on armv7 reveals that
the cpu spends the majority of its time waiting for cache
invalidation.
Optimize by keeping the rx ring buffer size as close as possible
to the mtu. This limits the amount of cache that requires
invalidation.
This optimization would normally force us to re-allocate all
ring buffers when the mtu is changed - a disruptive event,
because it can only happen when the network interface is down.
Remove the need to re-allocate all ring buffers by adding support
for multi-buffer frames. Now any combination of mtu and ring
buffer size will work. When the mtu changes from mtu1 to mtu2,
consumed buffers of size mtu1 are lazily replaced by newly
allocated buffers of size mtu2.
These optimizations double the rx performance on armv7.
Third parties report 3x rx speedup on armv8.
Tested with iperf3 on a freescale imx6qp + lan7430, both sides
set to mtu 1500 bytes, measure rx performance:
Before:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-20.00 sec 550 MBytes 231 Mbits/sec 0
After:
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-20.00 sec 1.33 GBytes 570 Mbits/sec 0
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following call path suggests that calling unregister_netdev on an
interface that is up will first bring it down.
enetc_pf_remove
-> unregister_netdev
-> unregister_netdevice_queue
-> unregister_netdevice_many
-> dev_close_many
-> __dev_close_many
-> enetc_close
-> enetc_stop
-> phylink_stop
However, enetc first destroys the phylink instance, then calls
unregister_netdev. This is already dissimilar to the setup (and error
path teardown path) from enetc_pf_probe, but more than that, it is buggy
because it is invalid to call phylink_stop after phylink_destroy.
So let's first unregister the netdev (and let the .ndo_stop events
consume themselves), then destroy the phylink instance, then free the
netdev.
Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Maxime Chevallier says:
====================
net: mvneta: implement basic MQPrio support
This is V2 for the MQPrio support in mvneta.
This small series adds basic support for mqprio offloading, by having
the rx queueing mirroring the TCs based on VLAN prio fields.
This was tested on Armada 3700, and proves useful to make sure
high-priority traffic has a better chance not getting dropped when
there's lots of packets incoming.
The first patch of the series deals with the per-cpu interrupts on the
armada 3700. Since they don't work, there were already some patches
applied to keep all queue mappings to CPU0, but there still were some
remaining mappings left to be dealt with.
The second patch implements the MQPrio offloading for the receive path.
Changes in V2 :
- Add a Fixes tag for the first patch
- Fix some warnings and the xmas tree in the second patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement a basic MQPrio support, inserting rules in RX that translate
the TC to prio mapping into vlan prio to queues.
The TX logic stays the same as when we don't offload the qdisc.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to Errata #23 "The per-CPU GbE interrupt is limited to Core
0", we can't use the per-cpu interrupt mechanism on the Armada 3700
familly.
This is correctly checked for RSS configuration, but the initial queue
mapping is still done by having the queues spread across all the CPUs in
the system, both in the init path and in the cpu_hotplug path.
Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When police action is created by cls API tcf_exts_validate() first
conditional that calls tcf_action_init_1() directly, the action idr is not
updated according to latest changes in action API that require caller to
commit newly created action to idr with tcf_idr_insert_many(). This results
such action not being accessible through act API and causes crash reported
by syzbot:
==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline]
BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204
CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
__kasan_report mm/kasan/report.c:400 [inline]
kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413
check_memory_region_inline mm/kasan/generic.c:179 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
instrument_atomic_read include/linux/instrumented.h:71 [inline]
atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
__tcf_idr_release net/sched/act_api.c:178 [inline]
tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
tc_action_net_exit include/net/act_api.h:151 [inline]
police_exit_net+0x168/0x360 net/sched/act_police.c:390
ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
kthread+0x3b1/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G B 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
panic+0x306/0x73d kernel/panic.c:231
end_report+0x58/0x5e mm/kasan/report.c:100
__kasan_report mm/kasan/report.c:403 [inline]
kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413
check_memory_region_inline mm/kasan/generic.c:179 [inline]
check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
instrument_atomic_read include/linux/instrumented.h:71 [inline]
atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
__tcf_idr_release net/sched/act_api.c:178 [inline]
tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
tc_action_net_exit include/net/act_api.h:151 [inline]
police_exit_net+0x168/0x360 net/sched/act_police.c:390
ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
kthread+0x3b1/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Kernel Offset: disabled
Fix the issue by calling tcf_idr_insert_many() after successful action
initialization.
Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together")
Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
pull-request: mlx5-next 2021-02-16
The patches in this pr are already submitted and reviewed through the
netdev and rdma mailing lists.
The series includes mlx5 HW bits and definitions for mlx5 real time clock
translation and handling in the mlx5 driver clock module to enable and
support such mode [1]
[1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The changes made in eccd540 is enough for xilinx_emaclite to run
without problem on 64-bit systems. I have tested it on a Xilinx
FPGA with RV64 softcore. The architecture limitation in Kconfig
seems no longer necessary.
A small change is included to print address with %lx instead of
casting to int and print with %x.
Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Horatiu Vulturv says:
====================
bridge: mrp: Extend br_mrp_switchdev_*
This patch series extends MRP switchdev to allow the SW to have a better
understanding if the HW can implement the MRP functionality or it needs
to help the HW to run it. There are 3 cases:
- when HW can't implement at all the functionality.
- when HW can implement a part of the functionality but needs the SW
implement the rest. For example if it can't detect when it stops
receiving MRP Test frames but it can copy the MRP frames to CPU to
allow the SW to determine this. Another example is generating the MRP
Test frames. If HW can't do that then the SW is used as backup.
- when HW can implement completely the functionality.
So, initially the SW tries to offload the entire functionality in HW, if
that fails it tries offload parts of the functionality in HW and use the
SW as helper and if also this fails then MRP can't run on this HW.
Based on these new calls, implement the switchdev for Ocelot driver. This
is an example where the HW can't run completely the functionality but it
can help the SW to run it, by trapping all MRP frames to CPU.
Also this patch series adds MRP support to DSA and implements the Felix
driver which just reuse the Ocelot functions. This part was just compiled
tested because I don't have any HW on which to do the actual tests.
v4:
- remove ifdef MRP from include/net/switchdev.h
- move MRP implementation for Ocelot in a different file such that
Felix driver can use it.
- extend DSA with MRP support
- implement MRP support for Felix.
v3:
- implement the switchdev calls needed by Ocelot driver.
v2:
- fix typos in comments and in commit messages
- remove some of the comments
- move repeated code in helper function
- fix issue when deleting a node when sw_backup was true
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement functions 'port_mrp_add', 'port_mrp_del',
'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp
functions from ocelot.
Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ
will be forward by the SW.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for offloading MRP in HW. Currently implement the switchdev
calls 'SWITCHDEV_OBJ_ID_MRP', 'SWITCHDEV_OBJ_ID_RING_ROLE_MRP',
to allow to create MRP instances and to set the role of these instances.
Add DSA_NOTIFIER_MRP_ADD/DEL and DSA_NOTIFIER_MRP_ADD/DEL_RING_ROLE
which calls to .port_mrp_add/del and .port_mrp_add/del_ring_role in the
DSA driver for the switch.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add basic support for MRP. The HW will just trap all MRP frames on the
ring ports to CPU and allow the SW to process them. In this way it is
possible to for this node to behave both as MRM and MRC.
Current limitations are:
- it doesn't support Interconnect roles.
- it supports only a single ring.
- the HW should be able to do forwarding of MRP Test frames so the SW
will not need to do this. So it would be able to have the role MRC
without SW support.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Check the return values of the br_mrp_switchdev function.
In case of:
- BR_MRP_NONE, return the error to userspace,
- BR_MRP_SW, continue with SW implementation,
- BR_MRP_HW, continue without SW implementation,
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch extends the br_mrp_switchdev functions to be able to have a
better understanding what cause the issue and if the SW needs to be used
as a backup.
There are the following cases:
- when the code is compiled without CONFIG_NET_SWITCHDEV. In this case
return success so the SW can continue with the protocol. Depending
on the function, it returns 0 or BR_MRP_SW.
- when code is compiled with CONFIG_NET_SWITCHDEV and the driver doesn't
implement any MRP callbacks. In this case the HW can't run MRP so it
just returns -EOPNOTSUPP. So the SW will stop further to configure the
node.
- when code is compiled with CONFIG_NET_SWITCHDEV and the driver fully
supports any MRP functionality. In this case the SW doesn't need to do
anything. The functions will return 0 or BR_MRP_HW.
- when code is compiled with CONFIG_NET_SWITCHDEV and the HW can't run
completely the protocol but it can help the SW to run it. For
example, the HW can't support completely MRM role(can't detect when it
stops receiving MRP Test frames) but it can redirect these frames to
CPU. In this case it is possible to have a SW fallback. The SW will
try initially to call the driver with sw_backup set to false, meaning
that the HW should implement completely the role. If the driver returns
-EOPNOTSUPP, the SW will try again with sw_backup set to false,
meaning that the SW will detect when it stops receiving the frames but
it needs HW support to redirect the frames to CPU. In case the driver
returns 0 then the SW will continue to configure the node accordingly.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the enum br_mrp_hw_support that is used by the br_mrp_switchdev
functions to allow the SW to detect the cases where HW can't implement
the functionality or when SW is used as a backup.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the member sw_backup to the structures switchdev_obj_ring_role_mrp
and switchdev_obj_in_role_mrp. In this way the SW can call the driver in
2 ways, once when sw_backup is set to false, meaning that the driver
should implement this completely in HW. And if that is not supported the
SW will call again but with sw_backup set to true, meaning that the
HW should help or allow the SW to run the protocol.
For example when role is MRM, if the HW can't detect when it stops
receiving MRP Test frames but it can trap these frames to CPU, then it
needs to return -EOPNOTSUPP when sw_backup is false and return 0 when
sw_backup is true.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove #IS_ENABLED(CONFIG_BRIDGE_MRP) from switchdev.h. This will
simplify the code implements MRP callbacks and will be similar with the
vlan filtering.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The return value of range_parse() indicates the size when it is
positive. The error code should be negative.
Signed-off-by: Shiyang Ruan <ruansy.fnst@cn.fujitsu.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20210126021331.1059933-1-ruansy.fnst@cn.fujitsu.com
Reported-by: Zhang Qilong <zhangqilong3@huawei.com>
Fixes: 8490e2e25b5a ("device-dax: add a range mapping allocation attribute")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
When 88E1111 is operating in SGMII mode, auto-negotiation should be enabled
on the SGMII side so that the link will come up properly with PCSes which
normally have auto-negotiation enabled. This is normally the case when the
PHY defaults to SGMII mode at power-up, however if we switched it from some
other mode like 1000Base-X, as may happen in some SFP module situations,
it may not be, particularly for modules which have 1000Base-X
auto-negotiation defaulting to disabled.
Call genphy_check_and_restart_aneg on the fiber page to ensure that auto-
negotiation is properly enabled on the SGMII interface.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Marek Behún says:
====================-
Add 5gbase-r PHY interface mode
there is still some testing needed for Amethyst patches, so I have
split the part adding support for 5gbase-r interface mode and am sending
it alone.
The first two patches are already reviewed.
Changes since last patches (Amethyst v16):
- added phylink 5gbase-r handler
- added SFP support for 5gbase-r mode
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The sfp_parse_support() function is setting 5000baseT_Full in some cases.
Now that we have PHY_INTERFACE_MODE_5GBASER interface mode available,
change sfp_select_interface() to return PHY_INTERFACE_MODE_5GBASER if
5000baseT_Full is set in the link mode mask.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add 5GBASER interface type and speed to phylink.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add 5GBASE-R phy interface mode
Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add 5gbase-r PHY interface mode.
Signed-off-by: Pavana Sharma <pavana.sharma@digi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The tg3 driver tried to communicate towards the PHY driver whether it
wanted RGMII in-band signaling enabled or disabled however there is
nothing that looks at those flags in drivers/net/phy/broadcom.c so this
does do not anything.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Shyam Sundar S K says:
====================
Bug fixes to amd-xgbe driver
General fixes on amd-xgbe driver are addressed in this series, mostly
on the mailbox communication failures and improving the link stability
of the amd-xgbe device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Frequent link up/down events can happen when a Bel Fuse SFP part is
connected to the amd-xgbe device. Try to avoid the frequent link
issues by resetting the PHY as documented in Bel Fuse SFP datasheets.
Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Normally, auto negotiation and reconnect should be automatically done by
the hardware. But there seems to be an issue where auto negotiation has
to be restarted manually. This happens because of link training and so
even though still connected to the partner the link never "comes back".
This needs an auto-negotiation restart.
Also, a change in xgbe-mdio is needed to get ethtool to recognize the
link down and get the link change message. This change is only
required in a backplane connection mode.
Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current driver calls netif_carrier_off() late in the link tear down
which can result in a netdev watchdog timeout.
Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
avoids the warning.
------------[ cut here ]------------
NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out
WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220
Modules linked in: amd_xgbe(E) amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E
Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019
RIP: 0010:dev_watchdog+0x20d/0x220
Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48
RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0
RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018
R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000
R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010
FS: 0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0
Call Trace:
<IRQ>
? pfifo_fast_reset+0x100/0x100
call_timer_fn+0x2b/0x130
run_timer_softirq+0x3e8/0x440
? enqueue_hrtimer+0x39/0x90
Fixes: e722ec82374b ("amd-xgbe: Update the BelFuse quirk to support SGMII")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|