Age | Commit message (Collapse) | Author |
|
All the mptcp receive path is protected by the msk socket
spinlock. As consequences, the tx path has to play a few tricks to
allocate the forward memory without acquiring the spinlock multiple
times, making the overall TX path quite complex.
This patch tries to clean-up a bit the tx path, using completely
separated fwd memory allocation, for the rx and the tx path.
The forward memory allocated in the rx path is now accounted in
msk->rmem_fwd_alloc and is (still) protected by the msk socket spinlock.
To cope with the above we provide a few MPTCP-specific variants for
the helpers to charge, uncharge, reclaim and free the forward memory
in the receive path.
msk->sk_forward_alloc now accounts only the forward memory for the tx
path, we can use the plain core sock helper to manipulate it and drop
quite a bit of complexity.
On memory pressure, both rx and tx fwd memories are reclaimed.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A later patch will change the MPTCP memory accounting schema
in such a way that MPTCP sockets will encode the total amount of
forward allocated memory in two separate fields (one for tx and
one for rx).
MPTCP sockets will use their own helper to provide the accurate
amount of fwd allocated memory.
To allow the above, this patch adds a new, optional, sk method to
fetch the fwd memory, wrap the call in a new helper and use it
where it is appropriate.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A following patch is going to implement a similar reclaim schema
for the MPTCP protocol, with different locking.
Let's define a couple of macros for the used thresholds, so
that the latter code will be more easily maintainable.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported data-races in inet_getname() multiple times,
it is time we fix this instead of pretending applications
should not trigger them.
getsockname() and getpeername() are not really considered fast path.
v2: added the missing BPF_CGROUP_RUN_SA_PROG() declaration
needed when CONFIG_CGROUP_BPF=n, as reported by
kernel test robot <lkp@intel.com>
syzbot typical report:
BUG: KCSAN: data-race in __inet_hash_connect / inet_getname
write to 0xffff888136d66cf8 of 2 bytes by task 14374 on cpu 1:
__inet_hash_connect+0x7ec/0x950 net/ipv4/inet_hashtables.c:831
inet_hash_connect+0x85/0x90 net/ipv4/inet_hashtables.c:853
tcp_v4_connect+0x782/0xbb0 net/ipv4/tcp_ipv4.c:275
__inet_stream_connect+0x156/0x6e0 net/ipv4/af_inet.c:664
inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:728
__sys_connect_file net/socket.c:1896 [inline]
__sys_connect+0x254/0x290 net/socket.c:1913
__do_sys_connect net/socket.c:1923 [inline]
__se_sys_connect net/socket.c:1920 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:1920
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff888136d66cf8 of 2 bytes by task 14408 on cpu 0:
inet_getname+0x11f/0x170 net/ipv4/af_inet.c:790
__sys_getsockname+0x11d/0x1b0 net/socket.c:1946
__do_sys_getsockname net/socket.c:1961 [inline]
__se_sys_getsockname net/socket.c:1958 [inline]
__x64_sys_getsockname+0x3e/0x50 net/socket.c:1958
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000 -> 0xdee0
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 14408 Comm: syz-executor.3 Not tainted 5.15.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211026213014.3026708-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a warning in xdp_rxq_info_unreg_mem_model() when reg_state isn't
equal to REG_STATE_REGISTERED, so the warning in xdp_rxq_info_unreg() is
redundant.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/20211027013856.1866-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://lore.kernel.org/r/20211026175547.3198242-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the new of_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, np;
@@
- of_get_mac_address(np, dev->dev_addr)
+ of_get_ethdev_address(np, dev)
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20211026175038.3197397-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 4d98bb0d7ec2 ("net: macb: Use mdio child node for MDIO bus if it
exists") added code to detect if a 'mdio' child node exists to the macb
driver. Ths added code does, however, not actually check if the child node
exists, but if the parent node exists. This results in errors such as
macb 10090000.ethernet eth0: Could not attach PHY (-19)
if there is no 'mdio' child node. Fix the code to actually check for
the child node.
Fixes: 4d98bb0d7ec2 ("net: macb: Use mdio child node for MDIO bus if it exists")
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Link: https://lore.kernel.org/r/20211026173950.353636-1-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The only valid values for a miniq pointer are NULL or a pointer to
miniq1 or miniq2, so testing for miniq_old != &miniq1 is functionally
equivalent to testing that it is NULL or equal to &miniq2.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Link: https://lore.kernel.org/r/20211026183721.137930-1-seth@forshee.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently rcu_barrier() is used to ensure that no readers of the
inactive mini_Qdisc buffer remain before it is reused. This waits for
any pending RCU callbacks to complete, when all that is actually
required is to wait for one RCU grace period to elapse after the buffer
was made inactive. This means that using rcu_barrier() may result in
unnecessary waits.
To improve this, store the current RCU state when a buffer is made
inactive and use poll_state_synchronize_rcu() to check whether a full
grace period has elapsed before reusing it. If a full grace period has
not elapsed, wait for a grace period to elapse, and in the non-RT case
use synchronize_rcu_expedited() to hasten it.
Since this approach eliminates the RCU callback it is no longer
necessary to synchronize_rcu() in the tp_head==NULL case. However, the
RCU state should still be saved for the previously active buffer.
Before this change I would typically see mini_qdisc_pair_swap() take
tens of milliseconds to complete. After this change it typcially
finishes in less than 1 ms, and often it takes just a few microseconds.
Thanks to Paul for walking me through the options for improving this.
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Link: https://lore.kernel.org/r/20211026130700.121189-1-seth@forshee.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch makes the driver r8169 pick up device Realtek Semiconductor Co.
, Ltd. Device [10ec:8162].
Signed-off-by: Janghyub Seo <jhyub06@gmail.com>
Suggested-by: Rushab Shah <rushabshah32@gmail.com>
Link: https://lore.kernel.org/r/1635231849296.1489250046.441294000@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add PTP_CLK_MAGIC to the userspace-api/ioctl/ioctl-number.rst
documentation file.
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20211024163831.10200-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull virtio fixes from Michael Tsirkin:
"A couple of fixes that seem important enough to pick at the last
moment"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-ring: fix DMA metadata flags
vduse: Fix race condition between resetting and irq injecting
vduse: Disallow injecting interrupt before DRIVER_OK is set
|
|
The flags are currently overwritten, leading to the wrong direction
being passed to the DMA unmap functions.
Fixes: 72b5e8958738aaa4 ("virtio-ring: store DMA metadata in desc_extra for split virtqueue")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20211026133100.17541-1-vincent.whitchurch@axis.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
The tc_gred_qopt_offload structure has grown too big to be on the
stack for 32-bit architectures after recent changes.
net/sched/sch_gred.c:903:13: error: stack frame size (1180) exceeds limit (1024) in 'gred_destroy' [-Werror,-Wframe-larger-than]
net/sched/sch_gred.c:310:13: error: stack frame size (1212) exceeds limit (1024) in 'gred_offload' [-Werror,-Wframe-larger-than]
Use dynamic allocation per qdisc to avoid this.
Fixes: 50dc9a8572aa ("net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types")
Fixes: 67c9e6270f30 ("net: sched: Protect Qdisc::bstats with u64_stats")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20211026100711.nalhttf6mbe6sudx@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Return error code if usb_maxpacket() returns 0 in usbnet_probe()
Fixes: 397430b50a36 ("usbnet: sanity check for maxpacket")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211026124015.3025136-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Leon Romanovsky says:
====================
Two reverts to calm down devlink discussion
Two reverts as was discussed in [1], fast, easy and wrong in long run
solution to syzkaller bug [2].
[1] https://lore.kernel.org/all/20211026120234.3408fbcc@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
[2] https://lore.kernel.org/netdev/000000000000af277405cf0a7ef0@google.com/
====================
Link: https://lore.kernel.org/r/cover.1635276828.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 22849b5ea5952d853547cc5e0651f34a246b2a4f as it
revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters
devlink objects during another devlink user triggered command.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 8bbeed4858239ac956a78e5cbaf778bd6f3baef8 as it
revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters
devlink objects during another devlink user triggered command.
Fixes: 22849b5ea595 ("devlink: Remove not-executed trap policer notifications")
Reported-by: syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull nds32 tracing fix from Steven Rostedt:
"Fix nds32le build when DYNAMIC_FTRACE is disabled
A randconfig found that nds32le architecture fails to build due to a
prototype mismatch between a ftrace function pointer and the function
it was to be assigned to. That function pointer prototype missed being
updated when all the ftrace callbacks were updated"
* tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fix from Dinh Nguyen:
"Fix a build error for allmodconfig"
* tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST
|
|
Pull rdma fixes from Jason Gunthorpe:
"Nothing very exciting here, it has been a quiet cycle overall. Usual
collection of small bug fixes:
- irdma issues with CQ entries, VLAN completions and a mutex deadlock
- Incorrect DCT packets in mlx5
- Userspace triggered overflows in qib
- Locking error in hfi
- Typo in errno value in qib/hfi1
- Double free in qedr
- Leak of random kernel memory to userspace with a netlink callback"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
RDMA/irdma: Do not hold qos mutex twice on QP resume
RDMA/irdma: Set VLAN in UD work completion correctly
RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
rdma/qedr: Fix crash due to redundant release of device's qp memory
RDMA/rdmavt: Fix error code in rvt_create_qp()
IB/hfi1: Fix abba locking issue with sc_disable()
IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
RDMA/mlx5: Set user priority for DCT
RDMA/irdma: Process extended CQ entries correctly
|
|
The ftrace callback prototype was changed to pass a special ftrace_regs
instead of pt_regs as the last parameter, but the static ftrace for nds32
missed updating ftrace_trace_function and this caused a warning when
compared to ftrace_stub:
../arch/nds32/kernel/ftrace.c: In function '_mcount':
../arch/nds32/kernel/ftrace.c:24:35: error: comparison of distinct pointer types lacks a cast [-Werror]
24 | if (ftrace_trace_function != ftrace_stub)
| ^~
Link: https://lore.kernel.org/all/20211027055554.19372-1-rdunlap@infradead.org/
Link: https://lkml.kernel.org/r/20211027125101.33449969@gandalf.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Fixes: d19ad0775dcd6 ("ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
mt76 patches for 5.16
* fix a compile error with !CONFIG_PM
* cleanups
* MT7915 DBDC fixes
* endian warning fixes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Two fixes:
* bridge vs. 4-addr mode check was wrong
* management frame registrations locking was
wrong, causing list corruption/crashes
====================
Link: https://lore.kernel.org/r/20211027143756.91711-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
nios2:allmodconfig builds fail with
make[1]: *** No rule to make target 'arch/nios2/boot/dts/""',
needed by 'arch/nios2/boot/dts/built-in.a'. Stop.
make: [Makefile:1868: arch/nios2/boot/dts] Error 2 (ignored)
This is seen with compile tests since those enable NIOS2_DTB_SOURCE_BOOL,
which in turn enables NIOS2_DTB_SOURCE. This causes the build error
because the default value for NIOS2_DTB_SOURCE is an empty string.
Disable NIOS2_DTB_SOURCE_BOOL for compile tests to avoid the error.
Fixes: 2fc8483fdcde ("nios2: Build infrastructure")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
|
|
Vladimir Oltean says:
====================
Bridge FDB refactoring
This series refactors the br_fdb.c, br_switchdev.c and switchdev.c files
to offer the same level of functionality with a bit less code, and to
clarify the purpose of some functions.
No functional change intended.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To reduce code churn, the same patch makes multiple changes, since they
all touch the same lines:
1. The implementations for these two are identical, just with different
function pointers. Reduce duplications and name the function pointers
"mod_cb" instead of "add_cb" and "del_cb". Pass the event as argument.
2. Drop the "const" attribute from "orig_dev". If the driver needs to
check whether orig_dev belongs to itself and then
call_switchdev_notifiers(orig_dev, SWITCHDEV_FDB_OFFLOADED), it
can't, because call_switchdev_notifiers takes a non-const struct
net_device *.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are two places where a switchdev FDB entry is constructed, one is
br_switchdev_fdb_notify() and the other is br_fdb_replay(). One uses a
struct initializer, and the other declares the structure as
uninitialized and populates the elements one by one.
One problem when introducing new members of struct
switchdev_notifier_fdb_info is that there is a risk for one of these
functions to run with an uninitialized value.
So centralize the logic of populating such structure into a dedicated
function. Being the primary location where these structures are created,
using an uninitialized variable and populating the members one by one
should be fine, since this one function is supposed to assign values to
all its members.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
br_fdb_replay is only called from switchdev code paths, so it makes
sense to be disabled if switchdev is not enabled in the first place.
As opposed to br_mdb_replay and br_vlan_replay which might be turned off
depending on bridge support for multicast and VLANs, FDB support is
always on. So moving br_mdb_replay and br_vlan_replay inside
br_switchdev.c would mean adding some #ifdef's in br_switchdev.c, so we
keep those where they are.
The reason for the movement is that in future changes there will be some
code reuse between br_switchdev_fdb_notify and br_fdb_replay.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can express the same logic without an "if" condition as big as the
function, just return early if the kmem_cache_alloc() call fails.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
br_fdb_insert() is a wrapper over fdb_insert() that also takes the
bridge hash_lock.
With fdb_insert() being renamed to fdb_add_local(), rename
br_fdb_insert() to br_fdb_add_local().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fdb_insert() is not a descriptive name for this function, and also easy
to confuse with __br_fdb_add(), fdb_add_entry(), br_fdb_update().
Even more confusingly, it is not even related in any way with those
functions, neither one calls the other.
Since fdb_insert() basically deals with the creation of a BR_FDB_LOCAL
entry and is called only from functions where that is the intention:
- br_fdb_changeaddr
- br_fdb_change_mac_address
- br_fdb_insert
then rename it to fdb_add_local(), because its removal counterpart is
called fdb_delete_local().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fdb_insert() has a forward declaration because its first caller,
br_fdb_changeaddr(), is declared before fdb_create(), a function which
fdb_insert() needs.
This patch moves the 2 functions above br_fdb_changeaddr() and deletes
the forward declaration for fdb_insert().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
fdb_notify() has a forward declaration because its first caller,
fdb_delete(), is declared before 3 functions that fdb_notify() needs:
fdb_to_nud(), fdb_fill_info() and fdb_nlmsg_size().
This patch moves the aforementioned 4 functions above fdb_delete() and
deletes the forward declaration.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Russell King says:
====================
Convert mvneta to phylink supported_interfaces
This patch series converts mvneta to use phylinks supported_interfaces
bitmap to simplify the validate() implementation. The patches:
1) Add the supported interface modes the supported_interfaces bitmap.
2) Removes the checks for the interface type being supported from
the validate callback
3) Removes the now unnecessary checks and call to
phylink_helper_basex_speed() to support switching between
1000base-X and 2500base-X for SFPs
(3) becomes possible because when asking the MAC for its complete
support, we walk all supported interfaces which will include 1000base-X
and 2500base-X only if the comphy is present.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that we have a better method to select SFP interface modes, we
no longer need to use phylink_helper_basex_speed() in a driver's
validation function, and we can also get rid of our hack to indicate
both 1000base-X and 2500base-X if the comphy is present to make that
work. Remove this hack and use of phylink_helper_basex_speed().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode in the
validation function. Remove this to simplify it.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Populate the phy_interface_t bitmap for the Marvell mvneta driver with
interfaces modes supported by the MAC.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Guangbin Huang says:
====================
net: hns3: add some fixes for -net
This series adds some fixes for the HNS3 ethernet driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adjusts the string spaces of some parameters of tx bd info in
debugfs according to their maximum needs.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The specified buffer length for three debugfs files fd_tcam, uc and tqp
is not enough for their maximum needs, so this patch fixes them.
Fixes: b5a0b70d77b9 ("net: hns3: refactor dump fd tcam of debugfs")
Fixes: 1556ea9120ff ("net: hns3: refactor dump mac list of debugfs")
Fixes: d96b0e59468d ("net: hns3: refactor dump reg of debugfs")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
in debugfs
As the width of packets number registers is 32 bits, they needs at most
10 characters for decimal data printing, but now the string spaces is not
enough, so this patch fixes it.
Fixes: e44c495d95e ("net: hns3: refactor queue info of debugfs")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The member data in struct hclge_desc is type of __le32, it needs endian
conversion before using it, and some functions of debugfs didn't do that,
so this patch fixes it.
Fixes: c0ebebb9ccc1 ("net: hns3: Add "dcb register" status information query function")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, if there is a reset event triggered by RAS during device in
initialization process, driver may run reset process concurrently with
initialization process. In this case, it may cause problem. For example,
the RSS indirection table may has not been alloc memory in initialization
process yet, but it is used in reset process, it will cause a call trace
like this:
[61228.744836] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[61228.897677] Workqueue: hclgevf hclgevf_service_task [hclgevf]
[61228.911390] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[61228.918670] pc : hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61228.927812] lr : hclgevf_set_rss_indir_table+0x90/0x190 [hclgevf]
[61228.937248] sp : ffff8000162ebb50
[61228.941087] x29: ffff8000162ebb50 x28: ffffb77add72dbc0 x27: ffff0820c7dc8080
[61228.949516] x26: 0000000000000000 x25: ffff0820ad4fc880 x24: ffff0820c7dc8080
[61228.958220] x23: ffff0820c7dc8090 x22: 00000000ffffffff x21: 0000000000000040
[61228.966360] x20: ffffb77add72b9c0 x19: 0000000000000000 x18: 0000000000000030
[61228.974646] x17: 0000000000000000 x16: ffffb77ae713feb0 x15: ffff0820ad4fcce8
[61228.982808] x14: ffffffffffffffff x13: ffff8000962eb7f7 x12: 00003834ec70c960
[61228.991990] x11: 00e0fafa8c206982 x10: 9670facc78a8f9a8 x9 : ffffb77add717530
[61229.001123] x8 : ffff0820ad4fd6b8 x7 : 0000000000000000 x6 : 0000000000000011
[61229.010249] x5 : 00000000000cb1b0 x4 : 0000000000002adb x3 : 0000000000000049
[61229.018662] x2 : ffff8000162ebbb8 x1 : 0000000000000000 x0 : 0000000000000480
[61229.027002] Call trace:
[61229.030177] hclgevf_set_rss_indir_table+0xb4/0x190 [hclgevf]
[61229.039009] hclgevf_rss_init_hw+0x128/0x1b4 [hclgevf]
[61229.046809] hclgevf_reset_rebuild+0x17c/0x69c [hclgevf]
[61229.053862] hclgevf_reset_service_task+0x4cc/0xa80 [hclgevf]
[61229.061306] hclgevf_service_task+0x6c/0x630 [hclgevf]
[61229.068491] process_one_work+0x1dc/0x48c
[61229.074121] worker_thread+0x15c/0x464
[61229.078562] kthread+0x168/0x16c
[61229.082873] ret_from_fork+0x10/0x18
[61229.088221] Code: 7900e7f6 f904a683 d503201f 9101a3e2 (38616b43)
[61229.095357] ---[ end trace 153661a538f6768c ]---
To fix this problem, don't schedule reset task before initialization
process is done.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the workqueue of hclge/hclgevf is executed on
the CPU that initiates scheduling requests by default. In
stress scenarios, the CPU may be busy and workqueue scheduling
is completed after a long period of time. To avoid this
situation and implement proper scheduling, use the WQ_UNBOUND
mode instead. In this way, the workqueue can be performed on
a relatively idle CPU.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a TP port is configured by follow steps:
1.ethtool -s ethx autoneg off speed 100 duplex full
2.ethtool -A ethx rx on tx on
3.ethtool -s ethx autoneg on(rx&tx negotiated pause results are off)
4.ethtool -s ethx autoneg off speed 100 duplex full
In step 3, driver will set rx&tx pause parameters of hardware to off as
pause parameters negotiated with link partner are off.
After step 4, the "ethtool -a ethx" command shows both rx and tx pause
parameters are on. However, pause parameters of hardware are still off
and port has no flow control function actually.
To fix this problem, if autoneg is disabled, driver uses its saved
parameters to restore pause of hardware. If the speed is not changed in
this case, there is no link state changed for phy, it will cause the pause
parameter is not taken effect, so we need to force phy to go down and up.
Fixes: aacbe27e82f0 ("net: hns3: modify how pause options is displayed")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-10-26
HW-GRO support in mlx5
Beside the HW GRO this series includes two trivial non-mlx5 patches:
- net: Prevent HW-GRO and LRO features operate together
- lib: bitmap: Introduce node-aware alloc API
Khalid Manaa Says:
==================
This series implements the HW-GRO offload using the HW feature SHAMPO.
HW-GRO: Hardware offload for the Generic Receive Offload feature.
SHAMPO: Split Headers And Merge Payload Offload.
This feature performs headers data split for each received packed and
merge the payloads of the packets of the same session.
There are new HW components for this feature:
The headers buffer:
– cyclic buffer where the packets headers will be located
Reservation buffer:
– capability to divide RQ WQEs to reservations, a definite size in
granularity of 4KB, the reservation is used to define the largest segment
that we can create by packets stitching.
Each reservation will have a session and the new received packet can be merged
to the session, terminate it, or open a new one according to the match criteria.
When a new packet is received the headers will be written to the headers buffer
and the data will be written to the reservation, in case the packet matches
the session the data will be written continuously otherwise it will be written
after performing an alignment.
SHAMPO RQ, WQ and CQE changes:
-----------------------------
RQ (receive queue) new params:
-shampo_no_match_alignment_granularity: the HW alignment granularity in case
the received packet doesn't match the current session.
-shampo_match_criteria_type: the type of match criteria.
-reservation_timeout: the maximum time that the HW will hold the reservation.
-Each RQ has SKB that represents the current opened flow.
WQ (work queue) new params:
-headers_mkey: mkey that represents the headers buffer, where the packets
headers will be written by the HW.
-shampo_enable: flag to verify if the WQ supports SHAMPO feature.
-log_reservation_size: the log of the reservation size where the data of
the packet will be written by the HW.
-log_max_num_of_packets_per_reservation: log of the maximum number of packets
that can be written to the same reservation.
-log_headers_entry_size: log of the header entry size of the headers buffer.
-log_headers_buffer_entry_num: log of the entries number of the headers buffer.
CQEs (Completion queue entry) SHAMPO fields:
-match: in case it is set, then the current packet matches the opened session.
-flush: in case it is set, the opened session must be flushed.
-header_size: the size of the packet’s headers.
-header_entry_index: the entry index in the headers buffer of the received
packet headers.
-data_offset: the offset of the received packet data in the WQE.
HW-GRO works as follow:
----------------------
The feature can be enabled on the interface using the ethtool command by
setting on rx-gro-hw. When the feature is on the mlx5 driver will reopen
the RQ to support the SHAMPO feature:
Will allocate the headers buffer and fill the parameters regarding the
reservation and the match criteria.
Receive packet flow:
each RQ will hold SKB that represents the current GRO opened session.
The driver has a new CQE handler mlx5e_handle_rx_cqe_mpwrq_shampo which will
use the CQE SHAMPO params to extract the location of the packet’s headers
in the headers buffer and the location of the packets data in the RQ.
Also, the CQE has two flags flush and match that indicate if the current
packet matches the current session or not and if we need to close the session.
In case there is an opened session, and we receive a matched packet then the
handler will merge the packet's payload to the current SKB, in case we receive
no match then the handler will flush the SKB and create a new one for the new packet.
In case the flash flag is set then the driver will close the session, the SKB
will be passed to the network stack.
In case the driver merges packets in the SKB, before passing the SKB to the network
stack the driver will update the checksum of the packet’s headers.
SKB build:
---------
The driver will build a new SKB in the following situations:
in case there is no current opened session.
In case the current packet doesn’t match the current session.
In case there is no place to add the packets data to the SKB that represents the
current session.
Otherwise, the driver will add the packet’s data to the SKB.
When the driver builds a new SKB, the linear area will contain only the packet headers
and the data will be added to the SKB fragments.
In case the entry size of the headers buffer is sufficient to build the SKB
it will be used, otherwise the driver will allocate new memory to build the SKB.
==================
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Firmware link offload monitoring can be made to work in 3/4 cases by
switching on firmware feature bit WLANACTIVE_OFFLOAD
- Secure power-save on
- Secure power-save off
- Open power-save on
However, with an open AP if we switch off power-saving - thus never
entering Beacon Mode Power Save - BMPS, firmware never forwards loss
of beacon upwards.
We had hoped that WLANACTIVE_OFFLOAD and some fixes for sequence numbers
would unblock this but, it hasn't and further investigation is required.
Its possible to have a complete set of Secure power-save on/off and Open
power-save on/off provided we use Linux' link monitoring mechanism.
While we debug the Open AP failure we need to fix upstream.
This reverts commit c973fdad79f6eaf247d48b5fc77733e989eb01e1.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025093037.3966022-2-bryan.odonoghue@linaro.org
|
|
If the system is resumed because of an incoming packet, the wcn36xx RX
interrupts is fired before actual resuming of the wireless/mac80211
stack, causing any received packets to be simply dropped. E.g. a ping
request causes a system resume, but is dropped and so never forwarded
to the IP stack.
This change fixes that, disabling DMA interrupts on suspend to no pass
packets until mac80211 is resumed and ready to handle them.
Note that it's not incompatible with RX irq wake.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150496-19290-1-git-send-email-loic.poulain@linaro.org
|