Age | Commit message (Collapse) | Author |
|
SCTP is not universally deployed, allow hiding its bit
from the skb.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Datacenter kernel builds will very likely not include WIRELESS,
so let them shave 2 bits off the skb by hiding the wifi fields.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Christian Marangi says:
====================
net: Add basic LED support for switch/phy
This is a continue of [1]. It was decided to take a more gradual
approach to implement LEDs support for switch and phy starting with
basic support and then implementing the hw control part when we have all
the prereq done.
This series implements only the brightness_set() and blink_set() ops.
An example of switch implementation is done with qca8k.
For PHY a more generic approach is used with implementing the LED
support in PHY core and with the user (in this case marvell) adding all
the required functions.
Currently we set the default-state as "keep" to not change the default
configuration of the declared LEDs since almost every switch have a
default configuration.
[1] https://lore.kernel.org/lkml/20230216013230.22978-1-ansuelsmth@gmail.com/
Changes in new series v7:
- Drop ethernet-leds schema and add unevaluatedProperties to
ethernet-controller and ethernet-phy schema
- Drop function-enumerator binding from schema example and DT
- Set devname_mandatory for qca8k leds and assign better name to LEDs
using the format {slave_mii_bus id}:0{port number}:{color}:{function}
- Add Documentation patch for Correct LEDs naming from Andrew
- Changes in Andrew patch:
- net: phy: Add a binding for PHY LEDs
- Convert index from u32 to u8
- net: phy: phy_device: Call into the PHY driver to set LED brightness
- Fixup kernel doc
- Convert index from u32 to u8
- net: phy: marvell: Add software control of the LEDs
- Convert index from u32 to u8
- net: phy: phy_device: Call into the PHY driver to set LED blinking
- Kernel doc fix
- Convert index from u32 to u8
- net: phy: marvell: Implement led_blink_set()
- Convert index from u32 to u8
Changes in new series v6:
- Add leds-ethernet.yaml to document reg in led node
- Update ethernet-controller and ethernet-phy to follow new leds-ethernet schema
- Fix comments in qca8k-leds.c (at least -> at most)
(wrong GENMASK for led phy 0 and 4)
- Add review and ack tag from Pavel Machek
- Changes in Andrew patch:
- leds: Provide stubs for when CLASS_LED & NEW_LEDS are disabled
- Change LED_CLASS to NEW_LEDS for led_init_default_state_get()
- net: phy: Add a binding for PHY LEDs
- Add dependency on LED_CLASS
- Drop review tag from Michal Kubiak (patch modified)
Changes in new series v5:
- Rebase everything on top of net-next/main
- Add more info on LED probe fail for qca8k
- Drop some additional raw number and move to define in qca8k header
- Add additional info on LED mapping on qca8k regs
- Checks port number in qca8k switch port parse
- Changes in Andrew patch:
- Add additional patch for stubs when CLASS_LED disabled
- Drop CLASS_LED dependency for PHYLIB (to fix kbot errors reported)
Changes in new series v4:
- Changes in Andrew patch:
- net: phy: Add a binding for PHY LEDs:
- Rename phy_led: led_list to list
- Rename phy_device: led_list to leds
- Remove phy_leds_remove() since devm_ should do what is needed
- Fixup documentation for struct phy_led
- Fail probe on LED errors
- net: phy: phy_device: Call into the PHY driver to set LED brightness
- Moved phy_led::phydev from previous patch to here since it is first
used here.
- net: phy: marvell: Implement led_blink_set()
- Use int instead of unsigned
- net: phy: marvell: Add software control of the LEDs
- Use int instead of unsigned
- Add depends on LED_CLASS for qca8k Kconfig
- Fix Makefile for qca8k as suggested
- Move qca8k_setup_led_ctrl to separate header
- Move Documentation from dsa-port to ethernet-controller
- Drop trailing . from Andrew patch fro consistency
Changes in new series v3:
- Move QCA8K_LEDS Kconfig option from tristate to bool
- Use new helper led_init_default_state_get for default-state in qca8k
- Drop cled_qca8k_brightness_get() as there isn't a good way to describe
the mode the led is currently in
- Rework qca8k_led_brightness_get() to return true only when LED is set
to always ON
Changes in new series v2:
- Add LEDs node for rb3011
- Fix rb3011 switch node unevaluated properties while running
make dtbs_check
- Fix a copypaste error in qca8k-leds.c for port 4 required shift
- Drop phy-handle usage for qca8k and use qca8k_port_to_phy()
- Add review tag from Andrew
- Add Christian Marangi SOB in each Andrew patch
- Add extra description for dsa-port stressing that PHY have no access
and LED are controlled by the related MAC
- Add missing additionalProperties for dsa-port.yaml and ethernet-phy.yaml
Changes from the old v8 series:
- Drop linux,default-trigger set to netdev.
- Dropped every hw control related patch and implement only
blink_set and brightness_set
- Add default-state to "keep" for each LED node example
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Network LEDs can exist in both the MAC and the PHY. Naming is
difficult because the netdev name is neither stable or unique, do to
commands like ip link set name eth42 dev eth0, and network
namesspaces.
Give some example names where the MAC and the PHY have unique names
based on device tree nodes, or PCI bus addresses.
Since the LED can be used for anything which Linux supports for LEDs,
avoid using names like activity or link, rather describe the location
on the RJ-45, of what the RJ-45 is expected to be used for, WAN/LAN
etc.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The WAN port of the 370-RD has a Marvell PHY, with one LED on
the front panel.y List this LED in the device tree.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Document support for LEDs node in phy and add an example for it.
PHY LED will have to match led pattern and should be treated as a
generic led.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add Switch LED for each port for MikroTik RB3011UiAS-RM.
MikroTik RB3011UiAS-RM is a 10 port device with 2 qca8337 switch chips
connected.
It was discovered that in the hardware design all 3 Switch LED trace of
the related port is connected to the same LED. This was discovered by
setting to 'always on' the related led in the switch regs and noticing
that all 3 LED for the specific port (for example for port 1) cause the
connected LED for port 1 to turn on. As an extra test we tried enabling
2 different LED for the port resulting in the LED turned off only if
every led in the reg was off.
Aside from this funny and strange hardware implementation, the device
itself have one green LED for each port, resulting in 10 green LED one
for each of the 10 supported port.
Cc: Jonathan McDowell <noodles@earth.li>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPQ8064 MikroTik RB3011UiAS-RM DT have currently unevaluted properties
in the 2 switch nodes. The bindings #address-cells and #size-cells are
redundant and cause warning for 'Unevaluated properties are not
allowed'.
Drop these bindings to mute these warning as they should not be there
from the start.
Cc: Jonathan McDowell <noodles@earth.li>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Jonathan McDowell <noodles@earth.li>
Tested-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add LEDs definition example for qca8k Switch Family to describe how they
should be defined for a correct usage.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Document support for LEDs node in ethernet-controller.
Ethernet Controller may support different LEDs that can be configured
for different operation like blinking on traffic event or port link.
Also add some Documentation to describe the difference of these nodes
compared to PHY LEDs, since ethernet-controller LEDs are controllable
by the ethernet controller regs and the possible intergated PHY doesn't
have control on them.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Marvell PHY can blink the LEDs, simple on/off. All LEDs blink at
the same rate, and the reset default is 84ms per blink, which is
around 12Hz.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Linux LEDs can be requested to perform hardware accelerated
blinking. Pass this to the PHY driver, if it implements the op.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a brightness function, so the LEDs can be controlled from
software using the standard Linux LED infrastructure.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Linux LEDs can be software controlled via the brightness file in /sys.
LED drivers need to implement a brightness_set function which the core
will call. Implement an intermediary in phy_device, which will call
into the phy driver if it implements the necessary function.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Define common binding parsing for all PHY drivers with LEDs using
phylib. Parse the DT as part of the phy_probe and add LEDs to the
linux LED class infrastructure. For the moment, provide a dummy
brightness function, which will later be replaced with a call into the
PHY driver. This allows testing since the LED core might otherwise
reject an LED whose brightness cannot be set.
Add a dependency on LED_CLASS. It either needs to be built in, or not
enabled, since a modular build can result in linker errors.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Provide stubs for devm_led_classdev_register_ext() and
led_init_default_state_get() so that LED drivers embedded within other
drivers such as PHYs and Ethernet switches still build when LEDS_CLASS
or NEW_LEDS are disabled. This also helps with Kconfig dependencies,
which are somewhat hairy for phylib and mdio and only get worse when
adding a dependency on LED_CLASS.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add LEDs blink_set() support to qca8k Switch Family.
These LEDs support hw accellerated blinking at a fixed rate
of 4Hz.
Reject any other value since not supported by the LEDs switch.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add LEDs basic support for qca8k Switch Family by adding basic
brightness_set() support.
Since these LEDs refelect port status, the default label is set to
":port". DT binding should describe the color and function of the
LEDs using standard LEDs api.
Each LED always have the device name as prefix. The device name is
composed from the mii bus id and the PHY addr resulting in example
names like:
- qca8k-0.0:00:amber:lan
- qca8k-0.0:00:white:lan
- qca8k-0.0:01:amber:lan
- qca8k-0.0:01:white:lan
These LEDs supports only blocking variant of the brightness_set()
function since they can sleep during access of the switch leds to set
the brightness.
While at it add to the qca8k header file each mode defined by the Switch
Documentation for future use.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move qca8k_port_to_phy() to qca8k header as it's useful for future
reference in Switch LEDs module since the same logic is applied to get
the right index of the switch port.
Make it inline as it's simple function that just decrease the port.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Our use-case needs two AT ports available:
One for running a ppp daemon, and another one for management
This patch enables a second AT port on DATA1
Signed-off-by: Jaime Breva <jbreva@nayarsystems.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The hardware SDO has issue to fill txd for the moment, so fallback to
driver filling method.
Fixes: 98686cd21624 (wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices)
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
Drivers that do not generate IV/PN in software can safely rekey PTK0
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
Improves performance by using bulk allocation
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
mt7663 efuse has 0x600 bytes instead of 0x400. Increase the size in order
to fix issues with incomplete calibration data
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
This enables HW offloading amsdu/de-amsdu support for 802.11s mesh
interface.
Co-developed-by: Bo Jiao <bo.jiao@mediatek.com>
Signed-off-by: Bo Jiao <bo.jiao@mediatek.com>
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
The user is allowed to change beacon tx rate (HT/VHT/HE) from hostapd.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
Similar to BSS_CHANGED_BASIC_RATES, this enables mcast rate
configuration through fixed rate tables.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Change-Id: Ifc305e8c7de9a7df4ad5f856e2097d721a886aaa
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
The connac3 removes fixed rate fields to reduce txd size and introduces
global rate tables (64 entries) for rate setting. Driver needs to fill
the corresponding idx in MT_TXD6_TX_RATE while tx, and push mt76_rate
into predifined table at bootup stage so that mvif->basic_rates_idx
can immediately switch out once setting changes.
spe_idx is also needed for fixed rate frames, and will be updated by
future patches.
Note that all table entries are shared across driver and firmware
(i.e.TxBF), hence adding MT7996_BASIC_RATES_TBL to reflect mapping
status.
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
Matthieu Baerts says:
====================
mptcp: fixes around listening sockets and the MPTCP worker
Christoph Paasch reported a couple of issues found by syzkaller and
linked to operations done by the MPTCP worker on (un)accepted sockets.
Fixing these issues was not obvious and rather complex but Paolo Abeni
nicely managed to propose these excellent patches that seem to satisfy
syzkaller.
Patch 1 partially reverts a recent fix but while still providing a
solution for the previous issue, it also prevents the MPTCP worker from
running concurrently with inet_csk_listen_stop(). A warning is then
avoided. The partially reverted patch has been introduced in v6.3-rc3,
backported up to v6.1 and fixing an issue visible from v5.18.
Patch 2 prevents the MPTCP worker to race with mptcp_accept() causing a
UaF when a fallback to TCP is done while in parallel, the socket is
being accepted by the userspace. This is also a fix of a previous fix
introduced in v6.3-rc3, backported up to v6.1 but here fixing an issue
that is in theory there from v5.7. There is no need to backport it up
to here as it looks like it is only visible later, around v5.18, see the
previous cover-letter linked to this original fix.
====================
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
|
|
The mptcp worker and mptcp_accept() can race, as reported by Christoph:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 14351 at lib/refcount.c:25 refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25
Modules linked in:
CPU: 1 PID: 14351 Comm: syz-executor.2 Not tainted 6.3.0-rc1-gde5e8fd0123c #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25
Code: 02 31 ff 89 de e8 1b f0 a7 ff 84 db 0f 85 6e ff ff ff e8 3e f5 a7 ff 48 c7 c7 d8 c7 34 83 c6 05 6d 2d 0f 02 01 e8 cb 3d 90 ff <0f> 0b e9 4f ff ff ff e8 1f f5 a7 ff 0f b6 1d 54 2d 0f 02 31 ff 89
RSP: 0018:ffffc90000a47bf8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88802eae98c0 RSI: ffffffff81097d4f RDI: 0000000000000001
RBP: ffff88802e712180 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: ffff88802eaea148 R12: ffff88802e712100
R13: ffff88802e712a88 R14: ffff888005cb93a8 R15: ffff88802e712a88
FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f277fd89120 CR3: 0000000035486002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_add include/linux/refcount.h:199 [inline]
__refcount_inc include/linux/refcount.h:250 [inline]
refcount_inc include/linux/refcount.h:267 [inline]
sock_hold include/net/sock.h:775 [inline]
__mptcp_close+0x4c6/0x4d0 net/mptcp/protocol.c:3051
mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072
inet_release+0x56/0xa0 net/ipv4/af_inet.c:429
__sock_release+0x51/0xf0 net/socket.c:653
sock_close+0x18/0x20 net/socket.c:1395
__fput+0x113/0x430 fs/file_table.c:321
task_work_run+0x96/0x100 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x4fc/0x10c0 kernel/exit.c:869
do_group_exit+0x51/0xf0 kernel/exit.c:1019
get_signal+0x12b0/0x1390 kernel/signal.c:2859
arch_do_signal_or_restart+0x25/0x260 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
exit_to_user_mode_prepare+0x131/0x1a0 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x19/0x40 kernel/entry/common.c:296
do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fec4b4926a9
Code: Unable to access opcode bytes at 0x7fec4b49267f.
RSP: 002b:00007fec49f9dd78 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006bc058 RCX: 00007fec4b4926a9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006bc058
RBP: 00000000006bc050 R08: 00000000007df998 R09: 00000000007df998
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c
R13: fffffffffffffea8 R14: 000000000000000b R15: 000000000001fe40
</TASK>
The root cause is that the worker can force fallback to TCP the first
mptcp subflow, actually deleting the unaccepted msk socket.
We can explicitly prevent the race delaying the unaccepted msk deletion
at listener shutdown time. In case the closed subflow is later accepted,
just drop the mptcp context and let the user-space deal with the
paired mptcp socket.
Fixes: b6985b9b8295 ("mptcp: use the workqueue to destroy unaccepted sockets")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/375
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a partial revert of the blamed commit, with a relevant
change: mptcp_subflow_queue_clean() now just change the msk
socket status and stop the worker, so that the UaF issue addressed
by the blamed commit is not re-introduced.
The above prevents the mptcp worker from running concurrently with
inet_csk_listen_stop(), as such race would trigger a warning, as
reported by Christoph:
RSP: 002b:00007f784fe09cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
WARNING: CPU: 0 PID: 25807 at net/ipv4/inet_connection_sock.c:1387 inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387
RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f7850afd6a9
RDX: 0000000000000000 RSI: 0000000020000340 RDI: 0000000000000004
Modules linked in:
RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c
R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40
</TASK>
CPU: 0 PID: 25807 Comm: syz-executor.7 Not tainted 6.2.0-g778e54711659 #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387
RAX: 0000000000000000 RBX: ffff888100dfbd40 RCX: 0000000000000000
RDX: ffff8881363aab80 RSI: ffffffff81c494f4 RDI: 0000000000000005
RBP: ffff888126dad080 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff888100dfe040
R13: 0000000000000001 R14: 0000000000000000 R15: ffff888100dfbdd8
FS: 00007f7850a2c800(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32d26000 CR3: 000000012fdd8006 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
__tcp_close+0x5b2/0x620 net/ipv4/tcp.c:2875
__mptcp_close_ssk+0x145/0x3d0 net/mptcp/protocol.c:2427
mptcp_destroy_common+0x8a/0x1c0 net/mptcp/protocol.c:3277
mptcp_destroy+0x41/0x60 net/mptcp/protocol.c:3304
__mptcp_destroy_sock+0x56/0x140 net/mptcp/protocol.c:2965
__mptcp_close+0x38f/0x4a0 net/mptcp/protocol.c:3057
mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072
inet_release+0x53/0xa0 net/ipv4/af_inet.c:429
__sock_release+0x4e/0xf0 net/socket.c:651
sock_close+0x15/0x20 net/socket.c:1393
__fput+0xff/0x420 fs/file_table.c:321
task_work_run+0x8b/0xe0 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:296
do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f7850af70dc
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7850af70dc
RDX: 00007f7850a2c800 RSI: 0000000000000002 RDI: 0000000000000003
RBP: 00000000006bd980 R08: 0000000000000000 R09: 00000000000018a0
R10: 00000000316338a4 R11: 0000000000000293 R12: 0000000000211e31
R13: 00000000006bc05c R14: 00007f785062c000 R15: 0000000000211af0
Fixes: 0a3f4f1f9c27 ("mptcp: fix UaF in listener shutdown")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/371
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes a missing 8 byte for the header size calculation. The
ipv6_rpl_srh_size() is used to check a skb_pull() on skb->data which
points to skb_transport_header(). Currently we only check on the
calculated addresses fields using CmprI and CmprE fields, see:
https://www.rfc-editor.org/rfc/rfc6554#section-3
there is however a missing 8 byte inside the calculation which stands
for the fields before the addresses field. Those 8 bytes are represented
by sizeof(struct ipv6_rpl_sr_hdr) expression.
Fixes: 8610c7c6e3bd ("net: ipv6: add support for rpl sr exthdr")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Reported-by: maxpl0it <maxpl0it@protonmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When vmxnet3_rq_create() fails to allocate rq->data_ring.base due to page
allocation failure, subsequent call to vmxnet3_rq_rx_complete() can result in
NULL pointer dereference.
To fix this bug, check not only that rxDataRingUsed is true but also that
adapter->rxdataring_enabled is true before calling memcpy() in
vmxnet3_rq_rx_complete().
[1728352.477993] ethtool: page allocation failure: order:9, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
...
[1728352.478009] Call Trace:
[1728352.478028] dump_stack+0x41/0x60
[1728352.478035] warn_alloc.cold.120+0x7b/0x11b
[1728352.478038] ? _cond_resched+0x15/0x30
[1728352.478042] ? __alloc_pages_direct_compact+0x15f/0x170
[1728352.478043] __alloc_pages_slowpath+0xcd3/0xd10
[1728352.478047] __alloc_pages_nodemask+0x2e2/0x320
[1728352.478049] __dma_direct_alloc_pages.constprop.25+0x8a/0x120
[1728352.478053] dma_direct_alloc+0x5a/0x2a0
[1728352.478056] vmxnet3_rq_create.part.57+0x17c/0x1f0 [vmxnet3]
...
[1728352.478188] vmxnet3 0000:0b:00.0 ens192: rx data ring will be disabled
...
[1728352.515347] BUG: unable to handle kernel NULL pointer dereference at 0000000000000034
...
[1728352.515440] RIP: 0010:memcpy_orig+0x54/0x130
...
[1728352.515655] Call Trace:
[1728352.515665] <IRQ>
[1728352.515672] vmxnet3_rq_rx_complete+0x419/0xef0 [vmxnet3]
[1728352.515690] vmxnet3_poll_rx_only+0x31/0xa0 [vmxnet3]
...
Signed-off-by: Seiji Nishikawa <snishika@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tariq Toukan says:
====================
net/mlx5e: Extend XDP multi-buffer capabilities
This series extends the XDP multi-buffer support in the mlx5e driver.
Patchset breakdown:
- Infrastructural changes and preparations.
- Add XDP multi-buffer support for XDP redirect-in.
- Use TX MPWQE (multi-packet WQE) HW feature for non-linear
single-segmented XDP frames.
- Add XDP multi-buffer support for striding RQ.
In Striding RQ, we overcome the lack of headroom and tailroom between
the RQ strides by allocating a side page per packet and using it for the
xdp_buff descriptor. We structure the xdp_buff so that it contains
nothing in the linear part, and the whole packet resides in the
fragments.
Performance highlight:
Packet rate test, 64 bytes, 32 channels, MTU 9000 bytes.
CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz.
NIC: ConnectX-6 Dx, at 100 Gbps.
+----------+-------------+-------------+---------+
| Test | Legacy RQ | Striding RQ | Speedup |
+----------+-------------+-------------+---------+
| XDP_DROP | 101,615,544 | 117,191,020 | +15% |
+----------+-------------+-------------+---------+
| XDP_TX | 95,608,169 | 117,043,422 | +22% |
+----------+-------------+-------------+---------+
Series generated against net commit:
e61caf04b9f8 Merge branch 'page_pool-allow-caching-from-safely-localized-napi'
I'm submitting this directly as Saeed is traveling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Here we add support for multi-buffer XDP handling in Striding RQ, which
is our default out-of-the-box RQ type. Before this series, loading such
an XDP program would fail, until you switch to the legacy RQ (by
unsetting the rx_striding_rq priv-flag).
To overcome the lack of headroom and tailroom between the strides, we
allocate a side page to be used for the descriptor (xdp_buff / skb) and
the linear part. When an XDP program is attached, we structure the
xdp_buff so that it contains no data in the linear part, and the whole
packet resides in the fragments.
In case of XDP_PASS, where an SKB still needs to be created, we copy up
to 256 bytes to its linear part, to match the current behavior, and
satisfy functions that assume finding the packet headers in the SKB
linear part (like eth_type_trans).
Performance testing:
Packet rate test, 64 bytes, 32 channels, MTU 9000 bytes.
CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz.
NIC: ConnectX-6 Dx, at 100 Gbps.
+----------+-------------+-------------+---------+
| Test | Legacy RQ | Striding RQ | Speedup |
+----------+-------------+-------------+---------+
| XDP_DROP | 101,615,544 | 117,191,020 | +15% |
+----------+-------------+-------------+---------+
| XDP_TX | 95,608,169 | 117,043,422 | +22% |
+----------+-------------+-------------+---------+
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for supporting XDP multi-buffer in striding RQ, use
xdp_buff struct to describe the packet. Make its skb_shared_info collide
the one of the allocated SKB, then add the fragments using the xdp_buff
API.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make the function more generic. Let it get an additional frame_sz
parameter instead of deriving it from the RQ struct.
No functional change here, just a preparation for a downstream patch.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce mlx5e_add_skb_shared_info_frag(), a function dedicated for
adding a fragment into a struct skb_shared_info object.
Use it in the Legacy RQ flow. Similar usage will be added in a
downstream patch by the corresponding Striding RQ flow.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Under a few restrictions, TX MPWQE feature can serve multiple TX packets
in a single TX descriptor. It requires each of the packets to have a
single scatter entry / segment.
Today we allow only linear frames to use this feature, although there's
no real problem with non-linear ones where the whole packet reside in
the first fragment.
Expand the XDP TX MPWQE feature support to include such frames. This is
in preparation for the downstream patch, in which we will generate such
non-linear frames.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove the assumption of non-zero linear length in the XDP xmit
function, used to serve both internal XDP_TX operations as well as
redirected-in requests.
Do not apply the MLX5E_XDP_MIN_INLINE check unless necessary.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
calculations
Function mlx5e_rx_get_linear_stride_sz() returns PAGE_SIZE immediately
in case an XDP program is attached. The more accurate formula is
ALIGN(sz, PAGE_SIZE), to prevent two packets from residing on the same
page.
The assumption behind the current code is that sz <= PAGE_SIZE holds for
all cases with XDP program set.
This is true because it is being called from:
- 3 times from Striding RQ flows, in which XDP is not supported for such
large packets.
- 1 time from Legacy RQ flow, under the condition
mlx5e_rx_is_linear_skb().
No functional change here, just removing the implied assumption in
preparation for supporting XDP multi-buffer in Striding RQ.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Change mlx5e_xdp_allowed() so it gets the params structure with the
xdp_prog applied, rather than creating a local copy based on the current
params in priv.
This reduces the amount of memory on the stack, and acts on the exact
params instance that's about to be applied.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Non-linear mem scheme of Striding RQ does not yet support XDP at this
point. Take the check where it belongs, inside the params validation
function mlx5e_params_validate_xdp().
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Handle multi-buffer XDP redirect-in requests coming through
mlx5e_xdp_xmit.
Extend struct mlx5e_xmit_data_frags with an additional dma_arr field, to
point to the fragments dma mapping, as they cannot be retrieved via the
page_pool_get_dma_addr() function.
Push a dma_addr xdpi instance per each fragment, and use them in the
completion flow to dma_unmap the frags.
Finally, remove the restriction in mlx5e_open_xdpsq, and set the flag in
xdp_features.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Here we fix the current wi->num_pkts abuse, as it was used to indicate
multiple xdpi entries in the xdpi_fifo.
Instead, reduce mlx5e_xdp_info to the size of a single field, making it
a union of unions. Per packet, use as many instances as needed to
provide the information needed at the time of completion.
The sequence of xdpi instances pushed is well defined, derived by the
xmit_mode.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is not likely nor unlikely that the xdp buff has fragments, it
depends on the program loaded and size of the packet received.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce struct mlx5e_xmit_data_frags to be used for non-linear xmit
buffers. Let it include sinfo pointer.
Take one bit from the len field to indicate if the descriptor has
fragments and can be casted-up into the extended version.
Zero-init to make sure has_frags, and potentially future fields, are
zero when not explicitly assigned.
Another field will be added in a downstream patch to indicate and point
to dma addresses of the different frags, for redirect-in requests.
This simplifies the mlx5e_xmit_xdp_frame/mlx5e_xmit_xdp_frame_mpwqe
functions params.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move TX datapath struct from the generic en.h to the datapath txrx.h
header, where it belongs.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move struct mlx5e_xdp_info and enum mlx5e_xdp_xmit_mode from the generic
en.h to the XDP header, where they belong.
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a net device is put administratively up, its 'IFF_UP' flag is set
(if not set already) and a 'NETDEV_UP' notification is emitted, which
causes the 8021q driver to add VLAN ID 0 on the device. The reverse
happens when a net device is put administratively down.
When changing the type of a bond to Ethernet, its 'IFF_UP' flag is
incorrectly cleared, resulting in the kernel skipping the above process
and VLAN ID 0 being leaked [1].
Fix by restoring the flag when changing the type to Ethernet, in a
similar fashion to the restoration of the 'IFF_SLAVE' flag.
The issue can be reproduced using the script in [2], with example out
before and after the fix in [3].
[1]
unreferenced object 0xffff888103479900 (size 256):
comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
hex dump (first 32 bytes):
00 a0 0c 15 81 88 ff ff 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
[<ffffffff8406426c>] vlan_vid_add+0x30c/0x790
[<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
[<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
[<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
[<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
[<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
[<ffffffff8379af36>] do_setlink+0xb96/0x4060
[<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
[<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
[<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
[<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
[<ffffffff839a738f>] netlink_unicast+0x53f/0x810
[<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
[<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
[<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
unreferenced object 0xffff88810f6a83e0 (size 32):
comm "ip", pid 329, jiffies 4294775225 (age 28.561s)
hex dump (first 32 bytes):
a0 99 47 03 81 88 ff ff a0 99 47 03 81 88 ff ff ..G.......G.....
81 00 00 00 01 00 00 00 cc cc cc cc cc cc cc cc ................
backtrace:
[<ffffffff81a6051a>] kmalloc_trace+0x2a/0xe0
[<ffffffff84064369>] vlan_vid_add+0x409/0x790
[<ffffffff84068e21>] vlan_device_event+0x1491/0x21a0
[<ffffffff81440c8e>] notifier_call_chain+0xbe/0x1f0
[<ffffffff8372383a>] call_netdevice_notifiers_info+0xba/0x150
[<ffffffff837590f2>] __dev_notify_flags+0x132/0x2e0
[<ffffffff8375ad9f>] dev_change_flags+0x11f/0x180
[<ffffffff8379af36>] do_setlink+0xb96/0x4060
[<ffffffff837adf6a>] __rtnl_newlink+0xc0a/0x18a0
[<ffffffff837aec6c>] rtnl_newlink+0x6c/0xa0
[<ffffffff837ac64e>] rtnetlink_rcv_msg+0x43e/0xe00
[<ffffffff839a99e0>] netlink_rcv_skb+0x170/0x440
[<ffffffff839a738f>] netlink_unicast+0x53f/0x810
[<ffffffff839a7fcb>] netlink_sendmsg+0x96b/0xe90
[<ffffffff8369d12f>] ____sys_sendmsg+0x30f/0xa70
[<ffffffff836a6d7a>] ___sys_sendmsg+0x13a/0x1e0
[2]
ip link add name t-nlmon type nlmon
ip link add name t-dummy type dummy
ip link add name t-bond type bond mode active-backup
ip link set dev t-bond up
ip link set dev t-nlmon master t-bond
ip link set dev t-nlmon nomaster
ip link show dev t-bond
ip link set dev t-dummy master t-bond
ip link show dev t-bond
ip link del dev t-bond
ip link del dev t-dummy
ip link del dev t-nlmon
[3]
Before:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 46:57:39:a4:46:a2 brd ff:ff:ff:ff:ff:ff
After:
12: t-bond: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/netlink
12: t-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 66:48:7b:74:b6:8a brd ff:ff:ff:ff:ff:ff
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Fixes: 75c78500ddad ("bonding: remap muticast addresses without using dev_close() and dev_open()")
Fixes: 9ec7eb60dcbc ("bonding: restore IFF_MASTER/SLAVE flags on bond enslave ether type change")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/netdev/78a8a03b-6070-3e6b-5042-f848dab16fb8@alu.unizg.hr/
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|