Age | Commit message (Collapse) | Author |
|
There is no need to read USR2 twice at every loop iteration: get rid of the
second read.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Link: https://lore.kernel.org/r/20230201142700.4346-6-sorganov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There is no reason to prematurely break out of FIFO reading loop, and it
might cause needless reenters into ISR, so keep reading until FIFO is
empty.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Link: https://lore.kernel.org/r/20230201142700.4346-5-sorganov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Do not call uart_handle_sysrq_char() if we got any receive error along with
the character, as we don't want random junk to be considered a sysrq.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Link: https://lore.kernel.org/r/20230201142700.4346-4-sorganov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Check if hardware Rx flood is in progress, and issue soft reset to UART to
stop the flood.
A way to reproduce the flood (checked on iMX6SX) is: open iMX UART at 9600
8N1, and from external source send 0xf0 char at 115200 8N1. In about 90% of
cases this starts a flood of "receiving" of 0xff characters by the iMX UART
that is terminated by any activity on RxD line, or could be stopped by
issuing soft reset to the UART (just stop/start of RX does not help). Note
that in essence what we did here is sending isolated start bit about 2.4
times shorter than it is to be if issued on the UART configured baud rate.
There was earlier attempt to fix similar issue in: 'commit
b38cb7d25711 ("serial: imx: Disable new features of autobaud detection")',
but apparently it only gets harder to reproduce the issue after that
commit.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Link: https://lore.kernel.org/r/20230201142700.4346-3-sorganov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We perform soft reset in 2 places, slightly differently for no sufficient
reasons, so move more generic variant to a function, and re-use the code.
Out of 2 repeat counters, 10 and 100, select 10, as the code works at
interrupts disabled, and in practice the reset happens immediately.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Link: https://lore.kernel.org/r/20230201142700.4346-2-sorganov@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
pci1xxxx's quad-uart function has the capability to wake up UART
from suspend state. Enable wakeup before entering into suspend and
disable wakeup on resume.
Co-developed-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Link: https://lore.kernel.org/r/20230207164814.3104605-5-kumaravel.thiagarajan@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
pci1xxxx uart supports RS485 mode of operation in the hardware with
auto-direction control with configurable delay for releasing RTS after
the transmission. This patch adds support for the RS485 mode.
Co-developed-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230207164814.3104605-4-kumaravel.thiagarajan@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
pci1xxxx is a PCIe switch with a multi-function endpoint on one of
its downstream ports. Quad-uart is one of the functions in the
multi-function endpoint. This driver loads for the quad-uart and
enumerates single or multiple instances of uart based on the PCIe
subsystem device ID.
Co-developed-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230207164814.3104605-3-kumaravel.thiagarajan@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Move implementation of setup_port func() to serial8250_pci_setup_port.
Co-developed-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Tharun Kumar P <tharunkumar.pasumarthi@microchip.com>
Signed-off-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230207164814.3104605-2-kumaravel.thiagarajan@microchip.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20230202141221.2293012-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Synchronizes hotplug remove with the freeing of the port.
This ensures we have freed all the memory associated with
this port and are not leaking memory.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20230203155802.404324-6-brking@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When hotplug removing an hvcs device, we need to ensure the
hangup processing is done prior to exiting the remove function,
so use tty_vhangup to do the hangup processing directly
rather than using tty_hangup which simply schedules the hangup
work for later execution.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20230203155802.404324-5-brking@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Grab a reference to the tty when removing the hvcs to ensure
it does not get freed unexpectedly.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20230203155802.404324-4-brking@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Rather than manually creating attributes for the hvcs driver,
let the driver core do this for us. This also fixes some hotplug
remove issues and ensures that cleanup of these attributes
is done in the right order.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20230203155802.404324-3-brking@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the dev_groups functionality to manage the attribute groups
for hvcs devices. This simplifies the code and also eliminates
errors coming from kernfs when attempting to remove a console
device that is in use.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20230203155802.404324-2-brking@linux.vnet.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There maybe pending USR interrupt before requesting irq, however
uart_add_one_port has not executed, so there will be kernel panic:
[ 0.795668] Unable to handle kernel NULL pointer dereference at virtual addre
ss 0000000000000080
[ 0.802701] Mem abort info:
[ 0.805367] ESR = 0x0000000096000004
[ 0.808950] EC = 0x25: DABT (current EL), IL = 32 bits
[ 0.814033] SET = 0, FnV = 0
[ 0.816950] EA = 0, S1PTW = 0
[ 0.819950] FSC = 0x04: level 0 translation fault
[ 0.824617] Data abort info:
[ 0.827367] ISV = 0, ISS = 0x00000004
[ 0.831033] CM = 0, WnR = 0
[ 0.833866] [0000000000000080] user address but active_mm is swapper
[ 0.839951] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 0.845953] Modules linked in:
[ 0.848869] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.1+g56321e101aca #1
[ 0.855617] Hardware name: Freescale i.MX8MP EVK (DT)
[ 0.860452] pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.867117] pc : __imx_uart_rxint.constprop.0+0x11c/0x2c0
[ 0.872283] lr : imx_uart_int+0xf8/0x1ec
The issue only happends in the inmate linux when Jailhouse hypervisor
enabled. The test procedure is:
while true; do
jailhouse enable imx8mp.cell
jailhouse cell linux xxxx
sleep 10
jailhouse cell destroy 1
jailhouse disable
sleep 5
done
And during the upper test, press keys to the 2nd linux console.
When `jailhouse cell destroy 1`, the 2nd linux has no chance to put
the uart to a quiese state, so USR1/2 may has pending interrupts. Then
when `jailhosue cell linux xx` to start 2nd linux again, the issue
trigger.
In order to disable irqs before requesting them, both UCR1 and UCR2 irqs
should be disabled, so here fix that, disable the Ageing Timer interrupt
in UCR2 as UCR1 does.
Fixes: 8a61f0c70ae6 ("serial: imx: Disable irqs before requesting them")
Suggested-by: Sherry Sun <sherry.sun@nxp.com>
Reviewed-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Acked-by: Jason Liu <jason.hui.liu@nxp.com>
Link: https://lore.kernel.org/r/20230206013016.29352-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The previous 'commit 846651eca073 ("serial: fsl_lpuart: RS485 RTS
polariy is inverse")' only fixed the inverse issue on lpuart 8bit
platforms.
This is a follow-up patch to fix the RS485 polarity inverse
issue on lpuart 32bit platforms.
Fixes: 03895cf41d18 ("tty: serial: fsl_lpuart: Add support for RS-485")
Reported-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Link: https://lore.kernel.org/r/20230207162420.3647904-1-shenwei.wang@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add the missing properties as compatible, reg, sound-dai-cells.
And then check this file using 'make dt_binding_check'.
Signed-off-by: Kiseok Jo <kiseok.jo@irondevice.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230208092420.5037-8-kiseok.jo@irondevice.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
It seems correct that the user changes the TDM slot needed after
device probe.
Signed-off-by: Kiseok Jo <kiseok.jo@irondevice.com>
Link: https://lore.kernel.org/r/20230208092420.5037-6-kiseok.jo@irondevice.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
It's necessary to set the value for each device, so remove that.
Signed-off-by: Kiseok Jo <kiseok.jo@irondevice.com>
Link: https://lore.kernel.org/r/20230208092420.5037-5-kiseok.jo@irondevice.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Tidyup reg/reg-name "maxItems".
Pointed by Krzysztof, and corrected by Rob.
Link: https://lore.kernel.org/r/46974ae7-5f7f-8fc1-4ea8-fe77b58f5bfb@linaro.org
Link: https://lore.kernel.org/r/20230207211621.GA4158591-robh@kernel.org
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/87pmalt01x.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
SOF driver calls snd_sof_dsp_update8 with parameters mask and value but
the snd_sof_dsp_update8 declares these two parameters in reverse order.
This causes some issues such as d0i3 register can't be set correctly
Now change function definition according to common SOF usage.
Fixes: c28a36b012f1 ("ASoC: SOF: ops: add snd_sof_dsp_updateb() helper")
Signed-off-by: Rander Wang <rander.wang@intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Chao Song <chao.song@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20230208104404.20554-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next
Mika writes:
thunderbolt: Changes for v6.3 merge window
This includes following Thunderbolt/USB4 changes for the v6.3 merge
window:
- Add support for DisplayPort bandwidth allocation mode
- Debug logging improvements
- Minor cleanups.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Add missing kernel-doc comment to tb_tunnel_maximum_bandwidth()
thunderbolt: Handle bandwidth allocation mode enablement notification
thunderbolt: Add support for DisplayPort bandwidth allocation mode
thunderbolt: Include the additional DP IN double word in debugfs dump
thunderbolt: Add functions to support DisplayPort bandwidth allocation mode
thunderbolt: Increase timeout of DP OUT adapter handshake
thunderbolt: Take CL states into account when waiting for link to come up
thunderbolt: Improve debug logging in tb_available_bandwidth()
thunderbolt: Log DP adapter type
thunderbolt: Use decimal port number in control and tunnel logs too
thunderbolt: Refactor tb_acpi_add_link()
thunderbolt: Use correct type in tb_port_is_clx_enabled() prototype
|
|
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230202141919.2298821-1-gregkh@linuxfoundation.org
|
|
This looks like it came across from x86, but x86 uses TLB_FLUSH_ALL as
a parameter to internal functions. Powerpc never sets it anywhere.
Remove the associated logic and leave a warning for now.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230203111718.1149852-4-npiggin@gmail.com
|
|
The MMU_NO_CONTEXT checks are an unnecessary complication. Make
these warn to prepare to remove them in future.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230203111718.1149852-3-npiggin@gmail.com
|
|
need_flush_all is only set by arch code to instruct generic tlb_flush
to flush all. It is never set by powerpc, so it can be removed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230203111718.1149852-2-npiggin@gmail.com
|
|
CLANG only knows the following CPUs:
generic, 440, 450, 601, 602, 603, 603e, 603ev, 604, 604e, 620, 630,
g3, 7400, g4, 7450, g4+, 750, 8548, 970, g5, a2, e500, e500mc, e5500,
power3, pwr3, power4, pwr4, power5, pwr5, power5x, pwr5x, power6,
pwr6, power6x, pwr6x, power7, pwr7, power8, pwr8, power9, pwr9,
power10, pwr10, powerpc, ppc, ppc32, powerpc64, ppc64, powerpc64le,
ppc64le, futur
Disable other ones when CC_IS_CLANG.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e62892e32c14a7a5738c597e39e0082cb0abf21c.1675335659.git.christophe.leroy@csgroup.eu
|
|
Add device tree file for sam9x60 curiosity board.
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-9-durai.manickamkr@microchip.com
|
|
Adding the sam9x60 curiosity board from Microchip into the atmel AT91 board
description yaml file.
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-8-durai.manickamkr@microchip.com
|
|
Added the missing flexcom functions for all the flexcom nodes.
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
[durai.manickamkr@microchip.com: added missing UART compatibles]
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-7-durai.manickamkr@microchip.com
|
|
Add dma bindings for flexcom nodes in the soc dtsi file. Users those who
don't wish to use the DMA function for their flexcom functions can
overwrite the dma bindings in the board device tree file.
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
[durai.manickamkr@microchip.com: fixed code indentation and updated commit log]
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-6-durai.manickamkr@microchip.com
|
|
The UART submodule in Flexcom has 16-byte Transmit and Receive FIFOs.
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-5-durai.manickamkr@microchip.com
|
|
The ranges, #address-cells and #size-cells properties are not required,
remove them from the spi4 node.
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-4-durai.manickamkr@microchip.com
|
|
Move the flexcom definitions from board specific DTS file
to the SoC specific DTSI file for sam9x60ek.
[durai.manickamkr@microchip.com: Logical split-up of this patch and added
missing UART5 compatibles]
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Hari Prasath Gujulan Elango <Hari.PrasathGE@microchip.com>
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-3-durai.manickamkr@microchip.com
|
|
Fixed the label numbering of the flexcom functions so that all
13 flexcom functions of sam9x60 are in the following order when the missing
flexcom functions are added:
flx0: uart0, spi0, i2c0
flx1: uart1, spi1, i2c1
flx2: uart2, spi2, i2c2
flx3: uart3, spi3, i2c3
flx4: uart4, spi4, i2c4
flx5: uart5, spi5, i2c5
flx6: uart6, i2c6
flx7: uart7, i2c7
flx8: uart8, i2c8
flx9: uart9, i2c9
flx10: uart10, i2c10
flx11: uart11, i2c11
flx12: uart12, i2c12
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20230207110651.197268-2-durai.manickamkr@microchip.com
|
|
Vladimir Oltean says:
====================
taprio automatic queueMaxSDU and new TXQ selection procedure
This patch set addresses 2 design limitations in the taprio software scheduler:
1. Software scheduling fundamentally prioritizes traffic incorrectly,
in a way which was inspired from Intel igb/igc drivers and does not
follow the inputs user space gives (traffic classes and TC to TXQ
mapping). Patch 05/15 handles this, 01/15 - 04/15 are preparations
for this work.
2. Software scheduling assumes that the gate for a traffic class closes
as soon as the next interval begins. But this isn't true.
If consecutive schedule entries have that traffic class gate open,
there is no "gate close" event and taprio should keep dequeuing from
that TC without interruptions. Patches 06/15 - 15/15 handle this.
Patch 10/15 is a generic Qdisc change required for this to work.
Future development directions which depend on this patch set are:
- Propagating the automatic queueMaxSDU calculation down to offloading
device drivers, instead of letting them calculate this, as
vsc9959_tas_guard_bands_update() does today.
- A software data path for tc-taprio with preemptible traffic and
Hold/Release events.
v1 at:
https://patchwork.kernel.org/project/netdevbpf/cover/20230128010719.2182346-1-vladimir.oltean@nxp.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Improve commit 497cc00224cf ("taprio: Handle short intervals and large
packets") to only perform segmentation when skb->len exceeds what
taprio_dequeue() expects.
In practice, this will make the biggest difference when a traffic class
gate is always open in the schedule. This is because the max_frm_len
will be U32_MAX, and such large skb->len values as Kurt reported will be
sent just fine unsegmented.
What I don't seem to know how to handle is how to make sure that the
segmented skbs themselves are smaller than the maximum frame size given
by the current queueMaxSDU[tc]. Nonetheless, we still need to drop
those, otherwise the Qdisc will hang.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The majority of the taprio_enqueue()'s function is spent doing TCP
segmentation, which doesn't look right to me. Compilers shouldn't have a
problem in inlining code no matter how we write it, so move the
segmentation logic to a separate function.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
durations
taprio today has a huge problem with small TC gate durations, because it
might accept packets in taprio_enqueue() which will never be sent by
taprio_dequeue().
Since not much infrastructure was available, a kludge was added in
commit 497cc00224cf ("taprio: Handle short intervals and large
packets"), which segmented large TCP segments, but the fact of the
matter is that the issue isn't specific to large TCP segments (and even
worse, the performance penalty in segmenting those is absolutely huge).
In commit a54fc09e4cba ("net/sched: taprio: allow user input of per-tc
max SDU"), taprio gained support for queueMaxSDU, which is precisely the
mechanism through which packets should be dropped at qdisc_enqueue() if
they cannot be sent.
After that patch, it was necessary for the user to manually limit the
maximum MTU per TC. This change adds the necessary logic for taprio to
further limit the values specified (or not specified) by the user to
some minimum values which never allow oversized packets to be sent.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I have one practical reason for doing this and one concerning correctness.
The practical reason has to do with a follow-up patch, which aims to mix
2 sources of max_sdu (one coming from the user and the other automatically
calculated based on TC gate durations @current link speed). Among those
2 sources of input, we must always select the smaller max_sdu value, but
this can change at various link speeds. So the max_sdu coming from the
user must be kept separated from the value that is operationally used
(the minimum of the 2), because otherwise we overwrite it and forget
what the user asked us to do.
To solve that, this patch proposes that struct sched_gate_list contains
the operationally active max_frm_len, and q->max_sdu contains just what
was requested by the user.
The reason having to do with correctness is based on the following
observation: the admin sched_gate_list becomes operational at a given
base_time in the future. Until then, it is inactive and applies no
shaping, all gates are open, etc. So the queueMaxSDU dropping shouldn't
apply either (this is a mechanism to ensure that packets smaller than
the largest gate duration for that TC don't hang the port; clearly it
makes little sense if the gates are always open).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Vinicius intended taprio to take the L1 overhead into account when
estimating packet transmission time through user input, specifically
through the qdisc size table (man tc-stab).
Something like this:
tc qdisc replace dev $eth root stab overhead 24 taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 \
sched-entry S 0x7e 9000000 \
sched-entry S 0x82 1000000 \
max-sdu 0 0 0 0 0 0 0 200 \
flags 0x0 clockid CLOCK_TAI
Without the overhead being specified, transmission times will be
underestimated and will cause late transmissions. For an offloading
driver, it might even cause TX hangs if there is no open gate large
enough to send the maximum sized packets for that TC (including L1
overhead). Properly knowing the L1 overhead will ensure that we are able
to auto-calculate the queueMaxSDU per traffic class just right, and
avoid these hangs due to head-of-line blocking.
We can't make the stab mandatory due to existing setups, but we can warn
the user that it's important with a warning netlink extack.
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220505160357.298794-1-vladimir.oltean@nxp.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some qdiscs like taprio turn out to be actually pretty reliant on a well
configured stab, to not underestimate the skb transmission time (by
properly accounting for L1 overhead).
In a future change, taprio will need the stab, if configured by the
user, to be available at ops->init() time. It will become even more
important in upcoming work, when the overhead will be used for the
queueMaxSDU calculation that is passed to an offloading driver.
However, rcu_assign_pointer(sch->stab, stab) is called right after
ops->init(), making it unavailable, and I don't really see a good reason
for that.
Move it earlier, which nicely seems to simplify the error handling path
as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
taprio_dequeue_from_txq() looks at the entry->end_time to determine
whether the skb will overrun its traffic class gate, as if at the end of
the schedule entry there surely is a "gate close" event for it. Hint:
maybe there isn't.
For each schedule entry, introduce an array of kernel times which
actually tracks when in the future will there be an *actual* gate close
event for that traffic class, and use that in the guard band overrun
calculation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently taprio assumes that the budget for a traffic class expires at
the end of the current interval as if the next interval contains a "gate
close" event for this traffic class.
This is, however, an unfounded assumption. Allow schedule entry
intervals to be fused together for a particular traffic class by
calculating the budget until the gate *actually* closes.
This means we need to keep budgets per traffic class, and we also need
to update the budget consumption procedure.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a confusion in terms in taprio which makes what is called
"close_time" to be actually used for 2 things:
1. determining when an entry "closes" such that transmitted skbs are
never allowed to overrun that time (?!)
2. an aid for determining when to advance and/or restart the schedule
using the hrtimer
It makes more sense to call this so-called "close_time" "end_time",
because it's not clear at all to me what "closes". Future patches will
hopefully make better use of the term "to close".
This is an absolutely mechanical change.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current taprio code operates on a very simplistic (and incorrect)
assumption: that egress scheduling for a traffic class can only take
place for the duration of the current interval, or i.o.w., it assumes
that at the end of each schedule entry, there is a "gate close" event
for all traffic classes.
As an example, traffic sent with the schedule below will be jumpy, even
though all 8 TC gates are open, so there is absolutely no "gate close"
event (effectively a transition from BIT(tc)==1 to BIT(tc)==0 in
consecutive schedule entries):
tc qdisc replace dev veth0 parent root taprio \
num_tc 2 \
map 0 1 \
queues 1@0 1@1 \
base-time 0 \
sched-entry S 0xff 4000000000 \
clockid CLOCK_TAI \
flags 0x0
This qdisc simply does not have what it takes in terms of logic to
*actually* compute the durations of traffic classes. Also, it does not
recognize the need to use this information on a per-traffic-class basis:
it always looks at entry->interval and entry->close_time.
This change proposes that each schedule entry has an array called
tc_gate_duration[tc]. This holds the information: "for how long will
this traffic class gate remain open, starting from *this* schedule
entry". If the traffic class gate is always open, that value is equal to
the cycle time of the schedule.
We'll also need to keep track, for the purpose of queueMaxSDU[tc]
calculation, what is the maximum time duration for a traffic class
having an open gate. This gives us directly what is the maximum sized
packet that this traffic class will have to accept. For everything else
it has to qdisc_drop() it in qdisc_enqueue().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Current taprio software implementation is haunted by the shadow of the
igb/igc hardware model. It iterates over child qdiscs in increasing
order of TXQ index, therefore giving higher xmit priority to TXQ 0 and
lower to TXQ N. According to discussions with Vinicius, that is the
default (perhaps even unchangeable) prioritization scheme used for the
NICs that taprio was first written for (igb, igc), and we have a case of
two bugs canceling out, resulting in a functional setup on igb/igc, but
a less sane one on other NICs.
To the best of my understanding, taprio should prioritize based on the
traffic class, so it should really dequeue starting with the highest
traffic class and going down from there. We get to the TXQ using the
tc_to_txq[] netdev property.
TXQs within the same TC have the same (strict) priority, so we should
pick from them as fairly as we can. We can achieve that by implementing
something very similar to q->curband from multiq_dequeue().
Since igb/igc really do have TXQ 0 of higher hardware priority than
TXQ 1 etc, we need to preserve the behavior for them as well. We really
have no choice, because in txtime-assist mode, taprio is essentially a
software scheduler towards offloaded child tc-etf qdiscs, so the TXQ
selection really does matter (not all igb TXQs support ETF/SO_TXTIME,
says Kurt Kanzenbach).
To preserve the behavior, we need a capability bit so that taprio can
determine if it's running on igb/igc, or on something else. Because igb
doesn't offload taprio at all, we can't piggyback on the
qdisc_offload_query_caps() call from taprio_enable_offload(), but
instead we need a separate call which is also made for software
scheduling.
Introduce two static keys to minimize the performance penalty on systems
which only have igb/igc NICs, and on systems which only have other NICs.
For mixed systems, taprio will have to dynamically check whether to
dequeue using one prioritization algorithm or using the other.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Simplify taprio_dequeue_from_txq() by noticing that we can goto one call
earlier than the previous skb_found label. This is possible because
we've unified the treatment of the child->ops->dequeue(child) return
call, we always try other TXQs now, instead of abandoning the root
dequeue completely if we failed in the peek() case.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Future changes will refactor the TXQ selection procedure, and a lot of
stuff will become messy, the indentation of the bulk of the dequeue
procedure would increase, etc.
Break out the bulk of the function into a new one, which knows the TXQ
(child qdisc) we should perform a dequeue from.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|