Age | Commit message (Collapse) | Author |
|
Where there is no other basis on which to order declarations
let us prefer reverse xmas tree. Also reduce scope where
sensible.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-16-jic23@kernel.org
|
|
Whilst not important, it's nice to have the general headers in
alphabetical order.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-15-jic23@kernel.org
|
|
Add _REG postfix to register addresses to avoid confusion with
fields. Also add additional field defines and use throughout the
driver in place of magic numbers.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-14-jic23@kernel.org
|
|
Note this doesn't support everything the chip can do as we ignore
window mode for now (in window / out of window).
* Given the chip doesn't have any way of disabling the threshold
pins, use disable_irq() etc to mask them except when we actually
want them enabled (previously events were always enabled).
Note there are race conditions, but using the current state from
the status register and disabling interrupts across changes in
type of event should mean those races result in interrupts,
but no events to userspace.
* Correctly reflect that there is one threshold line per channel.
* Only take notice of rising edge. If anyone wants the other edge
then they should set the other threshold (they are available for
rising and falling directions). This isn't perfect but it makes
it a lot simpler.
* If insufficient interrupts are specified in firnware, don't support
events.
* Adaptive events use the same pos/neg values of thrMD as non adaptive
ones.
Tested against qemu based emulation.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Link: https://lore.kernel.org/r/20210314181511.531414-13-jic23@kernel.org
|
|
Now we have core support for timeouts related to adaptive events, let us
use it. Note the units of that attribute are seconds, so we also need
to scale the cycles value by the period of each sample.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-12-jic23@kernel.org
|
|
For adaptive threshold events, the current value is compared with a
(typically) low pass filtered version of the same signal that slowly
tracks large scale changes. However, sometimes a step change can
result in a large lag before the low pass filtered version begins
to track the signal again. Timeouts can be used to made an
instantaneous 'correction'. Documentation of this attribute
is added in a later patch.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-11-jic23@kernel.org
|
|
Device uses a fixed sampling frequency. Let us expose it to userspace.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-10-jic23@kernel.org
|
|
Also
* drop i2c_set_client_data() as now unused.
* white space cleanups
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-9-jic23@kernel.org
|
|
The event line is active high and not maskable within the device.
It indicates current state directly.
The device supports separate rising and falling thresholds so rather
than trying to using each bound to detect in both directions just use
IRQF_TRIGGER_RISING. If a user wants to detect the value falling
back below the threshold, then set the falling threshold appropriately.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-8-jic23@kernel.org
|
|
present or not
The driver supports devices with different numbers of channels and
also can function without provision of an IRQ (with reduced features),
so this patch handles this cleanly by having multiple chan_spec
arrays and iio_info structures to pick between depending on what we
have.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-7-jic23@kernel.org
|
|
There are no mainline board files using this driver so lets drop
the platform_data support in favour of devicetree and similar.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-6-jic23@kernel.org
|
|
updating
The timeout is treated as one single value, but the datasheet describes
it as two 4 bit values, one for each direction of event.
As such change the driver to support the separate directions.
Also add limit checking to ensure it fits within the 4 bits.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-5-jic23@kernel.org
|
|
Original code was ordered in a fairly unituitive fashion with
the non adaptive threshold handling returning from the switch
statement, whilst the adapative path did the actual writes outside
the switch. Make it more readable by bringing everything within
the switch statement cases and reducing scope of local variables
as appropriate.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-4-jic23@kernel.org
|
|
The devices support window detection, but that corresponds to
being outside of a range defined by a lower an uppper bound rather
than being related to magnitude as such. Hence drop this interface
in the interests of making the driver ABI compliant.
We may bring back support for the window mode at somepoint in the future
but it will be in an ABI compliant fashion.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Link: https://lore.kernel.org/r/20210314181511.531414-3-jic23@kernel.org
|
|
Reduces boilerplate and chances of getting the error handling wrong.
No functional change.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon>
Link: https://lore.kernel.org/r/20210314181511.531414-2-jic23@kernel.org
|
|
The inv_mpu6050 driver requires an interrupt for buffered capture. But non
buffered reading for measurements works just fine without an interrupt
connected.
Make the interrupt optional to support this case.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
Link: https://lore.kernel.org/r/20210325131046.13383-2-lars@metafoo.de
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
The inv_mpu6050 driver manually assigns the indio_dev->modes property. But
this is not necessary since it will be setup in iio_trigger_buffer_setup().
Remove the manual assignment.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com>
Link: https://lore.kernel.org/r/20210325131046.13383-1-lars@metafoo.de
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
As Lars pointed out, we could either return the FD vs memcpy-ing it to the
userspace data object.
However, this comment exposed a bug. We should return 0 or negative from
these ioctl() handlers. Because an ioctl() handler can also return
IIO_IOCTL_UNHANDLED (which is positive 1), which means that the ioctl()
handler doesn't support this ioctl number. Positive 1 could also be a valid
FD number in some corner cases.
The reason we did this is to be able to differentiate between an error
code and an unsupported ioctl number; for unsupported ioctl numbers, the
main loop should keep going.
Maybe we should change this to a higher negative number, to avoid such
cases when/if we add more ioctl() handlers.
Cc: Lars-Peter Clausen <lars@metafoo.de>
Fixes: f73f7f4da5818 ("iio: buffer: add ioctl() to support opening extra buffers for IIO device")
Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
Link: https://lore.kernel.org/r/20210322084135.17536-1-aardelean@deviqon.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
The code was checking if (ret) from the processed
channel readout, not smart, we need to check if (ret < 0)
as this will likely be something like IIO_VAL_INT.
Fixes: 635ef601b238 ("iio: Provide iio_read_channel_processed_scale() API")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210323122705.1326362-1-linus.walleij@linaro.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
The mhi_prepare_for_transfer() and mhi_unprepare_from_transfer()
APIs could use better explanation. Add details on what MHI does
when these APIs are used.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-10-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
The __mhi_unprepare_channel() API does not require the __ prefix.
Get rid of it and make the internal function consistent with the
other function names.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Link: https://lore.kernel.org/r/1617311778-1254-9-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
A client can attempt to unprepare certain channels for transfer even
after the execution environment they are supposed to run in has changed.
In the event that happens, the device need not be notified of the reset
and the host can proceed with clean up for the channel context and
memory allocated for it on the host as the device will no longer be able
to respond to such a request.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-8-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
When clearing up the channel context after client drivers are
done using channels, the configuration is currently not being
reset entirely. Ensure this is done to appropriately handle
issues where clients unaware of the context state end up calling
functions which expect a context.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-7-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
MHI host can fail early if device is in a bad state by attempting
to assert device wake and holding the runtime PM vote before
sending a channel update command instead of performing a wake
toggle and waiting for a timeout if the send were to fail. This
can help improve the design and enable shorter wait periods for
device to respond as votes are already held.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-6-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Debug messages dealing with client devices use the generic MHI
controller or parent device along with a channel number. It would
be better to instead use the client device directly and enable
better log messages for channel updates.
Suggested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-5-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Improve the channel handling state machine such that all commands
go through a common function and a validation process to ensure
that the state machine is not violated in any way and adheres to
the MHI specification. Using this common function allows MHI to
eliminate some unnecessary debug messages and code duplication.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-4-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
If a channel was explicitly stopped but not reset and a driver
remove is issued, clean up the channel context such that it is
reflected on the device. This move is useful if a client driver
module is unloaded or a device crash occurs with the host having
placed the channel in a stopped state.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-3-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Add support to allow sending the STOP channel command. If a
client driver would like to STOP a channel and have the device
retain the channel context instead of issuing a RESET to it and
clearing the context, this would provide support for it after
the ability to send this command is exposed to clients.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617311778-1254-2-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Add generic info for SDX65 based modems.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1617399199-35172-1-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Some controllers can choose to skip preparation for power up.
In that case, device context is initialized based on the pre_init
flag not being set during mhi_prepare_for_power_up(). There is no
reason MHI host driver should maintain and provide controllers
with two separate paths for preparing MHI.
Going forward, all controllers will be required to call the
mhi_prepare_for_power_up() API followed by their choice of sync
or async power up. This allows MHI host driver to get rid of the
pre_init flag and sets up a common way for all controllers to use
MHI. This also helps controllers fail early on during preparation
phase in some failure cases.
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Hemant Kumar <hemantk@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617313309-24035-1-git-send-email-bbhatt@codeaurora.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Since M3 can be entered/exited quite a lot when used for runtime PM,
keep the mhi suspend/resume transitions quiet.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617700315-12492-2-git-send-email-loic.poulain@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
This change ensures that PM reference is always get during packet
queueing and released either after queuing completion (RX) or once
the buffer has been consumed (TX). This guarantees proper update for
underlying MHI controller runtime status (e.g. last_busy timestamp)
and prevents suspend to be triggered while TX packets are flying,
or before we completed update of the RX ring.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/1617700315-12492-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
|
|
Memory allocated by kvzalloc() should be freed by kvfree().
Fixes: 34ca65352ddf2 ("net/mlx5: E-Switch, Indirect table infrastructur")
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add reserved mapping to cover all the register in order to avoid setting
arbitrary values to newer FW which implements the reserved fields.
Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add reserved mapping to cover all the register in order to avoid
setting arbitrary values to newer FW which implements the reserved
fields.
Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The cited commit wrongly placed log_max_flow_counter field of
mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended
placement.
Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Make sure to modify uplink port to follow only if the uplink_follow
capability is set as required by the HW spec. Failure to do so causes
traffic to the uplink representor net device to cease after switching to
switchdev mode.
Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
That (and traversals in case of umount .) should be done before
complete_walk(). Either a braino or mismerge damage on queue
reorders - either way, I should've spotted that much earlier.
Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
X-Paperbag: Brown
Fixes: 161aff1d93ab "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()"
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fix incorrect documentation. Mostly referring to other objects,
likely because the text was copied and not adjusted.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When changing type with TUNSETLINK ioctl command, set tun->dev->addr_len
to match the appropriate type, using new tun_get_addr_len utility function
which returns appropriate address length for given type. Fixes a
KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=0766d38c656abeace60621896d705743aeefed51
Reported-by: syzbot+001516d86dbe88862cec@syzkaller.appspotmail.com
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The member 'tx_lpi_timer' is defined with __u32 datatype in the ethtool
header file. Hence, we should use ethnl_update_u32() in set_eee ops.
Fixes: fd77be7bd43c ("ethtool: set EEE settings with EEE_SET request")
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the VF down state bit is cleared after VF sending
link status request command. There is problem that when VF gets
link status replied from PF, the down state bit may still set
as 1. In this case, the link status replied from PF will be
ignored and always set VF link status to down.
To fix this problem, clear VF down state bit before VF requests
link status.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2021-04-06
this is a pull request of 1 patch for net/master.
The patch is by me and fixes the SPI half duplex support in the
mcp251x CAN driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pci_resource_start() is not a good indicator to determine if a PCI
resource exists or not, since the resource may start at address 0.
This is seen when trying to instantiate the driver in qemu for riscv32
or riscv64.
pci 0000:00:01.0: reg 0x10: [io 0x0000-0x001f]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000001f]
...
pcnet32: card has no PCI IO resources, aborting
Use pci_resouce_len() instead.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Incorrect accounting fwd_alloc can result in a warning when the socket
is torn down,
[18455.319240] WARNING: CPU: 0 PID: 24075 at net/core/stream.c:208 sk_stream_kill_queues+0x21f/0x230
[...]
[18455.319543] Call Trace:
[18455.319556] inet_csk_destroy_sock+0xba/0x1f0
[18455.319577] tcp_rcv_state_process+0x1b4e/0x2380
[18455.319593] ? lock_downgrade+0x3a0/0x3a0
[18455.319617] ? tcp_finish_connect+0x1e0/0x1e0
[18455.319631] ? sk_reset_timer+0x15/0x70
[18455.319646] ? tcp_schedule_loss_probe+0x1b2/0x240
[18455.319663] ? lock_release+0xb2/0x3f0
[18455.319676] ? __release_sock+0x8a/0x1b0
[18455.319690] ? lock_downgrade+0x3a0/0x3a0
[18455.319704] ? lock_release+0x3f0/0x3f0
[18455.319717] ? __tcp_close+0x2c6/0x790
[18455.319736] ? tcp_v4_do_rcv+0x168/0x370
[18455.319750] tcp_v4_do_rcv+0x168/0x370
[18455.319767] __release_sock+0xbc/0x1b0
[18455.319785] __tcp_close+0x2ee/0x790
[18455.319805] tcp_close+0x20/0x80
This currently happens because on redirect case we do skb_set_owner_r()
with the original sock. This increments the fwd_alloc memory accounting
on the original sock. Then on redirect we may push this into the queue
of the psock we are redirecting to. When the skb is flushed from the
queue we give the memory back to the original sock. The problem is if
the original sock is destroyed/closed with skbs on another psocks queue
then the original sock will not have a way to reclaim the memory before
being destroyed. Then above warning will be thrown
sockA sockB
sk_psock_strp_read()
sk_psock_verdict_apply()
-- SK_REDIRECT --
sk_psock_skb_redirect()
skb_queue_tail(psock_other->ingress_skb..)
sk_close()
sock_map_unref()
sk_psock_put()
sk_psock_drop()
sk_psock_zap_ingress()
At this point we have torn down our own psock, but have the outstanding
skb in psock_other. Note that SK_PASS doesn't have this problem because
the sk_psock_drop() logic releases the skb, its still associated with
our psock.
To resolve lets only account for sockets on the ingress queue that are
still associated with the current socket. On the redirect case we will
check memory limits per 6fa9201a89898, but will omit fwd_alloc accounting
until skb is actually enqueued. When the skb is sent via skb_send_sock_locked
or received with sk_psock_skb_ingress memory will be claimed on psock_other.
Fixes: 6fa9201a89898 ("bpf, sockmap: Avoid returning unneeded EAGAIN when redirecting to self")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/161731444013.68884.4021114312848535993.stgit@john-XPS-13-9370
|
|
In '4da6a196f93b1' we fixed a potential unhash loop caused when
a TLS socket in a sockmap was removed from the sockmap. This
happened because the unhash operation on the TLS ctx continued
to point at the sockmap implementation of unhash even though the
psock has already been removed. The sockmap unhash handler when a
psock is removed does the following,
void sock_map_unhash(struct sock *sk)
{
void (*saved_unhash)(struct sock *sk);
struct sk_psock *psock;
rcu_read_lock();
psock = sk_psock(sk);
if (unlikely(!psock)) {
rcu_read_unlock();
if (sk->sk_prot->unhash)
sk->sk_prot->unhash(sk);
return;
}
[...]
}
The unlikely() case is there to handle the case where psock is detached
but the proto ops have not been updated yet. But, in the above case
with TLS and removed psock we never fixed sk_prot->unhash() and unhash()
points back to sock_map_unhash resulting in a loop. To fix this we added
this bit of code,
static inline void sk_psock_restore_proto(struct sock *sk,
struct sk_psock *psock)
{
sk->sk_prot->unhash = psock->saved_unhash;
This will set the sk_prot->unhash back to its saved value. This is the
correct callback for a TLS socket that has been removed from the sock_map.
Unfortunately, this also overwrites the unhash pointer for all psocks.
We effectively break sockmap unhash handling for any future socks.
Omitting the unhash operation will leave stale entries in the map if
a socket transition through unhash, but does not do close() op.
To fix set unhash correctly before calling into tls_update. This way the
TLS enabled socket will point to the saved unhash() handler.
Fixes: 4da6a196f93b1 ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop")
Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@john-XPS-13-9370
|
|
Li Shuang found a NULL pointer dereference crash in her testing:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc]
[] Call Trace:
[] <IRQ>
[] tipc_crypto_rcv+0x2d9/0x8f0 [tipc]
[] tipc_rcv+0x2fc/0x1120 [tipc]
[] tipc_udp_recv+0xc6/0x1e0 [tipc]
[] udpv6_queue_rcv_one_skb+0x16a/0x460
[] udp6_unicast_rcv_skb.isra.35+0x41/0xa0
[] ip6_protocol_deliver_rcu+0x23b/0x4c0
[] ip6_input+0x3d/0xb0
[] ipv6_rcv+0x395/0x510
[] __netif_receive_skb_core+0x5fc/0xc40
This is caused by NULL returned by tipc_aead_get(), and then crashed when
dereferencing it later in tipc_crypto_rcv_complete(). This might happen
when tipc_crypto_rcv_complete() is called by two threads at the same time:
the tmp attached by tipc_crypto_key_attach() in one thread may be released
by the one attached by that in the other thread.
This patch is to fix it by incrementing the tmp's refcnt before attaching
it instead of calling tipc_aead_get() after attaching it.
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") brought a ~10% performance drop.
The reason for the performance drop was that GRO was forced
to chain sk_buff (using skb_shinfo(skb)->frag_list), which
uses more memory but also cause packet consumers to go over
a lot of overhead handling all the tiny skbs.
It turns out that virtio_net page_to_skb() has a wrong strategy :
It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then
copies 128 bytes from the page, before feeding the packet to GRO stack.
This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize
under-estimation for tiny skbs") because GRO was using 2 frags per MSS,
meaning we were not packing MSS with 100% efficiency.
Fix is to pull only the ethernet header in page_to_skb()
Then, we change virtio_net_hdr_to_skb() to pull the missing
headers, instead of assuming they were already pulled by callers.
This fixes the performance regression, but could also allow virtio_net
to accept packets with more than 128bytes of headers.
Many thanks to Xuan Zhuo for his report, and his tests/help.
Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs")
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://www.spinics.net/lists/netdev/msg731397.html
Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In bcm4908_enet_dma_alloc, if callee bcm4908_dma_alloc_buf_descs() failed,
it will free the ring->cpu_addr by dma_free_coherent() and return error.
Then bcm4908_enet_dma_free() will be called, and free the same cpu_addr
by dma_free_coherent() again.
My patch set ring->cpu_addr to NULL after it is freed in
bcm4908_dma_alloc_buf_descs() to avoid the double free.
Fixes: 4feffeadbcb2e ("net: broadcom: bcm4908enet: add BCM4908 controller driver")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into arm/fixes
mvebu fixes for 5.12 (part 1)
2 fixes on on turris-omnia (Armada 38x based:)
- Fix storm interrupt
- Enable hardware buffer management as it should be
Unbreak AHCI on all Marvell Armada 7k8k / CN913x platforms
* tag 'mvebu-fixes-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
ARM: dts: turris-omnia: configure LED[2]/INTn pin as interrupt pin
ARM: dts: turris-omnia: fix hardware buffer management
Revert "arm64: dts: marvell: armada-cp110: Switch to per-port SATA interrupts"
Link: https://lore.kernel.org/r/87a6qgctit.fsf@BL-laptop
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|