Age | Commit message (Collapse) | Author |
|
There's no need to print a message on every change in battery percentage
on regular log levels.
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Prepared by checking the datasheets of max17042, max17047/50
and max170455 for differences in register maps.
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
This register is same as in MAX17047 and MAX17050, so there's no need
for custom casing it.
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Add the rockchip serial flash controller (SFC) driver.
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Signed-off-by: Jon Lin <jon.lin@rock-chips.com>
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://lore.kernel.org/r/20210812134546.31340-3-jon.lin@rock-chips.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Moved drivers/platform/x86/intel_menlow.c to drivers/thermal/intel.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20210816035356.1955982-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
|
|
devm_phy_create can return -EPROBE_DEFER if the vbus-supply is not ready
yet. Silence this warning as the driver framework will re-attempt
registering the PHY. Use dev_err_probe() for phy resources to indicate
the deferral reason when waiting for the resource to come up.
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Link: https://lore.kernel.org/r/20210817041548.1276-7-linux.amoon@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Power off the PHY by putting it into reset mode.
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Link: https://lore.kernel.org/r/20210817041548.1276-6-linux.amoon@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use devm_platform_ioremap_resource to simplify code
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-9-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Return the error number directly without assignment
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-8-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use devm_platform_ioremap_resource to simplify code
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-7-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use clock bulk helpers to get/enable/disable clocks
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-6-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
devm_ioremap_resource() will print log if error happens.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-5-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Print error log using child devices instead of parent device.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-4-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add support type switch between USB3, PCIe, SATA and SGMII by
pericfg register, this is used to take the place of efuse or
jumper.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-3-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Use clock bulk helpers to get/enable/disable clocks
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629191987-20774-2-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
PIPE PHY status is used to communicate the completion of several PHY
functions. Check if PHY is ready for operation while configured for
PIPE mode during startup.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-10-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add debug information in probe regarding PHY configuration parameters
like single link or multilink protocol along with number of lanes
used for each protocol link.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-9-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Torrent PHY driver currently supports single link DP configuration.
Prepare driver to support multilink DP configurations by adding
separate functions for common initialization sequence.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210728145454.15945-8-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add PHY configuration registers for single link DP with 100MHz reference
clock and NO_SSC.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-7-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Add PHY registers for single link DP in array format to simplify
code and to improve readability. This supports already supported
frequencies for DP of 19.2MHz and 25MHz.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-6-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
reference clock rate
Torrent PHY supports multiple serdes standards with different input
reference clock frequencies. PHY register values differ based on the
reference clock rate. Add PHY input reference clock frequency as a
new dimension to select proper register configuration. No functional
change is intended.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-5-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Torrent PHY supports different input reference clock frequencies.
Register configurations will be different based on reference clock value.
Prepare driver to support such multiple reference clock frequencies.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210728145454.15945-4-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Reorder some functions to avoid function declarations.
No functional change.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Link: https://lore.kernel.org/r/20210728145454.15945-3-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Script checkpatch with --strict option gives message:
CHECK: Avoid CamelCase: <REF_CLK_19_2MHz>
CHECK: Avoid CamelCase: <REF_CLK_25MHz>
Fix this by removing CamelCase usage. No functional change.
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210728145454.15945-2-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
|
|
Commit a02e8964eaf92 ("virtio-net: ethtool configurable LRO")
maps LRO to virtio guest offloading features and allows the
administrator to enable and disable those features via ethtool.
This leads to several issues:
- For a device that doesn't support control guest offloads, the "LRO"
can't be disabled triggering WARN in dev_disable_lro() when turning
off LRO or when enabling forwarding bridging etc.
- For a device that supports control guest offloads, the guest
offloads are disabled in cases of bridging, forwarding etc slowing
down the traffic.
Fix this by using NETIF_F_GRO_HW instead. Though the spec does not
guarantee packets to be re-segmented as the original ones,
we can add that to the spec, possibly with a flag for devices to
differentiate between GRO and LRO.
Further, we never advertised LRO historically before a02e8964eaf92
("virtio-net: ethtool configurable LRO") and so bridged/forwarded
configs effectively always relied on virtio receive offloads behaving
like GRO - thus even if this breaks any configs it is at least not
a regression.
Fixes: a02e8964eaf92 ("virtio-net: ethtool configurable LRO")
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Ivan <ivan@prestigetransportation.com>
Tested-by: Ivan <ivan@prestigetransportation.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On CN10K, the higher bits in the channel number represents the CPT
channel number. Mask out these higher bits in the npc configuration
to allow packets from cpt for parsing.
Signed-off-by: Vidya <vvelumuri@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The way SW can identify the number NPC counters supported by silicon
has changed for CN10K. This patch addresses this reading appropriate
registers to find out number of counters available.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the mcam entry allocation request is from PF
and NOT a priority allocation request then allocate
low priority entries so that PF entries always have
lower priority than its VFs. This is required so
that entries with (base) MCAM match criteria have lower
priority compared to entries with (base + additional)
match criteria. This patch considers only best case
scenario where PF entries are allocated from low
priority zone if low priority zone has free space.
There are worst case scenarios like:
1. VFs allocating hundreds of MCAM entries leading to VFs
using all mid priority zone and low priority zone entries
hence no entries free from low priority zone for PF.
2. All the PFs and VFs in the system allocating and freeing
entries causing fragmentation in MCAM space and all the
entries requested by PF could not fit in low priority
zone for allocation.
This patch do not handle worst case scenarios.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Added support for setting or modifying MCAM entry count at
runtime via devlink params.
commands:
devlink dev param show
pci/0002:02:00.0:
name mcam_count type driver-specific
values:
cmode runtime value 16
devlink dev param set pci/0002:02:00.0 name mcam_count
value 64 cmode runtime
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variables used for TC flow management like maximum number
of flows, number of flows installed etc are a copy of ntuple
flow management variables. Since both TC and NTUPLE are not
supported at the same time, it's better to unify these with
common variables.
This patch addresses this unification and also does cleanup of
other minor stuff wrt TC.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Per single mailbox request a maximum of 256 MCAM entries
can be allocated. If more than 256 are being allocated, then
the mcam indices in the final list could get jumbled. Hence
sort the indices.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add packet flow classification support for both LMAC mapped virtual
functions and loopback VFs. This patch adds supports for ntuple
offload feature.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enabled NETIF_F_RXALL support for VF driver.
Also removed MTU range comments which are no longer valid.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Added debug messages for various failures during probe.
This will help in quickly identifying the API where the failure
is happening.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add appropriate error codes to be used when returning from AF
mailbox handlers due to some error condition.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When installing a flow using npc_install_flow
mailbox there are number of reasons to reject
the request like caller is not permitted,
invalid channel specified in request, flow
not supported in extraction profile and so on.
Hence define new error codes for npc flows and use
them instead of generic error codes.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-08-16
The following patchset provides two separate mlx5 updates
1) Ethtool RSS context and MQPRIO channel mode support:
1.1) enable mlx5e netdev driver to allow creating Transport Interface RX
(TIRs) objects on the fly to be used for ethtool RSS contexts and
TX MQPRIO channel mode
1.2) Introduce mlx5e_rss object to manage such TIRs.
1.3) Ethtool support for RSS context
1.4) Support MQPRIO channel mode
2) Bridge offloads Lag support:
to allow adding bond net devices to mlx5 bridge
2.1) Address bridge port by (vport_num, esw_owner_vhca_id) pair
since vport_num is only unique per eswitch and in lag mode we
need to manage ports from both eswitches.
2.2) Allow connectivity between representors of different eswitch
instances that are attached to same bridge
2.3) Bridge LAG, Require representors to be in shared FDB mode and
introduce local and peer ports representors,
match on paired eswitch metadata in peer FDB entries,
And finally support addition/deletion and aging of peer flows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adding FLAG "SPINAND_HAS_QE_BIT" for Quad mode support on Macronix
Serial Flash.
Validated via normal(default) and QUAD mode by read, erase, read back,
on Xilinx Zynq PicoZed FPGA board which included Macronix
SPI Host(drivers/spi/spi-mxic.c).
Signed-off-by: Jaime Liao <jaimeliao@mxic.com.tw>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/1628472472-32008-1-git-send-email-jaimeliao@mxic.com.tw
|
|
The check mixes pages (vm_pgoff) with bytes (vm_start, vm_end) on one
side of the comparison, and uses resource address (rather than just the
resource size) on the other side of the comparison.
This can allow malicious userspace to easily bypass the boundary check and
map pages that are located outside memory-region reserved by the driver.
Fixes: 01c60dcea9f7 ("drivers/misc: Add Aspeed P2A control driver")
Cc: stable@vger.kernel.org
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
|
|
The check mixes pages (vm_pgoff) with bytes (vm_start, vm_end) on one
side of the comparison, and uses resource address (rather than just the
resource size) on the other side of the comparison.
This can allow malicious userspace to easily bypass the boundary check and
map pages that are located outside memory-region reserved by the driver.
Fixes: 6c4e97678501 ("drivers/misc: Add Aspeed LPC control driver")
Cc: stable@vger.kernel.org
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
|
|
These values are unused now that the lightnvm support is gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-next
This is the main pull for v5.15, after the early pull request with
drm/scheduler conversion:
* New a6xx GPU support: a680 and 7c3
* dsi: 7nm phi, sc7280 support, test pattern generator support
* mdp4 fixes for older hw like the nexus7
* displayport fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs_tyanTeDGMH1X+Uf4wdyy7jYj-CinGXXVETiYOESahw@mail.gmail.com
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next
Mediatek DRM Next for Linux 5.15
1. MT8133 AAL support, adjust rdma fifo threshold formula.
2. Implement mmap as GEM object function.
3. Add support for MT8167.
4. Test component initialization earlier in the function mtk_drm_crtc_create.
5. CMDQ refinement.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210816232427.13368-1-chunkuang.hu@kernel.org
|
|
NET doesn't imply NET_DEVLINK. Select this separately, so that
random config combinations don't complain.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If ptp_ocp_device_init() fails, pci_disable_device() is skipped.
Fix the error handling so this case is covered. Update ptp_ocp_remove()
so the normal exit path is identical.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If attempting to flash the firmware with a blob of size 0,
the entire write loop is skipped and the uninitialized err
is returned. Fix by setting to 0 first.
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To fix the "reverse-NAT" for replies.
When a packet is sent over a VRF, the POST_ROUTING hooks are called
twice: Once from the VRF interface, and once from the "actual"
interface the packet will be sent from:
1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct()
This causes the POST_ROUTING hooks to run.
2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again.
Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and
second vrf_l3_rcv() calls them again.
As an example, consider the following SNAT rule:
> iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1
In this case sending over a VRF will create 2 conntrack entries.
The first is from the VRF interface, which performs the IP SNAT.
The second will run the SNAT, but since the "expected reply" will remain
the same, conntrack randomizes the source port of the packet:
e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack
rules are:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1
udp 17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is
SNAT-ed from 10000 --> 61033.
But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later
conntrack entry is matched:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1
udp 17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
And a "port 61033 unreachable" ICMP packet is sent back.
The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(),
the skb already has a conntrack flow attached to it, which means
nf_conntrack_in() will not resolve the flow again.
This means only the dest port is "reverse-NATed" (61033 -> 10000) but
the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's
not received.
This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'.
The fix is then to reset the flow when skb is received on a VRF, to let
conntrack resolve the flow again (which now will hit the earlier flow).
To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by
running 'bash ./repro'):
$ cat run_in_A1.py
import logging
logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
from scapy.all import *
import argparse
def get_packet_to_send(udp_dst_port, msg_name):
return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \
IP(src='3.3.3.3', dst='2.2.2.2')/ \
UDP(sport=53, dport=udp_dst_port)/ \
Raw(f'{msg_name}\x0012345678901234567890')
parser = argparse.ArgumentParser()
parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True,
help="From run_in_A3.py")
parser.add_argument('-socket_port', dest="socket_port", type=str,
required=True, help="From run_in_A3.py")
parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True,
help="From script")
args, _ = parser.parse_known_args()
iface_mac = args.iface_mac
socket_port = int(args.socket_port)
v1_mac = args.v1_mac
print(f'Source port before NAT: {socket_port}')
while True:
pkts = sniff(iface='_v0', store=True, count=1, timeout=10)
if 0 == len(pkts):
print('Something failed, rerun the script :(', flush=True)
break
pkt = pkts[0]
if not pkt.haslayer('UDP'):
continue
pkt_sport = pkt.getlayer('UDP').sport
print(f'Source port after NAT: {pkt_sport}', flush=True)
pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port')
sendp(pkt_to_send, '_v0', verbose=False) # Will not be received
pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port')
sendp(pkt_to_send, '_v0', verbose=False)
break
$ cat run_in_A2.py
import socket
import netifaces
print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}",
flush=True)
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
str('vrf_1' + '\0').encode('utf-8'))
s.connect(('3.3.3.3', 53))
print(f'{s. getsockname()[1]}', flush=True)
s.settimeout(5)
while True:
try:
# Periodically send in order to keep the conntrack entry alive.
s.send(b'a'*40)
resp = s.recvfrom(1024)
msg_name = resp[0].decode('utf-8').split('\0')[0]
print(f"Got {msg_name}", flush=True)
except Exception as e:
pass
$ cat repro.sh
ip netns del A1 2> /dev/null
ip netns del A2 2> /dev/null
ip netns add A1
ip netns add A2
ip -n A1 link add _v0 type veth peer name _v1 netns A2
ip -n A1 link set _v0 up
ip -n A2 link add e00000 type bond
ip -n A2 link add lo0 type dummy
ip -n A2 link add vrf_1 type vrf table 10001
ip -n A2 link set vrf_1 up
ip -n A2 link set e00000 master vrf_1
ip -n A2 addr add 1.1.1.1/24 dev e00000
ip -n A2 link set e00000 up
ip -n A2 link set _v1 master e00000
ip -n A2 link set _v1 up
ip -n A2 link set lo0 up
ip -n A2 addr add 2.2.2.2/32 dev lo0
ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000
ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001
ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \
SNAT --to-source 2.2.2.2 -o vrf_1
sleep 5
ip netns exec A2 python3 run_in_A2.py > x &
XPID=$!
sleep 5
IFACE_MAC=`sed -n 1p x`
SOCKET_PORT=`sed -n 2p x`
V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}`
ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \
${SOCKET_PORT} -v1_mac ${SOCKET_PORT}
sleep 5
kill -9 $XPID
wait $XPID 2> /dev/null
ip netns del A1
ip netns del A2
tail x -n 2
rm x
set +x
Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow adding bond net devices to mlx5 bridge with following changes:
- Modify bridge representor code to obtain uplink represetor that belongs
to eswitch that is registered for notification. Require representor to be
in shared FDB mode. If representor is the lag master, then consider its
port as local, otherwise treat it as peer.
- Use devcom to match on paired eswitch metadata in peer FDB entries. This
is necessary for shared FDB LAG to function since packets are always
received on active eswitch instance as opposed to parent eswitch of port.
- Support for deleting peer flows when receiving
SWITCHDEV_FDB_DEL_TO_BRIDGE notification was implemented in one of previous
patches in series. Now also implement support for handling
SWITCHDEV_FDB_ADD_TO_BRIDGE which can be generated on peer by bridge update
workqueue task in LAG configuration. Refresh the flow 'lastuse' timestamp
to current jiffies when receiving such notification on eswitch that manages
the local FDB entry. This allows peer entries to prevent ageing of the FDB.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Allow connectivity between representors of different eswitch instances that
are attached to same bridge when merged_eswitch capability is enabled. Add
ports of peer eswitch to bridge instance and mark them with
MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports with
MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on their
local eswitch instance, which then sends SWITCHDEV_FDB_DEL_TO_BRIDGE event.
Listen to the event on mlx5 bridge implementation and delete peer FDBs in
event handler.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|