Age | Commit message (Collapse) | Author |
|
When destroying all sets, we are either in pernet exit phase or
are executing a "destroy all sets command" from userspace. The latter
was taken into account in ip_set_dereference() (nfnetlink mutex is held),
but the former was not. The patch adds the required check to
rcu_dereference_protected() in ip_set_dereference().
Fixes: 4e7aaa6b82d6 ("netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type")
Reported-by: syzbot+b62c37cdd58103293a5a@syzkaller.appspotmail.com
Reported-by: syzbot+cfbe1da5fdfc39efc293@syzkaller.appspotmail.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406141556.e0b6f17e-lkp@intel.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The patch 15a6af94a277 ("spi: Increase imx51 ecspi burst length based
on transfer length") increased the burst length calculation in
mx51_ecspi_prepare_transfer() to be based on the transfer length.
This breaks HW CS + SPI_CS_WORD support which was added in
6e95b23a5b2d ("spi: imx: Implement support for CS_WORD") and transfers
with bits-per-word != 8, 16, 32.
SPI_CS_WORD means the CS should be toggled after each word. The
implementation in the imx-spi driver relies on the fact that the HW CS
is toggled automatically by the controller after each burst length
number of bits. Setting the burst length to the number of bits of the
_whole_ message breaks this use case.
Further the patch 15a6af94a277 ("spi: Increase imx51 ecspi burst
length based on transfer length") claims to optimize the transfers.
But even without this patch, on modern spi-imx controllers with
"dynamic_burst = true" (imx51, imx6 and newer), the transfers are
already optimized, i.e. the burst length is dynamically adjusted in
spi_imx_push() to avoid the pause between the SPI bursts. This has
been confirmed by a scope measurement on an imx6d.
Subsequent Patches tried to fix these and other problems:
- 5f66db08cbd3 ("spi: imx: Take in account bits per word instead of assuming 8-bits")
- e9b220aeacf1 ("spi: spi-imx: correctly configure burst length when using dma")
- c712c05e46c8 ("spi: imx: fix the burst length at DMA mode and CPU mode")
- cf6d79a0f576 ("spi: spi-imx: fix off-by-one in mx51 CPU mode burst length")
but the HW CS + SPI_CS_WORD use case is still broken.
To fix the problems revert the burst size calculation in
mx51_ecspi_prepare_transfer() back to the original form, before
15a6af94a277 ("spi: Increase imx51 ecspi burst length based on
transfer length") was applied.
Cc: Stefan Moring <stefan.moring@technolution.nl>
Cc: Stefan Bigler <linux@bigler.io>
Cc: Clark Wang <xiaoning.wang@nxp.com>
Cc: Carlos Song <carlos.song@nxp.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
Fixes: 15a6af94a277 ("spi: Increase imx51 ecspi burst length based on transfer length")
Fixes: 5f66db08cbd3 ("spi: imx: Take in account bits per word instead of assuming 8-bits")
Fixes: e9b220aeacf1 ("spi: spi-imx: correctly configure burst length when using dma")
Fixes: c712c05e46c8 ("spi: imx: fix the burst length at DMA mode and CPU mode")
Fixes: cf6d79a0f576 ("spi: spi-imx: fix off-by-one in mx51 CPU mode burst length")
Link: https://lore.kernel.org/all/20240618-oxpecker-of-ideal-mastery-db59f8-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>
Link: https://msgid.link/r/20240618-spi-imx-fix-bustlength-v1-1-2053dd5fdf87@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Previously pgattr_change_is_safe() was overly-strict and complained
(e.g. "[ 116.262743] __check_safe_pte_update: unsafe attribute change:
0x0560000043768fc3 -> 0x0160000043768fc3") if it saw any SW bits change
in a live PTE. There is no such restriction on SW bits in the Arm ARM.
Until now, no SW bits have been updated in live mappings via the
set_ptes() route. PTE_DIRTY would be updated live, but this is handled
by ptep_set_access_flags() which does not call pgattr_change_is_safe().
However, with the introduction of uffd-wp for arm64, there is core-mm
code that does ptep_get(); pte_clear_uffd_wp(); set_ptes(); which
triggers this false warning.
Silence this warning by masking out the SW bits during checks.
The bug isn't technically in the highlighted commit below, but that's
where bisecting would likely lead as its what made the bug user-visible.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: 5b32510af77b ("arm64/mm: Add uffd write-protect support")
Link: https://lore.kernel.org/r/20240619121859.4153966-1-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Netlink flags, although they don't have payload at the netlink level,
are represented as having "True" as value in pyroute2.
Without it, trying to add a flow with a flag-type action (e.g: pop_vlan)
fails with the following traceback:
Traceback (most recent call last):
File "[...]/ovs-dpctl.py", line 2498, in <module>
sys.exit(main(sys.argv))
^^^^^^^^^^^^^^
File "[...]/ovs-dpctl.py", line 2487, in main
ovsflow.add_flow(rep["dpifindex"], flow)
File "[...]/ovs-dpctl.py", line 2136, in add_flow
reply = self.nlm_request(
^^^^^^^^^^^^^^^^^
File "[...]/pyroute2/netlink/nlsocket.py", line 822, in nlm_request
return tuple(self._genlm_request(*argv, **kwarg))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[...]/pyroute2/netlink/generic/__init__.py", line 126, in
nlm_request
return tuple(super().nlm_request(*argv, **kwarg))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[...]/pyroute2/netlink/nlsocket.py", line 1124, in nlm_request
self.put(msg, msg_type, msg_flags, msg_seq=msg_seq)
File "[...]/pyroute2/netlink/nlsocket.py", line 389, in put
self.sendto_gate(msg, addr)
File "[...]/pyroute2/netlink/nlsocket.py", line 1056, in sendto_gate
msg.encode()
File "[...]/pyroute2/netlink/__init__.py", line 1245, in encode
offset = self.encode_nlas(offset)
^^^^^^^^^^^^^^^^^^^^^^^^
File "[...]/pyroute2/netlink/__init__.py", line 1560, in encode_nlas
nla_instance.setvalue(cell[1])
File "[...]/pyroute2/netlink/__init__.py", line 1265, in setvalue
nlv.setvalue(nla_tuple[1])
~~~~~~~~~^^^
IndexError: list index out of range
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove a pair of ports from the port matrix when both ports have the
isolated flag set.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As preparation for implementing bridge port isolation, move the logic to
add and remove bits in the port matrix into a new helper
mt7530_update_port_member(), which is called from
mt7530_port_bridge_join() and mt7530_port_bridge_leave().
Another part of the preparation is using dsa_port_offloads_bridge_dev()
instead of dsa_port_offloads_bridge() to check for bridge membership, as
we don't have a struct dsa_bridge in mt7530_port_bridge_flags().
The port matrix setting is slightly streamlined, now always first setting
the mt7530_port's pm field and then writing the port matrix from that
field into the hardware register, instead of duplicating the bit
manipulation for both the struct field and the register.
mt7530_port_bridge_join() was previously using |= to update the port
matrix with the port bitmap, which was unnecessary, as pm would only
have the CPU port set before joining a bridge; a simple assignment can
be used for both joining and leaving (and will also work when individual
bits are added/removed in port_bitmap with regard to the previous port
matrix, which is what happens with port isolation).
No functional change intended.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes the below build warning messages that are
caused due to linking same files to multiple modules by
exporting the required symbols.
"scripts/Makefile.build:244: drivers/net/ethernet/marvell/octeontx2/nic/Makefile:
otx2_devlink.o is added to multiple modules: rvu_nicpf rvu_nicvf
scripts/Makefile.build:244: drivers/net/ethernet/marvell/octeontx2/nic/Makefile:
otx2_dcbnl.o is added to multiple modules: rvu_nicpf rvu_nicvf"
Fixes: 8e67558177f8 ("octeontx2-pf: PFC config support with DCBx").
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Yan Zhai says:
====================
net: pass receive socket to drop tracepoint
We set up our production packet drop monitoring around the kfree_skb
tracepoint. While this tracepoint is extremely valuable for diagnosing
critical problems, it also has some limitation with drops on the local
receive path: this tracepoint can only inspect the dropped skb itself,
but such skb might not carry enough information to:
1. determine in which netns/container this skb gets dropped
2. determine by which socket/service this skb oughts to be received
The 1st issue is because skb->dev is the only member field with valid
netns reference. But skb->dev can get cleared or reused. For example,
tcp_v4_rcv will clear skb->dev and in later processing it might be reused
for OFO tree.
The 2nd issue is because there is no reference on an skb that reliably
points to a receiving socket. skb->sk usually points to the local
sending socket, and it only points to a receive socket briefly after
early demux stage, yet the socket can get stolen later. For certain drop
reason like TCP OFO_MERGE, Zerowindow, UDP at PROTO_MEM error, etc, it
is hard to infer which receiving socket is impacted. This cannot be
overcome by simply looking at the packet header, because of
complications like sk lookup programs. In the past, single purpose
tracepoints like trace_udp_fail_queue_rcv_skb, trace_sock_rcvqueue_full,
etc are added as needed to provide more visibility. This could be
handled in a more generic way.
In this change set we propose a new 'sk_skb_reason_drop' call as a drop-in
replacement for kfree_skb_reason at various local input path. It accepts
an extra receiving socket argument. Both issues above can be resolved
via this new argument.
V4->V5: rename rx_skaddr to rx_sk to be more clear visually, suggested
by Jesper Dangaard Brouer.
V3->V4: adjusted the TP_STRUCT field order to align better, suggested by
Steven Rostedt.
V2->V3: fixed drop_monitor function signatures; fixed a few uninitialized sks;
Added a few missing report tags from test bots (also noticed by Dan
Carpenter and Simon Horman).
V1->V2: instead of using skb->cb, directly add the needed argument to
trace_kfree_skb tracepoint. Also renamed functions as Eric Dumazet
suggested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
socket to the tracepoint.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202406011859.Aacus8GV-lkp@intel.com/
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
socket to the tracepoint.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202406011751.NpVN0sSk-lkp@intel.com/
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
socket to the tracepoint.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202406011539.jhwBd7DX-lkp@intel.com/
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
socket to the tracepoint.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving
socket to the tracepoint.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Long used destructors kfree_skb and kfree_skb_reason do not pass
receiving socket to packet drop tracepoints trace_kfree_skb.
This makes it hard to track packet drops of a certain netns (container)
or a socket (user application).
The naming of these destructors are also not consistent with most sk/skb
operating functions, i.e. functions named "sk_xxx" or "skb_xxx".
Introduce a new functions sk_skb_reason_drop as drop-in replacement for
kfree_skb_reason on local receiving path. Callers can now pass receiving
sockets to the tracepoints.
kfree_skb and kfree_skb_reason are still usable but they are now just
inline helpers that call sk_skb_reason_drop.
Note it is not feasible to do the same to consume_skb. Packets not
dropped can flow through multiple receive handlers, and have multiple
receiving sockets. Leave it untouched for now.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb does not include enough information to find out receiving
sockets/services and netns/containers on packet drops. In theory
skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
stack for OOO packet lookup. Similarly, skb->sk often identifies a local
sender, and tells nothing about a receiver.
Allow passing an extra receiving socket to the tracepoint to improve
the visibility on receiving drops.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
LPM consists of HIPM (host initiated power management) and DIPM
(device initiated power management).
ata_eh_set_lpm() will only enable HIPM if both the HBA and the device
supports it.
However, DIPM will be enabled as long as the device supports it.
The HBA will later reject the device's request to enter a power state
that it does not support (Slumber/Partial/DevSleep) (DevSleep is never
initiated by the device).
For a HBA that doesn't support any LPM states, simply don't set a LPM
policy such that all the HIPM/DIPM probing/enabling will be skipped.
Not enabling HIPM or DIPM in the first place is safer than relying on
the device following the AHCI specification and respecting the NAK.
(There are comments in the code that some devices misbehave when
receiving a NAK.)
Performing this check in ahci_update_initial_lpm_policy() also has the
advantage that a HBA that doesn't support any LPM states will take the
exact same code paths as a port that is external/hot plug capable.
Side note: the port in ata_port_dbg() has not been given a unique id yet,
but this is not overly important as the debug print is disabled unless
explicitly enabled using dynamic debug. A follow-up series will make sure
that the unique id assignment will be done earlier. For now, the important
thing is that the function returns before setting the LPM policy.
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Cc: stable@vger.kernel.org
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240618152828.2686771-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Currently, for JH7110 boards with EMMC slot, vqmmc voltage for EMMC is
fixed to 1.8V, while the spec needs it to be 3.3V on low speed mode and
should support switching to 1.8V when using higher speed mode. Since
there are no other peripherals using the same voltage source of EMMC's
vqmmc(ALDO4) on every board currently supported by mainline kernel,
regulator-max-microvolt of ALDO4 should be set to 3.3V.
Cc: stable@vger.kernel.org
Signed-off-by: Shengyu Qu <wiagn233@outlook.com>
Fixes: 7dafcfa79cc9 ("riscv: dts: starfive: enable DCDC1&ALDO4 node in axp15060")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
otx2_sq_append_skb makes used of __vlan_hwaccel_push_inside()
to unoffload VLANs - push them from skb meta data into skb data.
However, it omitts a check for __vlan_hwaccel_push_inside()
returning NULL.
Found by inspection based on [1] and [2].
Compile tested only.
[1] Re: [PATCH net-next v1] net: stmmac: Enable TSO on VLANs
https://lore.kernel.org/all/ZmrN2W8Fye450TKs@shell.armlinux.org.uk/
[2] Re: [PATCH net-next v2] net: stmmac: Enable TSO on VLANs
https://lore.kernel.org/all/CANn89i+11L5=tKsa7V7Aeyxaj6nYGRwy35PAbCRYJ73G+b25sg@mail.gmail.com/
Fixes: fd9d7859db6c ("octeontx2-pf: Implement ingress/egress VLAN offload")
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Diogo Ivo says
====================
Enable PTP timestamping/PPS for AM65x SR1.0 devices
This patch series enables support for PTP in AM65x SR1.0 devices.
This feature relies heavily on the Industrial Ethernet Peripheral
(IEP) hardware module, which implements a hardware counter through
which time is kept. This hardware block is the basis for exposing
a PTP hardware clock to userspace and for issuing timestamps for
incoming/outgoing packets, allowing for time synchronization.
The IEP also has compare registers that fire an interrupt when the
counter reaches the value stored in a compare register. This feature
allows us to support PPS events in the kernel.
The changes are separated into five patches:
- PATCH 01/05: Register SR1.0 devices with the IEP infrastructure to
expose a PHC clock to userspace, allowing time to be
adjusted using standard PTP tools. The code for issuing/
collecting packet timestamps is already present in the
current state of the driver, so only this needs to be
done.
- PATCH 02/05: Remove unnecessary spinlock synchronization.
- PATCH 03/05: Document IEP interrupt in DT binding.
- PATCH 04/05: Add support for IEP compare event/interrupt handling
to enable PPS events.
- PATCH 05/05: Add the interrupts to the IOT2050 device tree.
Currently every compare event generates two interrupts, the first
corresponding to the actual event and the second being a spurious
but otherwise harmless interrupt. The root cause of this has been
identified and has been solved in the platform's SDK. A forward port
of the SDK's patches also fixes the problem in upstream but is not
included here since it's upstreaming is out of the scope of this
series. If someone from TI would be willing to chime in and help
get the interrupt changes upstream that would be great!
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
---
Changes in v4:
- Remove unused 'flags' variables in patch 02/05
- Add patch 03/05 describing IEP interrupt in DT binding
- Link to v3: https://lore.kernel.org/r/20240607-iep-v3-0-4824224105bc@siemens.com
Changes in v3:
- Collect Reviewed-by tags
- Add patch 02/04 removing spinlocks from IEP driver
- Use mutex-based synchronization when accessing HW registers
- Link to v2: https://lore.kernel.org/r/20240604-iep-v2-0-ea8e1c0a5686@siemens.com
Changes in v2:
- Collect Reviewed-by tags
- PATCH 01/03: Limit line length to 80 characters
- PATCH 02/03: Proceed with limited functionality if getting IRQ fails,
limit line length to 80 characters
- Link to v1: https://lore.kernel.org/r/20240529-iep-v1-0-7273c07592d3@siemens.com
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the interrupts needed for PTP Hardware Clock support via IEP
in SR1.0 devices.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The IEP module supports compare events, in which a value is written to a
hardware register and when the IEP counter reaches the written value an
interrupt is generated. Add handling for this interrupt in order to
support PPS events.
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The IEP interrupt is used in order to support both capture events, where
an incoming external signal gets timestamped on arrival, and compare
events, where an interrupt is generated internally when the IEP counter
reaches a programmed value.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As all sources of concurrency in hardware register access occur in
non-interrupt context eliminate spinlock-based synchronization and
rely on the mutex-based synchronization that is already present.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enable PTP support for AM65x SR1.0 devices by registering with the IEP
infrastructure in order to expose a PTP clock to userspace.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Heng Qi says:
====================
virtio_net: fixes for checksum offloading and XDP handling
This series of patches aim to address two specific issues identified
in the virtio_net driver related to checksum offloading and XDP
processing of fully checksummed packets.
The first patch corrects the handling of checksum offloading in the
driver. The second patch addresses an issue where the XDP program had
no trouble with fully checksummed packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The XDP program can't correctly handle partially checksummed
packets, but works fine with fully checksummed packets. If the
device has already validated fully checksummed packets, then
the driver doesn't need to re-validate them, saving CPU resources.
Additionally, the driver does not drop all partially checksummed
packets when VIRTIO_NET_F_GUEST_CSUM is not negotiated. This is
not a bug, as the driver has always done this.
Fixes: 436c9453a1ac ("virtio-net: keep vnet header zeroed after processing XDP")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In virtio spec 0.95, VIRTIO_NET_F_GUEST_CSUM was designed to handle
partially checksummed packets, and the validation of fully checksummed
packets by the device is independent of VIRTIO_NET_F_GUEST_CSUM
negotiation. However, the specification erroneously stated:
"If VIRTIO_NET_F_GUEST_CSUM is not negotiated, the device MUST set flags
to zero and SHOULD supply a fully checksummed packet to the driver."
This statement is inaccurate because even without VIRTIO_NET_F_GUEST_CSUM
negotiation, the device can still set the VIRTIO_NET_HDR_F_DATA_VALID flag.
Essentially, the device can facilitate the validation of these packets'
checksums - a process known as RX checksum offloading - removing the need
for the driver to do so.
This scenario is currently not implemented in the driver and requires
correction. The necessary specification correction[1] has been made and
approved in the virtio TC vote.
[1] https://lists.oasis-open.org/archives/virtio-comment/202401/msg00011.html
Fixes: 4f49129be6fa ("virtio-net: Set RXCSUM feature if GUEST_CSUM is available")
Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After ecf848eb934b ("net: usb: ax88179_178a: fix link status when link is
set to down/up") to not reset from usbnet_open after the reset from
usbnet_probe at initialization stage to speed up this, some issues have
been reported.
It seems to happen that if the initialization is slower, and some time
passes between the probe operation and the open operation, the second reset
from open is necessary too to have the device working. The reason is that
if there is no activity with the phy, this is "disconnected".
In order to improve this, the solution is to detect when the phy is
"disconnected", and we can use the phy status register for this. So we will
only reset the device from reset operation in this situation, that is, only
if necessary.
The same bahavior is happening when the device is stopped (link set to
down) and later is restarted (link set to up), so if the phy keeps working
we only need to enable the mac again, but if enough time passes between the
device stop and restart, reset is necessary, and we can detect the
situation checking the phy status register too.
cc: stable@vger.kernel.org # 6.6+
Fixes: ecf848eb934b ("net: usb: ax88179_178a: fix link status when link is set to down/up")
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Reported-by: Antje Miederhöfer <a.miederhoefer@gmx.de>
Reported-by: Arne Fitzenreiter <arne_f@ipfire.org>
Tested-by: Yongqin Liu <yongqin.liu@linaro.org>
Tested-by: Antje Miederhöfer <a.miederhoefer@gmx.de>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit
6791e0ea3071 ("x86/resctrl: Access per-rmid structures by index")
adds logic to map individual monitoring groups into a global index space used
for tracking allocated RMIDs.
Attempts to free the default RMID are ignored in free_rmid(), and this works
fine on x86.
With arm64 MPAM, there is a latent bug here however: on platforms with no
monitors exposed through resctrl, each control group still gets a different
monitoring group ID as seen by the hardware, since the CLOSID always forms part
of the monitoring group ID.
This means that when removing a control group, the code may try to free this
group's default monitoring group RMID for real. If there are no monitors
however, the RMID tracking table rmid_ptrs[] would be a waste of memory and is
never allocated, leading to a splat when free_rmid() tries to dereference the
table.
One option would be to treat RMID 0 as special for every CLOSID, but this would
be ugly since bookkeeping still needs to be done for these monitoring group IDs
when there are monitors present in the hardware.
Instead, add a gating check of resctrl_arch_mon_capable() in free_rmid(), and
just do nothing if the hardware doesn't have monitors.
This fix mirrors the gating checks already present in
mkdir_rdt_prepare_rmid_alloc() and elsewhere.
No functional change on x86.
[ bp: Massage commit message. ]
Fixes: 6791e0ea3071 ("x86/resctrl: Access per-rmid structures by index")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20240618140152.83154-1-Dave.Martin@arm.com
|
|
Since commit 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS
swap-space"), we can plug multiple pages then unplug them all together.
That means iov_iter_count(iter) could be way bigger than PAGE_SIZE, it
actually equals the size of iov_iter_npages(iter, INT_MAX).
Note this issue has nothing to do with large folios as we don't support
THP_SWPOUT to non-block devices.
Fixes: 2282679fb20b ("mm: submit multipage write for SWP_FS_OPS swap-space")
Reported-by: Christoph Hellwig <hch@lst.de>
Closes: https://lore.kernel.org/linux-mm/20240614100329.1203579-1-hch@lst.de/
Cc: NeilBrown <neilb@suse.de>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Steve French <sfrench@samba.org>
Cc: Trond Myklebust <trondmy@kernel.org>
Cc: Chuanhua Han <hanchuanhua@oppo.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Paulo Alcantara <pc@manguebit.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Bharath SM <bharathsm@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
As defined by the MANA Hardware spec, the queue size for DMA is 4KB
minimal, and power of 2. And, the HWC queue size has to be exactly
4KB.
To support page sizes other than 4KB on ARM64, define the minimal
queue size as a macro separately from the PAGE_SIZE, which we always
assumed it to be 4KB before supporting ARM64.
Also, add MANA specific macros and update code related to size
alignment, DMA region calculations, etc.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1718655446-6576-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kamal Heib says:
====================
net/mlx4_en: Use ethtool_puts/sprintf
This patchset updates the mlx4_en driver to use the ethtool_puts and
ethtool_sprintf helper functions.
Signed-off-by: Kamal Heib <kheib@redhat.com>
====================
Link: https://lore.kernel.org/r/20240617172329.239819-1-kheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the ethtool_puts/ethtool_sprintf helper to print the stats strings
into the ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-4-kheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the ethtool_puts helper to print the selftest strings into the
ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-3-kheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the ethtool_puts helper to print the priv flags strings into the
ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-2-kheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
commit be27b8965297 ("net: stmmac: replace priv->speed with
the portTransmitRate from the tc-cbs parameters") introduced
a problem. When deleting, it prompts "Invalid portTransmitRate
0 (idleSlope - sendSlope)" and exits. Add judgment on cbs.enable.
Only when offload is enabled, speed divider needs to be calculated.
Fixes: be27b8965297 ("net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617013922.1035854-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Get master/slave configuration for initial system start with the link in
down state. This ensures ethtool shows current configuration. Also
fixes link reconfiguration with ethtool while in down state, preventing
ethtool from displaying outdated configuration.
Even though dp83tg720_config_init() is executed periodically as long as
the link is in admin up state but no carrier is detected, this is not
sufficient for the link in admin down state where
dp83tg720_read_status() is not periodically executed. To cover this
case, we need an extra read role configuration in
dp83tg720_config_aneg().
Fixes: cb80ee2f9bee1 ("net: phy: Add support for the DP83TG720S Ethernet PHY")
Cc: stable@vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240614094516.1481231-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In case this PHY is bootstrapped for managed mode, we need to manually
wake it. Otherwise no link will be detected.
Cc: stable@vger.kernel.org
Fixes: cb80ee2f9bee1 ("net: phy: Add support for the DP83TG720S Ethernet PHY")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240614094516.1481231-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Milkv Duo does not have a write-protect pin, so disable write protect
to prevent SDcards misdetected as read-only.
Fixes: 89a7056ed4f7 ("riscv: dts: sophgo: add sdcard support for milkv duo")
Signed-off-by: Haylen Chu <heylenay@outlook.com>
Link: https://lore.kernel.org/r/SEYPR01MB4221943C7B101DD2318DA0D3D7CE2@SEYPR01MB4221.apcprd01.prod.exchangelabs.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
|
|
When CXL subsystem is auto-assembling a pmem region during cxl
endpoint port probing, always hit below calltrace.
BUG: kernel NULL pointer dereference, address: 0000000000000078
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
Call Trace:
<TASK>
? __die+0x24/0x70
? page_fault_oops+0x82/0x160
? do_user_addr_fault+0x65/0x6b0
? exc_page_fault+0x7d/0x170
? asm_exc_page_fault+0x26/0x30
? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem]
? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem]
cxl_bus_probe+0x1b/0x60 [cxl_core]
really_probe+0x173/0x410
? __pfx___device_attach_driver+0x10/0x10
__driver_probe_device+0x80/0x170
driver_probe_device+0x1e/0x90
__device_attach_driver+0x90/0x120
bus_for_each_drv+0x84/0xe0
__device_attach+0xbc/0x1f0
bus_probe_device+0x90/0xa0
device_add+0x51c/0x710
devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core]
cxl_bus_probe+0x1b/0x60 [cxl_core]
The cxl_nvd of the memdev needs to be available during the pmem region
probe. Currently the cxl_nvd is registered after the endpoint port probe.
The endpoint probe, in the case of autoassembly of regions, can cause a
pmem region probe requiring the not yet available cxl_nvd. Adjust the
sequence so this dependency is met.
This requires adding a port parameter to cxl_find_nvdimm_bridge() that
can be used to query the ancestor root port. The endpoint port is not
yet available, but will share a common ancestor with its parent, so
start the query from there instead.
Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Ming <ming4.li@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20240612064423.2567625-1-ming4.li@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
- filesystems: warn_unused_result warnings
- seccomp: format-zero-length warnings
- fchmodat2: clang build warnings due to-static-libasan
- openat2: clang build warnings due to static-libasan, LOCAL_HDRS
* tag 'linux_kselftest-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/fchmodat2: fix clang build failure due to -static-libasan
selftests/openat2: fix clang build failures: -static-libasan, LOCAL_HDRS
selftests: seccomp: fix format-zero-length warnings
selftests: filesystems: fix warn_unused_result build warnings
|
|
openvswitch.sh makes use of substitutions of the form ${ns:0:1}, to
obtain the first character of $ns. Empirically, this is works with bash
but not dash. When run with dash these evaluate to an empty string and
printing an error to stdout.
# dash -c 'ns=client; echo "${ns:0:1}"' 2>error
# cat error
dash: 1: Bad substitution
# bash -c 'ns=client; echo "${ns:0:1}"' 2>error
c
# cat error
This leads to tests that neither pass nor fail.
F.e.
TEST: arp_ping [START]
adding sandbox 'test_arp_ping'
Adding DP/Bridge IF: sbx:test_arp_ping dp:arpping {, , }
create namespaces
./openvswitch.sh: 282: eval: Bad substitution
TEST: ct_connect_v4 [START]
adding sandbox 'test_ct_connect_v4'
Adding DP/Bridge IF: sbx:test_ct_connect_v4 dp:ct4 {, , }
./openvswitch.sh: 322: eval: Bad substitution
create namespaces
Resolve this by making openvswitch.sh a bash script.
Fixes: 918423fda910 ("selftests: openvswitch: add an initial flow programming case")
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240617-ovs-selftest-bash-v1-1-7ae6ccd3617b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On 32bit systems, the "4 * max" multiply can overflow. Use kcalloc()
to do the allocation to prevent this.
Fixes: 44c494c8e30e ("ptp: track available ptp vclocks information")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://lore.kernel.org/r/ee8110ed-6619-4bd7-9024-28c1f2ac24f4@moroto.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While adding a SPI device, the SPI core ensures that multiple logical CS
doesn't map to the same physical CS. For example, spi->chip_select[0] !=
spi->chip_select[1] and so forth. However, unlike the SPI master, the SPI
slave doesn't have the list of chip selects, this leads to probe failure
when the SPI controller is configured as slave. Update the
__spi_add_device() function to perform this check only if the SPI
controller is configured as master.
Fixes: 4d8ff6b0991d ("spi: Add multi-cs memories support in SPI core")
Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@amd.com>
Link: https://msgid.link/r/20240617153052.26636-1-amit.kumar-mahapatra@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Dell SKU 0C64 has a single rt1318 amplifier.
The prefix name of control still needs to be set rt1318-1 corresponding to UCM config.
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://msgid.link/r/20240612075740.1678082-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Hardcoding the number of CPUs at compile time does improve code
generation, but if you get it wrong the result will be confusion.
We already limited this earlier to only "experts" (see commit
fe5759d5bfda "cpumask: limit visibility of FORCE_NR_CPUS"), but with
distro kernel configs often having EXPERT enabled, that turns out to not
be much of a limit.
To quote the philosophers at Disney: "Everyone can be an expert. And
when everyone's an expert, no one will be".
There's a runtime warning if you then set nr_cpus to anything but the
forced number, but apparently that can be ignored too [1] and by then
it's pretty much too late anyway.
If we had some real way to limit this to "embedded only", maybe it would
be worth it, but let's see if anybody even notices that the option is
gone. We need to simplify kernel configuration anyway.
Link: https://lore.kernel.org/all/20240618105036.208a8860@rorschach.local.home/ [1]
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Bail from outer address space loop, not just the inner memslot loop, when
a "null" handler is encountered by __kvm_handle_hva_range(), which is the
intended behavior. On x86, which has multiple address spaces thanks to
SMM emulation, breaking from just the memslot loop results in undefined
behavior due to assigning the non-existent return value from kvm_null_fn()
to a bool.
In practice, the bug is benign as kvm_mmu_notifier_invalidate_range_end()
is the only caller that passes handler=kvm_null_fn, and it doesn't set
flush_on_ret, i.e. assigning garbage to r.ret is ultimately ignored. And
for most configuration the compiler elides the entire sequence, i.e. there
is no undefined behavior at runtime.
------------[ cut here ]------------
UBSAN: invalid-load in arch/x86/kvm/../../../virt/kvm/kvm_main.c:655:10
load of value 160 is not a valid value for type '_Bool'
CPU: 370 PID: 8246 Comm: CPU 0/KVM Not tainted 6.8.2-amdsos-build58-ubuntu-22.04+ #1
Hardware name: AMD Corporation Sh54p/Sh54p, BIOS WPC4429N 04/25/2024
Call Trace:
<TASK>
dump_stack_lvl+0x48/0x60
ubsan_epilogue+0x5/0x30
__ubsan_handle_load_invalid_value+0x79/0x80
kvm_mmu_notifier_invalidate_range_end.cold+0x18/0x4f [kvm]
__mmu_notifier_invalidate_range_end+0x63/0xe0
__split_huge_pmd+0x367/0xfc0
do_huge_pmd_wp_page+0x1cc/0x380
__handle_mm_fault+0x8ee/0xe50
handle_mm_fault+0xe4/0x4a0
__get_user_pages+0x190/0x840
get_user_pages_unlocked+0xe0/0x590
hva_to_pfn+0x114/0x550 [kvm]
kvm_faultin_pfn+0xed/0x5b0 [kvm]
kvm_tdp_page_fault+0x123/0x170 [kvm]
kvm_mmu_page_fault+0x244/0xaa0 [kvm]
vcpu_enter_guest+0x592/0x1070 [kvm]
kvm_arch_vcpu_ioctl_run+0x145/0x8a0 [kvm]
kvm_vcpu_ioctl+0x288/0x6d0 [kvm]
__x64_sys_ioctl+0x8f/0xd0
do_syscall_64+0x77/0x120
entry_SYSCALL_64_after_hwframe+0x6e/0x76
</TASK>
---[ end trace ]---
Fixes: 071064f14d87 ("KVM: Don't take mmu_lock for range invalidation unless necessary")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Link: https://lore.kernel.org/r/b8723d39903b64c241c50f5513f804390c7b5eec.1718203311.git.babu.moger@amd.com
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
After catching up with KP recently, we discussed that I will be now be
responsible for co-maintaining the BPF LSM. Adding myself as
designated maintainer of the BPF LSM, and specifying more files in
which the BPF LSM maintenance responsibilities should now extend out
to. This is at the back of all the BPF kfuncs that have been added
recently, which are fundamentally restricted to being used only from
BPF LSM program types.
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ZnA-1qdtXS1TayD7@google.com
|
|
The bpf arena logic didn't account for mremap operation. Add a refcnt for
multiple mmap events to prevent use-after-free in arena_vm_close.
Fixes: 317460317a02 ("bpf: Introduce bpf_arena.")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Barret Rhoden <brho@google.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Closes: https://lore.kernel.org/bpf/Zmuw29IhgyPNKnIM@xpf.sh.intel.com
Link: https://lore.kernel.org/bpf/20240617171812.76634-1-alexei.starovoitov@gmail.com
|