Age | Commit message (Collapse) | Author |
|
Packet sockets, like tap, can be used as the backend for kernel vhost.
In packet sockets, virtio net header size is currently hardcoded to be
the size of struct virtio_net_hdr, which is 10 bytes; however, it is not
always the case: some virtio features, such as mrg_rxbuf, need virtio
net header to be 12-byte long.
Mergeable buffers, as a virtio feature, is worthy of supporting: packets
that are larger than one-mbuf size will be dropped in vhost worker's
handle_rx if mrg_rxbuf feature is not used, but large packets
cannot be avoided and increasing mbuf's size is not economical.
With this virtio feature enabled by virtio-user, packet sockets with
hardcoded 10-byte virtio net header will parse mac head incorrectly in
packet_snd by taking the last two bytes of virtio net header as part of
mac header.
This incorrect mac header parsing will cause packet to be dropped due to
invalid ether head checking in later under-layer device packet receiving.
By adding extra field vnet_hdr_sz with utilizing holes in struct
packet_sock to record currently used virtio net header size and supporting
extra sockopt PACKET_VNET_HDR_SZ to set specified vnet_hdr_sz, packet
sockets can know the exact length of virtio net header that virtio user
gives.
In packet_snd, tpacket_snd and packet_recvmsg, instead of using
hardcoded virtio net header size, it can get the exact vnet_hdr_sz from
corresponding packet_sock, and parse mac header correctly based on this
information to avoid the packets being mistakenly dropped.
Signed-off-by: Jianfeng Tan <henry.tjf@antgroup.com>
Co-developed-by: Anqi Shen <amy.saq@antgroup.com>
Signed-off-by: Anqi Shen <amy.saq@antgroup.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The auto-silencer supports two modes: "thresholded" to fill up "just
enough", and "top-up" to fill up "as much as possible". The two modes
used rather distinct code paths, which this patch unifies. The only
remaining distinction is how much we actually want to fill.
This fixes a bug in thresholded mode, where we failed to use new_hw_ptr,
resulting in under-fill.
Top-up mode is now more well-behaved and much easier to understand in
corner cases.
This also updates comments in the proximity of silencing-related data
structures.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20230420113324.877164-1-oswald.buddenhagen@gmx.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
* kvm-arm64/smccc-filtering:
: .
: SMCCC call filtering and forwarding to userspace, courtesy of
: Oliver Upton. From the cover letter:
:
: "The Arm SMCCC is rather prescriptive in regards to the allocation of
: SMCCC function ID ranges. Many of the hypercall ranges have an
: associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
: room for vendor-specific implementations.
:
: The ever-expanding SMCCC surface leaves a lot of work within KVM for
: providing new features. Furthermore, KVM implements its own
: vendor-specific ABI, with little room for other implementations (like
: Hyper-V, for example). Rather than cramming it all into the kernel we
: should provide a way for userspace to handle hypercalls."
: .
KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC"
KVM: arm64: Test that SMC64 arch calls are reserved
KVM: arm64: Prevent userspace from handling SMC64 arch range
KVM: arm64: Expose SMC/HVC width to userspace
KVM: selftests: Add test for SMCCC filter
KVM: selftests: Add a helper for SMCCC calls with SMC instruction
KVM: arm64: Let errors from SMCCC emulation to reach userspace
KVM: arm64: Return NOT_SUPPORTED to guest for unknown PSCI version
KVM: arm64: Introduce support for userspace SMCCC filtering
KVM: arm64: Add support for KVM_EXIT_HYPERCALL
KVM: arm64: Use a maple tree to represent the SMCCC filter
KVM: arm64: Refactor hvc filtering to support different actions
KVM: arm64: Start handling SMCs from EL1
KVM: arm64: Rename SMC/HVC call handler to reflect reality
KVM: arm64: Add vm fd device attribute accessors
KVM: arm64: Add a helper to check if a VM has ran once
KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/timer-vm-offsets: (21 commits)
: .
: This series aims at satisfying multiple goals:
:
: - allow a VMM to atomically restore a timer offset for a whole VM
: instead of updating the offset each time a vcpu get its counter
: written
:
: - allow a VMM to save/restore the physical timer context, something
: that we cannot do at the moment due to the lack of offsetting
:
: - provide a framework that is suitable for NV support, where we get
: both global and per timer, per vcpu offsetting, and manage
: interrupts in a less braindead way.
:
: Conflict resolution involves using the new per-vcpu config lock instead
: of the home-grown timer lock.
: .
KVM: arm64: Handle 32bit CNTPCTSS traps
KVM: arm64: selftests: Augment existing timer test to handle variable offset
KVM: arm64: selftests: Deal with spurious timer interrupts
KVM: arm64: selftests: Add physical timer registers to the sysreg list
KVM: arm64: nv: timers: Support hyp timer emulation
KVM: arm64: nv: timers: Add a per-timer, per-vcpu offset
KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co
KVM: arm64: timers: Abstract the number of valid timers per vcpu
KVM: arm64: timers: Fast-track CNTPCT_EL0 trap handling
KVM: arm64: Elide kern_hyp_va() in VHE-specific parts of the hypervisor
KVM: arm64: timers: Move the timer IRQs into arch_timer_vm_data
KVM: arm64: timers: Abstract per-timer IRQ access
KVM: arm64: timers: Rationalise per-vcpu timer init
KVM: arm64: timers: Allow save/restoring of the physical timer
KVM: arm64: timers: Allow userspace to set the global counter offset
KVM: arm64: Expose {un,}lock_all_vcpus() to the rest of KVM
KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2
KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer
arm64: Add HAS_ECV_CNTPOFF capability
arm64: Add CNTPOFF_EL2 register definition
...
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When the Core device gets an event from the device, or notices
the device FW to be up or down, it needs to send those events
on to the clients that have an event handler. Add the code to
pass along the events to the clients.
The entry points pdsc_register_notify() and pdsc_unregister_notify()
are EXPORTed for other drivers that want to listen for these events.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the client API operations for running adminq commands.
The core registers the client with the FW, then the client
has a context for requesting adminq services. We expect
to add additional operations for other clients, including
requesting additional private adminqs and IRQs, but don't have
the need yet.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
An auxiliary_bus device is created for each vDPA type VF at VF
probe and destroyed at VF remove. The aux device name comes
from the driver name + VIF type + the unique id assigned at PCI
probe. The VFs are always removed on PF remove, so there should
be no issues with VFs trying to access missing PF structures.
The auxiliary_device names will look like "pds_core.vDPA.nn"
where 'nn' is the VF's uid.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Virtual Interfaces (VIFs) supported by the DSC's
configuration (vDPA, Eth, RDMA, etc) are reported in the
dev_ident struct and made visible in debugfs. At this point
only vDPA is supported in this driver so we only setup
devices for that feature.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the service routines for submitting and processing
the adminq messages and for handling notifyq events.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Set up the basic adminq and notifyq queue structures. These are
used mostly by the client drivers for feature configuration.
These are essentially the same adminq and notifyq as in the
ionic driver.
Part of this includes querying for device identity and FW
information, so we can make that available to devlink dev info.
$ devlink dev info pci/0000:b5:00.0
pci/0000:b5:00.0:
driver pds_core
serial_number FLM18420073
versions:
fixed:
asic.id 0x0
asic.rev 0x0
running:
fw 1.51.0-73
stored:
fw.goldfw 1.15.9-C-22
fw.mainfwa 1.60.0-73
fw.mainfwb 1.60.0-57
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The devcmd interface is the basic connection to the device through the
PCI BAR for low level identification and command services. This does
the early device initialization and finds the identity data, and adds
devcmd routines to be used by later driver bits.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the initial PCI driver framework for the new pds_core device
driver and its family of devices. This does the very basics of
registering for the new PF PCI device 1dd8:100c, setting up debugfs
entries, and registering with devlink.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new bridge port attribute that allows user space to enable
per-{Port, VLAN} neighbor suppression. Example:
# bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
false
# bridge link set dev swp1 neigh_vlan_suppress on
# bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
true
# bridge link set dev swp1 neigh_vlan_suppress off
# bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]'
false
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new VLAN attribute that allows user space to set the neighbor
suppression state of the port VLAN. Example:
# bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
false
# bridge vlan set vid 10 dev swp1 neigh_suppress on
# bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
true
# bridge vlan set vid 10 dev swp1 neigh_suppress off
# bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]'
false
# bridge vlan set vid 10 dev br0 neigh_suppress on
Error: bridge: Can't set neigh_suppress for non-port vlans.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add two internal flags that will be used to enable / disable per-{Port,
VLAN} neighbor suppression:
1. 'BR_NEIGH_VLAN_SUPPRESS': A per-port flag used to indicate that
per-{Port, VLAN} neighbor suppression is enabled on the bridge port.
When set, 'BR_NEIGH_SUPPRESS' has no effect.
2. 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED': A per-VLAN flag used to indicate
that neighbor suppression is enabled on the given VLAN.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array payload[] from the structure
sctp_datahdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/socket.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:230:29: warning: nested flexible array
This member is not even used anywhere.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array hmac[] from the structure
sctp_authhdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/auth.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:735:29: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array peer_init[] from the structure
sctp_cookie to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/sm_make_chunk.c: note: in included file (through include/net/sctp/sctp.h):
./include/net/sctp/structs.h:1588:28: warning: nested flexible array
./include/net/sctp/structs.h:343:28: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array variable[] from the structure
sctp_sackhdr and sctp_errhdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/sm_statefuns.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:451:28: warning: nested flexible array
./include/linux/sctp.h:393:29: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array skip[] from the structure
sctp_ifwdtsn/fwdtsn_hdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/stream_interleave.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:611:32: warning: nested flexible array
./include/linux/sctp.h:628:33: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch deletes the flexible-array params[] from the structure
sctp_inithdr, sctp_addiphdr and sctp_reconf_chunk to avoid some
sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/input.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:278:29: warning: nested flexible array
./include/linux/sctp.h:675:30: warning: nested flexible array
This warning is reported if a structure having a flexible array
member is included by other structures.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The virtio_ring header file uses the struct device without a forward
declaration.
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230417022037.917668-1-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add VIRTIO_F_NOTIFICATION_DATA support for vDPA transport.
If this feature is negotiated, the driver passes extra data when kicking
a virtqueue.
A device that offers this feature needs to implement the
kick_vq_with_data callback.
kick_vq_with_data receives the vDPA device and data.
data includes:
16 bits vqn and 16 bits next available index for split virtqueues.
16 bits vqs, 15 least significant bits of next available index
and 1 bit next_wrap for packed virtqueues.
This patch follows a patch [1] by Viktor Prutyanov which adds support
for the MMIO, channel I/O and modern PCI transports.
Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
Message-Id: <20230413081855.36643-3-alvaro.karsz@solid-run.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
According to VirtIO spec v1.2, VIRTIO_F_NOTIFICATION_DATA feature
indicates that the driver passes extra data along with the queue
notifications.
In a split queue case, the extra data is 16-bit available index. In a
packed queue case, the extra data is 1-bit wrap counter and 15-bit
available index.
Add support for this feature for MMIO, channel I/O and modern PCI
transports.
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230413081855.36643-2-alvaro.karsz@solid-run.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Address some minor kdoc warnings in vring.h.
* Place kdoc for 'struct vringh_config_ops' immediately before the structure
* Add missing documentation of members of 'vringh_iov' and 'vringh_kiov'
Warnings flagged by:
$ ./scripts/kernel-doc -none include/linux/vringh.h
include/linux/vringh.h:68: error: Cannot parse struct or union!
include/linux/vringh.h:92: warning: Function parameter or member 'iov' not described in 'vringh_iov'
include/linux/vringh.h:92: warning: Function parameter or member 'consumed' not described in 'vringh_iov'
include/linux/vringh.h:92: warning: Function parameter or member 'i' not described in 'vringh_iov'
include/linux/vringh.h:92: warning: Function parameter or member 'used' not described in 'vringh_iov'
include/linux/vringh.h:92: warning: Function parameter or member 'max_num' not described in 'vringh_iov'
include/linux/vringh.h:104: warning: Function parameter or member 'iov' not described in 'vringh_kiov'
include/linux/vringh.h:104: warning: Function parameter or member 'consumed' not described in 'vringh_kiov'
include/linux/vringh.h:104: warning: Function parameter or member 'i' not described in 'vringh_kiov'
include/linux/vringh.h:104: warning: Function parameter or member 'used' not described in 'vringh_kiov'
include/linux/vringh.h:104: warning: Function parameter or member 'max_num' not described in 'vringh_kiov'
Signed-off-by: Simon Horman <horms@kernel.org>
Message-Id: <20230331-vhost-fixes-v1-2-1f046e735b9e@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This patch addresses the following minor kdoc problems.
* Incorrect spelling of 'callback' and 'notification'
* Unrecognised kdoc format for 'struct vdpa_map_file'
* Missing documentation of 'get_vendor_vq_stats' member of
'struct vdpa_config_ops'
* Missing documentation of 'max_supported_vqs' and 'supported_features'
members of 'struct vdpa_mgmt_dev'
Most of these problems were flagged by:
$ ./scripts/kernel-doc -Werror -none include/linux/vdpa.h
include/linux/vdpa.h:20: warning: expecting prototype for struct vdpa_calllback. Prototype was for struct vdpa_callback instead
include/linux/vdpa.h:117: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Corresponding file area for device memory mapping
include/linux/vdpa.h:357: warning: Function parameter or member 'get_vendor_vq_stats' not described in 'vdpa_config_ops'
include/linux/vdpa.h:518: warning: Function parameter or member 'supported_features' not described in 'vdpa_mgmt_dev'
include/linux/vdpa.h:518: warning: Function parameter or member 'max_supported_vqs' not described in 'vdpa_mgmt_dev'
The misspelling of 'notification' was flagged by:
$ ./scripts/checkpatch.pl --codespell --showfile --strict -f include/linux/vdpa.h
include/linux/vdpa.h:171: CHECK: 'notifcation' may be misspelled - perhaps 'notification'?
...
Signed-off-by: Simon Horman <horms@kernel.org>
Message-Id: <20230331-vhost-fixes-v1-1-1f046e735b9e@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
vDPA supports the possibility to use user VA in the iotlb messages.
So, let's add support for user VA in vringh to use it in the vDPA
simulators.
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20230404131716.45855-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
These new optional callbacks is used to bind/unbind the device to
a specific address space so the vDPA framework can use VA when
these callbacks are implemented.
Suggested-by: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20230404131326.44403-2-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add eventfd for the vdpa callback so that user
can signal it directly instead of triggering the
callback. It will be used for vhost-vdpa case.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20230323053043.35-9-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
This introduces set/get_vq_affinity callbacks in
vdpa_config_ops to support virtqueue affinity
management for vdpa device drivers.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230323053043.35-3-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Add const to make the read-only pointer parameters clear, similar to
many existing functions.
To implement this change, the commit also introduces the use of
`container_of_const` to implement `to_vvq`, which ensures the const-ness
of read-only parameters and avoids accidental modification of their
members.
Signed-off-by: Feng Liu <feliu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Message-Id: <20230310053428.3376-4-feliu@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Group some variables based on their sizes to reduce hole and avoid padding.
On x86_64, this shrinks the size of 'struct virtqueue'
from 72 to 68 bytes.
It saves a few bytes of memory.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Message-Id: <8f3d2e49270a2158717e15008e7ed7228196ba02.1676707807.git.christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Lafreniere <peter@n8pjl.ca>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
|
|
It can be really hard to analyse or debug why packets are
going missing in mac80211, so add the needed infrastructure
to use use the new per-subsystem drop reasons.
We actually use two drop reason subsystems here because of
the different handling of frames that are dropped but still
go to monitor for old versions of hostapd, and those that
are just completely unusable (e.g. crypto failed.)
Annotate a few reasons here just to illustrate this, we'll
need to go through and annotate more of them later.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend drop reasons to make them usable by subsystems
other than core by reserving the high 16 bits for a
new subsystem ID, of which 0 of course is used for the
existing reasons immediately.
To still be able to have string reasons, restructure
that code a bit to make the loopup under RCU, the only
user of this (right now) is drop_monitor.
Link: https://lore.kernel.org/netdev/00659771ed54353f92027702c5bbb84702da62ce.camel@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This will, after the next patch, hold only the core
drop reasons and minimal infrastructure. Fix a small
kernel-doc issue while at it, to avoid the move
triggering a checker.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ICMPv6 error packets are not sent to the anycast destinations and this
prevents things like traceroute from working. So create a setting similar
to ECHO when dealing with Anycast sources (icmpv6_echo_ignore_anycast).
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20230419013238.2691167-1-maheshb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
lookups by descriptor are better off closer to syscall surface...
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pass the dentry of a source file and the dentry of a destination directory
to lock parent inodes for rename. As soon as this function returns,
->d_parent of the source file dentry is stable and inodes are properly
locked for calling vfs-rename. This helper is needed for ksmbd server.
rename request of SMB protocol has to rename an opened file, no matter
which directory it's in.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Since vfs_path_lookup is exported, It should not be internal.
Move vfs_path_lookup prototype in internal.h to linux/namei.h.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
__kfree_skb_defer() uses the old naming where "defer" meant
slab bulk free/alloc APIs. In the meantime we also made
__kfree_skb_defer() feed the per-NAPI skb cache, which
implies bulk APIs. So take away the 'defer' and add 'napi'.
While at it add a drop reason. This only matters on the
tx_action path, if the skb has a frag_list. But getting
rid of a SKB_DROP_REASON_NOT_SPECIFIED seems like a net
benefit so why not.
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230420020005.815854-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Address a number of warnings flagged by
./scripts/kernel-doc -none include/net/flow_dissector.h
include/net/flow_dissector.h:23: warning: Function parameter or member 'addr_type' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:23: warning: Function parameter or member 'flags' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:46: warning: Function parameter or member 'padding' not described in 'flow_dissector_key_basic'
include/net/flow_dissector.h:145: warning: Function parameter or member 'tipckey' not described in 'flow_dissector_key_addrs'
include/net/flow_dissector.h:157: warning: cannot understand function prototype: 'struct flow_dissector_key_arp '
include/net/flow_dissector.h:171: warning: cannot understand function prototype: 'struct flow_dissector_key_ports '
include/net/flow_dissector.h:203: warning: cannot understand function prototype: 'struct flow_dissector_key_icmp '
Also improve indentation on adjacent lines to those changed
to address the above.
No functional changes intended.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230419-flow-dissector-kdoc-v1-1-1aa0cca1118b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jesper points out that we must prevent recycling into cache
after page_pool_destroy() is called, because page_pool_destroy()
is not synchronized with recycling (some pages may still be
outstanding when destroy() gets called).
I assumed this will not happen because NAPI can't be scheduled
if its page pool is being destroyed. But I missed the fact that
NAPI may get reused. For instance when user changes ring configuration
driver may allocate a new page pool, stop NAPI, swap, start NAPI,
and then destroy the old pool. The NAPI is running so old page
pool will think it can recycle to the cache, but the consumer
at that point is the destroy() path, not NAPI.
To avoid extra synchronization let the drivers do "unlinking"
during the "swap" stage while NAPI is indeed disabled.
Fixes: 8c48eea3adf3 ("page_pool: allow caching from safely localized NAPI")
Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Link: https://lore.kernel.org/all/e8df2654-6a5b-3c92-489d-2fe5e444135f@redhat.com/
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/20230419182006.719923-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To be consistent with the other enum keys use OP_MOD
instead of OP_MODE.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:
$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0
Moreover, add documentation on the reporter functionality and the
counters description.
While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Adjacent changes:
net/mptcp/protocol.h
63740448a32e ("mptcp: fix accept vs worker race")
2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter and bpf.
There are a few fixes for new code bugs, including the Mellanox one
noted in the last networking pull. No known regressions outstanding.
Current release - regressions:
- sched: clear actions pointer in miss cookie init fail
- mptcp: fix accept vs worker race
- bpf: fix bpf_arch_text_poke() with new_addr == NULL on s390
- eth: bnxt_en: fix a possible NULL pointer dereference in unload
path
- eth: veth: take into account peer device for
NETDEV_XDP_ACT_NDO_XMIT xdp_features flag
Current release - new code bugs:
- eth: revert "net/mlx5: Enable management PF initialization"
Previous releases - regressions:
- netfilter: fix recent physdev match breakage
- bpf: fix incorrect verifier pruning due to missing register
precision taints
- eth: virtio_net: fix overflow inside xdp_linearize_page()
- eth: cxgb4: fix use after free bugs caused by circular dependency
problem
- eth: mlxsw: pci: fix possible crash during initialization
Previous releases - always broken:
- sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg
- netfilter: validate catch-all set elements
- bridge: don't notify FDB entries with "master dynamic"
- eth: bonding: fix memory leak when changing bond type to ethernet
- eth: i40e: fix accessing vsi->active_filters without holding lock
Misc:
- Mat is back as MPTCP co-maintainer"
* tag 'net-6.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
net: bridge: switchdev: don't notify FDB entries with "master dynamic"
Revert "net/mlx5: Enable management PF initialization"
MAINTAINERS: Resume MPTCP co-maintainer role
mailmap: add entries for Mat Martineau
e1000e: Disable TSO on i219-LM card to increase speed
bnxt_en: fix free-runnig PHC mode
net: dsa: microchip: ksz8795: Correctly handle huge frame configuration
bpf: Fix incorrect verifier pruning due to missing register precision taints
hamradio: drop ISA_DMA_API dependency
mlxsw: pci: Fix possible crash during initialization
mptcp: fix accept vs worker race
mptcp: stops worker on unaccepted sockets at listener close
net: rpl: fix rpl header size calculation
net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()
bonding: Fix memory leak when changing bond type to Ethernet
veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features flag
mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next()
bnxt_en: Fix a possible NULL pointer dereference in unload path
bnxt_en: Do not initialize PTP on older P3/P4 chips
netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements
...
|
|
Userspace can't easily discover how much of a sleep cycle was spent in a
hardware sleep state without using kernel tracing and vendor specific sysfs
or debugfs files.
To make this information more discoverable, introduce 3 new sysfs files:
1) The time spent in a hw sleep state for last cycle.
2) The time spent in a hw sleep state since the kernel booted
3) The maximum time that the hardware can report for a sleep cycle.
All of these files will be present only if the system supports s2idle.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
* for-next/perf: (24 commits)
KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege
drivers/perf: hisi: add NULL check for name
drivers/perf: hisi: Remove redundant initialized of pmu->name
perf/arm-cmn: Fix port detection for CMN-700
arm64: pmuv3: dynamically map PERF_COUNT_HW_BRANCH_INSTRUCTIONS
perf/arm-cmn: Validate cycles events fully
Revert "ARM: mach-virt: Select PMUv3 driver by default"
drivers/perf: apple_m1: Add Apple M2 support
dt-bindings: arm-pmu: Add PMU compatible strings for Apple M2 cores
perf: arm_cspmu: Fix variable dereference warning
perf/amlogic: Fix config1/config2 parsing issue
drivers/perf: Use devm_platform_get_and_ioremap_resource()
kbuild, drivers/perf: remove MODULE_LICENSE in non-modules
perf: qcom: Use devm_platform_get_and_ioremap_resource()
perf: arm: Use devm_platform_get_and_ioremap_resource()
perf/arm-cmn: Move overlapping wp_combine field
ARM: mach-virt: Select PMUv3 driver by default
ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM
ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations
perf: pmuv3: Change GENMASK to GENMASK_ULL
...
|
|
The latest version of your favorite fork of the PE/COFF spec includes a
new type of header flag that is intended to be used in the context of
EFI firmware to indicate to the image loader that the executable regions
of an image can be mapped with BTI/IBT enforcement enabled.
So let's import these definitions so we can use them in subsequent
patches.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The tracking of used_hiwater adds an atomic operation to the hot
path. This is acceptable only when debugging the kernel. To make
sure that the fields can never be used by mistake, do not even
include them in struct io_tlb_mem if CONFIG_DEBUG_FS is not set.
The build fails after doing that. To fix it, it is necessary to
remove all code specific to debugfs and instead provide a stub
implementation of swiotlb_create_debugfs_files(). As a bonus, this
change allows to remove one __maybe_unused attribute.
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|