Age | Commit message (Collapse) | Author |
|
The debugfs_create_dir() function never returns NULL. It does return
error pointers but in normal situations like this there is no need to
check for errors.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217071037.GE26548@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is already an "int err" declared at the start of the function so
re-use that instead of declaring a shadow err variable.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211217070735.GC26548@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following warning is fixed with additional config dependencies:
s390-linux-ld: drivers/net/ethernet/engleder/tsnep_main.o: in function `tsnep_probe':
tsnep_main.c:(.text+0x1de6): undefined reference to `devm_ioremap_resource'
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20211216200154.1520-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Avoid double free in tun_free_netdev() by moving the
dev->tstats and tun->security allocs to a new ndo_init routine
(tun_net_init()) that will be called by register_netdevice().
ndo_init is paired with the desctructor (tun_free_netdev()),
so if there's an error in register_netdevice() the destructor
will handle the frees.
BUG: KASAN: double-free or invalid-free in selinux_tun_dev_free_security+0x1a/0x20 security/selinux/hooks.c:5605
CPU: 0 PID: 25750 Comm: syz-executor416 Not tainted 5.16.0-rc2-syzk #1
Hardware name: Red Hat KVM, BIOS
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x89/0xb5 lib/dump_stack.c:106
print_address_description.constprop.9+0x28/0x160 mm/kasan/report.c:247
kasan_report_invalid_free+0x55/0x80 mm/kasan/report.c:372
____kasan_slab_free mm/kasan/common.c:346 [inline]
__kasan_slab_free+0x107/0x120 mm/kasan/common.c:374
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:1723 [inline]
slab_free_freelist_hook mm/slub.c:1749 [inline]
slab_free mm/slub.c:3513 [inline]
kfree+0xac/0x2d0 mm/slub.c:4561
selinux_tun_dev_free_security+0x1a/0x20 security/selinux/hooks.c:5605
security_tun_dev_free_security+0x4f/0x90 security/security.c:2342
tun_free_netdev+0xe6/0x150 drivers/net/tun.c:2215
netdev_run_todo+0x4df/0x840 net/core/dev.c:10627
rtnl_unlock+0x13/0x20 net/core/rtnetlink.c:112
__tun_chr_ioctl+0x80c/0x2870 drivers/net/tun.c:3302
tun_chr_ioctl+0x2f/0x40 drivers/net/tun.c:3311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x19d/0x220 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3a/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/1639679132-19884-1-git-send-email-george.kennedy@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In line:
upper = info->upper_dev;
We access upper_dev field, which is related only for particular events
(e.g. event == NETDEV_CHANGEUPPER). So, this line cause invalid memory
access for another events,
when ptr is not netdev_notifier_changeupper_info.
The KASAN logs are as follows:
[ 30.123165] BUG: KASAN: stack-out-of-bounds in prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
[ 30.133336] Read of size 8 at addr ffff80000cf772b0 by task udevd/778
[ 30.139866]
[ 30.141398] CPU: 0 PID: 778 Comm: udevd Not tainted 5.16.0-rc3 #6
[ 30.147588] Hardware name: DNI AmazonGo1 A7040 board (DT)
[ 30.153056] Call trace:
[ 30.155547] dump_backtrace+0x0/0x2c0
[ 30.159320] show_stack+0x18/0x30
[ 30.162729] dump_stack_lvl+0x68/0x84
[ 30.166491] print_address_description.constprop.0+0x74/0x2b8
[ 30.172346] kasan_report+0x1e8/0x250
[ 30.176102] __asan_load8+0x98/0xe0
[ 30.179682] prestera_netdev_port_event.constprop.0+0x68/0x538 [prestera]
[ 30.186847] prestera_netdev_event_handler+0x1b4/0x1c0 [prestera]
[ 30.193313] raw_notifier_call_chain+0x74/0xa0
[ 30.197860] call_netdevice_notifiers_info+0x68/0xc0
[ 30.202924] register_netdevice+0x3cc/0x760
[ 30.207190] register_netdev+0x24/0x50
[ 30.211015] prestera_device_register+0x8a0/0xba0 [prestera]
Fixes: 3d5048cc54bd ("net: marvell: prestera: move netdev topology validation to prestera_main")
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20211216171714.11341-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In case, when some ports is in list and we don't find requested - we
return last iterator state and not return NULL as expected.
Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20211216170736.8851-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 86c3a3e964d910a62eeb277d60b2a60ebefa9feb.
The tipc_aead_init() function can be calling from an interrupt routine.
This allocation might sleep with GFP_KERNEL flag, hence the following BUG
is reported.
[ 17.657509] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230
[ 17.660916] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/3
[ 17.664093] preempt_count: 302, expected: 0
[ 17.665619] RCU nest depth: 2, expected: 0
[ 17.667163] Preemption disabled at:
[ 17.667165] [<0000000000000000>] 0x0
[ 17.669753] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G W 5.16.0-rc4+ #1
[ 17.673006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 17.675540] Call Trace:
[ 17.676285] <IRQ>
[ 17.676913] dump_stack_lvl+0x34/0x44
[ 17.678033] __might_resched.cold+0xd6/0x10f
[ 17.679311] kmem_cache_alloc_trace+0x14d/0x220
[ 17.680663] tipc_crypto_start+0x4a/0x2b0 [tipc]
[ 17.682146] ? kmem_cache_alloc_trace+0xd3/0x220
[ 17.683545] tipc_node_create+0x2f0/0x790 [tipc]
[ 17.684956] tipc_node_check_dest+0x72/0x680 [tipc]
[ 17.686706] ? ___cache_free+0x31/0x350
[ 17.688008] ? skb_release_data+0x128/0x140
[ 17.689431] tipc_disc_rcv+0x479/0x510 [tipc]
[ 17.690904] tipc_rcv+0x71c/0x730 [tipc]
[ 17.692219] ? __netif_receive_skb_core+0xb7/0xf60
[ 17.693856] tipc_l2_rcv_msg+0x5e/0x90 [tipc]
[ 17.695333] __netif_receive_skb_list_core+0x20b/0x260
[ 17.697072] netif_receive_skb_list_internal+0x1bf/0x2e0
[ 17.698870] ? dev_gro_receive+0x4c2/0x680
[ 17.700255] napi_complete_done+0x6f/0x180
[ 17.701657] virtnet_poll+0x29c/0x42e [virtio_net]
[ 17.703262] __napi_poll+0x2c/0x170
[ 17.704429] net_rx_action+0x22f/0x280
[ 17.705706] __do_softirq+0xfd/0x30a
[ 17.706921] common_interrupt+0xa4/0xc0
[ 17.708206] </IRQ>
[ 17.708922] <TASK>
[ 17.709651] asm_common_interrupt+0x1e/0x40
[ 17.711078] RIP: 0010:default_idle+0x18/0x20
Fixes: 86c3a3e964d9 ("tipc: use consistent GFP flags")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20211217030059.5947-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the user sets a lower mtu on the CPU port than on the switch,
then DMA inserts a few more bytes into the buffer than expected.
In the worst case, it may exceed the size of the buffer. The
experiments showed that the buffer should be a multiple of the
burst length value. This patch rounds the length of the rx buffer
upwards and fixes this bug. The reservation of FCS space in the
buffer has been removed as PMAC strips the FCS.
Fixes: 998ac358019e ("net: lantiq: add support for jumbo frames")
Reported-by: Thomas Nixon <tom@tomn.co.uk>
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Paul Blakey says:
====================
net/sched: Fix ct zone matching for invalid conntrack state
Currently, when a packet is marked as invalid conntrack_in in act_ct,
post_ct will be set, and connection info (nf_conn) will be removed
from the skb. Later openvswitch and flower matching will parse this
as ct_state=+trk+inv. But because the connection info is missing,
there is also no zone info to match against even though the packet
is tracked.
This series fixes that, by passing the last executed zone by act_ct.
The zone info is passed along from act_ct to the ct flow dissector
(used by flower to extract zone info) and to ovs, the same way as post_ct
is passed, via qdisc layer skb cb to dissector, and via skb extension
to OVS.
Since adding any more data to qdisc skb cb, there will be no room
for BPF skb cb to extend it and stay under skb->cb size, this series
moves the tc related info from within qdisc skb cb to a tc specific cb
that also extends it.
====================
Link: https://lore.kernel.org/r/20211214172435.24207-1-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zone id is not restored if we passed ct and ct rejected the connection,
as there is no ct info on the skb.
Save the zone from tc skb cb to tc skb extension and pass it on to
ovs, use that info to restore the zone id for invalid connections.
Fixes: d29334c15d33 ("net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If ct rejects a flow, it removes the conntrack info from the skb.
act_ct sets the post_ct variable so the dissector will see this case
as an +tracked +invalid state, but the zone id is lost with the
conntrack info.
To restore the zone id on such cases, set the last executed zone,
via the tc control block, when passing ct, and read it back in the
dissector if there is no ct info on the skb (invalid connection).
Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
BPF layer extends the qdisc control block via struct bpf_skb_data_end
and because of that there is no more room to add variables to the
qdisc layer control block without going over the skb->cb size.
Extend the qdisc control block with a tc control block,
and move all tc related variables to there as a pre-step for
extending the tc control block with additional members.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull libata fix from Damien Le Moal:
"A single fix for this cycle:
- Check that ATA16 passthrough commands that do not transfer any data
have a DMA direction set to DMA_NONE (From George)"
* tag 'libata-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
libata: if T_LENGTH is zero, dma direction should be DMA_NONE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fixes from Damien Le Moal:
"One fix and one trivial update for rc6:
- Add MODULE_ALIAS_FS to get automatic module loading on mount
(Naohiro)
- Update Damien's email address in the MAINTAINERS file (me)"
* tag 'zonefs-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
MAITAINERS: Change zonefs maintainer email address
zonefs: add MODULE_ALIAS_FS
|
|
According to the official Microsoft MS-SMB2 document section 3.3.5.4, this
flag should be used only for 3.0 and 3.0.2 dialects. Setting it for 3.1.1
is a violation of the specification.
This causes my Windows 10 client to detect an anomaly in the negotiation,
and disable encryption entirely despite being explicitly enabled in ksmbd,
causing all data transfers to go in plain text.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org # v5.15
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Marcos Del Sol Vives <marcos@orca.pet>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
mount.cifs can pass a device with multiple delimiters in it. This will
cause rename(2) to fail with ENOENT.
V2:
- Make sanitize_path more readable.
- Fix multiple delimiters between UNC and prepath.
- Avoid a memory leak if a bad user starts putting a lot of delimiters
in the path on purpose.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2031200
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api")
Cc: stable@vger.kernel.org # 5.11+
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Thiago Rafael Becker <trbecker@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
We have a cyclic dependency between fscache super cookie
and root inode cookie. The super cookie relies on
tcon->resource_id, which gets populated from the root inode
number. However, fetching the root inode initializes inode
cookie as a child of super cookie, which is yet to be populated.
resource_id is only used as auxdata to check the validity of
super cookie. We can completely avoid setting resource_id to
remove the circular dependency. Since vol creation time and
vol serial numbers are used for auxdata, we should be fine.
Additionally, there will be auxiliary data check for each
inode cookie as well.
Fixes: 5bf91ef03d98 ("cifs: wait for tcon resource_id before getting fscache super")
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
A CP_PROTECT entry for SMMU registers is missing for A540. According to
downstream sources its length is same as on A530 - 0x20000 bytes.
On all other revisions SMMU region length is 0x10000 bytes. Despite
this, we setup region of length 0x20000 on all revisions. This doesn't
cause any issues on those GPUs. As for preventing accesses to the region
from protected mode it was tested to work the same.
This patch drops the "if" condition in setup of CP_PROTECT entry because
it already includes all supported revisions except A540.
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Link: https://lore.kernel.org/r/20211212160333.980343-2-vladimir.lypak@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
This GPU is found on SoCs such as MSM8953 (650 MHz), SDM450 (600 MHz),
SDM632 (725 MHz).
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Link: https://lore.kernel.org/r/20211212160333.980343-1-vladimir.lypak@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
There appears to be a spelling mistake in a bpf test message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211217182400.39296-1-colin.i.king@gmail.com
|
|
Reimplement bpf_probe_large_insn_limit() in bpftool, as that libbpf API
is scheduled for deprecation in v0.8.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211217171202.3352835-4-andrii@kernel.org
|
|
Add selftests for prog/map/prog+helper feature probing APIs. Prog and
map selftests are designed in such a way that they will always test all
the possible prog/map types, based on running kernel's vmlinux BTF enum
definition. This way we'll always be sure that when adding new BPF
program types or map types, libbpf will be always updated accordingly to
be able to feature-detect them.
BPF prog_helper selftest will have to be manually extended with
interesting and important prog+helper combinations, it's easy, but can't
be completely automated.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211217171202.3352835-3-andrii@kernel.org
|
|
Create three extensible alternatives to inconsistently named
feature-probing APIs:
- libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type();
- libbpf_probe_bpf_map_type() instead of bpf_probe_map_type();
- libbpf_probe_bpf_helper() instead of bpf_probe_helper().
Set up return values such that libbpf can report errors (e.g., if some
combination of input arguments isn't possible to validate, etc), in
addition to whether the feature is supported (return value 1) or not
supported (return value 0).
Also schedule deprecation of those three APIs. Also schedule deprecation
of bpf_probe_large_insn_limit().
Also fix all the existing detection logic for various program and map
types that never worked:
- BPF_PROG_TYPE_LIRC_MODE2;
- BPF_PROG_TYPE_TRACING;
- BPF_PROG_TYPE_LSM;
- BPF_PROG_TYPE_EXT;
- BPF_PROG_TYPE_SYSCALL;
- BPF_PROG_TYPE_STRUCT_OPS;
- BPF_MAP_TYPE_STRUCT_OPS;
- BPF_MAP_TYPE_BLOOM_FILTER.
Above prog/map types needed special setups and detection logic to work.
Subsequent patch adds selftests that will make sure that all the
detection logic keeps working for all current and future program and map
types, avoiding otherwise inevitable bit rot.
[0] Closes: https://github.com/libbpf/libbpf/issues/312
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Julia Kartseva <hex@fb.com>
Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org
|
|
This reverts commit bd0687c18e635b63233dc87f38058cd728802ab4.
This patch causes a Tx only workload to go to sleep even when it does
not have to, leading to misserable performance in skb mode. It fixed
one rare problem but created a much worse one, so this need to be
reverted while I try to craft a proper solution to the original
problem.
Fixes: bd0687c18e63 ("xsk: Do not sleep in poll() when need_wakeup set")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211217145646.26449-1-magnus.karlsson@gmail.com
|
|
Even after commit e1d7ba873555 ("time: Always make sure wall_to_monotonic
isn't positive") it is still possible to make wall_to_monotonic positive
by running the following code:
int main(void)
{
struct timespec time;
clock_gettime(CLOCK_MONOTONIC, &time);
time.tv_nsec = 0;
clock_settime(CLOCK_REALTIME, &time);
return 0;
}
The reason is that the second parameter of timespec64_compare(), ts_delta,
may be unnormalized because the delta is calculated with an open coded
substraction which causes the comparison of tv_sec to yield the wrong
result:
wall_to_monotonic = { .tv_sec = -10, .tv_nsec = 900000000 }
ts_delta = { .tv_sec = -9, .tv_nsec = -900000000 }
That makes timespec64_compare() claim that wall_to_monotonic < ts_delta,
but actually the result should be wall_to_monotonic > ts_delta.
After normalization, the result of timespec64_compare() is correct because
the tv_sec comparison is not longer misleading:
wall_to_monotonic = { .tv_sec = -10, .tv_nsec = 900000000 }
ts_delta = { .tv_sec = -10, .tv_nsec = 100000000 }
Use timespec64_sub() to ensure that ts_delta is normalized, which fixes the
issue.
Fixes: e1d7ba873555 ("time: Always make sure wall_to_monotonic isn't positive")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211213135727.1656662-1-liaoyu15@huawei.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One driver fix: the pm8001 has never actually worked on a system with
an IOMMU and this fixes that use case"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pm8001: Fix phys_to_virt() usage on dma_addr_t
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes, almost all error handling one-liners and for stable.
- regression fix in directory logging items
- regression fix of extent buffer status bits handling after an error
- fix memory leak in error handling path in tree-log
- fix freeing invalid anon device number when handling errors during
subvolume creation
- fix warning when freeing leaf after subvolume creation failure
- fix missing blkdev put in device scan error handling
- fix invalid delayed ref after subvolume creation failure"
* tag 'for-5.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix missing blkdev_put() call in btrfs_scan_one_device()
btrfs: fix warning when freeing leaf after subvolume creation failure
btrfs: fix invalid delayed ref after subvolume creation failure
btrfs: check WRITE_ERR when trying to read an extent buffer
btrfs: fix missing last dir item offset update when logging directory
btrfs: fix double free of anon_dev after failure to create subvolume
btrfs: fix memory leak in __add_inode_ref()
|
|
If the workqueue allocation fails, the driver is marked as not initialized,
and timer and panic_notifier will be left registered.
Instead of removing those when workqueue allocation fails, do the workqueue
initialization before doing it, and cleanup srcu_struct if it fails.
Fixes: 1d49eb91e86e ("ipmi: Move remove_work to dedicated workqueue")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
Cc: stable@vger.kernel.org
Message-Id: <20211217154410.1228673-2-cascardo@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
In case, init_srcu_struct fails (because of memory allocation failure), we
might proceed with the driver initialization despite srcu_struct not being
entirely initialized.
Fixes: 913a89f009d9 ("ipmi: Don't initialize anything in the core until something uses it")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Message-Id: <20211217154410.1228673-1-cascardo@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
For VIRTCHNL_VF_OFFLOAD_VLAN, PF's would limit the number of VLAN
filters a VF was allowed to add. However, by the time the opcode failed,
the VLAN netdev had already been added. VIRTCHNL_VF_OFFLOAD_VLAN_V2
added the ability for a PF to tell the VF how many VLAN filters it's
allowed to add. Make changes to support that functionality.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows
the VF to support 802.1Q and 802.1ad VLAN insertion and stripping if
successfully negotiated via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS.
Multiple changes were needed to support this new functionality.
1. Added new aq_required flags to support any kind of VLAN stripping and
insertion offload requests via virtchnl.
2. Added the new method iavf_set_vlan_offload_features() that's
used during VF initialization, VF reset, and iavf_set_features() to
set the aq_required bits based on the current VLAN offload
configuration of the VF's netdev.
3. Added virtchnl handling for VIRTCHNL_OP_ENABLE_STRIPPING_V2,
VIRTCHNL_OP_DISABLE_STRIPPING_V2, VIRTCHNL_OP_ENABLE_INSERTION_V2,
and VIRTCHNL_OP_ENABLE_INSERTION_V2.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows
the PF to set the location of the Tx and Rx VLAN tag for insertion and
stripping offloads. In order to support this functionality a few changes
are needed.
1. Add a new method to cache the VLAN tag location based on negotiated
capabilities for the Tx and Rx ring flags. This needs to be called in
the initialization and reset paths.
2. Refactor the transmit hotpath to account for the new Tx ring flags.
When IAVF_TXR_FLAGS_VLAN_LOC_L2TAG2 is set, then the driver needs to
insert the VLAN tag in the L2TAG2 field of the transmit descriptor.
When the IAVF_TXRX_FLAGS_VLAN_LOC_L2TAG1 is set, then the driver needs
to use the l2tag1 field of the data descriptor (same behavior as
before).
3. Refactor the iavf_tx_prepare_vlan_flags() function to simplify
transmit hardware VLAN offload functionality by only depending on the
skb_vlan_tag_present() function. This can be done because the OS
won't request transmit offload for a VLAN unless the driver told the
OS it's supported and enabled.
4. Refactor the receive hotpath to account for the new Rx ring flags and
VLAN ethertypes. This requires checking the Rx ring flags and
descriptor status bits to determine the location of the VLAN tag.
Also, since only a single ethertype can be supported at a time, check
the enabled netdev features before specifying a VLAN ethertype in
__vlan_hwaccel_put_tag().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Based on VIRTCHNL_VF_OFFLOAD_VLAN_V2, the VF can now support more VLAN
capabilities (i.e. 802.1AD offloads and filtering). In order to
communicate these capabilities to the netdev layer, the VF needs to
parse its VLAN capabilities based on whether it was able to negotiation
VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither of
these.
In order to support this, add the following functionality:
iavf_get_netdev_vlan_hw_features() - This is used to determine the VLAN
features that the underlying hardware supports and that can be toggled
off/on based on the negotiated capabiltiies. For example, if
VIRTCHNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any capability marked
with VIRTCHNL_VLAN_TOGGLE can be toggled on/off by the VF. If
VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then only VLAN insertion and/or
stripping can be toggled on/off.
iavf_get_netdev_vlan_features() - This is used to determine the VLAN
features that the underlying hardware supports and that should be
enabled by default. For example, if VIRTHCNL_VF_OFFLOAD_VLAN_V2 was
negotiated, then any supported capability that has its ethertype_init
filed set should be enabled by default. If VIRTCHNL_VF_OFFLOAD_VLAN was
negotiated, then filtering, stripping, and insertion should be enabled
by default.
Also, refactor iavf_fix_features() to take into account the new
capabilities. To do this, query all the supported features (enabled by
default and toggleable) and make sure the requested change is supported.
If VIRTCHNL_VF_OFFLOAD_VLAN_V2 is successfully negotiated, there is no
need to check VIRTCHNL_VLAN_TOGGLE here because the driver already told
the netdev layer which features can be toggled via netdev->hw_features
during iavf_process_config(), so only those features will be requested
to change.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In order to support the new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability the
VF driver needs to rework it's initialization state machine and reset
flow. This has to be done because successful negotiation of
VIRTCHNL_VF_OFFLOAD_VLAN_V2 requires the VF driver to perform a second
capability request via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS before
configuring the adapter and its netdev.
Add the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit when sending the
VIRTHCNL_OP_GET_VF_RESOURECES message. The underlying PF will either
support VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or
neither. Both of these offloads should never be supported together.
Based on this, add 2 new states to the initialization state machine:
__IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS
__IAVF_INIT_CONFIG_ADAPTER
The __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state is used to request/store
the new VLAN capabilities if and only if VIRTCHNL_VLAN_OFFLOAD_VLAN_V2
was successfully negotiated in the __IAVF_INIT_GET_RESOURCES state.
The __IAVF_INIT_CONFIG_ADAPTER state is used to configure the
adapter/netdev after the resource requests have finished. The VF will
move into this state regardless of whether it successfully negotiated
VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2.
Also, add a the new flag IAVF_FLAG_AQ_GET_OFFLOAD_VLAN_V2_CAPS and set
it during VF reset. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 was successfully
negotiated then the VF will request its VLAN capabilities via
VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS during the reset. This is needed
because the PF may change/modify the VF's configuration during VF reset
(i.e. modifying the VF's port VLAN configuration).
This also, required the VF to call netdev_update_features() since its
VLAN features may change during VF reset. Make sure to call this under
rtnl_lock().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently VIRTCHNL only allows for VLAN filtering and offloads to happen
on a single 802.1Q VLAN. Add support to filter and offload on inner,
outer, and/or inner + outer VLANs.
This is done by introducing the new capability
VIRTCHNL_VF_OFFLOAD_VLAN_V2. The flow to negotiate this new capability
is shown below.
1. VF - sets the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit in the
virtchnl_vf_resource.vf_caps_flags during the
VIRTCHNL_OP_GET_VF_RESOURCES request message. The VF should also set
the VIRTCHNL_VF_OFFLOAD_VLAN bit in case the PF driver doesn't support
the new capability.
2. PF - sets the VLAN capability bit it supports in the
VIRTCHNL_OP_GET_VF_RESOURCES response message. This will either be
VIRTCHNL_VF_OFFLOAD_VLAN_V2, VIRTCHNL_VF_OFFLOAD_VLAN, or none.
3. VF - If the VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability was ACK'd by the
PF, then the VF needs to request the VLAN capabilities of the
PF/Device by issuing a VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS request.
If the VIRTCHNL_VF_OFFLOAD_VLAN capability was ACK'd then the VF
knows only single 802.1Q VLAN filtering/offloads are supported. If no
VLAN capability is ACK'd then the PF/Device doesn't support hardware
VLAN filtering/offloads for this VF.
4. PF - Populates the virtchnl_vlan_caps structure based on what it
allows/supports for that VF and sends that response via
VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS.
After VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is successfully negotiated
the VF driver needs to interpret the capabilities supported by the
underlying PF/Device. The VF will be allowed to filter/offload the
inner 802.1Q, outer (various ethertype), inner 802.1Q + outer
(various ethertypes), or none based on which fields are set.
The VF will also need to interpret where the VLAN tag should be inserted
and/or stripped based on the negotiated capabilities.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"Another small SELinux fix for v5.16 to ensure that we don't block on
memory allocations while holding a spinlock.
This passes all our tests without problem"
* tag 'selinux-pr-20211217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix sleeping function called from invalid context
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A handful of DT updates for the SiFive HiFive Unmatched, that fix the
regulator handling. These should stop some warning spew.
- A pair of fixes for both the SiFive Hifive Unleashed and Unmatched,
that correctly hook up the MMC card detect signal.
* tag 'riscv-for-linus-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: sifive unmatched: Link the tmp451 with its power supply
riscv: dts: sifive unmatched: Fix regulator for board rev3
riscv: dts: sifive unmatched: Expose the PMIC sub-functions
riscv: dts: sifive unmatched: Expose the board ID eeprom
riscv: dts: sifive unmatched: Name gpio lines
riscv: dts: unmatched: Add gpio card detect to mmc-spi-slot
riscv: dts: unleashed: Add gpio card detect to mmc-spi-slot
|
|
Pull block fixes from Jens Axboe:
- Fix for hammering on the delayed run queue timer (me)
- bcache regression fix for this merge window (Lin)
- Fix a divide-by-zero in the blk-iocost code (Tejun)
* tag 'block-5.16-2021-12-17' of git://git.kernel.dk/linux-block:
bcache: fix NULL pointer reference in cached_dev_detach_finish
block: reduce kblockd_mod_delayed_work_on() CPU consumption
iocost: Fix divide-by-zero on donation from low hweight cgroup
|
|
Pull io_uring fix from Jens Axboe:
"Just a single fix, fixing an issue with the worker creation change
that was merged last week"
* tag 'io_uring-5.16-2021-12-17' of git://git.kernel.dk/linux-block:
io-wq: drop wqe lock before creating new worker
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"A bunch of driver fixes, notably:
- uninit variable fix for dw-axi-dmac driver
- return value check dw-edma driver
- calling wq quiesce inside spinlock and missed completion for idxd
driver
- mod alias fix for st_fdma driver"
* tag 'dmaengine-fix-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: st_fdma: fix MODULE_ALIAS
dmaengine: idxd: fix missed completion on abort path
dmaengine: ti: k3-udma: Fix smatch warnings
dmaengine: idxd: fix calling wq quiesce inside spinlock
dmaengine: dw-edma: Fix return value check for dma_set_mask_and_coherent()
dmaengine: dw-axi-dmac: Fix uninitialized variable in axi_chan_block_xfer_start()
|
|
Currently cleaned_count is initialized to ICE_DESC_UNUSED(rx_ring) and
later on during the Rx processing it is incremented per each frame that
driver consumed. This can result in excessive buffers requested from xsk
pool based on that value.
To address this, just drop cleaned_count and pass
ICE_DESC_UNUSED(rx_ring) directly as a function argument to
ice_alloc_rx_bufs_zc(). Idea is to ask for buffers as many as consumed.
Let us also call ice_alloc_rx_bufs_zc unconditionally at the end of
ice_clean_rx_irq_zc. This has been changed in that way for corresponding
ice_clean_rx_irq, but not here.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Commit ac6f733a7bd5 ("ice: allow empty Rx descriptors") stated that ice
HW can produce empty descriptors that are valid and they should be
processed.
Add this support to xsk ZC path to avoid potential processing problems.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The descriptor that ntu is pointing at when we exit
ice_alloc_rx_bufs_zc() should not have its corresponding DD bit cleared
as descriptor is not allocated in there and it is not valid for HW
usage.
The allocation routine at the entry will fill the descriptor that ntu
points to after it was set to ntu + nb_buffs on previous call.
Even the spec says:
"The tail pointer should be set to one descriptor beyond the last empty
descriptor in host descriptor ring."
Therefore, step away from clearing the status_error0 on ntu + nb_buffs
descriptor.
Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface")
Reported-by: Elza Mathew <elza.mathew@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The 'if (ntu == rx_ring->count)' block in ice_alloc_rx_buffers_zc()
was previously residing in the loop, but after introducing the
batched interface it is used only to wrap-around the NTU descriptor,
thus no more need to assign 'xdp'.
Fixes: db804cfc21e9 ("ice: Use the xsk batched rx allocation interface")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Currently, the zero-copy data path is reusing the memory region that was
initially allocated for an array of struct ice_rx_buf for its own
purposes. This is error prone as it is based on the ice_rx_buf struct
always being the same size or bigger than what the zero-copy path needs.
There can also be old values present in that array giving rise to errors
when the zero-copy path uses it.
Fix this by freeing the ice_rx_buf region and allocating a new array for
the zero-copy path that has the right length and is initialized to zero.
Fixes: 57f7f8b6bc0b ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Pull drm fixes from Dave Airlie:
"Mostly amdgpu fixes this week scattered around the driver, otherwise
one i915, one ast, one simpledrm. There is a revert in the fb-helper
for places userspace was using a string that we tried to change.
i915:
- Fix a bound check in the DMC fw load.
ast:
- NULL ptr deref fix
simpledrm:
- pixel clock units fix
fb-helper:
- userspace regression revert
amdgpu:
- Fix RLC register offset
- GMC fix
- Properly cache SMU FW version on Yellow Carp
- Fix missing callback on DCN3.1
- Reset DMCUB before HW init
- Fix for GMC powergating on PCO
- Fix a possible memory leak in GPU metrics table handling on RN"
* tag 'drm-fixes-2021-12-17-1' of git://anongit.freedesktop.org/drm/drm:
drm/amd/pm: fix a potential gpu_metrics_table memory leak
drm/amdgpu: correct the wrong cached state for GMC on PICASSO
drm/amd/display: Reset DMCUB before HW init
drm/amd/display: Set exit_optimized_pwr_state for DCN31
drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC
drm/amdgpu: don't override default ECO_BITs setting
drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE
drm/i915/display: Fix an unsigned subtraction which can never be negative.
drm/ast: potential dereference of null pointer
drm: simpledrm: fix wrong unit with pixel clock
Revert "drm/fb-helper: improve DRM fbdev emulation device names"
|
|
Currently we only NULL the xdp_buff pointer in the internal SW ring but
we never give it back to the xsk buffer pool. This means that buffers
can be leaked out of the buff pool and never be used again.
Add missing xsk_buff_free() call to the routine that is supposed to
clean the entries that are left in the ring so that these buffers in the
umem can be used by other sockets.
Also, only go through the space that is actually left to be cleaned
instead of a whole ring.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Some systems (e.g. Hyper-V guests) have all their memory marked as
hotpluggable in SRAT. acpi_numa_memory_affinity_init(), however,
ignores all such regions when !CONFIG_MEMORY_HOTPLUG and this is
unfortunate as memory affinity (NUMA) information gets lost.
'Hot Pluggable' flag in SRAT only means that "system hardware supports
hot-add and hot-remove of this memory region", it doesn't prevent
memory from being cold-plugged there.
Ignore 'Hot Pluggable' bit instead of skipping the whole memory
affinity information when !CONFIG_MEMORY_HOTPLUG.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
ACPICA code takes care about cache flushing on S1/S2/S3 in
acpi_hw_extended_sleep() and acpi_hw_legacy_sleep().
acpi_suspend_enter() calls into ACPICA code via acpi_enter_sleep_state()
for S1 or x86_acpi_suspend_lowlevel() for S3.
acpi_sleep_prepare() call tree:
__acpi_pm_prepare()
acpi_pm_prepare()
acpi_suspend_ops::prepare_late()
acpi_hibernation_ops::pre_snapshot()
acpi_hibernation_ops::prepare()
acpi_suspend_begin_old()
acpi_suspend_begin_old::begin()
acpi_hibernation_begin_old()
acpi_hibernation_ops_old::acpi_hibernation_begin_old()
acpi_power_off_prepare()
pm_power_off_prepare()
Hibernation (S4) and Power Off (S5) don't require cache flushing, so
the only interesting callsites are acpi_suspend_ops::prepare_late()
and acpi_suspend_begin_old::begin(). Both of them have cache flush
on ->enter() operation in acpi_suspend_enter().
Remove redundant ACPI_FLUSH_CPU_CACHE() in acpi_sleep_prepare() and
acpi_suspend_enter().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
According to ACPI 6.4, Section 8.2, CPU cache flushing required on
entering the C3 power state.
Avoid flushing the cache on entering other C-states.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|