Age | Commit message (Collapse) | Author |
|
Now that the bundle->channel_lock has been eliminated, we don't need the
dummy service bundle anymore. It's purpose was purely to provide the
channel_lock for service connections.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
Change the rx_packet tracepoint to display the securityIndex from the
packet header instead of displaying the type in numeric form. There's no
need for the latter, as the display of the type in symbolic form will fall
back automatically to displaying the hex value if no symbol is available.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
Set a reason when freeing a packet that has been consumed such that
dropwatch doesn't complain that it has been dropped.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
We no longer need local->defrag_sem as all DATA packet transmission is now
done from one thread, so remove it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
call->tx_buffer is now only accessed within the I/O thread (->tx_sendmsg is
the way sendmsg passes packets to the I/O thread) so there's no need to
lock around it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
Now that general ACK transmission is done from the same thread as incoming
DATA packet wrangling, there's no possibility that the SACK table will be
being updated by the latter whilst the former is trying to copy it to an
ACK.
This means that we can safely rotate the SACK table whilst updating it
without having to take a lock, rather than keeping all the bits inside it
in fixed place and copying and then rotating it in the transmitter.
Therefore, simplify SACK handing by keeping track of starting point in the
ring and rotate slots down as we consume them.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
call->ackr_window doesn't need to be atomic as ACK generation and ACK
transmission are now done in the same thread, so drop the atomic64 handling
and split it into two separate members.
Similarly, call->ackr_nr_unacked doesn't need to be atomic now either.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
When doing a call that has a single transmitted data packet and a massive
amount of received data packets, we only ping for one RTT sample, which
means we don't get a good reading on it.
Fix this by converting occasional IDLE ACKs into PING ACKs to elicit a
response.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
If CONFIG_AF_RXRPC_DEBUG_RX_DELAY=y, then a delay is injected between
packets and errors being received and them being made available to the
processing code, thereby allowing the RTT to be artificially increased.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
Convert call->recvmsg_lock to a spinlock as it's only ever write-locked.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
|
|
Shrink the tabulation in the rxrpc trace header a bit to allow for fields
with long type names that have been removed.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Work around checkpatch warnings in the rxrpc trace header by removing
whitespace before ')' on lines defining the trace record struct.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Looks like kunit_test_init_section_suites(...) was messed up in a merge
conflict. This fixes it.
kunit_test_init_section_suites(...) was not updated to avoid the extra
level of indirection when .kunit_test_suites was flattened. Given no-one
was actively using it, this went unnoticed for a long period of time.
Fixes: e5857d396f35 ("kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites")
Signed-off-by: Brendan Higgins <brendan.higgins@linux.dev>
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Martin Fernandez <martin.fernandez@eclypsium.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
The mailing list at librecores.org is being shut down due to
infrastructure issues. Update the the newly created list on
vger.kernel.org.
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
When validating drafted SPDK ublk target, in a case that
assigning large queue depth to multiqueue ublk device,
ublk target would run into a weird incorrect state. During
rounds of review and debug, An overflow bug was found
in ublk driver.
In ublk_cmd.h, UBLK_MAX_QUEUE_DEPTH is 4096 which means
each ublk queue depth can be set as large as 4096. But
when setting qd for a ublk device,
sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io)
will be larger than 65535 if qd is larger than 2728.
Then queue_size is overflowed, and ublk_get_queue()
references a wrong pointer position. The wrong content of
ublk_queue elements will lead to out-of-bounds memory
access.
Extend queue_size in ublk_device as "unsigned int".
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131070552.115067-1-xiaodong.liu@intel.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
After a call to console_unlock() in vcs_read() the vc_data struct can be
freed by vc_deallocate(). Because of that, the struct vc_data pointer
load must be done at the top of while loop in vcs_read() to avoid a UAF
when vcs_size() is called.
Syzkaller reported a UAF in vcs_size().
BUG: KASAN: use-after-free in vcs_size (drivers/tty/vt/vc_screen.c:215)
Read of size 4 at addr ffff8881137479a8 by task 4a005ed81e27e65/1537
CPU: 0 PID: 1537 Comm: 4a005ed81e27e65 Not tainted 6.2.0-rc5 #1
Hardware name: Red Hat KVM, BIOS 1.15.0-2.module
Call Trace:
<TASK>
__asan_report_load4_noabort (mm/kasan/report_generic.c:350)
vcs_size (drivers/tty/vt/vc_screen.c:215)
vcs_read (drivers/tty/vt/vc_screen.c:415)
vfs_read (fs/read_write.c:468 fs/read_write.c:450)
...
</TASK>
Allocated by task 1191:
...
kmalloc_trace (mm/slab_common.c:1069)
vc_allocate (./include/linux/slab.h:580 ./include/linux/slab.h:720
drivers/tty/vt/vt.c:1128 drivers/tty/vt/vt.c:1108)
con_install (drivers/tty/vt/vt.c:3383)
tty_init_dev (drivers/tty/tty_io.c:1301 drivers/tty/tty_io.c:1413
drivers/tty/tty_io.c:1390)
tty_open (drivers/tty/tty_io.c:2080 drivers/tty/tty_io.c:2126)
chrdev_open (fs/char_dev.c:415)
do_dentry_open (fs/open.c:883)
vfs_open (fs/open.c:1014)
...
Freed by task 1548:
...
kfree (mm/slab_common.c:1021)
vc_port_destruct (drivers/tty/vt/vt.c:1094)
tty_port_destructor (drivers/tty/tty_port.c:296)
tty_port_put (drivers/tty/tty_port.c:312)
vt_disallocate_all (drivers/tty/vt/vt_ioctl.c:662 (discriminator 2))
vt_ioctl (drivers/tty/vt/vt_ioctl.c:903)
tty_ioctl (drivers/tty/tty_io.c:2776)
...
The buggy address belongs to the object at ffff888113747800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 424 bytes inside of
1024-byte region [ffff888113747800, ffff888113747c00)
The buggy address belongs to the physical page:
page:00000000b3fe6c7c refcount:1 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x113740
head:00000000b3fe6c7c order:3 compound_mapcount:0 subpages_mapcount:0
compound_pincount:0
anon flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0010200 ffff888100042dc0 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888113747880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888113747900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888113747980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888113747a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888113747a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Disabling lock debugging due to kernel taint
Fixes: ac751efa6a0d ("console: rename acquire/release_console_sem() to console_lock/unlock()")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Suggested-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: George Kennedy <george.kennedy@oracle.com>
Link: https://lore.kernel.org/r/1674577014-12374-1-git-send-email-george.kennedy@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Replace the enable_irq_wake() call with one to dev_pm_set_wake_irq()
instead. This will let the dev PM framework automatically manage the
the wakeup capability of the ipa IRQ and ensure that userspace requests
to enable/disable wakeup for the IPA via sysfs are respected.
Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20230127202758.2913612-1-caleb.connolly@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There is no bug. If sch->length == 0, this would result in an infinite
loop, but first caller, do_basic_checks(), errors out in this case.
After this change, packets with bogus zero-length chunks are no longer
detected as invalid, so revert & add comment wrt. 0 length check.
Fixes: 98ee00774525 ("netfilter: conntrack: fix bug in for_each_sctp_chunk")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When using a xfrm interface in a bridged setup (the outgoing device is
bridged), the incoming packets in the xfrm interface are only tracked
in the outgoing direction.
$ brctl show
bridge name interfaces
br_eth1 eth1
$ conntrack -L
tcp 115 SYN_SENT src=192... dst=192... [UNREPLIED] ...
If br_netfilter is enabled, the first (encrypted) packet is received onR
eth1, conntrack hooks are called from br_netfilter emulation which
allocates nf_bridge info for this skb.
If the packet is for local machine, skb gets passed up the ip stack.
The skb passes through ip prerouting a second time. br_netfilter
ip_sabotage_in supresses the re-invocation of the hooks.
After this, skb gets decrypted in xfrm layer and appears in
network stack a second time (after decryption).
Then, ip_sabotage_in is called again and suppresses netfilter
hook invocation, even though the bridge layer never called them
for the plaintext incarnation of the packet.
Free the bridge info after the first suppression to avoid this.
I was unable to figure out where the regression comes from, as far as i
can see br_netfilter always had this problem; i did not expect that skb
is looped again with different headers.
Fixes: c4b0e771f906 ("netfilter: avoid using skb->nf_bridge directly")
Reported-and-tested-by: Wolfgang Nothdurft <wolfgang@linogate.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When NET_DSA_MICROCHIP_KSZ_COMMON is built-in but PTP is a loadable
module, the ksz_ptp support still causes a link failure:
ld.lld-16: error: undefined symbol: ptp_clock_index
>>> referenced by ksz_ptp.c
>>> drivers/net/dsa/microchip/ksz_ptp.o:(ksz_get_ts_info) in archive vmlinux.a
This can happen if NET_DSA_MICROCHIP_KSZ8863_SMI is enabled, or
even if none of the KSZ9477_I2C/KSZ_SPI/KSZ8863_SMI ones are active
but only the common module is.
The most straightforward way to address this is to move the
dependency to NET_DSA_MICROCHIP_KSZ_PTP itself, which can now
only be enabled if both PTP_1588_CLOCK support is reachable
from NET_DSA_MICROCHIP_KSZ_COMMON. Alternatively, one could make
NET_DSA_MICROCHIP_KSZ_COMMON a hidden Kconfig symbol and extend the
PTP_1588_CLOCK_OPTIONAL dependency to NET_DSA_MICROCHIP_KSZ8863_SMI as
well, but that is a little more fragile.
Fixes: eac1ea20261e ("net: dsa: microchip: ptp: add the posix clock support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230130131808.1084796-1-arnd@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Correct spelling problems for Documentation/networking/ as reported
by codespell.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Jiri Pirko <jiri@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20230129231053.20863-5-rdunlap@infradead.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In kernels compiled with CONFIG_PARAVIRT=n, the compiler re-orders the
DR7 read in exc_nmi() to happen before the call to sev_es_ist_enter().
This is problematic when running as an SEV-ES guest because in this
environment the DR7 read might cause a #VC exception, and taking #VC
exceptions is not safe in exc_nmi() before sev_es_ist_enter() has run.
The result is stack recursion if the NMI was caused on the #VC IST
stack, because a subsequent #VC exception in the NMI handler will
overwrite the stack frame of the interrupted #VC handler.
As there are no compiler barriers affecting the ordering of DR7
reads/writes, make the accesses to this register volatile, forbidding
the compiler to re-order them.
[ bp: Massage text, make them volatile too, to make sure some
aggressive compiler optimization pass doesn't discard them. ]
Fixes: 315562c9af3d ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler")
Reported-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230127035616.508966-1-aik@amd.com
|
|
If a relocatable kernel is loaded at a non-zero address and told not to
relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the
interrupt code at zero is left with RWX permissions.
That is a security weakness, and leads to a warning at boot if
CONFIG_DEBUG_WX is enabled:
powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000
WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries
NIP: c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000
REGS: c000000003503770 TRAP: 0700 Not tainted (6.2.0-rc1-00001-g8ae8e98aea82-dirty)
MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24000220 XER: 00000000
CFAR: c000000000545a58 IRQMASK: 0
...
NIP note_page+0x484/0x4c0
LR note_page+0x480/0x4c0
Call Trace:
note_page+0x480/0x4c0 (unreliable)
ptdump_pmd_entry+0xc8/0x100
walk_pgd_range+0x618/0xab0
walk_page_range_novma+0x74/0xc0
ptdump_walk_pgd+0x98/0x170
ptdump_check_wx+0x94/0x100
mark_rodata_ro+0x30/0x70
kernel_init+0x78/0x1a0
ret_from_kernel_thread+0x5c/0x64
The fix has two parts. Firstly the pages from zero up to the end of
interrupts need to be marked read-only, so that they are left with R-X
permissions. Secondly the mapping logic needs to be taught to ensure
there is a page boundary at the end of the interrupt region, so that the
permission change only applies to the interrupt text, and not the region
following it.
Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-2-mpe@ellerman.id.au
|
|
If a relocatable kernel is loaded at an address that is not 2MB aligned
and told not to relocate to zero, the kernel can crash due to
mark_rodata_ro() incorrectly changing some read-write data to read-only.
Scenarios where the misalignment can occur are when the kernel is
loaded by kdump or using the RELOCATABLE_TEST config option.
Example crash with the kernel loaded at 5MB:
Run /sbin/init as init process
BUG: Unable to handle kernel data access on write at 0xc000000000452000
Faulting instruction address: 0xc0000000005b6730
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
NIP: c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
REGS: c000000004503250 TRAP: 0300 Not tainted (6.2.0-rc1-00011-g349188be4841)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 44288480 XER: 00000000
CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
...
NIP memset+0x68/0x104
LR zero_user_segments.constprop.0+0xa8/0xf0
Call Trace:
ext4_mpage_readpages+0x7f8/0x830
ext4_readahead+0x48/0x60
read_pages+0xb8/0x380
page_cache_ra_unbounded+0x19c/0x250
filemap_fault+0x58c/0xae0
__do_fault+0x60/0x100
__handle_mm_fault+0x1230/0x1a40
handle_mm_fault+0x120/0x300
___do_page_fault+0x20c/0xa80
do_page_fault+0x30/0xc0
data_access_common_virt+0x210/0x220
This happens because mark_rodata_ro() tries to change permissions on the
range _stext..__end_rodata, but _stext sits in the middle of the 2MB
page from 4MB to 6MB:
radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)
The logic that changes the permissions assumes the linear mapping was
split correctly at boot, so it marks the entire 2MB page read-only. That
leads to the write fault above.
To fix it, the boot time mapping logic needs to consider that if the
kernel is running at a non-zero address then _stext is a boundary where
it must split the mapping.
That leads to the mapping being split correctly, allowing the rodata
permission change to take happen correctly, with no spillover:
radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)
If the kernel is loaded at a 2MB aligned address, the mapping continues
to use 2MB pages as before:
radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages
Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au
|
|
In kexec_extra_fdt_size_ppc64() there's logic to estimate how much
extra space will be needed in the device tree for some memory related
properties.
That logic uses the size of RAM divided by drmem_lmb_size() to do the
estimation. However drmem_lmb_size() can be zero if the machine has no
hotpluggable memory configured, which is the case when booting with qemu
and no maxmem=x parameter is passed (the default).
The division by zero is reported by UBSAN, and can also lead to an
overflow and a warning from kvmalloc, and kdump kernel loading fails:
WARNING: CPU: 0 PID: 133 at mm/util.c:596 kvmalloc_node+0x15c/0x160
Modules linked in:
CPU: 0 PID: 133 Comm: kexec Not tainted 6.2.0-rc5-03455-g07358bd97810 #223
Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,git-dd0dca pSeries
NIP: c00000000041ff4c LR: c00000000041fe58 CTR: 0000000000000000
REGS: c0000000096ef750 TRAP: 0700 Not tainted (6.2.0-rc5-03455-g07358bd97810)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24248242 XER: 2004011e
CFAR: c00000000041fed0 IRQMASK: 0
...
NIP kvmalloc_node+0x15c/0x160
LR kvmalloc_node+0x68/0x160
Call Trace:
kvmalloc_node+0x68/0x160 (unreliable)
of_kexec_alloc_and_setup_fdt+0xb8/0x7d0
elf64_load+0x25c/0x4a0
kexec_image_load_default+0x58/0x80
sys_kexec_file_load+0x5c0/0x920
system_call_exception+0x128/0x330
system_call_vectored_common+0x15c/0x2ec
To fix it, skip the calculation if drmem_lmb_size() is zero.
Fixes: 2377c92e37fe ("powerpc/kexec_file: fix FDT size estimation for kdump kernel")
Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230130014707.541110-1-mpe@ellerman.id.au
|
|
As DMA Rx can be completed from two places, it is possible that DMA Rx
completes before DMA completion callback had a chance to complete it.
Once the previous DMA Rx has been completed, a new one can be started
on the next UART interrupt. The following race is possible
(uart_unlock_and_check_sysrq_irqrestore() replaced with
spin_unlock_irqrestore() for simplicity/clarity):
CPU0 CPU1
dma_rx_complete()
serial8250_handle_irq()
spin_lock_irqsave(&port->lock)
handle_rx_dma()
serial8250_rx_dma_flush()
__dma_rx_complete()
dma->rx_running = 0
// Complete DMA Rx
spin_unlock_irqrestore(&port->lock)
serial8250_handle_irq()
spin_lock_irqsave(&port->lock)
handle_rx_dma()
serial8250_rx_dma()
dma->rx_running = 1
// Setup a new DMA Rx
spin_unlock_irqrestore(&port->lock)
spin_lock_irqsave(&port->lock)
// sees dma->rx_running = 1
__dma_rx_complete()
dma->rx_running = 0
// Incorrectly complete
// running DMA Rx
This race seems somewhat theoretical to occur for real but handle it
correctly regardless. Check what is the DMA status before complething
anything in __dma_rx_complete().
Reported-by: Gilles BULOZ <gilles.buloz@kontron.com>
Tested-by: Gilles BULOZ <gilles.buloz@kontron.com>
Fixes: 9ee4b83e51f7 ("serial: 8250: Add support for dmaengine")
Cc: stable@vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230130114841.25749-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
__dma_rx_complete() is called from two places:
- Through the DMA completion callback dma_rx_complete()
- From serial8250_rx_dma_flush() after IIR_RLSI or IIR_RX_TIMEOUT
The former does not hold port's lock during __dma_rx_complete() which
allows these two to race and potentially insert the same data twice.
Extend port's lock coverage in dma_rx_complete() to prevent the race
and check if the DMA Rx is still pending completion before calling
into __dma_rx_complete().
Reported-by: Gilles BULOZ <gilles.buloz@kontron.com>
Tested-by: Gilles BULOZ <gilles.buloz@kontron.com>
Fixes: 9ee4b83e51f7 ("serial: 8250: Add support for dmaengine")
Cc: stable@vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230130114841.25749-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Requesting an interrupt with IRQF_ONESHOT will run the primary handler
in the hard-IRQ context even in the force-threaded mode. The
force-threaded mode is used by PREEMPT_RT in order to avoid acquiring
sleeping locks (spinlock_t) in hard-IRQ context. This combination
makes it impossible and leads to "sleeping while atomic" warnings.
Use one interrupt handler for both handlers (primary and secondary)
and drop the IRQF_ONESHOT flag which is not needed.
Fixes: e359b4411c283 ("serial: stm32: fix threaded interrupt handling")
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Valentin Caron <valentin.caron@foss.st.com> # V3
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230120160332.57930-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
"1st set of IIO fixes for the 6.2 cycle.
The usual mixed bag - with a bunch of issues found by Carlos Song
in the fxos8700 IMU driver dominating.
hid-accel,gyro
- Fix wrong returned value when read succeeds.
marvell,berlin-adc
- Missing of_node_put() in an error path.
nxp,fxos8700 (freescale)
- Wrong channel type match.
- Swapped channel read back.
- Incomplete channel read back (not enough bytes).
- Missing shift of acceleration data.
- Range selection didn't work (datasheet bug)
- Wrong ODR mode read back due to wrong field offset.
- Drop unused, but wrong define.
- Fix issue with magnetometer scale an units.
nxp,imx8qxp
- Fix an irq flood due to not reading data early enough.
st,lsm6dsx
- Add CONFIG_IIO_TRIGGERED_BUFFER select.
st,stm32-adc
- Fix missing MODULE_DEVICE_TABLE() needed for module aliases.
ti,twl6030
- Fix missing enable of some channels.
- Fix a typo in previous patch that meant one channel still wasn't enabled.
xilinx,xadc
- Carrying on incorrectly after allocation error."
* tag 'iio-fixes-for-6.2a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: imu: fxos8700: fix MAGN sensor scale and unit
iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN
iio: imu: fxos8700: fix failed initialization ODR mode assignment
iio: imu: fxos8700: fix incorrect ODR mode readback
iio: light: cm32181: Fix PM support on system with 2 I2C resources
iio: hid: fix the retval in gyro_3d_capture_sample
iio: hid: fix the retval in accel_3d_capture_sample
iio: imu: st_lsm6dsx: fix build when CONFIG_IIO_TRIGGERED_BUFFER=m
iio:adc:twl6030: Enable measurement of VAC
iio: imu: fxos8700: fix ACCEL measurement range selection
iio: imu: fxos8700: fix IMU data bits returned to user space
iio: imu: fxos8700: fix incomplete ACCEL and MAGN channels readback
iio: imu: fxos8700: fix swapped ACCEL and MAGN channels readback
iio: imu: fxos8700: fix map label of channel type to MAGN sensor
iio:adc:twl6030: Enable measurements of VUSB, VBAT and others
iio: imx8qxp-adc: fix irq flood when call imx8qxp_adc_read_raw()
iio: adc: xilinx-ams: fix devm_krealloc() return value check
iio: adc: berlin2-adc: Add missing of_node_put() in error path
iio: adc: stm32-dfsdm: fill module aliases
|
|
Nothing was explicitly bounds checking the priority index used to access
clpriop[]. WARN and bail out early if it's pathological. Seen with GCC 13:
../net/sched/sch_htb.c: In function 'htb_activate_prios':
../net/sched/sch_htb.c:437:44: warning: array subscript [0, 31] is outside array bounds of 'struct htb_prio[8]' [-Warray-bounds=]
437 | if (p->inner.clprio[prio].feed.rb_node)
| ~~~~~~~~~~~~~~~^~~~~~
../net/sched/sch_htb.c:131:41: note: while referencing 'clprio'
131 | struct htb_prio clprio[TC_HTB_NUMPRIO];
| ^~~~~~
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Link: https://lore.kernel.org/r/20230127224036.never.561-kees@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There doesn't appear to be a reason to truncate the allocation used for
flow_info, so do a full allocation and remove the unused empty struct.
GCC does not like having a reference to an object that has been
partially allocated, as bounds checking may become impossible when
such an object is passed to other code. Seen with GCC 13:
../drivers/net/ethernet/mediatek/mtk_ppe.c: In function 'mtk_foe_entry_commit_subflow':
../drivers/net/ethernet/mediatek/mtk_ppe.c:623:18: warning: array subscript 'struct mtk_flow_entry[0]' is partly outside array bounds of 'unsigned char[48]' [-Warray-bounds=]
623 | flow_info->l2_data.base_flow = entry;
| ^~
Cc: Felix Fietkau <nbd@nbd.name>
Cc: John Crispin <john@phrozen.org>
Cc: Sean Wang <sean.wang@mediatek.com>
Cc: Mark Lee <Mark-MC.Lee@mediatek.com>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mediatek@lists.infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230127223853.never.014-kees@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Previously, ibmvnic IRQs were assigned to CPU numbers by assigning all
the IRQs for transmit queues then assigning all the IRQs for receive
queues. With multi-threaded processors, in a heavy RX or TX environment,
physical cores would either be overloaded or underutilized (due to the
IRQ assignment algorithm). This approach is sub-optimal because IRQs for
the same subprocess (RX or TX) would be bound to adjacent CPU numbers,
meaning they were more likely to be contending for the same core.
For example, in a system with 64 CPU's and 32 queues, the IRQs would
be bound to CPU in the following pattern:
IRQ type | CPU number
-----------------------
TX0 | 0-1
TX1 | 2-3
<etc>
RX0 | 32-33
RX1 | 34-35
<etc>
Observe that in SMT-8, the first 4 tx queues would be sharing the
same core.
A more optimal algorithm would balance the number RX and TX IRQ's across
the physical cores. Therefore, to increase performance, distribute RX and
TX IRQs across cores by alternating between assigning IRQs for RX and TX
queues to CPUs.
With a system with 64 CPUs and 32 queues, this results in the following
pattern:
IRQ type | CPU number
-----------------------
TX0 | 0-1
RX0 | 2-3
TX1 | 4-5
RX1 | 6-7
<etc>
Observe that in SMT-8, there is equal distribution of RX and TX IRQs
per core. In the above case, each core handles 2 TX and 2 RX IRQ's.
Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Reviewed-by: Haren Myneni <haren@linux.ibm.com>
Link: https://lore.kernel.org/r/20230127214358.318152-1-nnac123@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When CONFIG_MODULE_SIG_KEY is PKCS#11 URI (pkcs11:*), signing of modules
fails:
scripts/sign-file sha256 /.../linux/pkcs11:token=foo;object=bar;pin-value=1111 certs/signing_key.x509 /.../kernel/crypto/tcrypt.ko
Usage: scripts/sign-file [-dp] <hash algo> <key> <x509> <module> [<dest>]
scripts/sign-file -s <raw sig> <hash algo> <x509> <module> [<dest>]
First, we need to avoid adding the $(srctree)/ prefix to the URL.
Second, since the kconfig string values no longer include quotes, we need to add
them again when passing a PKCS#11 URI to sign-file. This avoids
splitting by the shell if the URI contains semicolons.
Fixes: 4db9c2e3d055 ("kbuild: stop using config_filename in scripts/Makefile.modsign")
Fixes: 129ab0d2d9f3 ("kbuild: do not quote string values in include/config/auto.conf")
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
When CONFIG_MODULE_SIG_KEY is PKCS#11 URI (pkcs11:*) and contains a
semicolon, signing_key.x509 fails to build:
certs/extract-cert pkcs11:token=foo;object=bar;pin-value=1111 certs/signing_key.x509
Usage: extract-cert <source> <dest>
Add quotes to the extract-cert argument to avoid splitting by the shell.
This approach was suggested by Masahiro Yamada <masahiroy@kernel.org>.
Fixes: 129ab0d2d9f3 ("kbuild: do not quote string values in include/config/auto.conf")
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Smatch static analysis tool detects that acquired lock is not released
in hwdep device when condition branch is passed due to no event. It is
unlikely to occur, while fulfilling is preferable for better coding.
Reported-by: Dan Carpenter <error27@gmail.com>
Fixes: 634ec0b2906e ("ALSA: firewire-motu: notify event for parameter change in register DSP model")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20230130141540.102854-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:
====================
ieee802154 for net 2023-01-30
Only one fix this time around.
Miquel Raynal fixed a potential double free spotted by Dan Carpenter.
* tag 'ieee802154-for-net-2023-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
mac802154: Fix possible double free upon parsing error
====================
Link: https://lore.kernel.org/r/20230130095646.301448-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Colin Foster says:
====================
add support for the the vsc7512 internal copper phys
This patch series is a continuation to add support for the VSC7512:
https://patchwork.kernel.org/project/netdevbpf/list/?series=674168&state=*
That series added the framework and initial functionality for the
VSC7512 chip. Several of these patches grew during the initial
development of the framework, which is why v1 will include changelogs.
It was during v9 of that original MFD patch set that these were dropped.
With that out of the way, the VSC7512 is mainly a subset of the VSC7514
chip. The 7512 lacks an internal MIPS processor, but otherwise many of
the register definitions are identical. That is why several of these
patches are simply to expose common resources from
drivers/net/ethernet/mscc/*.
This patch only adds support for the first four ports (swp0-swp3). The
remaining ports require more significant changes to the felix driver,
and will be handled in the future.
====================
Link: https://lore.kernel.org/r/20230127193559.1001051-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Utilize the existing ocelot MFD interface to add switch functionality to
the Microsemi VSC7512 chip.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Acked-for-MFD-by: Lee Jones <lee@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add control of an external VSC7512 chip.
Currently the four copper phy ports are fully functional. Communication to
external phys is also functional, but the SGMII / QSGMII interfaces are
currently non-functional.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The main purpose of the Ocelot chips are the Ethernet switching
functionalities. Document the support for these features.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VSC7511, VSC7512, VSC7513 and VSC7514 all have the ability to be
controlled either internally by a memory-mapped CPU, or externally via
interfaces like SPI and PCIe. The internal CPU of the VSC7511 and 7512
don't have the resources to run Linux, so must be controlled via these
external interfaces in a DSA configuration.
Add mscc,vsc7512-switch compatible string to indicate that the chips are
being controlled externally in a DSA configuration.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The *_RES_SIZE macros are initally <= 0x100. Future resource sizes will be
upwards of 0x200000 in size.
To keep things clean, fully align the RES_SIZE macros to 32-bit to do
nothing more than make the code more consistent.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Acked-for-MFD-by: Lee Jones <lee@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the Felix driver would probe the ports and verify functionality, it
would fail if it hit single port mode that wasn't supported by the driver.
The initial case for the VSC7512 driver will have physical ports that
exist, but aren't supported by the driver implementation. Add the
OCELOT_PORT_MODE_NONE macro to handle this scenario, and allow the Felix
driver to continue with all the ports that are currently functional.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The architecture around the VSC7512 differs from existing felix drivers. In
order to add support for all the chip's features (pinctrl, MDIO, gpio) the
device had to be laid out as a multi-function device (MFD).
One difference between an MFD and a standard platform device is that the
regmaps are allocated to the parent device before the child devices are
probed. As such, there is no need for felix to initialize new regmaps in
these configurations, they can simply be requested from the parent device.
Add support for MFD configurations by performing this request from the
parent device.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The define FELIX_MAC_QUIRKS was used directly in the felix.c shared driver.
Other devices (VSC7512 for example) don't require the same quirks, so they
need to be configured on a per-device basis.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VSC7514 target regmap is identical for ones shared with similar
hardware, specifically the VSC7512. Share this resource, and change the
name to match the pattern of other exported resources.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Resetting the switch core is the same whether it is done internally or
externally. Move this routine to the ocelot library so it can be used by
other drivers.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The vcap_props structure is common to other devices, specifically the
VSC7512 chip that can only be controlled externally. Export this structure
so it doesn't need to be recreated.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ocelot_regfields struct is common between several different chips, some
of which can only be controlled externally. Export this structure so it
doesn't have to be duplicated in these other drivers.
Rename the structure as well, to follow the conventions of other shared
resources.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Expose ocelot_wm functions so they can be shared with other drivers.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|