Age | Commit message (Collapse) | Author |
|
The CHANNEL_NOT_RUNNING error condition has been generalized, so
rename it to be INCORRECT_CHANNEL_STATE.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In gsi_channel_update(), a reference count is taken on the last
completed transaction "to keep it from completing" before we give
the event back to the hardware. Completion processing for that
transaction (and any other "new" ones) will not occur until after
this function returns, so there's no risk it completing early. So
there's no need to take and drop the additional transaction
reference.
Use local variables in the call to gsi_evt_ring_doorbell().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit
47f33de4aafb ("x86/sev: Mark the code returning to user space as syscall gap")
added a bunch of text references without annotating them, resulting in a
spree of objtool complaints:
vmlinux.o: warning: objtool: vc_switch_off_ist+0x77: relocation to !ENDBR: entry_SYSCALL_64+0x15c
vmlinux.o: warning: objtool: vc_switch_off_ist+0x8f: relocation to !ENDBR: entry_SYSCALL_compat+0xa5
vmlinux.o: warning: objtool: vc_switch_off_ist+0x97: relocation to !ENDBR: .entry.text+0x21ea
vmlinux.o: warning: objtool: vc_switch_off_ist+0xef: relocation to !ENDBR: .entry.text+0x162
vmlinux.o: warning: objtool: __sev_es_ist_enter+0x60: relocation to !ENDBR: entry_SYSCALL_64+0x15c
vmlinux.o: warning: objtool: __sev_es_ist_enter+0x6c: relocation to !ENDBR: .entry.text+0x162
vmlinux.o: warning: objtool: __sev_es_ist_enter+0x8a: relocation to !ENDBR: entry_SYSCALL_compat+0xa5
vmlinux.o: warning: objtool: __sev_es_ist_enter+0xc1: relocation to !ENDBR: .entry.text+0x21ea
Since these text references are used to compare against IP, and are not
an indirect call target, they don't need ENDBR so annotate them away.
Fixes: 47f33de4aafb ("x86/sev: Mark the code returning to user space as syscall gap")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220520082604.GQ2578@worktop.programming.kicks-ass.net
|
|
Add an explicit dependency to the respective CPU vendor so that the
respective microcode support for it gets built only when that support is
enabled.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/8ead0da9-9545-b10d-e3db-7df1a1f219e4@infradead.org
|
|
Remove the superfluous judgment since the function is
never called for a root cgroup, as suggested by Tejun.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
https://gitlab.freedesktop.org/abhinavk/msm into drm-next
5.19 fixes for msm-next
- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a big-time
hitter in the CI-runs.
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
A device tree binding change for Rockchip VOP2
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220519080556.42p52cya4u6y3kps@houat
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"Fix a regression in a recent fix to qcom-rng"
* tag 'v5.18-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: qcom-rng - fix infinite loop on requests not multiple of WORD_SZ
|
|
omap_rom_rng_runtime_resume()
'ddata->clk' is enabled by clk_prepare_enable(), it should be disabled
by clk_disable_unprepare().
Fixes: 8d9d4bdc495f ("hwrng: omap3-rom - Use runtime PM instead of custom functions")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Should not to uses the CRYPTO_ALG_ALLOCATES_MEMORY in SEC2. The SEC2
driver uses the pre-allocated buffers, including the src sgl pool, dst
sgl pool and other qp ctx resources. (e.g. IV buffer, mac buffer, key
buffer). The SEC2 driver doesn't allocate memory during request processing.
The driver only maps software sgl to allocated hardware sgl during I/O. So
here is fix it.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
QAT_401xx is a derivative of 4xxx. Add support for that device in the
qat_4xxx driver by including the DIDs (both PF and VF), extending the
probe and the firmware loader.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Srinivas Kerekare <srinivas.kerekare@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Re-enable the registration of algorithms after fixes to (1) use
pre-allocated buffers in the datapath and (2) support the
CRYPTO_TFM_REQ_MAY_BACKLOG flag.
This reverts commit 8893d27ffcaf6ec6267038a177cb87bcde4dd3de.
Cc: stable@vger.kernel.org
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
If a request has the flag CRYPTO_TFM_REQ_MAY_SLEEP set, allocate memory
using the flag GFP_KERNEL otherwise use GFP_ATOMIC.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Reject requests with a source buffer that is bigger than the size of the
key. This is to prevent a possible integer underflow that might happen
when copying the source scatterlist into a linear buffer.
Cc: stable@vger.kernel.org
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Reject requests with a source buffer that is bigger than the size of the
key. This is to prevent a possible integer underflow that might happen
when copying the source scatterlist into a linear buffer.
Cc: stable@vger.kernel.org
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The functions qat_dh_compute_value() allocates memory with
dma_alloc_coherent() if the source or the destination buffers are made
of multiple flat buffers or of a size that is not compatible with the
hardware.
This memory is then freed with dma_free_coherent() in the context of a
tasklet invoked to handle the response for the corresponding request.
According to Documentation/core-api/dma-api-howto.rst, the function
dma_free_coherent() cannot be called in an interrupt context.
Replace allocations with dma_alloc_coherent() in the function
qat_dh_compute_value() with kmalloc() + dma_map_single().
Cc: stable@vger.kernel.org
Fixes: c9839143ebbf ("crypto: qat - Add DH support")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
After commit f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP"), if
the algorithms are enabled, the driver crashes with a BUG_ON while
executing vunmap() in the context of a tasklet. This is due to the fact
that the function dma_free_coherent() cannot be called in an interrupt
context (see Documentation/core-api/dma-api-howto.rst).
The functions qat_rsa_enc() and qat_rsa_dec() allocate memory with
dma_alloc_coherent() if the source or the destination buffers are made
of multiple flat buffers or of a size that is not compatible with the
hardware.
This memory is then freed with dma_free_coherent() in the context of a
tasklet invoked to handle the response for the corresponding request.
Replace allocations with dma_alloc_coherent() in the functions
qat_rsa_enc() and qat_rsa_dec() with kmalloc() + dma_map_single().
Cc: stable@vger.kernel.org
Fixes: a990532023b9 ("crypto: qat - Add support for RSA algorithm")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When an RSA key represented in form 2 (as defined in PKCS #1 V2.1) is
used, some components of the private key persist even after the TFM is
released.
Replace the explicit calls to free the buffers in qat_rsa_exit_tfm()
with a call to qat_rsa_clear_ctx() which frees all buffers referenced in
the TFM context.
Cc: stable@vger.kernel.org
Fixes: 879f77e9071f ("crypto: qat - Add RSA CRT mode")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The implementations of the crypto algorithms (aead, skcipher, etc) in
the QAT driver do not properly support requests with the
CRYPTO_TFM_REQ_MAY_BACKLOG flag set. If the HW queue is full, the driver
returns -EBUSY but does not enqueue the request. This can result in
applications like dm-crypt waiting indefinitely for the completion of a
request that was never submitted to the hardware.
Fix this by adding a software backlog queue: if the ring buffer is more
than eighty percent full, then the request is enqueued to a backlog
list and the error code -EBUSY is returned back to the caller.
Requests in the backlog queue are resubmitted at a later time, in the
context of the callback of a previously submitted request.
The request for which -EBUSY is returned is then marked as -EINPROGRESS
once submitted to the HW queues.
The submission loop inside the function qat_alg_send_message() has been
modified to decide which submission policy to use based on the request
flags. If the request does not have the CRYPTO_TFM_REQ_MAY_BACKLOG set,
the previous behaviour has been preserved.
Based on a patch by
Vishnu Das Ramachandran <vishnu.dasx.ramachandran@intel.com>
Cc: stable@vger.kernel.org
Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Kyle Sanderson <kyle.leet@gmail.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
All the algorithms in qat_algs.c and qat_asym_algs.c use the same
pattern to submit messages to the HW queues. Move the submission loop
to a new function, qat_alg_send_message(), and share it between the
symmetric and the asymmetric algorithms.
As part of this rework, since the number of retries before returning an
error is inconsistent between the symmetric and asymmetric
implementations, set it to a value that works for both (i.e. 20, was 10
in qat_algs.c and 100 in qat_asym_algs.c)
In addition fix the return code reported when the HW queues are full.
In that case return -ENOSPC instead of -EBUSY.
Including stable in CC since (1) the error code returned if the HW queues
are full is incorrect and (2) to facilitate the backport of the next fix
"crypto: qat - add backlog mechanism".
Cc: stable@vger.kernel.org
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
In order to do DMAs, the QAT device requires that the scatterlist
structures are mapped and translated into a format that the firmware can
understand. This is defined as the composition of a scatter gather list
(SGL) descriptor header, the struct qat_alg_buf_list, plus a variable
number of flat buffer descriptors, the struct qat_alg_buf.
The allocation and mapping of these data structures is done each time a
request is received from the skcipher and aead APIs.
In an OOM situation, this behaviour might lead to a dead-lock if an
allocation fails.
Based on the conversation in [1], increase the size of the aead and
skcipher request contexts to include an SGL descriptor that can handle
a maximum of 4 flat buffers.
If requests exceed 4 entries buffers, memory is allocated dynamically.
[1] https://lore.kernel.org/linux-crypto/20200722072932.GA27544@gondor.apana.org.au/
Cc: stable@vger.kernel.org
Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Set to zero the context buffers containing the DH key before they are
freed.
This is a defense in depth measure that avoids keys to be recovered from
memory in case the system is compromised between the free of the buffer
and when that area of memory (containing keys) gets overwritten.
Cc: stable@vger.kernel.org
Fixes: c9839143ebbf ("crypto: qat - Add DH support")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next, misc
updates and fallout fixes from recent Florian's code rewritting (from
last pull request):
1) Use new flowi4_l3mdev field in ip_route_me_harder(), from Martin Willi.
2) Avoid unnecessary GC with a timestamp in conncount, from William Tu
and Yifeng Sun.
3) Remove TCP conntrack debugging, from Florian Westphal.
4) Fix compilation warning in ctnetlink, from Florian.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: ctnetlink: fix up for "netfilter: conntrack: remove unconfirmed list"
netfilter: conntrack: remove pr_debug callsites from tcp tracker
netfilter: nf_conncount: reduce unnecessary GC
netfilter: Use l3mdev flow key when re-routing mangled packets
====================
Link: https://lore.kernel.org/r/20220519220206.722153-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I missed this in the barrage of GCC 12 warnings. Commit cf2df74e202d
("net: fix dev_fill_forward_path with pppoe + bridge") changed
the pointer into an array.
Fixes: d7e6f5836038 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
Link: https://lore.kernel.org/r/20220520012555.2262461-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We update KVM RISC-V maintainers entry to include appropriate KVM
selftests directories so that RISC-V related KVM selftests patches
are CC'ed to KVM RISC-V mailing list.
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Currently, there is no provision for vmm (qemu-kvm or kvmtool) to
query about multiple-letter ISA extensions. The config register
is only used for base single letter ISA extensions.
A new ISA extension register is added that will allow the vmm
to query about any ISA extension one at a time. It is enabled for
both single letter or multi-letter ISA extensions. The ISA extension
register is useful to if the vmm requires to retrieve/set single
extension while the config register should be used if all the base
ISA extension required to retrieve or set.
For any multi-letter ISA extensions, the new register interface
must be used.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
On RISC-V platforms with hardware VMID support, we share same
VMID for all VCPUs of a particular Guest/VM. This means we might
have stale G-stage TLB entries on the current Host CPU due to
some other VCPU of the same Guest which ran previously on the
current Host CPU.
To cleanup stale TLB entries, we simply flush all G-stage TLB
entries by VMID whenever underlying Host CPU changes for a VCPU.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The generic KVM has support for VCPU requests which can be used
to do arch-specific work in the run-loop. We introduce remote
HFENCE functions which will internally use VCPU requests instead
of host SBI calls.
Advantages of doing remote HFENCEs as VCPU requests are:
1) Multiple VCPUs of a Guest may be running on different Host CPUs
so it is not always possible to determine the Host CPU mask for
doing Host SBI call. For example, when VCPU X wants to do HFENCE
on VCPU Y, it is possible that VCPU Y is blocked or in user-space
(i.e. vcpu->cpu < 0).
2) To support nested virtualization, we will be having a separate
shadow G-stage for each VCPU and a common host G-stage for the
entire Guest/VM. The VCPU requests based remote HFENCEs helps
us easily synchronize the common host G-stage and shadow G-stage
of each VCPU without any additional IPI calls.
This is also a preparatory patch for upcoming nested virtualization
support where we will be having a shadow G-stage page table for
each Guest VCPU.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Currently, the KVM_MAX_VCPUS value is 16384 for RV64 and 128
for RV32.
The KVM_MAX_VCPUS value is too high for RV64 and too low for
RV32 compared to other architectures (e.g. x86 sets it to 1024
and ARM64 sets it to 512). The too high value of KVM_MAX_VCPUS
on RV64 also leads to VCPU mask on stack consuming 2KB.
We set KVM_MAX_VCPUS to 1024 for both RV64 and RV32 to be
aligned other architectures.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Various __kvm_riscv_hfence_xyz() functions implemented in the
kvm/tlb.S are equivalent to corresponding HFENCE.GVMA instructions
and we don't have range based local HFENCE functions.
This patch provides complete set of local HFENCE functions which
supports range based TLB invalidation and supports HFENCE.VVMA
based functions. This is also a preparatory patch for upcoming
Svinval support in KVM RISC-V.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
We should treat SBI HFENCE calls as NOPs until nested virtualization
is supported by KVM RISC-V. This will help us test booting a hypervisor
under KVM RISC-V.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Latest QEMU supports G-stage Sv57x4 mode so this patch extends KVM
RISC-V G-stage handling to detect and use Sv57x4 mode when available.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The two-stage address translation defined by the RISC-V privileged
specification defines: VS-stage (guest virtual address to guest
physical address) programmed by the Guest OS and G-stage (guest
physical addree to host physical address) programmed by the
hypervisor.
To align with above terminology, we replace "stage2" with "gstage"
and "Stage2" with "G-stage" name everywhere in KVM RISC-V sources.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Fix the following coccicheck warnings:
./tools/testing/selftests/kvm/lib/riscv/processor.c:353:3-4: Unneeded
semicolon.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Currently, we simply hang using "while (1) ;" upon any unexpected
guest traps because the default guest trap handler is guest_hang().
The above approach is not useful to anyone because KVM selftests
users will only see a hung application upon any unexpected guest
trap.
This patch improves unexpected guest trap handling for KVM RISC-V
selftests by doing the following:
1) Return to host user-space
2) Dump VCPU registers
3) Die using TEST_ASSERT(0, ...)
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Mat Martineau says:
====================
mptcp: Miscellaneous fixes and a new test case
Patches 1 and 3 remove helpers that were iterating over the subflow
connection list without proper locking. Iteration was not needed in
either case.
Patch 2 fixes handling of MP_FAIL timeout, checking for orphaned
subflows instead of using the MPTCP socket data lock and connection
state.
Patch 4 adds a test for MP_FAIL timeout using tc pedit to induce checksum
failures.
====================
Link: https://lore.kernel.org/r/20220518220446.209750-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the multiple subflows test case for MP_FAIL, to test the MP_FAIL
reset case. Use the test_linkfail value to make 1024KB test files.
Invoke reset_with_fail() to use 'iptables' and 'tc action pedit' rules
to produce the bit flips to trigger the checksum failures on ns2eth2.
Add delays on ns2eth1 to make sure more data can translate on ns2eth2.
The check_invert flag is enabled in reset_with_fail(), so this test
prints out the inverted bytes, instead of the file mismatch errors.
Invoke pedit_action_pkts() to get the numbers of the packets edited
by the tc pedit actions, and print this numbers to the output.
Co-developed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MPTCP socket's conn_list (list of subflows) requires the socket lock
to access. The MP_FAIL timeout code added such an access, where it would
check the list of subflows both in timer context and (later) in workqueue
context where the socket lock is held.
Rather than check the list twice, remove the check in the timeout
handler and only depend on the check in the workqueue. Also remove the
MPTCP_FAIL_NO_RESPONSE flag, since mptcp_mp_fail_no_response() has
insignificant overhead and can be checked on each worker run.
Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MP_FAIL timeout (waiting for a peer to respond to an MP_FAIL with
another MP_FAIL) is implemented using the MPTCP socket's sk_timer. That
timer is also used at MPTCP socket close, so it's important to not have
the two timer users interfere with each other.
At MPTCP socket close, all subflows are orphaned before sk_timer is
manipulated. By checking the SOCK_DEAD flag on the subflows, each
subflow can determine if the timer is safe to alter without acquiring
any MPTCP-level lock. This replaces code that was using the
mptcp_data_lock and MPTCP-level socket state checks that did not
correctly protect the timer.
Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond")
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The mentioned helper requires the msk socket lock, and the
current callers don't own it nor can't acquire it, so the
access is racy.
All the current callers are really checking for infinite mapping
fallback, and the latter condition is explicitly tracked by
the relevant msk variable: we can safely remove the caller usage
- and the caller itself.
The issue is present since MP_FAIL implementation, but the
fix only applies since the infinite fallback support, ence the
somewhat unexpected fixes tag.
Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status")
Acked-and-tested-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch improves TCP PRR loss recovery behavior for a corner
case. Previously during PRR conservation-bound mode, it strictly
sends the amount equals to the amount newly acked or s/acked.
The patch changes s.t. PRR may send additional amount that was banked
previously (e.g. application-limited) in the conservation-bound
mode, similar to the slow-start mode. This unifies and simplifies the
algorithm further and may improve the recovery latency. This change
still follow the general packet conservation design principle and
always keep inflight/cwnd below the slow start threshold set
by the congestion control module.
PRR is described in RFC 6937. We'll include this change in the
latest revision rfc6937-bis as well.
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220519003410.2531936-1-ycheng@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When removing the rcu_read_lock in bond_ethtool_get_ts_info() as
discussed [1], I didn't notice it could be called via setsockopt,
which doesn't hold rcu lock, as syzbot pointed:
stack backtrace:
CPU: 0 PID: 3599 Comm: syz-executor317 Not tainted 5.18.0-rc5-syzkaller-01392-g01f4685797a5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
bond_option_active_slave_get_rcu include/net/bonding.h:353 [inline]
bond_ethtool_get_ts_info+0x32c/0x3a0 drivers/net/bonding/bond_main.c:5595
__ethtool_get_ts_info+0x173/0x240 net/ethtool/common.c:554
ethtool_get_phc_vclocks+0x99/0x110 net/ethtool/common.c:568
sock_timestamping_bind_phc net/core/sock.c:869 [inline]
sock_set_timestamping+0x3a3/0x7e0 net/core/sock.c:916
sock_setsockopt+0x543/0x2ec0 net/core/sock.c:1221
__sys_setsockopt+0x55e/0x6a0 net/socket.c:2223
__do_sys_setsockopt net/socket.c:2238 [inline]
__se_sys_setsockopt net/socket.c:2235 [inline]
__x64_sys_setsockopt+0xba/0x150 net/socket.c:2235
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f8902c8eb39
Fix it by adding rcu_read_lock and take a ref on the real_dev.
Since dev_hold() and dev_put() can take NULL these days, we can
skip checking if real_dev exist.
[1] https://lore.kernel.org/netdev/27565.1642742439@famine/
Reported-by: syzbot+92beb3d46aab498710fa@syzkaller.appspotmail.com
Fixes: aa6034678e87 ("bonding: use rcu_dereference_rtnl when get bonding active slave")
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220519020148.1058344-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current title of our section of the documentation is
Linux Networking Documentation. Since we're describing
a section of Linux Documentation repeating those two
words seems redundant.
Link: https://lore.kernel.org/r/20220518234346.2088436-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
GCC 12 seems upset that we check ipa_irq against array bound
but then proceed, anyway:
drivers/net/ipa/ipa_interrupt.c: In function ‘ipa_interrupt_add’:
drivers/net/ipa/ipa_interrupt.c:196:27: warning: array subscript 30 is above array bounds of ‘void (*[30])(struct ipa *, enum ipa_irq_id)’ [-Warray-bounds]
196 | interrupt->handler[ipa_irq] = handler;
| ~~~~~~~~~~~~~~~~~~^~~~~~~~~
drivers/net/ipa/ipa_interrupt.c:42:27: note: while referencing ‘handler’
42 | ipa_irq_handler_t handler[IPA_IRQ_COUNT];
| ^~~~~~~
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220519004417.2109886-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
GCC 12 warns:
drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c: In function ‘ipc_protocol_dl_td_process’:
drivers/net/wwan/iosm/iosm_ipc_protocol_ops.c:406:13: warning: the comparison will always evaluate as ‘true’ for the address of ‘cb’ will never be NULL [-Waddress]
406 | if (!IPC_CB(skb)) {
| ^
Indeed the check seems entirely pointless. Hopefully the other
validation checks will catch if the cb is bad, but it can't be
NULL.
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Link: https://lore.kernel.org/r/20220519004342.2109832-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Martin Blumenstingl says:
====================
lantiq_gswip: Two small fixes
While updating the Lantiq target in OpenWrt to Linux 5.15 I came across
an FDB related error message. While that still needs to be solved I
found two other small issues on the way.
This series fixes the two minor issues found while revisiting the FDB
code in the lantiq_gswip driver:
- The first patch fixes the start index used in gswip_port_fdb() to
find the entry with the matching bridge. The updated logic is now
consistent with the rest of the driver.
- The second patch fixes a typo in a dev_err() message.
[0] https://lore.kernel.org/netdev/20220517194015.1081632-1-martin.blumenstingl@googlemail.com/
====================
Link: https://lore.kernel.org/r/20220518220051.1520023-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
gswip_port_fdb_dump() reads the MAC bridge entries. The error message
should say "failed to read mac bridge entry". While here, also add the
index to the error print so humans can get to the cause of the problem
easier.
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The first N entries in priv->vlans are reserved for managing ports which
are not part of a bridge. Use priv->hw_info->max_ports to consistently
access per-bridge entries at index 7. Starting at
priv->hw_info->cpu_port (6) is harmless in this case because
priv->vlan[6].bridge is always NULL so the comparison result is always
false (which results in this entry being skipped).
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
t7xx_request_irq() error: uninitialized symbol 'ret'.
t7xx_core_hk_handler() error: potentially dereferencing uninitialized 'event'.
If the condition to enter the loop that waits for the handshake event
is false on the first iteration then the uninitialized 'event' will be
dereferenced, fix this by initializing 'event' to NULL.
t7xx_port_proxy_recv_skb() warn: variable dereferenced before check 'skb'.
No need to check skb at t7xx_port_proxy_recv_skb() since we know it
is always called with a valid skb by t7xx_cldma_gpd_rx_from_q().
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ricardo Martinez <ricardo.martinez@linux.intel.com>
Link: https://lore.kernel.org/r/20220518195529.126246-1-ricardo.martinez@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Russell King says:
====================
mtk_eth_soc phylink updates
This series ultimately updates mtk_eth_soc to use phylink_pcs, with some
fixes along the way.
Previous attempts to update this driver (which is now marked as legacy)
have failed due to lack of testing. I am hoping that this time will be
different; Marek can test RGMII modes, but not SGMII. So all that we
know is that this patch series probably doesn't break RGMII.
1) remove unused mac_mode and sgmii flags members from structures.
2) remove unnecessary interpretation of speed when configuring 1000
and 2500 Base-X
3) move configuration of SGMII duplex setting from mac_config() to
link_up()
4) only pass in interface mode to mtk_sgmii_setup_mode_force()
5) move decision about which mtk_sgmii_setup_mode_*() function to call
into mtk_sgmii.c
6) add a fixme comment for RGMII explaning why the call to
mtk_gmac0_rgmii_adjust() is completely wrong - this needs to be
addressed by someone who has the hardware and can test an appropriate
fix. This fixme means that the driver still can't become non-legacy.
7) move gmac setup from mac_config() to mac_finish() - this preserves
the order that we write to the hardware when we eventually convert to
phylink_pcs()
8) move configuration of syscfg0 in SGMII/802.3z mode to mac_finish()
for the same reasons as (7).
9) convert mtk_sgmii.c code structure and the mtk_sgmii structure to
suit conversion to phylink_pcs
10) finally convert to phylink_pcs
As there has been no feedback from mtk_eth_soc maintainers to my RFC
on April 6th, not my reminder on April 11th, so it's now time to merge
this anyway. Mediatek code seems to be submitted to the kernel and
then the maintainers scarper...
====================
Link: https://lore.kernel.org/r/YoUIX+BN/ZbyXzTT@shell.armlinux.org.uk
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|