Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dave Jiang:
"Check for RCH dport before accessing pci_host_bridge and a fix to
address a KASAN warning for the cxl regression test suite cxl-test"
* tag 'cxl-fixes-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/test: Skip cxl_setup_parent_dport() for emulated dports
cxl/pci: Get AER capability address from RCRB only for RCH dport
|
|
Remove the global PCI/MSI irqdomain implementation and provide the
required MSI parent functionality by filling in msi_parent_ops, so the
PCI/MSI code can detect the new parent and setup per-device MSI domains.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240815112608.26925-2-zhangtianyang@loongson.cn
|
|
When soft interrupt actions are called, they are passed a pointer to the
struct softirq action which contains the action's function pointer.
This pointer isn't useful, as the action callback already knows what
function it is. And since each callback handles a specific soft interrupt,
the callback also knows which soft interrupt number is running.
No soft interrupt action callback actually uses this parameter, so remove
it from the function pointer signature. This clarifies that soft interrupt
actions are global routines and makes it slightly cheaper to call them.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/20240815171549.3260003-1-csander@purestorage.com
|
|
The unification of irq_domain_create_legacy() missed the fact that
interrupts must be associated even when the Linux interrupt number provided
in the first_irq argument is 0.
This breaks all call sites of irq_domain_create_legacy() which supply 0 as
the first_irq argument.
Enforce the association for legacy domains in __irq_domain_instantiate() to
cure this.
[ tglx: Massaged it slightly. ]
Fixes: 70114e7f7585 ("irqdomain: Simplify simple and legacy domain creation")
Reported-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by Matti Vaittinen <mazziesaccount@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://lore.kernel.org/all/c3379142-10bc-4f14-b8ac-a46927aeac38@gmail.com
|
|
gicv2m_of_init() fails to perform an of_node_put() when
of_address_to_resource() fails, leading to a refcount leak.
Address this by moving the error handling path outside of the loop and
making it common to all failure modes.
Fixes: 4266ab1a8ff5 ("irqchip/gic-v2m: Refactor to prepare for ACPI support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240820092843.1219933-1-make24@iscas.ac.cn
|
|
The devm_platform_ioremap_resource() function doesn't return NULL, it
returns error pointers. Fix the error handling to match.
Fixes: 2333df5ae51e ("irqchip: Add RISC-V advanced PLIC driver for direct-mode")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/a5a628d6-81d8-4933-81a8-64aad4743ec4@stanley.mountain
|
|
The recently established lock ordering mandates that the per-VM
vmapp_lock is acquired before taking the per-VPE lock.
As it turns out, its_vpe_set_affinity() takes the VPE lock, and
then calls into its_send_vmovp(), which itself takes the vmapp
lock. Obviously, this is a lock order violation.
As its_send_vmovp() is only called from its_vpe_set_affinity(),
hoist the vmapp locking from the former into the latter, restoring
the expected order.
Fixes: f0eb154c39471 ("irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock")
Reported-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240818171625.3030584-1-maz@kernel.org
|
|
The latest Linux RISC-V no longer boots on the Allwinner D1 platform
because the sun4i_timer driver fails to get an interrupt from PLIC due to
the recent conversion of the PLIC to a platform driver. Converting the
sun4i timer to a platform driver does not work either because the D1 does
not have a SBI timer available so early boot hangs. See the 'Closes:'
link for deeper analysis.
The real fix requires enabling the SBI time extension in the platform
firmware (OpenSBI) and convert sun4i_timer into platform driver.
Unfortunately, the real fix involves changing multiple places and can't be
achieved in a short duration and aside of that requires users to update
firmware.
As a work-around, retrofit PLIC probing such that the PLIC is probed early
only for the Allwinner D1 platform and probed as a regular platform driver
for rest of the RISC-V platforms. In the process, partially revert some of
the previous changes because the PLIC device pointer is not available in
all probing paths.
Fixes: e306a894bd51 ("irqchip/sifive-plic: Chain to parent IRQ after handlers are ready")
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240820034850.3189912-1-apatel@ventanamicro.com
Closes: https://lore.kernel.org/lkml/20240814145642.344485-1-emil.renner.berthing@canonical.com/
|
|
Add AMDI0015 to the ACPI APD support list to ensure correct clock settings
for the I3C device on the latest AMD platforms.
Co-developed-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Sanket Goswami <Sanket.Goswami@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20240812144018.360847-1-Shyam-sundar.S-k@amd.com
[ rjw: Added missing tag ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Nikolay Aleksandrov says:
====================
bonding: fix xfrm offload bugs
I noticed these problems while reviewing a bond xfrm patch recently.
The fixes are straight-forward, please review carefully the last one
because it has side-effects. This set has passed bond's selftests
and my custom bond stress tests which crash without these fixes.
Note the first patch is not critical, but it simplifies the next fix.
====================
Link: https://patch.msgid.link/20240816114813.326645-1-razor@blackwall.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If the active slave is cleared manually the xfrm state is not flushed.
This leads to xfrm add/del imbalance and adding the same state multiple
times. For example when the device cannot handle anymore states we get:
[ 1169.884811] bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
because it's filled with the same state after multiple active slave
clearings. This change also has a few nice side effects: user-space
gets a notification for the change, the old device gets its mac address
and promisc/mcast adjusted properly.
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We shouldn't set real_dev to NULL because packets can be in transit and
xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume
real_dev is set.
Example trace:
kernel: BUG: unable to handle page fault for address: 0000000000001030
kernel: bond0: (slave eni0np1): making interface the new active one
kernel: #PF: supervisor write access in kernel mode
kernel: #PF: error_code(0x0002) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops: 0002 [#1] PREEMPT SMP
kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12
kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim]
kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
kernel: Code: e0 0f 0b 48 83 7f 38 00 74 de 0f 0b 48 8b 47 08 48 8b 37 48 8b 78 40 e9 b2 e5 9a d7 66 90 0f 1f 44 00 00 48 8b 86 80 02 00 00 <83> 80 30 10 00 00 01 b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f
kernel: bond0: (slave eni0np1): making interface the new active one
kernel: RSP: 0018:ffffabde81553b98 EFLAGS: 00010246
kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
kernel:
kernel: RAX: 0000000000000000 RBX: ffff9eb404e74900 RCX: ffff9eb403d97c60
kernel: RDX: ffffffffc090de10 RSI: ffff9eb404e74900 RDI: ffff9eb3c5de9e00
kernel: RBP: ffff9eb3c0a42000 R08: 0000000000000010 R09: 0000000000000014
kernel: R10: 7974203030303030 R11: 3030303030303030 R12: 0000000000000000
kernel: R13: ffff9eb3c5de9e00 R14: ffffabde81553cc8 R15: ffff9eb404c53000
kernel: FS: 00007f2a77a3ad00(0000) GS:ffff9eb43bd00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000001030 CR3: 00000001122ab000 CR4: 0000000000350ef0
kernel: bond0: (slave eni0np1): making interface the new active one
kernel: Call Trace:
kernel: <TASK>
kernel: ? __die+0x1f/0x60
kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
kernel: ? page_fault_oops+0x142/0x4c0
kernel: ? do_user_addr_fault+0x65/0x670
kernel: ? kvm_read_and_reset_apf_flags+0x3b/0x50
kernel: bond0: (slave eni0np1): making interface the new active one
kernel: ? exc_page_fault+0x7b/0x180
kernel: ? asm_exc_page_fault+0x22/0x30
kernel: ? nsim_bpf_uninit+0x50/0x50 [netdevsim]
kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
kernel: ? nsim_ipsec_offload_ok+0xc/0x20 [netdevsim]
kernel: bond0: (slave eni0np1): making interface the new active one
kernel: bond_ipsec_offload_ok+0x7b/0x90 [bonding]
kernel: xfrm_output+0x61/0x3b0
kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA
kernel: ip_push_pending_frames+0x56/0x80
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We must check if there is an active slave before dereferencing the pointer.
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix the return type which should be bool.
Fixes: 955b785ec6b3 ("bonding: fix suspicious RCU usage in bond_ipsec_offload_ok()")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
for_each_child_of_node can help to iterate through the device_node,
and we don't need to use while loop. No functional change with this
conversion.
Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816015837.109627-1-zhangzekun11@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
HiSilicon HIP10/11 platforms using the same SMMU PMCG with HIP09
and thus suffers the same erratum. List them in the PMCG platform
information list without introducing a new SMMU PMCG Model.
Update the silicon-errata.rst as well.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240731092658.11012-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Ido Schimmel says:
====================
Preparations for FIB rule DSCP selector
This patchset moves the masking of the upper DSCP bits in 'flowi4_tos'
to the core instead of relying on callers of the FIB lookup API to do
it.
This will allow us to start changing users of the API to initialize the
'flowi4_tos' field with all six bits of the DSCP field. In turn, this
will allow us to extend FIB rules with a new DSCP selector.
By masking the upper DSCP bits in the core we are able to maintain the
behavior of the TOS selector in FIB rules and routes to only match on
the lower DSCP bits.
While working on this I found two users of the API that do not mask the
upper DSCP bits before performing the lookup. The first is an ancient
netlink family that is unlikely to be used. It is adjusted in patch #1
to mask both the upper DSCP bits and the ECN bits before calling the
API.
The second user is a nftables module that differs in this regard from
its equivalent iptables module. It is adjusted in patch #2 to invoke the
API with the upper DSCP bits masked, like all other callers. The
relevant selftest passed, but in the unlikely case that regressions are
reported because of this change, we can restore the existing behavior
using a new flow information flag as discussed here [1].
The last patch moves the masking of the upper DSCP bits to the core,
making the first two patches redundant, but I wanted to post them
separately to call attention to the behavior change for these two users
of the FIB lookup API.
Future patchsets (around 3) will start unmasking the upper DSCP bits
throughout the networking stack before adding support for the new FIB
rule DSCP selector.
Changes from v1 [2]:
Patch #3: Include <linux/ip.h> in <linux/in_route.h> instead of
including it in net/ip_fib.h
[1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
[2] https://lore.kernel.org/netdev/20240725131729.1729103-1-idosch@nvidia.com/
====================
Link: https://patch.msgid.link/20240814125224.972815-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The TOS field in the IPv4 flow information structure ('flowi4_tos') is
matched by the kernel against the TOS selector in IPv4 rules and routes.
The field is initialized differently by different call sites. Some treat
it as DSCP (RFC 2474) and initialize all six DSCP bits, some treat it as
RFC 1349 TOS and initialize it using RT_TOS() and some treat it as RFC
791 TOS and initialize it using IPTOS_RT_MASK.
What is common to all these call sites is that they all initialize the
lower three DSCP bits, which fits the TOS definition in the initial IPv4
specification (RFC 791).
Therefore, the kernel only allows configuring IPv4 FIB rules that match
on the lower three DSCP bits which are always guaranteed to be
initialized by all call sites:
# ip -4 rule add tos 0x1c table 100
# ip -4 rule add tos 0x3c table 100
Error: Invalid tos.
While this works, it is unlikely to be very useful. RFC 791 that
initially defined the TOS and IP precedence fields was updated by RFC
2474 over twenty five years ago where these fields were replaced by a
single six bits DSCP field.
Extending FIB rules to match on DSCP can be done by adding a new DSCP
selector while maintaining the existing semantics of the TOS selector
for applications that rely on that.
A prerequisite for allowing FIB rules to match on DSCP is to adjust all
the call sites to initialize the high order DSCP bits and remove their
masking along the path to the core where the field is matched on.
However, making this change alone will result in a behavior change. For
example, a forwarded IPv4 packet with a DS field of 0xfc will no longer
match a FIB rule that was configured with 'tos 0x1c'.
This behavior change can be avoided by masking the upper three DSCP bits
in 'flowi4_tos' before comparing it against the TOS selectors in FIB
rules and routes.
Implement the above by adding a new function that checks whether a given
DSCP value matches the one specified in the IPv4 flow information
structure and invoke it from the three places that currently match on
'flowi4_tos'.
Use RT_TOS() for the masking of 'flowi4_tos' instead of IPTOS_RT_MASK
since the latter is not uAPI and we should be able to remove it at some
point.
Include <linux/ip.h> in <linux/in_route.h> since the former defines
IPTOS_TOS_MASK which is used in the definition of RT_TOS() in
<linux/in_route.h>.
No regressions in FIB tests:
# ./fib_tests.sh
[...]
Tests passed: 218
Tests failed: 0
And FIB rule tests:
# ./fib_rule_tests.sh
[...]
Tests passed: 116
Tests failed: 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As part of its functionality, the nftables FIB expression module
performs a FIB lookup, but unlike other users of the FIB lookup API, it
does so without masking the upper DSCP bits. In particular, this differs
from the equivalent iptables match ("rpfilter") that does mask the upper
DSCP bits before the FIB lookup.
Align the module to other users of the FIB lookup API and mask the upper
DSCP bits using IPTOS_RT_MASK before the lookup.
No regressions in nft_fib.sh:
# ./nft_fib.sh
PASS: fib expression did not cause unwanted packet drops
PASS: fib expression did drop packets for 1.1.1.1
PASS: fib expression did drop packets for 1c3::c01d
PASS: fib expression forward check with policy based routing
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
lookup according to user provided parameters and communicate the result
back to user space.
However, unlike other users of the FIB lookup API, the upper DSCP bits
and the ECN bits of the DS field are not masked, which can result in the
wrong result being returned.
Solve this by masking the upper DSCP bits and the ECN bits using
IPTOS_RT_MASK.
The structure that communicates the request and the response is not
exported to user space, so it is unlikely that this netlink family is
actually in use [1].
[1] https://lore.kernel.org/netdev/ZpqpB8vJU%2FQ6LSqa@debian/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Referring to the errata ERR051608 of I.MX93, LPSPI TCR[PRESCALE]
can only be configured to be 0 or 1, other values are not valid
and will cause LPSPI to not work.
Add the prescale limitation for LPSPI in I.MX93. Other platforms
are not affected.
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Link: https://patch.msgid.link/20240820070658.672127-1-carlos.song@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If the "test->highmem = alloc_pages()" allocation fails then calling
__free_pages(test->highmem) will result in a NULL dereference. Also
change the error code to -ENOMEM instead of returning success.
Fixes: 2661081f5ab9 ("mmc_test: highmem tests")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/8c90be28-67b4-4b0d-a105-034dc72a0b31@stanley.mountain
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
iounmap() on x86 occasionally fails to unmap because the provided valid
ioremap address is not below high_memory. It turned out that this
happens due to KASLR.
KASLR uses the full address space between PAGE_OFFSET and vaddr_end to
randomize the starting points of the direct map, vmalloc and vmemmap
regions. It thereby limits the size of the direct map by using the
installed memory size plus an extra configurable margin for hot-plug
memory. This limitation is done to gain more randomization space
because otherwise only the holes between the direct map, vmalloc,
vmemmap and vaddr_end would be usable for randomizing.
The limited direct map size is not exposed to the rest of the kernel, so
the memory hot-plug and resource management related code paths still
operate under the assumption that the available address space can be
determined with MAX_PHYSMEM_BITS.
request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1
downwards. That means the first allocation happens past the end of the
direct map and if unlucky this address is in the vmalloc space, which
causes high_memory to become greater than VMALLOC_START and consequently
causes iounmap() to fail for valid ioremap addresses.
MAX_PHYSMEM_BITS cannot be changed for that because the randomization
does not align with address bit boundaries and there are other places
which actually require to know the maximum number of address bits. All
remaining usage sites of MAX_PHYSMEM_BITS have been analyzed and found
to be correct.
Cure this by exposing the end of the direct map via PHYSMEM_END and use
that for the memory hot-plug and resource management related places
instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END
maps to a variable which is initialized by the KASLR initialization and
otherwise it is based on MAX_PHYSMEM_BITS as before.
To prevent future hickups add a check into add_pages() to catch callers
trying to add memory above PHYSMEM_END.
Fixes: 0483e1fa6e09 ("x86/mm: Implement ASLR for kernel memory regions")
Reported-by: Max Ramanouski <max8rr8@gmail.com>
Reported-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-By: Max Ramanouski <max8rr8@gmail.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/87ed6soy3z.ffs@tglx
|
|
Fix a race condition if the clock provider comes up after mmc is probed,
this causes mmc to fail without retrying.
When given the DEFER error from the clk source, pass it on up the chain.
Fixes: f90a0612f0e1 ("mmc: dw_mmc: lookup for optional biu and ciu clocks")
Signed-off-by: Ben Whitten <ben.whitten@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240811212212.123255-1-ben.whitten@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
GRO code checks for matching layer 2 headers to see, if packet belongs
to the same flow and because ip6 tunnel set dev->hard_header_len
this check fails in cases, where it shouldn't. To fix this don't
set hard_header_len, but use needed_headroom like ipv4/ip_tunnel.c
does.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://patch.msgid.link/20240815151419.109864-1-tbogendoerfer@suse.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When only the last resource is invalid, tpmi_sst_dev_add() is returing
error even if there are other valid resources before. This function
should return error when there are no valid resources.
Here tpmi_sst_dev_add() is returning "ret" variable. But this "ret"
variable contains the failure status of last call to sst_main(), which
failed for the invalid resource. But there may be other valid resources
before the last entry.
To address this, do not update "ret" variable for sst_main() return
status.
If there are no valid resources, it is already checked for by !inst
below the loop and -ENODEV is returned.
Fixes: 9d1d36268f3d ("platform/x86: ISST: Support partitioned systems")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org # 6.10+
Link: https://lore.kernel.org/r/20240816163626.415762-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
revert commit 4c905f6740a3 ("netfilter: nf_tables: initialize registers in
nft_do_chain()").
Previous patch makes sure that loads from uninitialized registers are
detected from the control plane. in this case rule blob auto-zeroes
registers. Thus the explicit zeroing is not needed anymore.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Reject rules where a load occurs from a register that has not seen a store
early in the same rule.
commit 4c905f6740a3 ("netfilter: nf_tables: initialize registers in
nft_do_chain()")
had to add a unconditional memset to the nftables register space to avoid
leaking stack information to userspace.
This memset shows up in benchmarks. After this change, this commit can
be reverted again.
Note that this breaks userspace compatibility, because theoretically
you can do
rule 1: reg2 := meta load iif, reg2 == 1 jump ...
rule 2: reg2 == 2 jump ... // read access with no store in this rule
... after this change this is rejected.
Neither nftables nor iptables-nft generate such rules, each rule is
always standalone.
This resuts in a small increase of nft_ctx structure by sizeof(long).
To cope with hypothetical rulesets like the example above one could emit
on-demand "reg[x] = 0" store when generating the datapath blob in
nf_tables_commit_chain_prepare().
A patch that does this is linked to below.
For now, lets disable this. In nf_tables, a rule is the smallest
unit that can be replaced from userspace, i.e. a hypothetical ruleset
that relies on earlier initialisations of registers can't be changed
at will as register usage would need to be coordinated.
Link: https://lore.kernel.org/netfilter-devel/20240627135330.17039-4-fw@strlen.de/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Mechanical transformation, no logical changes intended.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nft_counter_reset() resets the counter by subtracting the previously
retrieved value from the counter. This is a write operation on the
counter and as such it requires to be performed with a write sequence of
nft_counter_seq to serialize against its possible reader.
Update the packets/ bytes within write-sequence of nft_counter_seq.
Fixes: d84701ecbcd6a ("netfilter: nft_counter: rework atomic dump and reset")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The sequence counter nft_counter_seq is a per-CPU counter. There is no
lock associated with it. nft_counter_do_eval() is using the same counter
and disables BH which suggest that it can be invoked from a softirq.
This in turn means that nft_counter_offload_stats(), which disables only
preemption, can be interrupted by nft_counter_do_eval() leading to two
writer for one seqcount_t.
This can lead to loosing stats or reading statistics while they are
updated.
Disable BH during stats update in nft_counter_offload_stats() to ensure
one writer at a time.
Fixes: b72920f6e4a9d ("netfilter: nftables: counter hardware offload support")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Convert arm32 to use the arch_cpu_is_hotpluggable() helper rather than
arch_register_cpu().
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Currently, almost all architectures have switched to GENERIC_CPU_DEVICES,
except for arm32. Also switch over to GENERIC_CPU_DEVICES, and provide an
arch_register_cpu() that populates the hotpluggable flag for arm32.
The struct cpu in struct cpuinfo_arm is never used directly, remove
it to use the one GENERIC_CPU_DEVICES provides.
This also has the effect of moving the registration of CPUs from subsys to
driver core initialisation, prior to any initcalls running.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Floating point instructions in userspace can crash some arm kernels
built with clang/LLD 17.0.6:
BUG: unsupported FP instruction in kernel mode
FPEXC == 0xc0000780
Internal error: Oops - undefined instruction: 0 [#1] ARM
CPU: 0 PID: 196 Comm: vfp-reproducer Not tainted 6.10.0 #1
Hardware name: BCM2835
PC is at vfp_support_entry+0xc8/0x2cc
LR is at do_undefinstr+0xa8/0x250
pc : [<c0101d50>] lr : [<c010a80c>] psr: a0000013
sp : dc8d1f68 ip : 60000013 fp : bedea19c
r10: ec532b17 r9 : 00000010 r8 : 0044766c
r7 : c0000780 r6 : ec532b17 r5 : c1c13800 r4 : dc8d1fb0
r3 : c10072c4 r2 : c0101c88 r1 : ec532b17 r0 : 0044766c
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 00c5387d Table: 0251c008 DAC: 00000051
Register r0 information: non-paged memory
Register r1 information: vmalloc memory
Register r2 information: non-slab/vmalloc memory
Register r3 information: non-slab/vmalloc memory
Register r4 information: 2-page vmalloc region
Register r5 information: slab kmalloc-cg-2k
Register r6 information: vmalloc memory
Register r7 information: non-slab/vmalloc memory
Register r8 information: non-paged memory
Register r9 information: zero-size pointer
Register r10 information: vmalloc memory
Register r11 information: non-paged memory
Register r12 information: non-paged memory
Process vfp-reproducer (pid: 196, stack limit = 0x61aaaf8b)
Stack: (0xdc8d1f68 to 0xdc8d2000)
1f60: 0000081f b6f69300 0000000f c10073f4 c10072c4 dc8d1fb0
1f80: ec532b17 0c532b17 0044766c b6f9ccd8 00000000 c010a80c 00447670 60000010
1fa0: ffffffff c1c13800 00c5387d c0100f10 b6f68af8 00448fc0 00000000 bedea188
1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c
1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff 00000000 00000000
Call trace:
[<c0101d50>] (vfp_support_entry) from [<c010a80c>] (do_undefinstr+0xa8/0x250)
[<c010a80c>] (do_undefinstr) from [<c0100f10>] (__und_usr+0x70/0x80)
Exception stack(0xdc8d1fb0 to 0xdc8d1ff8)
1fa0: b6f68af8 00448fc0 00000000 bedea188
1fc0: bedea314 00000001 00448ebc b6f9d000 00447608 b6f9ccd8 00000000 bedea19c
1fe0: bede9198 bedea188 b6e1061c 0044766c 60000010 ffffffff
Code: 0a000061 e3877202 e594003c e3a09010 (eef16a10)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception in interrupt
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
This is a minimal userspace reproducer on a Raspberry Pi Zero W:
#include <stdio.h>
#include <math.h>
int main(void)
{
double v = 1.0;
printf("%fn", NAN + *(volatile double *)&v);
return 0;
}
Another way to consistently trigger the oops is:
calvin@raspberry-pi-zero-w ~$ python -c "import json"
The bug reproduces only when the kernel is built with DYNAMIC_DEBUG=n,
because the pr_debug() calls act as barriers even when not activated.
This is the output from the same kernel source built with the same
compiler and DYNAMIC_DEBUG=y, where the userspace reproducer works as
expected:
VFP: bounce: trigger ec532b17 fpexc c0000780
VFP: emulate: INST=0xee377b06 SCR=0x00000000
VFP: bounce: trigger eef1fa10 fpexc c0000780
VFP: emulate: INST=0xeeb40b40 SCR=0x00000000
VFP: raising exceptions 30000000
calvin@raspberry-pi-zero-w ~$ ./vfp-reproducer
nan
Crudely grepping for vmsr/vmrs instructions in the otherwise nearly
idential text for vfp_support_entry() makes the problem obvious:
vmlinux.llvm.good [0xc0101cb8] <+48>: vmrs r7, fpexc
vmlinux.llvm.good [0xc0101cd8] <+80>: vmsr fpexc, r0
vmlinux.llvm.good [0xc0101d20] <+152>: vmsr fpexc, r7
vmlinux.llvm.good [0xc0101d38] <+176>: vmrs r4, fpexc
vmlinux.llvm.good [0xc0101d6c] <+228>: vmrs r0, fpscr
vmlinux.llvm.good [0xc0101dc4] <+316>: vmsr fpexc, r0
vmlinux.llvm.good [0xc0101dc8] <+320>: vmrs r0, fpsid
vmlinux.llvm.good [0xc0101dcc] <+324>: vmrs r6, fpscr
vmlinux.llvm.good [0xc0101e10] <+392>: vmrs r10, fpinst
vmlinux.llvm.good [0xc0101eb8] <+560>: vmrs r10, fpinst2
vmlinux.llvm.bad [0xc0101cb8] <+48>: vmrs r7, fpexc
vmlinux.llvm.bad [0xc0101cd8] <+80>: vmsr fpexc, r0
vmlinux.llvm.bad [0xc0101d20] <+152>: vmsr fpexc, r7
vmlinux.llvm.bad [0xc0101d30] <+168>: vmrs r0, fpscr
vmlinux.llvm.bad [0xc0101d50] <+200>: vmrs r6, fpscr <== BOOM!
vmlinux.llvm.bad [0xc0101d6c] <+228>: vmsr fpexc, r0
vmlinux.llvm.bad [0xc0101d70] <+232>: vmrs r0, fpsid
vmlinux.llvm.bad [0xc0101da4] <+284>: vmrs r10, fpinst
vmlinux.llvm.bad [0xc0101df8] <+368>: vmrs r4, fpexc
vmlinux.llvm.bad [0xc0101e5c] <+468>: vmrs r10, fpinst2
I think LLVM's reordering is valid as the code is currently written: the
compiler doesn't know the instructions have side effects in hardware.
Fix by using "asm volatile" in fmxr() and fmrx(), so they cannot be
reordered with respect to each other. The original compiler now produces
working kernels on my hardware with DYNAMIC_DEBUG=n.
This is the relevant piece of the diff of the vfp_support_entry() text,
from the original oopsing kernel to a working kernel with this patch:
vmrs r0, fpscr
tst r0, #4096
bne 0xc0101d48
tst r0, #458752
beq 0xc0101ecc
orr r7, r7, #536870912
ldr r0, [r4, #0x3c]
mov r9, #16
-vmrs r6, fpscr
orr r9, r9, #251658240
add r0, r0, #4
str r0, [r4, #0x3c]
mvn r0, #159
sub r0, r0, #-1207959552
and r0, r7, r0
vmsr fpexc, r0
vmrs r0, fpsid
+vmrs r6, fpscr
and r0, r0, #983040
cmp r0, #65536
bne 0xc0101d88
Fixes: 4708fb041346 ("ARM: vfp: Reimplement VFP exception entry in C code")
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
The code in early_paging_init is directly masking off bits
8, 9, 10 and 11 to temporarily disable caching of the translation
tables. There is some exlanations in the comment, but use some
defines instead of magic numbers so ut becomes more evident
what is going on.
Change the type of the register to u32 since these are indeed
unsigned 32bit registers, and use a temporary variable instead
of baking too much into the inline assembly call to increase
readability.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Wen Gu says:
====================
net/smc: introduce ringbufs usage statistics
Currently, we have histograms that show the sizes of ringbufs that ever
used by SMC connections. However, they are always incremental and since
SMC allows the reuse of ringbufs, we cannot know the actual amount of
ringbufs being allocated or actively used.
So this patch set introduces statistics for the amount of ringbufs that
actually allocated by link group and actively used by connections of a
certain net namespace, so that we can react based on these memory usage
information, e.g. active fallback to TCP.
With appropriate adaptations of smc-tools, we can obtain these ringbufs
usage information:
$ smcr -d linkgroup
LG-ID : 00000500
LG-Role : SERV
LG-Type : ASYML
VLAN : 0
PNET-ID :
Version : 1
Conns : 0
Sndbuf : 12910592 B <-
RMB : 12910592 B <-
or
$ smcr -d stats
[...]
RX Stats
Data transmitted (Bytes) 869225943 (869.2M)
Total requests 18494479
Buffer usage (Bytes) 12910592 (12.31M) <-
[...]
TX Stats
Data transmitted (Bytes) 12760884405 (12.76G)
Total requests 36988338
Buffer usage (Bytes) 12910592 (12.31M) <-
[...]
[...]
Change log:
v3->v2
- use new helper nla_put_uint() instead of nla_put_u64_64bit().
v2->v1
https://lore.kernel.org/r/20240807075939.57882-1-guwen@linux.alibaba.com/
- remove inline keyword in .c files.
- use local variable in macros to avoid potential side effects.
v1
https://lore.kernel.org/r/20240805090551.80786-1-guwen@linux.alibaba.com/
====================
Link: https://patch.msgid.link/20240814130827.73321-1-guwen@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The buffer size histograms in smc_stats, namely rx/tx_rmbsize, record
the sizes of ringbufs for all connections that have ever appeared in
the net namespace. They are incremental and we cannot know the actual
ringbufs usage from these. So here introduces statistics for current
ringbufs usage of existing smc connections in the net namespace into
smc_stats, it will be incremented when new connection uses a ringbuf
and decremented when the ringbuf is unused.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently we have the statistics on sndbuf/RMB sizes of all connections
that have ever been on the link group, namely smc_stats_memsize. However
these statistics are incremental and since the ringbufs of link group
are allowed to be reused, we cannot know the actual allocated buffers
through these. So here introduces the statistic on actual allocated
ringbufs of the link group, it will be incremented when a new ringbuf is
added into buf_list and decremented when it is deleted from buf_list.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This avoids warning:
[ 0.118053] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
Caused by get_c0_compare_int on secondary CPU.
We also skipped saving IRQ number to struct clock_event_device *cd as
it's never used by clockevent core, as per comments it's only meant
for "non CPU local devices".
Reported-by: Serge Semin <fancer.lancer@gmail.com>
Closes: https://lore.kernel.org/linux-mips/6szkkqxpsw26zajwysdrwplpjvhl5abpnmxgu2xuj3dkzjnvsf@4daqrz4mf44k/
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Add an skb helper function to copy a range of bytes from within
an existing skb_seq_state.
Signed-off-by: Christian Hopps <chopps@labn.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Integrity Policy Enforcement (IPE) is an LSM that provides an
complimentary approach to Mandatory Access Control than existing LSMs
today.
Existing LSMs have centered around the concept of access to a resource
should be controlled by the current user's credentials. IPE's approach,
is that access to a resource should be controlled by the system's trust
of a current resource.
The basis of this approach is defining a global policy to specify which
resource can be trusted.
Signed-off-by: Deven Bowers <deven.desai@linux.microsoft.com>
Signed-off-by: Fan Wu <wufan@linux.microsoft.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
We don't have sufficient information to debug:
https://github.com/koverstreet/bcachefs/issues/726
- print out durability of extent ptrs, when non default
- print the number of replicas we need in data_update_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
syzkaller reported UAF in kcm_release(). [0]
The scenario is
1. Thread A builds a skb with MSG_MORE and sets kcm->seq_skb.
2. Thread A resumes building skb from kcm->seq_skb but is blocked
by sk_stream_wait_memory()
3. Thread B calls sendmsg() concurrently, finishes building kcm->seq_skb
and puts the skb to the write queue
4. Thread A faces an error and finally frees skb that is already in the
write queue
5. kcm_release() does double-free the skb in the write queue
When a thread is building a MSG_MORE skb, another thread must not touch it.
Let's add a per-sk mutex and serialise kcm_sendmsg().
[0]:
BUG: KASAN: slab-use-after-free in __skb_unlink include/linux/skbuff.h:2366 [inline]
BUG: KASAN: slab-use-after-free in __skb_dequeue include/linux/skbuff.h:2385 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
BUG: KASAN: slab-use-after-free in __skb_queue_purge include/linux/skbuff.h:3181 [inline]
BUG: KASAN: slab-use-after-free in kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
Read of size 8 at addr ffff0000ced0fc80 by task syz-executor329/6167
CPU: 1 PID: 6167 Comm: syz-executor329 Tainted: G B 6.8.0-rc5-syzkaller-g9abbc24128bc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:291
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:298
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd0/0x124 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x178/0x518 mm/kasan/report.c:488
kasan_report+0xd8/0x138 mm/kasan/report.c:601
__asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381
__skb_unlink include/linux/skbuff.h:2366 [inline]
__skb_dequeue include/linux/skbuff.h:2385 [inline]
__skb_queue_purge_reason include/linux/skbuff.h:3175 [inline]
__skb_queue_purge include/linux/skbuff.h:3181 [inline]
kcm_release+0x170/0x4c8 net/kcm/kcmsock.c:1691
__sock_release net/socket.c:659 [inline]
sock_close+0xa4/0x1e8 net/socket.c:1421
__fput+0x30c/0x738 fs/file_table.c:376
____fput+0x20/0x30 fs/file_table.c:404
task_work_run+0x230/0x2e0 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x618/0x1f64 kernel/exit.c:871
do_group_exit+0x194/0x22c kernel/exit.c:1020
get_signal+0x1500/0x15ec kernel/signal.c:2893
do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Allocated by task 6166:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x40/0x78 mm/kasan/common.c:68
kasan_save_alloc_info+0x70/0x84 mm/kasan/generic.c:626
unpoison_slab_object mm/kasan/common.c:314 [inline]
__kasan_slab_alloc+0x74/0x8c mm/kasan/common.c:340
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc_node+0x204/0x4c0 mm/slub.c:3903
__alloc_skb+0x19c/0x3d8 net/core/skbuff.c:641
alloc_skb include/linux/skbuff.h:1296 [inline]
kcm_sendmsg+0x1d3c/0x2124 net/kcm/kcmsock.c:783
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
sock_sendmsg+0x220/0x2c0 net/socket.c:768
splice_to_socket+0x7cc/0xd58 fs/splice.c:889
do_splice_from fs/splice.c:941 [inline]
direct_splice_actor+0xec/0x1d8 fs/splice.c:1164
splice_direct_to_actor+0x438/0xa0c fs/splice.c:1108
do_splice_direct_actor fs/splice.c:1207 [inline]
do_splice_direct+0x1e4/0x304 fs/splice.c:1233
do_sendfile+0x460/0xb3c fs/read_write.c:1295
__do_sys_sendfile64 fs/read_write.c:1362 [inline]
__se_sys_sendfile64 fs/read_write.c:1348 [inline]
__arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1348
__invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
Freed by task 6167:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x40/0x78 mm/kasan/common.c:68
kasan_save_free_info+0x5c/0x74 mm/kasan/generic.c:640
poison_slab_object+0x124/0x18c mm/kasan/common.c:241
__kasan_slab_free+0x3c/0x78 mm/kasan/common.c:257
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kmem_cache_free+0x15c/0x3d4 mm/slub.c:4363
kfree_skbmem+0x10c/0x19c
__kfree_skb net/core/skbuff.c:1109 [inline]
kfree_skb_reason+0x240/0x6f4 net/core/skbuff.c:1144
kfree_skb include/linux/skbuff.h:1244 [inline]
kcm_release+0x104/0x4c8 net/kcm/kcmsock.c:1685
__sock_release net/socket.c:659 [inline]
sock_close+0xa4/0x1e8 net/socket.c:1421
__fput+0x30c/0x738 fs/file_table.c:376
____fput+0x20/0x30 fs/file_table.c:404
task_work_run+0x230/0x2e0 kernel/task_work.c:180
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x618/0x1f64 kernel/exit.c:871
do_group_exit+0x194/0x22c kernel/exit.c:1020
get_signal+0x1500/0x15ec kernel/signal.c:2893
do_signal+0x23c/0x3b44 arch/arm64/kernel/signal.c:1249
do_notify_resume+0x74/0x1f4 arch/arm64/kernel/entry-common.c:148
exit_to_user_mode_prepare arch/arm64/kernel/entry-common.c:169 [inline]
exit_to_user_mode arch/arm64/kernel/entry-common.c:178 [inline]
el0_svc+0xac/0x168 arch/arm64/kernel/entry-common.c:713
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
The buggy address belongs to the object at ffff0000ced0fc80
which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 0 bytes inside of
freed 240-byte region [ffff0000ced0fc80, ffff0000ced0fd70)
The buggy address belongs to the physical page:
page:00000000d35f4ae4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ed0f
flags: 0x5ffc00000000800(slab|node=0|zone=2|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 05ffc00000000800 ffff0000c1cbf640 fffffdffc3423100 dead000000000004
raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff0000ced0fb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff0000ced0fc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff0000ced0fc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff0000ced0fd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
ffff0000ced0fd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b72d86aa5df17ce74c60
Tested-by: syzbot+b72d86aa5df17ce74c60@syzkaller.appspotmail.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240815220437.69511-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The check for '!to' is redundant here, since skb_can_coalesce() already
contains this check.
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1723730983-22912-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit a1ab24e5fc4a ("mptcp: consolidate sockopt synchronization")
removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816100404.879598-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These are never implenmented since commit b691b1116e82 ("net/mlx5: Implement
devlink port function cmds to control ipsec_packet").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101550.881844-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is no caller and implementations in tree.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101638.882072-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit f13697cc7a19 ("gve: Switch to config-aware queue allocation")
convert this function to gve_rx_alloc_rings_gqi().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816101906.882743-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the MCTP route input test, we're routing one skb, then (when delivery
is expected) checking the resulting routed skb.
However, we're currently checking the original skb length, rather than
the routed skb. Check the routed skb instead; the original will have
been freed at this point.
Fixes: 8892c0490779 ("mctp: Add route input to socket tests")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/kernel-janitors/4ad204f0-94cf-46c5-bdab-49592addf315@kili.mountain/
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240816-mctp-kunit-skb-fix-v1-1-3c367ac89c27@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the netlink policy to validate IPv6 address length.
Destination address currently has policy for max len set,
and source has no policy validation. In both cases
the code does the real check. With correct policy
check the code can be removed.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240816212245.467745-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|