Age | Commit message (Collapse) | Author |
|
Both IPsec and kTLS need two functions declared in the lib/crypto.c
file. These functions are advertised through general mlx5.h file and
don't have any protection from attempts to call them without proper
config option.
Instead of creating stubs just for two functions, simply build that *.c
file as part of regular mlx5_eth build and rely on compiler to throw
them away if no callers exist in produced code.
Link: https://lore.kernel.org/r/37f02171da06886c1b403d44dd18b2a56b19219d.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
IPsec is part of ethernet side of mlx5 driver and needs to be placed
in en_accel folder.
Link: https://lore.kernel.org/r/a0ca88f4d9c602c574106c0de0511803e7dcbdff.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
In current code, the CONFIG_MLX5_IPSEC and CONFIG_MLX5_EN_IPSEC are
the same. So remove useless indirection.
Link: https://lore.kernel.org/r/fd14492cbc01a0d51a5bfedde02bcd2154123fde.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Flow steering is a low level internal driver API, as such it relies on
the callers to check if namespace is supported and not rely on some
compilation flag.
Link: https://lore.kernel.org/r/cfb411a8a9ed2a1471810af254bdc0f03469f79c.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Merge two different function to one in order to provide coherent
picture if the device is IPsec capable or not.
Link: https://lore.kernel.org/r/8f10ea06ad19c6f651e9fb33921009658f01e1d5.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The mlx5_is_ipsec_device() check was to distinguish ConnectX device
related ops from FPGA, so post removing FPGA IPsec code this check
can be removed as no other device implements it.
It is safe to do it as there is already embedded check of IPsec device
in mlx5_accel_ipsec_device_caps().
Link: https://lore.kernel.org/r/e45362abfcabe18e8af20ec8d1acdc99355978f3.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The IPsec won't be initialized at all if device doesn't support IPsec
offload. It means that we can combine the ipsec.c and ipsec_offload.c
files to one file. Such change will allow us to remove ipsec_ops
indirection.
Link: https://lore.kernel.org/r/d0ac1fb7b14c10ae20a21ae17a393ee860c72ac3.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The removal of mlx5 flow steering logic, left the kernel without any RDMA
drivers that implements flow action callbacks supplied by RDMA/core. Any
user access to them caused to EOPNOTSUPP error, which can be achieved by
simply removing ioctl implementation.
Link: https://lore.kernel.org/r/a638e376314a2eb1c66f597c0bbeeab2e5de7faf.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The mlx5 flow steering crypto API was intended to be used in FPGA
devices, which is not supported for years already. The removal of
mlx5 crypto FPGA code together with inability to configure encryption
keys makes the low steering API completely unusable.
So delete the code, so any ESP flow steering requests will fail with
not supported error, as it is happening now anyway as no device support
this type of API.
Link: https://lore.kernel.org/r/634a5face7734381463d809bfb89850f6998deac.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The IPSEC_REQUIRED_METADATA capability bit is never set, and can be
safely removed from the flow action flags.
Link: https://lore.kernel.org/r/697cd60bd5c9b6a004c449c1a41c2798fac844ff.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Delete the statistics that is not used anymore.
Link: https://lore.kernel.org/r/3f194752881e095910c887dd5cede1dcba6acaf3.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Only FPGA needed this NO_TRAILER flag, so remove this assignment.
Link: https://lore.kernel.org/r/636d75421e1ca4254a062537eea001ab0e50e19b.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
The IDA halloc variable is not needed and can be removed.
Link: https://lore.kernel.org/r/cbecfbe01621e1b8bde746aa7f6c08497e656a25.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Remove specific to FPGS IPsec metadata handling logic which is not
required for mlx5 NICs devices.
Link: https://lore.kernel.org/r/fe67a1de4fc6032a940e18c8a6461a1ccf902fc4.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Mellanox INNOVA IPsec cards are EOL in Nov, 2019 [1]. As such, the code
is unmaintained, untested and not in-use by any upstream/distro oriented
customers. In order to reduce code complexity, drop the kernel code.
[1] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf
Link: https://lore.kernel.org/r/2afe88ec5020a491079eacf6fe3c89b64d65195c.1649232994.git.leonro@nvidia.com
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Pull block fixes from Jens Axboe:
"Nothing major in here, just a few small fixes:
- Small series of neglected drbd patches (Christoph, Lv, Xiaomeng)
- Remove dead variable in cdrom (Enze)"
* tag 'block-5.18-2022-04-08' of git://git.kernel.dk/linux-block:
drbd: set QUEUE_FLAG_STABLE_WRITES
drbd: fix an invalid memory access caused by incorrect use of list iterator
drbd: Fix five use after free bugs in get_initial_state
cdrom: remove unused variable
|
|
Pull io_uring fixes from Jens Axboe:
"A bit bigger than usual post merge window, largely due to a revert and
a fix of at what point files are assigned for requests.
The latter fixing a linked request use case where a dependent link can
rely on what file is assigned consistently.
Summary:
- 32-bit compat fix for IORING_REGISTER_IOWQ_AFF (Eugene)
- File assignment fixes (me)
- Revert of the NAPI poll addition from this merge window. The author
isn't available right now to engage on this, so let's revert it and
we can retry for the 5.19 release (me, Jakub)
- Fix a timeout removal race (me)
- File update and SCM fixes (Pavel)"
* tag 'io_uring-5.18-2022-04-08' of git://git.kernel.dk/linux-block:
io_uring: fix race between timeout flush and removal
io_uring: use nospec annotation for more indexes
io_uring: zero tag on rsrc removal
io_uring: don't touch scm_fp_list after queueing skb
io_uring: nospec index for tags on files update
io_uring: implement compat handling for IORING_REGISTER_IOWQ_AFF
Revert "io_uring: Add support for napi_busy_poll"
io_uring: drop the old style inflight file tracking
io_uring: defer file assignment
io_uring: propagate issue_flags state down to file assignment
io_uring: move read/write file prep state into actual opcode handler
io_uring: defer splice/tee file validity check until command issue
io_uring: don't check req->file in io_fsync_prep()
|
|
Pull rdma fixes from Jason Gunthorpe:
"Several bug fixes for old bugs:
- Welcome Leon as co-maintainer for RDMA so we are back to having two
people
- Some corner cases are fixed in mlx5's MR code
- Long standing CM bug where a DREQ at the wrong time can result in a
long timeout
- Missing locking and refcounting in hf1"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hfi1: Fix use-after-free bug for mm struct
IB/rdmavt: add lock to call to rvt_error_qp to prevent a race condition
IB/cm: Cancel mad on the DREQ event when the state is MRA_REP_RCVD
RDMA/mlx5: Add a missing update of cache->last_add
RDMA/mlx5: Don't remove cache MRs when a delay is needed
MAINTAINERS: Update qib and hfi1 related drivers
MAINTAINERS: Add Leon Romanovsky to RDMA maintainers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These revert a problematic commit from the 5.17 development cycle and
finalize the elimination of acpi_bus_get_device() that mostly took
place during the recent merge window.
Specifics:
- Revert an ACPI processor driver change related to cache
invalidation in acpi_idle_play_dead() that clearly was a mistake
and introduced user-visible regressions (Akihiko Odaki).
- Replace the last instance of acpi_bus_get_device() added during the
recent merge window and drop the function to prevent more users of
it from being added (Rafael Wysocki)"
* tag 'acpi-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: bus: Eliminate acpi_bus_get_device()
Revert "ACPI: processor: idle: Only flush cache on entering C3"
|
|
vcpu_fp uses the riscv_isa_extension mechanism which gets
defined in hwcap.h but doesn't include that head file.
While it seems to work in most cases, in certain conditions
this can lead to build failures like
../arch/riscv/kvm/vcpu_fp.c: In function ‘kvm_riscv_vcpu_fp_reset’:
../arch/riscv/kvm/vcpu_fp.c:22:13: error: implicit declaration of function ‘riscv_isa_extension_available’ [-Werror=implicit-function-declaration]
22 | if (riscv_isa_extension_available(&isa, f) ||
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../arch/riscv/kvm/vcpu_fp.c:22:49: error: ‘f’ undeclared (first use in this function)
22 | if (riscv_isa_extension_available(&isa, f) ||
Fix this by simply including the necessary header.
Fixes: 0a86512dc113 ("RISC-V: KVM: Factor-out FP virtualization into separate
sources")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The guest_hang() function is used as the default exception handler
for various KVM selftests applications by setting it's address in
the vstvec CSR. The vstvec CSR requires exception handler base address
to be at least 4-byte aligned so this patch fixes alignment of the
guest_hang() function.
Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V
64-bit")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Supporting hardware updates of PTE A and D bits is optional for any
RISC-V implementation so current software strategy is to always set
these bits in both G-stage (hypervisor) and VS-stage (guest kernel).
If PTE A and D bits are not set by software (hypervisor or guest)
then RISC-V implementations not supporting hardware updates of these
bits will cause traps even for perfectly valid PTEs.
Based on above explanation, the VS-stage page table created by various
KVM selftest applications is not correct because PTE A and D bits are
not set. This patch fixes VS-stage page table programming of PTE A and
D bits for KVM selftests.
Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V
64-bit")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
We might have RISC-V systems (such as QEMU) where VMID is not part
of the TLB entry tag so these systems will have to flush all TLB
entries upon any change in hgatp.VMID.
Currently, we zero-out hgatp CSR in kvm_arch_vcpu_put() and we
re-program hgatp CSR in kvm_arch_vcpu_load(). For above described
systems, this will flush all TLB entries whenever VCPU exits to
user-space hence reducing performance.
This patch fixes above described performance issue by not clearing
hgatp CSR in kvm_arch_vcpu_put().
Fixes: 34bde9d8b9e6 ("RISC-V: KVM: Implement VCPU world-switch")
Cc: stable@vger.kernel.org
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
UBSAN warnings are observed on atlantic driver:
[ 294.432996] UBSAN: array-index-out-of-bounds in /build/linux-Qow4fL/linux-5.15.0/drivers/net/ethernet/aquantia/atlantic/aq_nic.c:484:48
[ 294.433695] index 8 is out of range for type 'aq_vec_s *[8]'
The ring is dereferenced right before breaking out the loop, to prevent
that from happening, only use the index in the loop to fix the issue.
BugLink: https://bugs.launchpad.net/bugs/1958770
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20220408022204.16815-1-kai.heng.feng@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The DSA master might not have been probed yet in which case the probe of
the felix switch fails with -EPROBE_DEFER:
[ 4.435305] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517
It is not an error. Use dev_err_probe() to demote this particular error
to a debug message.
Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220408101521.281886-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The fsl,scu.txt dt-binding documentation explicitly mentions
that the compatible string should be either "fsl,imx8qm-clock"
or "fsl,imx8qxp-clock", followed by "fsl,scu-clk". Also, i.MX8qm
SCU clocks and i.MX8qxp SCU clocks are really not the same, so
we have to set the compatible property according to SoC name.
Let's correct the i.MX8qm clock controller's compatible property
from
"fsl,imx8qxp-clk", "fsl,scu-clk"
to
"fsl,imx8qm-clk", "fsl,scu-clk" .
Fixes: f2180be18a63 ("arm64: dts: imx: add imx8qm common dts file")
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
When the driver fails to probe, we will get the following splat:
[ 19.311970] ------------[ cut here ]------------
[ 19.312566] WARNING: CPU: 3 PID: 375 at drivers/regulator/core.c:2257 _regulator_put+0x3ec/0x4e0
[ 19.317591] RIP: 0010:_regulator_put+0x3ec/0x4e0
[ 19.328831] Call Trace:
[ 19.329112] <TASK>
[ 19.329369] regulator_bulk_free+0x82/0xe0
[ 19.329860] devres_release_group+0x319/0x3d0
[ 19.330357] i2c_device_probe+0x766/0x940
Fix this by adding a callback that will deal with the disabling when the
driver fails to probe.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Link: https://lore.kernel.org/r/20220409022629.3493557-1-zheyuma97@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Split the smb3_add_credits tracepoint to make it more obvious when looking
at the logs which line corresponds to what credit change. Also add a
tracepoint for credit overflow when it's being added back.
Note that it might be better to add another field to the tracepoint for
the information rather than splitting it. It would also be useful to store
the MID potentially, though that isn't available when the credits are first
obtained.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: linux-cifs@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit fix from Shuah Khan:
"A single documentation fix to incorrect and outdated usage
information"
* tag 'linux-kselftest-kunit-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: kunit: fix path to .kunitconfig in start.rst
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Build and run-times fixes to tests:
- header dependencies
- missing tear-downs to release allocated resources in assert paths
- missing error messages when build fails
- coccicheck and unused variable warnings"
* tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/harness: Pass variant to teardown
selftests/harness: Run TEARDOWN for ASSERT failures
selftests: fix an unused variable warning in pidfd selftest
selftests: fix header dependency for pid_namespace selftests
selftests: x86: add 32bit build warnings for SUSE
selftests/proc: fix array_size.cocci warning
selftests/vDSO: fix array_size.cocci warning
|
|
Merge fixes from Andrew Morton:
"9 patches.
Subsystems affected by this patch series: mm (migration, highmem,
sparsemem, mremap, mempolicy, and memcg), lz4, mailmap, and
MAINTAINERS"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add Tom as clang reviewer
mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"
mailmap: update Vasily Averin's email address
mm/mempolicy: fix mpol_new leak in shared_policy_replace
mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
lz4: fix LZ4_decompress_safe_partial read out of bound
highmem: fix checks in __kmap_local_sched_{in,out}
mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.
|
|
I have been helping with build breaks and other clang things and would
like to help with the reviews.
Link: https://lkml.kernel.org/r/20220407175715.3378998-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
has subtle races which are proving ugly to fix. Revert the original
optimization. If quantitative testing indicates that we have a
significant problem here then other implementations can be looked at.
Fixes: 405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I'm moving to a @linux.dev account. Map my old addresses.
Link: https://lkml.kernel.org/r/737c7c2b-cdab-63ee-be90-cb33316c9657@linux.dev
Signed-off-by: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If mpol_new is allocated but not used in restart loop, mpol_new will be
freed via mpol_put before returning to the caller. But refcnt is not
initialized yet, so mpol_put could not do the right things and might
leak the unused mpol_new. This would happen if mempolicy was updated on
the shared shmem file while the sp->lock has been dropped during the
memory allocation.
This issue could be triggered easily with the below code snippet if
there are many processes doing the below work at the same time:
shmid = shmget((key_t)5566, 1024 * PAGE_SIZE, 0666|IPC_CREAT);
shm = shmat(shmid, 0, 0);
loop many times {
mbind(shm, 1024 * PAGE_SIZE, MPOL_LOCAL, mask, maxnode, 0);
mbind(shm + 128 * PAGE_SIZE, 128 * PAGE_SIZE, MPOL_DEFAULT, mask,
maxnode, 0);
}
Link: https://lkml.kernel.org/r/20220329111416.27954-1-linmiaohe@huawei.com
Fixes: 42288fe366c4 ("mm: mempolicy: Convert shared_policy mutex to spinlock")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org> [3.8]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If an mremap() syscall with old_size=0 ends up in move_page_tables(), it
will call invalidate_range_start()/invalidate_range_end() unnecessarily,
i.e. with an empty range.
This causes a WARN in KVM's mmu_notifier. In the past, empty ranges
have been diagnosed to be off-by-one bugs, hence the WARNing. Given the
low (so far) number of unique reports, the benefits of detecting more
buggy callers seem to outweigh the cost of having to fix cases such as
this one, where userspace is doing something silly. In this particular
case, an early return from move_page_tables() is enough to fix the
issue.
Link: https://lkml.kernel.org/r/20220329173155.172439-1-pbonzini@redhat.com
Reported-by: syzbot+6bde52d89cfdf9f61425@syzkaller.appspotmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The gcc 12 compiler reports a "'mem_section' will never be NULL" warning
on the following code:
static inline struct mem_section *__nr_to_section(unsigned long nr)
{
#ifdef CONFIG_SPARSEMEM_EXTREME
if (!mem_section)
return NULL;
#endif
if (!mem_section[SECTION_NR_TO_ROOT(nr)])
return NULL;
:
It happens with CONFIG_SPARSEMEM_EXTREME off. The mem_section definition
is
#ifdef CONFIG_SPARSEMEM_EXTREME
extern struct mem_section **mem_section;
#else
extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
#endif
In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static
2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]"
doesn't make sense.
Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]"
check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an
explicit NR_SECTION_ROOTS check to make sure that there is no
out-of-bound array access.
Link: https://lkml.kernel.org/r/20220331180246.2746210-1-longman@redhat.com
Fixes: 3e347261a80b ("sparsemem extreme implementation")
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: Justin Forbes <jforbes@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When partialDecoding, it is EOF if we've either filled the output buffer
or can't proceed with reading an offset for following match.
In some extreme corner cases when compressed data is suitably corrupted,
UAF will occur. As reported by KASAN [1], LZ4_decompress_safe_partial
may lead to read out of bound problem during decoding. lz4 upstream has
fixed it [2] and this issue has been disscussed here [3] before.
current decompression routine was ported from lz4 v1.8.3, bumping
lib/lz4 to v1.9.+ is certainly a huge work to be done later, so, we'd
better fix it first.
[1] https://lore.kernel.org/all/000000000000830d1205cf7f0477@google.com/
[2] https://github.com/lz4/lz4/commit/c5d6f8a8be3927c0bec91bcc58667a6cfad244ad#
[3] https://lore.kernel.org/all/CC666AE8-4CA4-4951-B6FB-A2EFDE3AC03B@fb.com/
Link: https://lkml.kernel.org/r/20211111105048.2006070-1-guoxuenan@huawei.com
Reported-by: syzbot+63d688f1d899c588fb71@syzkaller.appspotmail.com
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Nick Terrell <terrelln@fb.com>
Acked-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Yann Collet <cyan@fb.com>
Cc: Chengyang Fan <cy.fan@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When CONFIG_DEBUG_KMAP_LOCAL is enabled __kmap_local_sched_{in,out} check
that even slots in the tsk->kmap_ctrl.pteval are unmapped. The slots are
initialized with 0 value, but the check is done with pte_none. 0 pte
however does not necessarily mean that pte_none will return true. e.g.
on xtensa it returns false, resulting in the following runtime warnings:
WARNING: CPU: 0 PID: 101 at mm/highmem.c:627 __kmap_local_sched_out+0x51/0x108
CPU: 0 PID: 101 Comm: touch Not tainted 5.17.0-rc7-00010-gd3a1cdde80d2-dirty #13
Call Trace:
dump_stack+0xc/0x40
__warn+0x8f/0x174
warn_slowpath_fmt+0x48/0xac
__kmap_local_sched_out+0x51/0x108
__schedule+0x71a/0x9c4
preempt_schedule_irq+0xa0/0xe0
common_exception_return+0x5c/0x93
do_wp_page+0x30e/0x330
handle_mm_fault+0xa70/0xc3c
do_page_fault+0x1d8/0x3c4
common_exception+0x7f/0x7f
WARNING: CPU: 0 PID: 101 at mm/highmem.c:664 __kmap_local_sched_in+0x50/0xe0
CPU: 0 PID: 101 Comm: touch Tainted: G W 5.17.0-rc7-00010-gd3a1cdde80d2-dirty #13
Call Trace:
dump_stack+0xc/0x40
__warn+0x8f/0x174
warn_slowpath_fmt+0x48/0xac
__kmap_local_sched_in+0x50/0xe0
finish_task_switch$isra$0+0x1ce/0x2f8
__schedule+0x86e/0x9c4
preempt_schedule_irq+0xa0/0xe0
common_exception_return+0x5c/0x93
do_wp_page+0x30e/0x330
handle_mm_fault+0xa70/0xc3c
do_page_fault+0x1d8/0x3c4
common_exception+0x7f/0x7f
Fix it by replacing !pte_none(pteval) with pte_val(pteval) != 0.
Link: https://lkml.kernel.org/r/20220403235159.3498065-1-jcmvbkbc@gmail.com
Fixes: 5fbda3ecd14a ("sched: highmem: Store local kmaps in task struct")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix a VM_BUG_ON_FOLIO(folio_nr_pages(old) != nr_pages) crash.
With folios support, it is possible to have other than HPAGE_PMD_ORDER
THPs, in the form of folios, in the system. Use thp_order() to correctly
determine the source page order during migration.
Link: https://lkml.kernel.org/r/20220404165325.1883267-1-zi.yan@sent.com
Link: https://lore.kernel.org/linux-mm/20220404132908.GA785673@u2004/
Fixes: d68eccad3706 ("mm/filemap: Allow large folios to be added to the page cache")
Reported-by: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 01491a756578 ("fscache, cachefiles: Disable configuration") added
the FSCACHE_OLD_API configuration when rewritten. Now, it's not used any
more. Remove it.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-March/006647.html # v1
|
|
We already have the wrapper function to set cache state.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006648.html # v1
|
|
fscache_cookies_seq_ops is only used in proc.c that is compiled under
enabled CONFIG_PROC_FS, so move related code under this config. The
same case exsits in internal.h.
Also, make fscache_lru_cookie_timeout static due to no user outside
of cookie.c.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006649.html # v1
|
|
The cookie is not used at all, remove it and update the usage in io.c
and afs/write.c (which is the only user outside of fscache currently)
at the same time.
[DH: Amended the documentation also]
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006659.html
|
|
There's no fscache_are_objects_withdrawn() helper at all to test if
cookie withdrawal is completed currently. The cache backend is using
fscache_wait_for_objects() to wait all objects to be withdrawn.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006705.html # v1
|
|
1. cache backend is using fscache_relinquish_cache() rather than
fscache_relinquish_cookie() to reset the cache cookie.
2. No fscache_cache_relinquish() helper currently, it should be
fscache_relinquish_cache().
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006703.html # v1
Link: https://listman.redhat.com/archives/linux-cachefs/2022-April/006704.html # v2
|
|
Use the actual length of volume coherency data when setting the
xattr to avoid the following KASAN report.
BUG: KASAN: slab-out-of-bounds in cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles]
Write of size 4 at addr ffff888101e02af4 by task kworker/6:0/1347
CPU: 6 PID: 1347 Comm: kworker/6:0 Kdump: loaded Not tainted 5.18.0-rc1-nfs-fscache-netfs+ #13
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
Workqueue: events fscache_create_volume_work [fscache]
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x5a
print_report.cold+0x5e/0x5db
? __lock_text_start+0x8/0x8
? cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles]
kasan_report+0xab/0x120
? cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles]
kasan_check_range+0xf5/0x1d0
memcpy+0x39/0x60
cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles]
cachefiles_acquire_volume+0x2be/0x500 [cachefiles]
? __cachefiles_free_volume+0x90/0x90 [cachefiles]
fscache_create_volume_work+0x68/0x160 [fscache]
process_one_work+0x3b7/0x6a0
worker_thread+0x2c4/0x650
? process_one_work+0x6a0/0x6a0
kthread+0x16c/0x1a0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Allocated by task 1347:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
cachefiles_set_volume_xattr+0x76/0x350 [cachefiles]
cachefiles_acquire_volume+0x2be/0x500 [cachefiles]
fscache_create_volume_work+0x68/0x160 [fscache]
process_one_work+0x3b7/0x6a0
worker_thread+0x2c4/0x650
kthread+0x16c/0x1a0
ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff888101e02af0
which belongs to the cache kmalloc-8 of size 8
The buggy address is located 4 bytes inside of
8-byte region [ffff888101e02af0, ffff888101e02af8)
The buggy address belongs to the physical page:
page:00000000a2292d70 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x101e02
flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0000200 0000000000000000 dead000000000001 ffff888100042280
raw: 0000000000000000 0000000080660066 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888101e02980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc
ffff888101e02a00: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00
>ffff888101e02a80: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 04 fc
^
ffff888101e02b00: fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc
ffff888101e02b80: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc
==================================================================
Fixes: 413a4a6b0b55 "cachefiles: Fix volume coherency attribute"
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/20220405134649.6579-1-dwysocha@redhat.com/ # v1
Link: https://lore.kernel.org/r/20220405142810.8208-1-dwysocha@redhat.com/ # Incorrect v2
|
|
Unmark inode in use if error encountered. If the in-use flag leakage
occurs in cachefiles_open_file(), Cachefiles will complain "Inode
already in use" when later another cookie with the same index key is
looked up.
If the in-use flag leakage occurs in cachefiles_create_tmpfile(), though
the "Inode already in use" warning won't be triggered, fix the leakage
anyway.
Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://listman.redhat.com/archives/linux-cachefs/2022-March/006615.html # v1
Link: https://listman.redhat.com/archives/linux-cachefs/2022-March/006618.html # v2
|
|
Currently, when inserting a new filter that needs to sit at the head
of chain 0, it will first update the heads pointer on all devices using
the (shared) block, and only then complete the initialization of the new
element so that it has a "next" element.
This can lead to a situation that the chain 0 head is propagated to
another CPU before the "next" initialization is done. When this race
condition is triggered, packets being matched on that CPU will simply
miss all other filters, and will flow through the stack as if there were
no other filters installed. If the system is using OVS + TC, such
packets will get handled by vswitchd via upcall, which results in much
higher latency and reordering. For other applications it may result in
packet drops.
This is reproducible with a tc only setup, but it varies from system to
system. It could be reproduced with a shared block amongst 10 veth
tunnels, and an ingress filter mirroring packets to another veth.
That's because using the last added veth tunnel to the shared block to
do the actual traffic, it makes the race window bigger and easier to
trigger.
The fix is rather simple, to just initialize the next pointer of the new
filter instance (tp) before propagating the head change.
The fixes tag is pointing to the original code though this issue should
only be observed when using it unlocked.
Fixes: 2190d1d0944f ("net: sched: introduce helpers to work with filter chains")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/b97d5f4eaffeeb9d058155bcab63347527261abf.1649341369.git.marcelo.leitner@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Yi Chen reported an unexpected sctp connection abort, and it occurred when
COOKIE_ECHO is bundled with DATA Fragment by SCTP HW GSO. As the IP header
is included in chunk->head_skb instead of chunk->skb, it failed to check
IP header version in security_sctp_assoc_request().
According to Ondrej, SELinux only looks at IP header (address and IPsec
options) and XFRM state data, and these are all included in head_skb for
SCTP HW GSO packets. So fix it by using head_skb when calling
security_sctp_assoc_request() in processing COOKIE_ECHO.
v1->v2:
- As Ondrej noticed, chunk->head_skb should also be used for
security_sctp_assoc_established() in sctp_sf_do_5_1E_ca().
Fixes: e215dab1c490 ("security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce")
Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/71becb489e51284edf0c11fc15246f4ed4cef5b6.1649337862.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|