Age | Commit message (Collapse) | Author |
|
If a previous ufence is not signalled, vm_bind will return -EBUSY.
Delaying the modification of ufence->signalled can cause issues if the
UMD reuses the same ufence so update ufence->signalled before waking up
waiters.
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3233
Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241114150537.4161573-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 553a5d14fcd927194c409b10faced6a6dbc678d1)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
SBI_POR_DOING can be cleared after recovery is completed, so that
changes made before recovery can be persistent, and subsequent
errors can be recorded into cp/sb.
Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Fsync data recovery attempts to check and fix write pointer consistency
of cursegs and all other zones. If the write pointers of cursegs are
unaligned, cursegs are changed to new sections.
If recovery fails, zone write pointers are still checked and fixed,
but the latest checkpoint cannot be written back. Additionally, retry-
mount skips recovery and rolls back to reuse the old cursegs whose
zones are already finished. This can lead to unaligned write later.
This patch addresses the issue by leaving writer pointers untouched if
recovery fails. When retry-mount is performed, cursegs and other zones
are checked and fixed after skipping recovery.
Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
The unusable cap value must be adjusted before checking whether
checkpoint=disable is feasible.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
dd if=/dev/zero of=file bs=4k count=5
xfs_io file -c "fiemap -v 2 16384"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 139272..139303 32 0x1000
1: [32..39]: 139304..139311 8 0x1001
xfs_io file -c "fiemap -v 0 16384"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..31]: 139272..139303 32 0x1000
xfs_io file -c "fiemap -v 0 16385"
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..39]: 139272..139311 40 0x1001
There are two problems:
- continuous extent is split to two
- FIEMAP_EXTENT_LAST is missing in last extent
The root cause is: if upper boundary of inquiry crosses extent,
f2fs_map_blocks() will truncate length of returned extent to
F2FS_BYTES_TO_BLK(len), and also, it will stop to query latter
extent or hole to make sure current extent is last or not.
In order to fix this issue, once we found an extent locates
in the end of inquiry range by f2fs_map_blocks(), we need to
expand inquiry range to requiry.
Cc: stable@vger.kernel.org
Fixes: 7f63eb77af7b ("f2fs: report unwritten area in f2fs_fiemap")
Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If user give a file size as "length" parameter for fiemap
operations, but if this size is non-block size aligned,
it will show 2 segments fiemap results even this whole file
is contiguous on disk, such as the following results:
./f2fs_io fiemap 0 19034 ylog/analyzer.py
Fiemap: offset = 0 len = 19034
logical addr. physical addr. length flags
0 0000000000000000 0000000020baa000 0000000000004000 00001000
1 0000000000004000 0000000020bae000 0000000000001000 00001001
after this patch:
./f2fs_io fiemap 0 19034 ylog/analyzer.py
Fiemap: offset = 0 len = 19034
logical addr. physical addr. length flags
0 0000000000000000 00000000315f3000 0000000000005000 00001001
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
f2fs doesn't support different blksize in one instance, so
bytes_to_blks() and blks_to_bytes() are equal to F2FS_BYTES_TO_BLK
and F2FS_BLK_TO_BYTES, let's use F2FS_BYTES_TO_BLK/F2FS_BLK_TO_BYTES
instead for cleanup.
Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
It missed to cast variable to unsigned long long type before
bit shift, which will cause overflow, fix it.
Fixes: f7ef9b83b583 ("f2fs: introduce macros to convert bytes and blocks in f2fs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
strcpy is deprecated. Kernel docs recommend replacing strcpy with
strscpy. The function strcpy() return value isn't used so there
shouldn't be an issue replacing with the safer alternative strscpy.
Signed-off-by: Daniel Yang <danielyangkang@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This reverts commit 54f43a10fa257ad4af02a1d157fefef6ebcfa7dc.
The above commit broke the lazytime mount, given
mount("/dev/vdb", "/mnt/test", "f2fs", 0, "lazytime");
CC: stable@vger.kernel.org # 6.11+
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Add BPF uprobe session support (Jiri Olsa)
- Optimize uprobe performance (Andrii Nakryiko)
- Add bpf_fastcall support to helpers and kfuncs (Eduard Zingerman)
- Avoid calling free_htab_elem() under hash map bucket lock (Hou Tao)
- Prevent tailcall infinite loop caused by freplace (Leon Hwang)
- Mark raw_tracepoint arguments as nullable (Kumar Kartikeya Dwivedi)
- Introduce uptr support in the task local storage map (Martin KaFai
Lau)
- Stringify errno log messages in libbpf (Mykyta Yatsenko)
- Add kmem_cache BPF iterator for perf's lock profiling (Namhyung Kim)
- Support BPF objects of either endianness in libbpf (Tony Ambardar)
- Add ksym to struct_ops trampoline to fix stack trace (Xu Kuohai)
- Introduce private stack for eligible BPF programs (Yonghong Song)
- Migrate samples/bpf tests to selftests/bpf test_progs (Daniel T. Lee)
- Migrate test_sock to selftests/bpf test_progs (Jordan Rife)
* tag 'bpf-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (152 commits)
libbpf: Change hash_combine parameters from long to unsigned long
selftests/bpf: Fix build error with llvm 19
libbpf: Fix memory leak in bpf_program__attach_uprobe_multi
bpf: use common instruction history across all states
bpf: Add necessary migrate_disable to range_tree.
bpf: Do not alloc arena on unsupported arches
selftests/bpf: Set test path for token/obj_priv_implicit_token_envvar
selftests/bpf: Add a test for arena range tree algorithm
bpf: Introduce range_tree data structure and use it in bpf arena
samples/bpf: Remove unused variable in xdp2skb_meta_kern.c
samples/bpf: Remove unused variables in tc_l2_redirect_kern.c
bpftool: Cast variable `var` to long long
bpf, x86: Propagate tailcall info only for subprogs
bpf: Add kernel symbol for struct_ops trampoline
bpf: Use function pointers count as struct_ops links count
bpf: Remove unused member rcu from bpf_struct_ops_map
selftests/bpf: Add struct_ops prog private stack tests
bpf: Support private stack for struct_ops progs
selftests/bpf: Add tracing prog private stack tests
bpf, x86: Support private stack in jit
...
|
|
SPI-NAND changes:
A load of fixes to Winbond manufacturer driver have been done, plus a
structure constification.
Raw NAND changes:
The GPMI driver has been improved on the power management side.
The Davinci driver has been cleaned up.
A leak in the Atmel driver plus some typos in the core have been fixed.
|
|
Rename pwrctrl functions and structures from "pwrctl" to "pwrctrl" to match
the similar file renames.
Link: https://lore.kernel.org/r/20241115214428.2061153-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
|
|
To slightly reduce confusion between "pwrctl" (the power controller and
power sequencing framework) and "bwctrl" (the bandwidth controller),
rename "pwrctl" to "pwrctrl" so they use the same "ctrl" suffix.
Rename drivers/pci/pwrctl/ to drivers/pci/pwrctrl/, including the related
MAINTAINERS, include file (include/linux/pci-pwrctl.h), Makefile, and
Kconfig changes.
This is the minimal rename of files only. A subsequent commit will rename
functions and data structures.
Link: https://lore.kernel.org/r/20241115214428.2061153-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
|
|
pwrctl parent
There is no need to iterate over all children of the pwrctl device parent
to remove the pwrctl device. Since the pwrctl device associated with the
PCI device can be found using of_find_device_by_node() API, use it directly
instead.
Any pwrctl devices lying around without getting associated with the PCI
devices will be removed once their parent device gets removed.
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-5-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
As per the kernel device driver model, a pwrctl device is the supplier for
the PCI device, but the device link that enforces the supplier-consumer
relationship was previously created by the pwrctl driver. Therefore, the
driver model didn't prevent probing PCI client drivers before probing the
corresponding pwrctl drivers. This may lead to a race condition if the PCI
device was already powered on by the bootloader (before the pwrctl driver).
If the bootloader did not power on the PCI device, this wouldn't create any
problem as the pwrctl driver will be the one powering on the device, so the
PCI client driver always gets probed afterward. But if the device was
already powered on, then the device will be seen by the PCI core and the
PCI client driver may get probed before its pwrctl driver. This creates a
race condition as the pwrctl driver may change the device power state while
the device is being accessed by the client driver.
One such issue was already reported on the Qcom X13s platform with the WLAN
device and fixed with a hack in the WCN pwrseq driver by a9aaf1ff88a8
("power: sequencing: request the WLAN enable GPIO as-is").
A cleaner way to fix the above mentioned race condition is to ensure that
the pwrctl drivers are always probed before the client drivers.
If the PCI device is associated with a pwrctl platform device with a power
supply, add a device link between the PCI device and the pwrctl device
before device_attach() in pci_bus_add_device().
Note that there is no need to explicitly remove the device link as that
will be taken care of by the driver core when the PCI device gets removed.
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Fixes: 8fb18619d910 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-3-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: squash fix from
https://lore.kernel.org/r/20241120062459.6371-1-manivannan.sadhasivam@linaro.org
for SPARCv9 issue reported by Jonathan Currier <dullfire@yahoo.com>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: wrap code to 80 columns]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org # Depends on power supply check
|
|
Currently, pwrctl devices are created if the corresponding PCI nodes are
defined in devicetree. But this is not correct, because not all PCI nodes
require pwrctl support. Pwrctl comes into the picture only when the device
requires kernel to manage its power state. This can be determined using the
power supply properties present in the devicetree node of the device.
Add of_pci_supply_present() to check whether the devicetree contains at
least one power supply property for a device. If one is present, create a
pwrctl device for that PCI node.
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 8fb18619d910 ("PCI/pwrctl: Create platform devices for child OF nodes of the port node")
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-2-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[bhelgaas: rename of_pci_is_supply_present() to of_pci_supply_present() for
readability]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: stable+noautosel@kernel.org # Depends on of_platform_device_create() rework
|
|
The of_platform_populate() API creates platform devices by descending
through the children of the parent node. But it provides no control over
the child nodes, which makes it difficult to add checks for the child
nodes in the future.
Use of_platform_device_create() and for_each_child_of_node_scoped() to make
it possible to add checks for each node before creating the platform
device.
Link: https://lore.kernel.org/r/20241025-pci-pwrctl-rework-v2-1-568756156cbe@linaro.org
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Tested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
We should only consider max_uA constraints if they are explicitly defined.
In cases where it is not set, we should assume the regulator has no current
limit.
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20241121-feature_poe_port_prio-v3-2-83299fa6967c@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
A previous commit added file support for nop, but it only initializes
nop->fd if IORING_NOP_FIXED_FILE is set. That check should be
IORING_NOP_FILE. Fix up the condition in nop preparation, and initialize
it to a sane value even if we're not going to be directly using it.
While in there, do the same thing for the nop->buffer field.
Reported-by: syzbot+9a8500a45c2cabdf9577@syzkaller.appspotmail.com
Fixes: a85f31052bce ("io_uring/nop: add support for testing registered files and buffers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of eagerly running all available local tw, limit the amount of
local tw done to the max of IO_LOCAL_TW_DEFAULT_MAX (20) or wait_nr. The
value of 20 is chosen as a reasonable heuristic to allow enough work
batching but also keep latency down.
Add a retry_llist that maintains a list of local tw that couldn't be
done in time. No synchronisation is needed since it is only modified
within the task context.
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20241120221452.3762588-3-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation for adding a new llist of tw to retry due to hitting the
tw limit, add a helper io_local_work_pending(). This function returns
true if there is any local tw pending. For now it only checks
ctx->work_llist.
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20241120221452.3762588-2-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In the mca_fe_hw_params(), the variable tdm_slot_width is of
type unsigned int, so the output should be %u
Signed-off-by: liujing <liujing@cmss.chinamobile.com>
Link: https://patch.msgid.link/20241113015758.5441-1-liujing@cmss.chinamobile.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
devm_kasprintf() can return a NULL pointer on failure,but this
returned value in imx_audmix_probe() is not checked.
Add NULL check in imx_audmix_probe(), to handle kernel NULL
pointer dereference error.
Fixes: 05d996e11348 ("ASoC: imx-audmix: Split capture device for audmix")
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Link: https://patch.msgid.link/20241118084553.4195-1-hanchunchao@inspur.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The words delay is computed by multiplying two unsigned ints
and by adding up the result to a u64 variable.
The multiplication, however, is performed with 32bit math
thus losing data when the actual result is larger than UINT32_MAX.
Fix the operation by casting the first operand to u64, thus forcing
the multiplication to be performed with 64bit math.
This fixes 1 OVERFLOW_BEFORE_WIDEN issue reported by Coverity
Report: CID 1601859: Integer handling issues (OVERFLOW_BEFORE_WIDEN)
Cc: Mark Brown <broonie@kernel.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: imx@lists.linux.dev
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Antonio Quartulli <antonio@mandelbit.com>
Link: https://patch.msgid.link/20241115220202.31086-1-antonio@mandelbit.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Large amount of mount hangs observed during hotplugging of 9pfs devices. The
9pfs Xen driver attempts to initialize itself more than once, causing the
frontend and backend to disagree: the backend listens on a channel that the
frontend does not send on, resulting in stalled processing.
Only allow initialization of 9p frontend once.
Fixes: c15fe55d14b3b ("9p/xen: fix connection sequence")
Signed-off-by: Alex Zenla <alex@edera.dev>
Signed-off-by: Alexander Merritt <alexander@edera.dev>
Signed-off-by: Ariadne Conill <ariadne@ariadne.space>
Reviewed-by: Juergen Gross <jgross@suse.com>
Message-ID: <20241119211633.38321-1-alexander@edera.dev>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
|
|
Update the get-reg-list test to test the Svade and Svadu Extensions are
available for guest OS.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-6-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
We extend the KVM ISA extension ONE_REG interface to allow VMM tools to
detect and enable Svade and Svadu extensions for Guest/VM. Since the
henvcfg.ADUE is read-only zero if the menvcfg.ADUE is zero, the Svadu
extension is available for Guest/VM and the Svade extension is allowed
to disabledonly when arch_has_hw_pte_young() is true.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-4-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Add entries for the Svade and Svadu extensions to the riscv,isa-extensions
property.
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-3-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Svade and Svadu extensions represent two schemes for managing the PTE A/D
bits. When the PTE A/D bits need to be set, Svade extension intdicates
that a related page fault will be raised. In contrast, the Svadu extension
supports hardware updating of PTE A/D bits. Since the Svade extension is
mandatory and the Svadu extension is optional in RVA23 profile, by default
the M-mode firmware will enable the Svadu extension in the menvcfg CSR
when only Svadu is present in DT.
This patch detects Svade and Svadu extensions from DT and adds
arch_has_hw_pte_young() to enable optimization in MGLRU and
__wp_page_copy_user() when we have the PTE A/D bits hardware updating
support.
Co-developed-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-2-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The stack depot filters out everything outside of the top interrupt
context as an uninteresting or irrelevant part of the stack traces. This
helps with stack trace de-duplication, avoiding an explosion of saved
stack traces that share the same IRQ context code path but originate
from different randomly interrupted points, eventually exhausting the
stack depot.
Filtering uses in_irqentry_text() to identify functions within the
.irqentry.text and .softirqentry.text sections, which then become the
last stack trace entries being saved.
While __do_softirq() is placed into the .softirqentry.text section by
common code, populating .irqentry.text is architecture-specific.
Currently, the .irqentry.text section on s390 is empty, which prevents
stack depot filtering and de-duplication and could result in warnings
like:
Stack depot reached limit capacity
WARNING: CPU: 0 PID: 286113 at lib/stackdepot.c:252 depot_alloc_stack+0x39a/0x3c8
with PREEMPT and KASAN enabled.
Fix this by moving the IO/EXT interrupt handlers from .kprobes.text into
the .irqentry.text section and updating the kprobes blacklist to include
the .irqentry.text section.
This is done only for asynchronous interrupts and explicitly not for
program checks, which are synchronous and where the context beyond the
program check is important to preserve. Despite machine checks being
somewhat in between, they are extremely rare, and preserving context
when possible is also of value.
SVCs and Restart Interrupts are not relevant, one being always at the
boundary to user space and the other being a one-time thing.
IRQ entries filtering is also optionally used in ftrace function graph,
where the same logic applies.
Cc: stable@vger.kernel.org # 5.15+
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Free_sampling_buffer() releases the Sampling Data Buffers (SDBs)
and Sampling Data Buffer Table (SDBTs) allocated at event
initialization. Both buffers are of PAGE_SIZE bytes. Each SDBT
consists of 512 entries. The first 511 entries point to SDBs
the last entry points to a successor SDBT. The last SDBT in
the list points to the origin of all SDBTs. SDBTs do not
contain holes, that is an entry always points to a SDB.
If less than 511 SDBs have been allocation, the last entry
points to the origin SDBT.
Simplify the release of the SDBs and SDBTs, walk along the
SDBT chain, release SDBs and SDBTs and stop when reaching
the origin again.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
If the page has been exported, do not re-import it. Imports should
only be triggered by the guest. The guest will import the page
automatically when it will need it again, there is no advantage in
importing it manually.
Moreover, vfio_pin_pages() will take an extra reference on the page and
thus will cause the import to always fail. The extra reference would be
dropped only after pointlessly trying to import the page.
Fixes: f88fb1335733 ("s390/vfio-ap: make sure nib is shared")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20241115135611.87836-1-imbrenda@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
arch_stack_walk_user_common() contains a return statement instead of a
break statement in case store_ip() fails while trying to store a callchain
entry of a user space process.
This may lead to a missing pagefault_enable() call.
If this happens any subsequent page fault of the process won't be resolved
by the page fault handler and this in turn will lead to the process being
killed.
Use a break instead of a return statement to fix this.
Fixes: ebd912ff9919 ("s390/stacktrace: Merge perf_callchain_user() and arch_stack_walk_user()")
Cc: stable@vger.kernel.org
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Since commit b3e5dc45fd1e ("s390/mm: fix local TLB flushing vs.
detach of an mm address space") __tlb_flush_mm() does not flush
local CPU, but the comment suggests it does, which is misleading.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The s390dbf format handler rely on being passed an output buffer sized
such that their output will always fit and then use plain sprintf() to
write to it. While only supplied data from other kernel components this
still potentially allows buffer overwrite if callers are not careful.
Instead just pass in the size of the output buffer and use scnprintf()
instead of sprintf() and strscpy() instead of strcpy(). The latter also
allows us to get rid of a separate strlen() call.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
A driver might allow the mmap access before initializing its
runtime->dma_area properly. Add a proper NULL check before passing to
virt_to_page() for avoiding a panic.
Reported-by: syzbot+4bf62a7b1d0f4fdb7ae2@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20241120141104.7060-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Update the hyperlink as it currently redirects to a generic site
instead of the actual specification.
Signed-off-by: Hridesh MG <hridesh699@gmail.com>
Link: https://patch.msgid.link/20241120155553.21099-1-hridesh699@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.13, part #2
- Constrain invalidations from GICR_INVLPIR to only affect the LPI
INTID space
- Set of robustness improvements to the management of vgic irqs and GIC
ITS table entries
- Fix compilation issue w/ CONFIG_CC_OPTIMIZE_FOR_SIZE=y where
set_sysreg_masks() wasn't getting inlined, breaking check for a
constant sysreg index
- Correct KVM's vPMU overflow condition to match the architecture for
hyp and non-hyp counters
|
|
Fix the inverted check for security_sb_show_options().
Link: https://lore.kernel.org/r/c8eaa647-5d67-49b6-9401-705afcb7e4d7@stanley.mountain
Link: https://lore.kernel.org/r/20241120-verehren-rhabarber-83a11b297bcc@brauner
Fixes: aefff51e1c29 ("statmount: retrieve security mount options")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: stable@vger.kernel.org # mainline only
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Move common code from opt_array/opt_sec_array to helper. This helper
does more than just unescape options, so rename to
statmount_opt_process().
Handle corner case of just a single character in options.
Rename some local variables to better describe their function.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/r/20241120142732.55210-1-mszeredi@redhat.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Brian Foster <bfoster@redhat.com> says:
Here's v4 of the zero range flush improvements. No real major changes
here, mostly minor whitespace, naming issues, etc.
* patches from https://lore.kernel.org/r/20241115200155.593665-1-bfoster@redhat.com:
iomap: elide flush from partial eof zero range
iomap: lift zeroed mapping handling into iomap_zero_range()
iomap: reset per-iter state on non-error iter advances
Link: https://lore.kernel.org/r/20241115200155.593665-1-bfoster@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Remove duplicate included header file linux/uio.h
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Link: https://lore.kernel.org/r/20240628062329.321162-2-thorsten.blum@toblux.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap zero range flushes pagecache in certain situations to
determine which parts of the range might require zeroing if dirty
data is present in pagecache. The kernel robot recently reported a
regression associated with this flushing in the following stress-ng
workload on XFS:
stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --metamix 64
This workload involves repeated small, strided, extending writes. On
XFS, this produces a pattern of post-eof speculative preallocation,
conversion of preallocation from delalloc to unwritten, dirtying
pagecache over newly unwritten blocks, and then rinse and repeat
from the new EOF. This leads to repetitive flushing of the EOF folio
via the zero range call XFS uses for writes that start beyond
current EOF.
To mitigate this problem, special case EOF block zeroing to prefer
zeroing the folio over a flush when the EOF folio is already dirty.
To do this, split out and open code handling of an unaligned start
offset. This brings most of the performance back by avoiding flushes
on zero range calls via write and truncate extension operations. The
flush doesn't occur in these situations because the entire range is
post-eof and therefore the folio that overlaps EOF is the only one
in the range.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-4-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
In preparation for special handling of subranges, lift the zeroed
mapping logic from the iterator into the caller. Since this puts the
pagecache dirty check and flushing in the same place, streamline the
comments a bit as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-3-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap_iter_advance() zeroes the processed and mapping fields on
every non-error iteration except for the last expected iteration
(i.e. return 0 expected to terminate the iteration loop). This
appears to be circumstantial as nothing currently relies on these
fields after the final iteration.
Therefore to better faciliate iomap_iter reuse in subsequent
patches, update iomap_iter_advance() to always reset per-iteration
state on successful completion.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115200155.593665-2-bfoster@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
iomap_zero_range() uses buffered writes for manual zeroing, no
longer updates i_size for such writes, but is still explicitly
called for post-eof ranges. The historical use case for this is
zeroing post-eof speculative preallocation on extending writes from
XFS. However, XFS also recently changed to convert all post-eof
delalloc mappings to unwritten in the iomap_begin() handler, which
means it now never expects manual zeroing of post-eof mappings. In
other words, all post-eof mappings should be reported as holes or
unwritten.
This is a subtle dependency that can be hard to detect if violated
because associated codepaths are likely to update i_size after folio
locks are dropped, but before writeback happens to occur. For
example, if XFS reverts back to some form of manual zeroing of
post-eof blocks on write extension, writeback of those zeroed folios
will now race with the presumed i_size update from the subsequent
buffered write.
Since iomap_zero_range() can't correctly zero post-eof mappings
beyond EOF without updating i_size, warn if this ever occurs. This
serves as minimal indication that if this use case is reintroduced
by a filesystem, iomap_zero_range() might need to reconsider i_size
updates for write extending use cases.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Link: https://lore.kernel.org/r/20241115145931.535207-1-bfoster@redhat.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Setting GPIO direction = high, sometimes results in GPIO value = 0.
If a GPIO is pulled high, the following construction results in the
value being 0 when the desired value is 1:
$ echo "high" > /sys/class/gpio/gpio336/direction
$ cat /sys/class/gpio/gpio336/value
0
Before the GPIO direction is changed from an input to an output,
exar_set_value() is called with value = 1, but since the GPIO is an
input when exar_set_value() is called, _regmap_update_bits() reads a 1
due to an external pull-up. regmap_set_bits() sets force_write =
false, so the value (1) is not written. When the direction is then
changed, the GPIO becomes an output with the value of 0 (the hardware
default).
regmap_write_bits() sets force_write = true, so the value is always
written by exar_set_value() and an external pull-up doesn't affect the
outcome of setting direction = high.
The same can happen when a GPIO is pulled low, but the scenario is a
little more complicated.
$ echo high > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
1
$ echo in > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
0
$ echo low > /sys/class/gpio/gpio351/direction
$ cat /sys/class/gpio/gpio351/value
1
Fixes: 36fb7218e878 ("gpio: exar: switch to using regmap")
Co-developed-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Matthew McClain <mmcclain@noprivs.com>
Signed-off-by: Sai Kumar Cholleti <skmr537@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241105071523.2372032-1-skmr537@gmail.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
During conversion driver to modern APIs the base field initial value
of the GPIO chip was moved from -1 to 0, which triggers a warning.
Add missed base initialisation as it was in the original code.
Initialise the GPIO chip label correctly as it was done by
of_mm_gpiochip_add_data() before the below mentioned change.
Fixes: 50dded8d9d62 ("gpio: altera: Drop legacy-of-mm-gpiochip.h header")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241118095402.516989-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Initialise the GPIO chip label correctly as it was done by
of_mm_gpiochip_add_data() before the below mentioned change.
Fixes: cf8f4462e5fa ("gpio: zevio: drop of_gpio.h header")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241118092729.516736-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|