Age | Commit message (Collapse) | Author |
|
When a driver tries to send an I2C message while the adapter is
suspended, this typically fails with:
i2c-sh_mobile e60b0000.i2c: Transfer request timed out
Avoid accessing the adapter while it is suspended by marking it
suspended during suspend. This allows the I2C core to catch this, and
print a warning:
WARNING: CPU: 1 PID: 13 at drivers/i2c/i2c-core.h:54
__i2c_transfer+0x4a4/0x4e4
i2c i2c-6: Transfer while suspended
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Add S500 variant to the list of devices supported by the Actions Semi
Owl I2C driver.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Convert the Actions Semi Owl I2C DT binding to a YAML schema for
enabling DT validation.
Additionally, add a new compatible string corresponding to the I2C
controller found in the S500 variant of the Actions Semi Owl SoCs
family.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-genpd
Remaining genpd changes for omaps for v5.11
This series contains the remaining genpd changes for omap4/5,
and dra7 to add the power domain and reset control data to
omap-prm driver. We also update several devices to probe without
platform data to get us closer to booting omap4/5, and dra7
without platform data.
There is also a build fix for the earlier am437x series that
I should have applied into a separate branch on top of the
am437x breaking commit. It ended here as I was originally
planning to send out a single pull request for all the genpd
changes, but then decided to break it down to smaller chunks.
It's all really a larger single git branch though, so this
should be OK and I really did not want to start reorganizing
the branch after testing it and having it sit in Linux next.
The changes done here are:
- Clock driver needs idlest check dropped for IVA for omap4
and dra7
- Add remaining power domain and reset control data to
omap-prm driver for omap4/5 and dra7
- Add device tree data for remaining power domains and
reset control for omap4/5 and dra7 dts files
- Update dss, dsp, iva and gpmc dts files to use genpd
and to drop the remaining platform data
- Update dss for omap5 to use genpd
- Update dra7 iva to to use genpd and to drop the remaining
platform data
* tag 'omap-for-v5.11/genpd-rest-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Fix am4 only build after genpd changes
ARM: dts: Configure power domain for omap5 dss
ARM: dts: omap5: add remaining PRM instances
soc: ti: omap-prm: omap5: add genpd support for remaining PRM instances
ARM: OMAP2+: Drop legacy platform data for dra7 gpmc
ARM: dts: Configure interconnect target module for dra7 iva
ARM: dts: dra7: add remaining PRM instances
soc: ti: omap-prm: dra7: add genpd support for remaining PRM instances
clk: ti: dra7: Drop idlest polling from IVA clkctrl clocks
ARM: OMAP2+: Drop legacy platform data for omap4 gpmc
ARM: OMAP2+: Drop legacy platform data for omap4 iva
ARM: dts: Configure power domain for omap4 dsp
ARM: dts: Configure power domain for omap4 dss
ARM: dts: omap4: add remaining PRM instances
soc: ti: omap-prm: omap4: add genpd support for remaining PRM instances
clk: ti: omap4: Drop idlest polling from IVA clkctrl clocks
Link: https://lore.kernel.org/r/pull-1606806458-694517@atomide.com-4
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-genpd
Update am473x to boot without platform data
Similar to am335x, we can now update am437x dts files to boot
with genpd and simple-pm-bus, and drop the related platform data.
To do that, we need to do the following changes for am437x:
- Update the clock driver to keep the l3_main clock always on for
suspend and resume to work
- Add power domain and reset controller data to omap-prm driver
- Configure interconnect clocks for system timers as those are
now managed separately by the drivers/clocksource drivers
- Update control module, wkup_m3, emif, ocmcram, mpuss and l3_noc
for device tree data and drop the legacy platform data
- Update the interconnect instances to boot with gendp and
simple-pm-bus
- Drop the remaining platform data for am437x
* tag 'omap-for-v5.11/genpd-am437x-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Drop legacy remaining legacy platform data for am4
ARM: dts: Use simple-pm-bus for genpd for am4 l3
ARM: dts: Move am4 l3 noc to a separate node
ARM: dts: Use simple-pm-bus for genpd for am4 l4_per
ARM: dts: Use simple-pm-bus for genpd for am4 l4_fast
ARM: dts: Use simple-pm-bus for genpd for am4 l4_wkup
ARM: OMAP2+: Drop legacy platform data for am4 mpuss
ARM: OMAP2+: Drop legacy platform data for am4 ocmcram
ARM: OMAP2+: Drop legacy platform data for am4 emif
ARM: OMAP2+: Drop legacy platform data for am4 wkup_m3
ARM: dts: Configure interconnect target module for am4 wkup_m3
ARM: dts: Configure RTC powerdomain for am4
ARM: OMAP2+: Drop legacy platform data for am4 control module
ARM: dts: Configure also interconnect clocks for am4 system timer
ARM: dts: am43xx: add remaining PRM instances
soc: ti: omap-prm: am4: add genpd support for remaining PRM instances
clk: ti: am437x: Keep am4 l3 main clock always on for genpd
Link: https://lore.kernel.org/r/pull-1606806458-694517@atomide.com-3
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-genpd
Update am335x to boot without platform data
With the driver updates done for genpd support, we can now update
am335x dts files to boot with genpd and simple-pm-bus, and drop
the related platform data.
To do that, we need to do the following changes for am335x:
- Add the remaining power domain and reset controller instances
- Configure interconnect clocks for system timers as those are
now managed separately by the drivers/clocksource drivers
- Update control module, RTC, gpmc, debugss, emif, ocmcram,
instr, and mpuss for device tree data and drop the legacy
platform data
- Update the interconnect instances to boot with gendp and
simple-pm-bus
- Drop the remaining platform data for am335x
- Add kconfig option for OMAP_HWMOD to build it only for the
SoCs that need it
* tag 'omap-for-v5.11/genpd-am335x-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: Build hwmod related code as needed
ARM: OMAP2+: Drop legacy remaining legacy platform data for am3
ARM: dts: Use simple-pm-bus for genpd for am3 l3
ARM: dts: Use simple-pm-bus for genpd for am3 l4_per
ARM: dts: Use simple-pm-bus for genpd for am3 l4_fast
ARM: dts: Use simple-pm-bus for genpd for am3 l4_wkup
ARM: OMAP2+: Drop legacy platform data for am3 mpuss
ARM: OMAP2+: Drop legacy platform data for am3 instr
ARM: OMAP2+: Drop legacy platform data for am3 ocmcram
ARM: OMAP2+: Drop legacy platform data for am3 emif
ARM: OMAP2+: Drop legacy platform data for am3 debugss
ARM: OMAP2+: Drop legacy platform data for am3 and am4 gpmc
ARM: OMAP2+: Drop legacy platform data for am3 wkup_m3
ARM: dts: Configure interconnect target module for am3 wkup_m3
ARM: dts: Configure RTC powerdomain for am3
ARM: OMAP2+: Drop legacy platform data for am3 control module
ARM: dts: Configure also interconnect clocks for am4 system timer
ARM: dts: am33xx: add remaining PRM instances
Link: https://lore.kernel.org/r/pull-1606806458-694517@atomide.com-2
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/omap-genpd
Driver changes for omaps for genpd for v5.11 merge window
This series of changes allows booting am335x with genpd and
device tree data without the legacy platform data. Also at
least am437x can be booted with gendp with power domain and
dts data. The SoC specific dts changes will be a separate
pull request.
We need the following driver changes merged before the dts
changes can be done:
- platform code needs a few improvments to probe l4_wkup first
for clocks, and to bail out when there is no platform data
- ti-sysc driver needs a non-urgent fix for asserting rstctrl
reset only after disabling the clocks, to probe modules with
no known control registers, and added quirk handling for gpmc
devices
- omap-prm driver needs a non-urgent fix for reset status bit,
support added for pm_clk, and then we add the rest of am335x
power domain data
- clock driver for am335x needs to keep l3_main clock enabled
with genpd for suspend and resume to work
- wkup_m3 remoteproc driver needs support added for reset
control if available instead of the legacy pdata callbacks
- pm33xx driver needs PM runtime support added for genpd
The am335x specific driver changes for the clock, wkup_m3,
pm33xx and remoteproc drivers are quite trivial and have not
caused merge conflicts in Linux next. I did not get acks for
these changes except from Santosh but had already pushed out
the branch already at that point. So I've added the related
driver maintainers to Cc.
* tag 'omap-for-v5.11/genpd-drivers-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
remoteproc/wkup_m3: Use reset control driver if available
soc: ti: pm33xx: Enable basic PM runtime support for genpd
soc: ti: omap-prm: am3: add genpd support for remaining PRM instances
soc: ti: omap-prm: Add pm_clk for genpd
clk: ti: am33xx: Keep am3 l3 main clock always on for genpd
bus: ti-sysc: Implement GPMC debug quirk to drop platform data
bus: ti-sysc: Support modules without control registers
ARM: OMAP2+: Probe PRCM first to probe l4_wkup with simple-pm-bus
ARM: OMAP2+: Check for inited flag
bus: ti-sysc: Assert reset only after disabling clocks
soc: ti: omap-prm: Do not check rstst bit on deassert if already deasserted
bus: ti-sysc: Fix bogus resetdone warning on enable for cpsw
bus: ti-sysc: Fix reset status check for modules with quirks
ARM: OMAP2+: Fix missing select PM_GENERIC_DOMAINS_OF
ARM: OMAP2+: Fix location for select PM_GENERIC_DOMAINS
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Not resetting the SMT siblings might leave them in unpredictable
state. One of the observed problems was that the CPU timer wasn't
reset and therefore large system time values where accounted during
CPU bringup.
Cc: <stable@kernel.org> # 4.0
Fixes: 10ad34bc76dfb ("s390: add SMT support")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Currently only idle_task_exit() explicitly switches (switch_mm) to
init_mm. This causes the kernel asce to be loaded into cr7 and
therefore it would be used for potential user space accesses.
This is currently no problem since idle_task_exit() is nearly the last
thing a CPU executes before it is taken down. However things might
change - and therefore make sure that always the invalid asce is used
for cr7 when active_mm is init_mm.
This makes sure that all potential user space accesses will fail,
instead of accessing kernel address space.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
When a machine check interrupt is triggered during idle, the code
is using the async timer/clock for idle time calculation. It should use
the machine check enter timer/clock which is passed to the macro.
Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org> # 5.8
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
During removal of the critical section cleanup the calculation
of mt_cycles during idle was removed. This causes invalid
accounting on systems with SMT enabled.
Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org> # 5.8
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
More and more functionality from the early boot phase gets carried over
to the decompressor. With this the complexity of the code and thus the
chance to introduce bugs increases. In order to be able to debug these
early boot bugs the distributions have to package the decompressors
vmlinux together with the other debuginfos. However for that the
distributions require the vmlinux to contain a build-id.
Per default the section containing the build-id is placed first in the
section table. So make sure to move it behind the .text section
otherwise the image would be unbootable.
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
diag308 subcode 0 performes a clear reset which inlcudes the reset of
all registers in the system. While this is the preferred behavior when
loading a normal kernel via kexec it prevents the crash kernel to store
the register values in the dump. To prevent this use subcode 1 when
loading a crash kernel instead.
Fixes: ee337f5469fd ("s390/kexec_file: Add crash support to image loader")
Cc: <stable@vger.kernel.org> # 4.17
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Reported-by: Xiaoying Yan <yiyan@redhat.com>
Tested-by: Lianbo Jiang <lijiang@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Use of sch->dev reference after the put_device() call could trigger
the use-after-free bugs.
Fix this by simply adjusting the position of put_device.
Fixes: 37db8985b211 ("s390/cio: add basic protected virtualization support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com>
[vneethv@linux.ibm.com: Slight modification in the commit-message]
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
The stm32 timers example name should match the pattern timer@. Also,
the example is based on stm32mp1 timer 2, so the identifier should be
'1' instead of '0' (e.g. timer 1).
Fixes: bfbcbf88f9db ("dt-bindings: timer: Convert stm32 timer bindings to json-schema")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com>
Link: https://lore.kernel.org/r/1606913114-25693-1-git-send-email-fabrice.gasnier@foss.st.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The vrf_add_mac_header_if_unset() is defined within a conditional
compilation block which depends on the CONFIG_IPV6 macro.
However, the vrf_add_mac_header_if_unset() needs to be called also by IPv4
related code and when the CONFIG_IPV6 is not set, this function is missing.
As a consequence, the build process stops reporting the error:
ERROR: implicit declaration of function 'vrf_add_mac_header_if_unset'
The problem is solved by *only* moving functions
vrf_add_mac_header_if_unset() and vrf_prepare_mac_header() out of the
conditional block.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 0489390882202 ("vrf: add mac header for tunneled packets when sniffer is attached")
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20201208175210.8906-1-andrea.mayer@uniroma2.it
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If cm_create_timewait_info() fails, the timewait_info pointer will contain
an error value and will be used in cm_remove_remote() later.
general protection fault, probably for non-canonical address 0xdffffc0000000024: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0×0000000000000120-0×0000000000000127]
CPU: 2 PID: 12446 Comm: syz-executor.3 Not tainted 5.10.0-rc5-5d4c0742a60e #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:cm_remove_remote.isra.0+0x24/0×170 drivers/infiniband/core/cm.c:978
Code: 84 00 00 00 00 00 41 54 55 53 48 89 fb 48 8d ab 2d 01 00 00 e8 7d bf 4b fe 48 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 48 89 ea 83 e2 07 38 d0 7f 08 84 c0 0f 85 fc 00 00 00
RSP: 0018:ffff888013127918 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: ffffc9000a18b000
RDX: 0000000000000024 RSI: ffffffff82edc573 RDI: fffffffffffffff4
RBP: 0000000000000121 R08: 0000000000000001 R09: ffffed1002624f1d
R10: 0000000000000003 R11: ffffed1002624f1c R12: ffff888107760c70
R13: ffff888107760c40 R14: fffffffffffffff4 R15: ffff888107760c9c
FS: 00007fe1ffcc1700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2ff21000 CR3: 000000010f504001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
cm_destroy_id+0x189/0×15b0 drivers/infiniband/core/cm.c:1155
cma_connect_ib drivers/infiniband/core/cma.c:4029 [inline]
rdma_connect_locked+0x1100/0×17c0 drivers/infiniband/core/cma.c:4107
rdma_connect+0x2a/0×40 drivers/infiniband/core/cma.c:4140
ucma_connect+0x277/0×340 drivers/infiniband/core/ucma.c:1069
ucma_write+0x236/0×2f0 drivers/infiniband/core/ucma.c:1724
vfs_write+0x220/0×830 fs/read_write.c:603
ksys_write+0x1df/0×240 fs/read_write.c:658
do_syscall_64+0x33/0×40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20201204064205.145795-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reported-by: Amit Matityahu <mitm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/drivers
Samsung SoC drivers changes for v5.11, part two
1. Mark PM functions of newly added clkout module as unused to silence
!CONFIG_PM warnings.
2. Initialize ChipID driver later - in arch initcall.
* tag 'samsung-drivers-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
clk: samsung: mark PM functions as __maybe_unused
soc: samsung: exynos-chipid: initialize later - with arch_initcall
soc: samsung: exynos-chipid: order list of SoCs by name
Link: https://lore.kernel.org/r/20201207074528.4475-1-krzk@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
into arm/drivers
arm64: soc: ZynqMP SoC changes for v5.11 v2
- Small alignments in Xilinx Firmware driver
- Exposing syscon interface for VCU driver
* tag 'zynqmp-soc-for-v5.11-v2' of https://github.com/Xilinx/linux-xlnx:
firmware: xilinx: Properly align function parameter
firmware: xilinx: Add a blank line after function declaration
firmware: xilinx: Remove additional newline
firmware: xilinx: Fix kernel-doc warnings
firmware: xlnx-zynqmp: fix compilation warning
soc: xilinx: vcu: add missing register NUM_CORE
soc: xilinx: vcu: use vcu-settings syscon registers
dt-bindings: soc: xlnx: extract xlnx, vcu-settings to separate binding
soc: xilinx: vcu: drop useless success message
Link: https://lore.kernel.org/r/71d38756-4456-29fc-26a3-341e1d09aafe@monstr.eu
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
In commit a2d375eda771 ("dyndbg: refine export, rename to
dynamic_debug_exec_queries()"), a string is copied before checking it
isn't NULL. Fix this, report a usage/interface error, and return the
proper error code.
Fixes: a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()")
Cc: stable@vger.kernel.org
--
-v2 drop comment tweak, improve commit message
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20201209183625.2432329-1-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Checking !list_empty(&ctx->cq_overflow_list) around noflush in
io_cqring_events() is racy, because if it fails but a request overflowed
just after that, io_cqring_overflow_flush() still will be called.
Remove the second check, it shouldn't be a problem for performance,
because there is cq_check_overflow bit check just above.
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
It's not safe to call io_cqring_overflow_flush() for IOPOLL mode without
hodling uring_lock, because it does synchronisation differently. Make
sure we have it.
As for io_ring_exit_work(), we don't even need it there because
io_ring_ctx_wait_and_kill() already set force flag making all overflowed
requests to be dropped.
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
IOPOLL allows buffer remove/provide requests, but they doesn't
synchronise by rules of IOPOLL, namely it have to hold uring_lock.
Cc: <stable@vger.kernel.org> # 5.7+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Abaci Fuzz reported a double-free or invalid-free BUG in io_commit_cqring():
[ 95.504842] BUG: KASAN: double-free or invalid-free in io_commit_cqring+0x3ec/0x8e0
[ 95.505921]
[ 95.506225] CPU: 0 PID: 4037 Comm: io_wqe_worker-0 Tainted: G B
W 5.10.0-rc5+ #1
[ 95.507434] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 95.508248] Call Trace:
[ 95.508683] dump_stack+0x107/0x163
[ 95.509323] ? io_commit_cqring+0x3ec/0x8e0
[ 95.509982] print_address_description.constprop.0+0x3e/0x60
[ 95.510814] ? vprintk_func+0x98/0x140
[ 95.511399] ? io_commit_cqring+0x3ec/0x8e0
[ 95.512036] ? io_commit_cqring+0x3ec/0x8e0
[ 95.512733] kasan_report_invalid_free+0x51/0x80
[ 95.513431] ? io_commit_cqring+0x3ec/0x8e0
[ 95.514047] __kasan_slab_free+0x141/0x160
[ 95.514699] kfree+0xd1/0x390
[ 95.515182] io_commit_cqring+0x3ec/0x8e0
[ 95.515799] __io_req_complete.part.0+0x64/0x90
[ 95.516483] io_wq_submit_work+0x1fa/0x260
[ 95.517117] io_worker_handle_work+0xeac/0x1c00
[ 95.517828] io_wqe_worker+0xc94/0x11a0
[ 95.518438] ? io_worker_handle_work+0x1c00/0x1c00
[ 95.519151] ? __kthread_parkme+0x11d/0x1d0
[ 95.519806] ? io_worker_handle_work+0x1c00/0x1c00
[ 95.520512] ? io_worker_handle_work+0x1c00/0x1c00
[ 95.521211] kthread+0x396/0x470
[ 95.521727] ? _raw_spin_unlock_irq+0x24/0x30
[ 95.522380] ? kthread_mod_delayed_work+0x180/0x180
[ 95.523108] ret_from_fork+0x22/0x30
[ 95.523684]
[ 95.523985] Allocated by task 4035:
[ 95.524543] kasan_save_stack+0x1b/0x40
[ 95.525136] __kasan_kmalloc.constprop.0+0xc2/0xd0
[ 95.525882] kmem_cache_alloc_trace+0x17b/0x310
[ 95.533930] io_queue_sqe+0x225/0xcb0
[ 95.534505] io_submit_sqes+0x1768/0x25f0
[ 95.535164] __x64_sys_io_uring_enter+0x89e/0xd10
[ 95.535900] do_syscall_64+0x33/0x40
[ 95.536465] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 95.537199]
[ 95.537505] Freed by task 4035:
[ 95.538003] kasan_save_stack+0x1b/0x40
[ 95.538599] kasan_set_track+0x1c/0x30
[ 95.539177] kasan_set_free_info+0x1b/0x30
[ 95.539798] __kasan_slab_free+0x112/0x160
[ 95.540427] kfree+0xd1/0x390
[ 95.540910] io_commit_cqring+0x3ec/0x8e0
[ 95.541516] io_iopoll_complete+0x914/0x1390
[ 95.542150] io_do_iopoll+0x580/0x700
[ 95.542724] io_iopoll_try_reap_events.part.0+0x108/0x200
[ 95.543512] io_ring_ctx_wait_and_kill+0x118/0x340
[ 95.544206] io_uring_release+0x43/0x50
[ 95.544791] __fput+0x28d/0x940
[ 95.545291] task_work_run+0xea/0x1b0
[ 95.545873] do_exit+0xb6a/0x2c60
[ 95.546400] do_group_exit+0x12a/0x320
[ 95.546967] __x64_sys_exit_group+0x3f/0x50
[ 95.547605] do_syscall_64+0x33/0x40
[ 95.548155] entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason is that once we got a non EAGAIN error in io_wq_submit_work(),
we'll complete req by calling io_req_complete(), which will hold completion_lock
to call io_commit_cqring(), but for polled io, io_iopoll_complete() won't
hold completion_lock to call io_commit_cqring(), then there maybe concurrent
access to ctx->defer_list, double free may happen.
To fix this bug, we always let io_iopoll_complete() complete polled io.
Cc: <stable@vger.kernel.org> # 5.5+
Reported-by: Abaci Fuzz <abaci@linux.alibaba.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Support timeout updates through IORING_OP_TIMEOUT_REMOVE with passed in
IORING_TIMEOUT_UPDATE. Updates doesn't support offset timeout mode.
Oirignal timeout.off will be ignored as well.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: remove now unused 'ret' variable]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add io_timeout_extract() helper, which searches and disarms timeouts,
but doesn't complete them. No functional changes.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring_cancel_files()'s task check condition mistakenly got flipped.
1. There can't be a request in the inflight list without
IO_WQ_WORK_FILES, kill this check to keep the whole condition simpler.
2. Also, don't call the function for files==NULL to not do such a check,
all that staff is already handled well by its counter part,
__io_uring_cancel_task_requests().
With that just flip the task check.
Also, it iowq-cancels all request of current task there, don't forget to
set right ->files into struct io_task_cancel.
Fixes: c1973b38bf639 ("io_uring: cancel only requests of current task")
Reported-by: syzbot+c0d52d0b3c0c3ffb9525@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_file_data_ref_zero() can be invoked from soft-irq from the RCU core,
hence we need to ensure that the file_data lock is bottom half safe. Use
the _bh() variants when grabbing this lock.
Reported-by: syzbot+1f4ba1e5520762c523c6@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_req_init() doesn't decrement state->ios_left if a request doesn't
need ->file, it just returns before that on if(!needs_file). That's
not really a problem but may cause overhead for an additional fput().
Also inline and kill io_req_set_file() as it's of no use anymore.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Keep submit state invariant of whether there are file refs left based on
state->nr_refs instead of (state->file==NULL), and always check against
the first one. It's easier to track and allows to remove 1 if. It also
automatically leaves struct submit_state in a consistent state after
io_submit_state_end(), that's not used yet but nice.
btw rename has_refs to file_refs for more clarity.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
syzbot reports following issue:
INFO: task syz-executor.2:12399 can't die for more than 143 seconds.
task:syz-executor.2 state:D stack:28744 pid:12399 ppid: 8504 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:3773 [inline]
__schedule+0x893/0x2170 kernel/sched/core.c:4522
schedule+0xcf/0x270 kernel/sched/core.c:4600
schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1847
do_wait_for_common kernel/sched/completion.c:85 [inline]
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
kthread_stop+0x17a/0x720 kernel/kthread.c:596
io_put_sq_data fs/io_uring.c:7193 [inline]
io_sq_thread_stop+0x452/0x570 fs/io_uring.c:7290
io_finish_async fs/io_uring.c:7297 [inline]
io_sq_offload_create fs/io_uring.c:8015 [inline]
io_uring_create fs/io_uring.c:9433 [inline]
io_uring_setup+0x19b7/0x3730 fs/io_uring.c:9507
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45deb9
Code: Unable to access opcode bytes at RIP 0x45de8f.
RSP: 002b:00007f174e51ac78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
RAX: ffffffffffffffda RBX: 0000000000008640 RCX: 000000000045deb9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 00000000000050e5
RBP: 000000000118bf58 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
R13: 00007ffed9ca723f R14: 00007f174e51b9c0 R15: 000000000118bf2c
INFO: task syz-executor.2:12399 blocked for more than 143 seconds.
Not tainted 5.10.0-rc3-next-20201110-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Currently we don't have a reproducer yet, but seems that there is a
race in current codes:
=> io_put_sq_data
ctx_list is empty now. |
==> kthread_park(sqd->thread); |
| T1: sq thread is parked now.
==> kthread_stop(sqd->thread); |
KTHREAD_SHOULD_STOP is set now.|
===> kthread_unpark(k); |
| T2: sq thread is now unparkd, run again.
|
| T3: sq thread is now preempted out.
|
===> wake_up_process(k); |
|
| T4: Since sqd ctx_list is empty, needs_sched will be true,
| then sq thread sets task state to TASK_INTERRUPTIBLE,
| and schedule, now sq thread will never be waken up.
===> wait_for_completion |
I have artificially used mdelay() to simulate above race, will get same
stack like this syzbot report, but to be honest, I'm not sure this code
race triggers syzbot report.
To fix this possible code race, when sq thread is unparked, need to check
whether sq thread has been stopped.
Reported-by: syzbot+03beeb595f074db9cfd1@syzkaller.appspotmail.com
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Double fixed files for splice/tee are done in a nasty way, it takes 2
ref_node refs, and during the second time it blindly overrides
req->fixed_file_refs hoping that it haven't changed. That works because
all that is done under iouring_lock in a single go but is error-prone.
Bind everything explicitly to a single ref_node and take only one ref,
with current ref_node ordering it's guaranteed to keep all files valid
awhile the request is inflight.
That's mainly a cleanup + preparation for generic resource handling,
but also saves pcpu_ref get/put for splice/tee with 2 fixed files.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As tasks now cancel only theirs requests, and inflight_wait is awaited
only in io_uring_cancel_files(), which should be called with ->in_idle
set, instead of keeping a separate inflight_wait use tctx->wait.
That will add some spurious wakeups but actually is safer from point of
not hanging the task.
e.g.
task1 | IRQ
| *start* io_complete_rw_common(link)
| link: req1 -> req2 -> req3(with files)
*cancel_files() |
io_wq_cancel(), etc. |
| put_req(link), adds to io-wq req2
schedule() |
So, task1 will never try to cancel req2 or req3. If req2 is
long-standing (e.g. read(empty_pipe)), this may hang.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We don't even allow not plain data msg_control, which is disallowed in
__sys_{send,revb}msg_sock(). So no need in fs for IORING_OP_SENDMSG and
IORING_OP_RECVMSG. fs->lock is less contanged not as much as before, but
there are cases that can be, e.g. IOSQE_ASYNC.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If IORING_SETUP_SQPOLL is enabled, sqes are either handled in sq thread
task context or in io worker task context. If current task context is sq
thread, we don't need to check whether should wake up sq thread.
io_iopoll_req_issued() calls wq_has_sleeper(), which has smp_mb() memory
barrier, before this patch, perf shows obvious overhead:
Samples: 481K of event 'cycles', Event count (approx.): 299807382878
Overhead Comma Shared Object Symbol
3.69% :9630 [kernel.vmlinux] [k] io_issue_sqe
With this patch, perf shows:
Samples: 482K of event 'cycles', Event count (approx.): 299929547283
Overhead Comma Shared Object Symbol
0.70% :4015 [kernel.vmlinux] [k] io_issue_sqe
It shows some obvious improvements.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Both IOPOLL and sqes handling need to acquire uring_lock, combine
them together, then we just need to acquire uring_lock once.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some static checker reports below warning:
fs/io_uring.c:6939 io_sq_thread()
error: uninitialized symbol 'timeout'.
This is a false positive, but let's just initialize 'timeout' to make
sure we don't trip over this.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are some issues about current io_sq_thread() implementation:
1. The prepare_to_wait() usage in __io_sq_thread() is weird. If
multiple ctxs share one same poll thread, one ctx will put poll thread
in TASK_INTERRUPTIBLE, but if other ctxs have work to do, we don't
need to change task's stat at all. I think only if all ctxs don't have
work to do, we can do it.
2. We use round-robin strategy to make multiple ctxs share one same
poll thread, but there are various condition in __io_sq_thread(), which
seems complicated and may affect round-robin strategy.
To improve above issues, I take below actions:
1. If multiple ctxs share one same poll thread, only if all all ctxs
don't have work to do, we can call prepare_to_wait() and schedule() to
make poll thread enter sleep state.
2. To make round-robin strategy more straight, I simplify
__io_sq_thread() a bit, it just does io poll and sqes submit work once,
does not check various condition.
3. For multiple ctxs share one same poll thread, we choose the biggest
sq_thread_idle among these ctxs as timeout condition, and will update
it when ctx is in or out.
4. Not need to check EBUSY especially, if io_submit_sqes() returns
EBUSY, IORING_SQ_CQ_OVERFLOW should be set, helper in liburing should
be aware of cq overflow and enters kernel to flush work.
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of iterating over each request and cancelling it individually in
io_uring_cancel_files(), try to cancel all matching requests and use
->inflight_list only to check if there anything left.
In many cases it should be faster, and we can reuse a lot of code from
task cancellation.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Make io_poll_remove_all() and io_kill_timeouts() to match against files
as well. A preparation patch, effectively not used by now.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring_cancel_files() guarantees to cancel all matching requests,
that's not necessary to do that in a loop. Move it up in the callchain
into io_uring_cancel_task_requests().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_uring_cancel_files() cancels all request that match files regardless
of task. There is no real need in that, cancel only requests of the
specified task. That also handles SQPOLL case as it already changes task
to it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add io_match_task() that matches both task and files.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If IORING_SETUP_SQPOLL is set all requests belong to the corresponding
SQPOLL task, so skip task checking in that case and always match.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Inline io_import_iovec() and leave only its former __io_import_iovec()
renamed to the original name. That makes it more obious what is reused in
io_read/write().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_size and iov_count in io_read() and io_write() hold the same value,
kill the last one.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This is the only code that relies on import_iovec() returning
iter.count on success.
This allows a better interface to import_iovec().
Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
SQPOLL task may find sqo_task->files == NULL and
__io_sq_thread_acquire_files() would leave it unset, so following
fget_many() and others try to dereference NULL and fault. Propagate
an error files are missing.
[ 118.962785] BUG: kernel NULL pointer dereference, address:
0000000000000020
[ 118.963812] #PF: supervisor read access in kernel mode
[ 118.964534] #PF: error_code(0x0000) - not-present page
[ 118.969029] RIP: 0010:__fget_files+0xb/0x80
[ 119.005409] Call Trace:
[ 119.005651] fget_many+0x2b/0x30
[ 119.005964] io_file_get+0xcf/0x180
[ 119.006315] io_submit_sqes+0x3a4/0x950
[ 119.007481] io_sq_thread+0x1de/0x6a0
[ 119.007828] kthread+0x114/0x150
[ 119.008963] ret_from_fork+0x22/0x30
Reported-by: Josef Grieb <josef.grieb@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now users who want to get woken when waiting for events should submit a
timeout command first. It is not safe for applications that split SQ and
CQ handling between two threads, such as mysql. Users should synchronize
the two threads explicitly to protect SQ and that will impact the
performance.
This patch adds support for timeout to existing io_uring_enter(). To
avoid overloading arguments, it introduces a new parameter structure
which contains sigmask and timeout.
I have tested the workloads with one thread submiting nop requests
while the other reaping the cqe with timeout. It shows 1.8~2x faster
when the iodepth is 16.
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
[axboe: various cleanups/fixes, and name change to SIG_IS_DATA]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We unconditionally call blk_start_plug() when starting the IO
submission, but we only really should do that if we have more than 1
request to submit AND we're potentially dealing with block based storage
underneath. For any other type of request, it's just a waste of time to
do so.
Add a ->plug bit to io_op_def and set it for read/write requests. We
could make this more precise and check the file itself as well, but it
doesn't matter that much and would quickly become more expensive.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|