Age | Commit message (Collapse) | Author |
|
* clk-imx6ul:
clk: imx6ul: fix periph clk2 clock mux selection
|
|
'clk-davinci' and 'clk-meson' into clk-next
* clk-davinci-psc-da830:
clk: davinci: psc-da830: fix USB0 48MHz PHY clock registration
* clk-renesas:
clk: renesas: cpg-mssr: Add support for R-Car E3
clk: renesas: Add r8a77990 CPG Core Clock Definitions
clk: renesas: rcar-gen2: Centralize quirks handling
clk: renesas: r8a77980: Correct parent clock of PCIEC0
clk: renesas: r8a7794: Fix LB clock divider
clk: renesas: r8a7792: Fix LB clock divider
clk: renesas: r8a7791/r8a7793: Fix LB clock divider
clk: renesas: r8a7745: Fix LB clock divider
clk: renesas: r8a7743: Fix LB clock divider
clk: renesas: cpg-mssr: Add r8a77470 support
clk: renesas: Add r8a77470 CPG Core Clock Definitions
clk: renesas: r8a77965: Add MSIOF controller clocks
* clk-at91-recalc:
clk: at91: PLL recalc_rate() now using cached MUL and DIV values
* clk-davinci:
clk: davinci: Fix link errors when not all SoCs are enabled
clk: davinci: psc: allow for dev == NULL
clk: davinci: da850-pll: change PLL0 to CLK_OF_DECLARE
clk: davinci: pll: allow dev == NULL
clk: davinci: psc-dm365: fix few clocks
clk: davinci: pll-dm646x: keep PLL2 SYSCLK1 always enabled
clk: davinci: psc-dm355: fix ASP0/1 clkdev lookups
clk: davinci: pll-dm355: fix SYSCLKn parent names
clk: davinci: pll-dm355: drop pll2_sysclk2
* clk-meson:
clk: meson: axg: let mpll clocks round closest
clk: meson: mpll: add round closest support
clk: meson: meson8b: mark fclk_div2 gate clocks as CLK_IS_CRITICAL
clk: meson: use SPDX license identifiers consistently
clk: meson: drop CLK_SET_RATE_PARENT flag
clk: meson-axg: Add AO Clock and Reset controller driver
clk: meson: aoclk: refactor common code into dedicated file
clk: meson: migrate to devm_of_clk_add_hw_provider API
clk: meson: gxbb: add the video decoder clocks
clk: meson: meson8b: add support for the NAND clocks
dt-bindings: clock: reset: Add AXG AO Clock and Reset Bindings
dt-bindings: clock: axg-aoclkc: New binding for Meson-AXG SoC
clk: meson: gxbb: expose VDEC_1 and VDEC_HEVC clocks
dt-bindings: clock: meson8b: export the NAND clock
|
|
* clk-qcom-8996-halt:
clk: qcom: gcc-msm8996: Disable halt check on UFS clocks
clk: msm8996-gcc: Mark halt check as no-op for USB/PCIE pipe_clk
|
|
All the managed resources would be freed by the time release function
is invoked. Handling such memory in qcom_smd_edge_release() would do
bad things.
Found this issue while testing Audio usecase where the dsp is started up
and shutdown in a loop.
This patch fixes this issue by using simple kzalloc for allocating
channel->name and channel which is then freed in qcom_smd_edge_release().
Without this patch restarting a remoteproc would crash the system.
Fixes: 53e2822e56c7 ("rpmsg: Introduce Qualcomm SMD backend")
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
* clk-qcom-sdm845:
clk: qcom: Export clk_fabia_pll_configure()
clk: qcom: Add video clock controller driver for SDM845
dt-bindings: clock: Introduce QCOM Video clock bindings
clk: qcom: Add Global Clock controller (GCC) driver for SDM845
clk: qcom: Add DT bindings for SDM845 gcc clock controller
clk: qcom: Configure the RCGs to a safe source as needed
clk: qcom: Add support for BRANCH_HALT_SKIP flag for branch clocks
clk: qcom: Simplify gdsc status checking logic
clk: qcom: gdsc: Add support to poll CFG register to check GDSC state
clk: qcom: gdsc: Add support to poll for higher timeout value
clk: qcom: gdsc: Add support to reset AON and block reset logic
clk: qcom: Add support for controlling Fabia PLL
clk: qcom: Clear hardware clock control bit of RCG
Also fixup the Kconfig mess where SDM845 GCC has msm8998 in the
description and also the video Kconfig says things slightly differently
from the GCC one so just make it the same.
|
|
Pull documentation updates from Jonathan Corbet:
"There's been a fair amount of work in the docs tree this time around,
including:
- Extensive RST conversions and organizational work in the
memory-management docs thanks to Mike Rapoport.
- An update of Documentation/features from Andrea Parri and a script
to keep it updated.
- Various LICENSES updates from Thomas, along with a script to check
SPDX tags.
- Work to fix dangling references to documentation files; this
involved a fair number of one-liner comment changes outside of
Documentation/
... and the usual list of documentation improvements, typo fixes, etc"
* tag 'docs-4.18' of git://git.lwn.net/linux: (103 commits)
Documentation: document hung_task_panic kernel parameter
docs/admin-guide/mm: add high level concepts overview
docs/vm: move ksm and transhuge from "user" to "internals" section.
docs: Use the kerneldoc comments for memalloc_no*()
doc: document scope NOFS, NOIO APIs
docs: update kernel versions and dates in tables
docs/vm: transhuge: split userspace bits to admin-guide/mm/transhuge
docs/vm: transhuge: minor updates
docs/vm: transhuge: change sections order
Documentation: arm: clean up Marvell Berlin family info
Documentation: gpio: driver: Fix a typo and some odd grammar
docs: ranoops.rst: fix location of ramoops.txt
scripts/documentation-file-ref-check: rewrite it in perl with auto-fix mode
docs: uio-howto.rst: use a code block to solve a warning
mm, THP, doc: Add document for thp_swpout/thp_swpout_fallback
w1: w1_io.c: fix a kernel-doc warning
Documentation/process/posting: wrap text at 80 cols
docs: admin-guide: add cgroup-v2 documentation
Revert "Documentation/features/vm: Remove arch support status file for 'pte_special'"
Documentation: refcount-vs-atomic: Update reference to LKMM doc.
...
|
|
Improve the safety of the code and ensure the array cannot be indexed
out of bounds when picking the CPU for a given SDMA engine.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
The following code fails to allocate a buffer for the
tail address that the hardware DMAs into when the user
context DMA_RTAIL is set.
if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) {
rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent(
&dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail,
gfp_flags);
if (!rcd->rcvhdrtail_kvaddr)
goto bail_free;
rcd->rcvhdrqtailaddr_dma = dma_hdrqtail;
}
So the rcvhdrtail_kvaddr would then be NULL.
The mmap logic fails to check for a NULL rcvhdrtail_kvaddr.
The fix is to test for both user and kernel DMA_TAIL options
during the allocation as well as testing for a NULL
rcvhdrtail_kvaddr during the mmap processing.
Additionally, all downstream testing of the capmask for DMA_RTAIL
have been eliminated in favor of testing rcvhdrtail_kvaddr.
Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
'clk-bcm-stingray' into clk-next
* clk-match-string:
clk: use match_string() helper
clk: bcm2835: use match_string() helper
* clk-ingenic:
clk: ingenic: jz4770: Add 150us delay after enabling VPU clock
clk: ingenic: jz4770: Enable power of AHB1 bus after ungating VPU clock
clk: ingenic: jz4770: Modify C1CLK clock to disable CPU clock stop on idle
clk: ingenic: jz4770: Change OTG from custom to standard gated clock
clk: ingenic: Support specifying "wait for clock stable" delay
clk: ingenic: Add support for clocks whose gate bit is inverted
* clk-si544-round-fix:
clk-si544: Properly round requested frequency to nearest match
* clk-bcm-stingray:
clk: bcm: Update and add Stingray clock entries
dt-bindings: clk: Update Stingray binding doc
|
|
and 'clk-debugfs-simple' into clk-next
* clk-imx7d:
clk: imx7d: reset parent for mipi csi root
clk: imx7d: fix mipi dphy div parent
* clk-hisi-stub:
clk/driver/hisi: Consolidate the Kconfig for the CLOCK_STUB
* clk-mvebu:
clk: mvebu: use correct bit for 98DX3236 NAND
* clk-imx6-epit:
clk: imx6: add EPIT clock support
* clk-debugfs-simple:
clk: Return void from debug_init op
clk: remove clk_debugfs_add_file()
clk: tegra: no need to check return value of debugfs_create functions
clk: davinci: no need to check return value of debugfs_create functions
clk: bcm2835: no need to check return value of debugfs_create functions
clk: no need to check return value of debugfs_create functions
|
|
* clk-imx6sx:
clk: imx6sl: correct ocram_podf clock type
clk: imx6sx: disable unnecessary clocks during clock initialization
clk: imx6sx: add missing lvds2 clock to the clock tree
* clk-imx7d-enet:
ARM: dts: imx7: correct enet ipg clock
clk: imx7d: correct enet clock CCGR registers
clk: imx7d: correct enet phy ref clock gates
* clk-aspeed-24:
clk: aspeed: Add 24MHz fixed clock
|
|
and 'clk-qcom-mmagic' into clk-next
* clk-allwinner:
clk: sunxi-ng: r40: export a regmap to access the GMAC register
clk: sunxi-ng: r40: rewrite init code to a platform driver
clk: sunxi-ng: add support for H6 PRCM CCU
* clk-rockchip:
clk: rockchip: remove deprecated gate-clk code and dt-binding
clk: rockchip: use match_string() helper
* clk-tegra:
clk: tegra: Add quirk for getting CDEV1/2 clocks on Tegra20
clk: tegra20: Correct parents of CDEV1/2 clocks
clk: tegra20: Add DEV1/DEV2 OSC dividers
* clk-berlin:
clk: berlin: switch to SPDX license identifier
* clk-qcom-mmagic:
clk: qcom: mmcc-msm8996: leave all mmagic gdscs and clocks always enabled
clk: qcom: Register the gdscs before the clocks
clk: qcom: gdsc: Add support for ALWAYS_ON gdscs
|
|
'clk-mtk-mali' and 'clk-imx6ul-ccosr' into clk-next
* clk-hisi-usb:
clk: hisilicon: add missing usb3 clocks for Hi3798CV200 SoC
* clk-silent-bulk:
clk: bulk: silently error out on EPROBE_DEFER
* clk-mtk-hdmi:
clk: mediatek: correct the clocks for MT2701 HDMI PHY module
* clk-mtk-mali:
clk: mediatek: add g3dsys support for MT2701 and MT7623
dt-bindings: reset: mediatek: add entry for Mali-450 node to refer
dt-bindings: clock: mediatek: add entry for Mali-450 node to refer
dt-bindings: clock: mediatek: add g3dsys bindings
* clk-imx6ul-ccosr:
clk: imx: Add new clo01 and clo2 controlled by CCOSR
|
|
'clk-stratix10' and 'clk-aspeed' into clk-next
* clk-stm32mp1:
clk: stm32mp1: Fix a memory leak in 'clk_stm32_register_gate_ops()'
clk: stm32mp1: Add CLK_IGNORE_UNUSED to ck_sys_dbg clock
clk: stm32mp1: remove ck_apb_dbg clock
clk: stm32mp1: set stgen_k clock as critical
clk: stm32mp1: add missing tzc2 clock
clk: stm32mp1: fix SAI3 & SAI4 clocks
clk: stm32mp1: remove unused dfsdm_src[] const
clk: stm32mp1: add missing static
* clk-samsung:
clk: samsung: simplify getting .drvdata
* clk-uniphier-mpeg:
clk: uniphier: add LD11/LD20 stream demux system clock
* clk-stratix10:
clk: socfpga: stratix10: suppress unbinding platform's clock driver
clk: socfpga: stratix10: use platform driver APIs
* clk-aspeed:
clk:aspeed: Fix reset bits for PCI/VGA and PECI
clk: aspeed: Support second reset register
|
|
'clk-qcom-rcg-fix' into clk-next
* clk-qcom-rpmh:
dt-bindings: clock: Introduce QCOM RPMh clock bindings
* clk-npcm7xx:
clk: npcm7xx: fix return value check in npcm7xx_clk_init()
clk: npcm7xx: add clock controller
dt-binding: clk: npcm750: Add binding for Nuvoton NPCM7XX Clock
* clk-of-parent-count:
pinctrl: sunxi: Use of_clk_get_parent_count() instead of open coding
soc/tegra: pmc: Use of_clk_get_parent_count() instead of open coding
soc: rockchip: power-domain: Use of_clk_get_parent_count() instead of open coding
ARM: timer-sp: Use of_clk_get_parent_count() instead of open coding
clk: Extract OF clock helpers in <linux/of_clk.h>
* clk-qcom-rcg-fix:
clk: qcom: Base rcg parent rate off plan frequency
|
|
* clk-actions:
clk: actions: Add S900 SoC clock support
clk: actions: Add pll clock support
clk: actions: Add composite clock support
clk: actions: Add fixed factor clock support
clk: actions: Add factor clock support
clk: actions: Add divider clock support
clk: actions: Add mux clock support
clk: actions: Add gate clock support
clk: actions: Add common clock driver support
dt-bindings: clock: Add Actions S900 clock bindings
|
|
into clk-next
* clk-warn:
clk: Print the clock name and warning cause
* clk-core:
clk: Remove clk_init_cb typedef
* clk-spear:
clk: spear: fix WDT clock definition on SPEAr600
* clk-qcom-msm8998:
clk: qcom: Add MSM8998 Global Clock Control (GCC) driver
|
|
Sergei Shtylyov says:
====================
sh_eth: fix & clean up sh_eth_soft_swap()
Here's a set of 3 patches against DaveM's 'net-next.git' repo. First one fixes an
old buffer endiannes issue (luckily, the ARM SoCs are smart enough to not actually
care) plus couple clean ups around sh_eth_soft_swap()...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When initializing 'maxp' in sh_eth_soft_swap(), the buffer length needs
to be rounded up -- that's just asking for DIV_ROUND_UP()!
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sh_eth_tsu_soft_swap() is called twice by the driver, remove *inline* and
move that function from the header to the driver itself to let gcc decide
whether to expand it inline or not...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Browsing thru the driver disassembly, I noticed that ARM gcc generated
no code whatsoever for sh_eth_soft_swap() while building a little-endian
kernel -- apparently __LITTLE_ENDIAN__ was not being #define'd, however
it got implicitly #define'd when building with the SH gcc (I could only
find the explicit #define __LITTLE_ENDIAN that was #include'd when building
a little-endian kernel). Luckily, the Ether controller only doing big-
endian DMA is encountered on the early SH771x SoCs only and all ARM SoCs
implement EDMR.DE and thus set 'sh_eth_cpu_data::hw_swap'. But anyway, we
need to fix the #ifdef inside sh_eth_soft_swap() to something that would
work on all architectures...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the client holds a delegation, then ensure we filter out attempts
to invalidate the size, owner, group owner, or mode unless we made the
change, in which case, check that NFS_INO_REVAL_FORCED is set by the
caller.
Always filter out attempts to invalidate the change attribute and
size, since we are authoritative for those.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we hold a delegation, we should not need to call
nfs_check_inode_attributes() since we already know which attributes
are valid, and which ones may still need revalidation. The state
of the NFS_INO_REVAL_FORCED flag is therefore irrelevant.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Make sure that the client completely ignores change attribute and size
changes on the server when it holds a delegation.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Don't mark attributes as invalid just because they have changed. Instead,
for the purposes of adjusting the attribute cache timeout, keep a
separate variable that tracks whether or not a change occurred.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Always try to set the attributes, even if we don't have a valid struct
nfs_fattr.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If there are attributes that are still invalid when we set a delegation,
then we need to set the NFS_INO_REVAL_FORCED flag.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
If we hold a delegation, we don't need to care about whether or not
the inode attributes are up to date. We know we can cache the results
of this call regardless.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
We already earlier discouraged people from using this interface in
commit 88796e7e5c45 ("sched/swait: Document it clearly that the swait
facilities are special and shouldn't be used"), but I just got a pull
request with a new broken user.
So make the comment *really* clear.
The swait interfaces are bad, and should not be used unless you have
some *very* strong reasons that include tons of hard performance numbers
on just why you want to use them, and you show that you actually
understand that they aren't at all like the normal wait/wakeup
interfaces.
So far, every single user has been suspect. The main user is KVM, which
is completely pointless (there is only ever one waiter, which avoids the
interface subtleties, but also means that having a queue instead of a
pointer is counter-productive and certainly not an "optimization").
So make the comments much stronger.
Not that anybody likely reads them anyway, but there's always some
slight hope that it will cause somebody to think twice.
I'd like to remove this interface entirely, but there is the theoretical
possibility that it's actually the right thing to use in some situation,
most likely some deep RT use.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is a problem if we are going to unmap a rbd device and the
watch_dwork is going to queue delayed work for watch:
unmap Thread watch Thread timer
do_rbd_remove
cancel_tasks_sync(rbd_dev)
queue_delayed_work for watch
destroy_workqueue(rbd_dev->task_wq)
drain_workqueue(wq)
destroy other resources in wq
call_timer_fn
__queue_work()
Then the delayed work escape the cancel_tasks_sync() and
destroy_workqueue() and we will get an user-after-free call trace:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
Modules linked in:
CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 4.17.0-rc6+ #13
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__queue_work+0x6a/0x3b0
RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086
RAX: ffff9427deca8400 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI: 0000000000000000
RBP: ffff942783e39e00 R08: ffff9427deca8400 R09: ffff9427df1c3f00
R10: 0000000000000004 R11: 0000000000000005 R12: ffff9427cfb85970
R13: 0000000000002000 R14: 000000000001eca0 R15: 0000000000000007
FS: 0000000000000000(0000) GS:ffff9427df1c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004c900a005 CR4: 00000000000206e0
Call Trace:
<IRQ>
? __queue_work+0x3b0/0x3b0
call_timer_fn+0x2d/0x130
run_timer_softirq+0x16e/0x430
? tick_sched_timer+0x37/0x70
__do_softirq+0xd2/0x280
irq_exit+0xd5/0xe0
smp_apic_timer_interrupt+0x6c/0x130
apic_timer_interrupt+0xf/0x20
[ Move rbd_dev->watch_dwork cancellation so that rbd_reregister_watch()
either bails out early because the watch is UNREGISTERED at that point
or just gets cancelled. ]
Cc: stable@vger.kernel.org
Fixes: 99d1694310df ("rbd: retry watch re-registration periodically")
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Based on code, default value of rsize/wsize is 16 MB.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
In current ceph_show_options(), there is no item for showing 'ino32',
so add showing mount option 'ino32' if the value is different with
default.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The check (intval < PAGE_SIZE) will involve type cast, so even when
specifying negative value to rsize/wsize/readdir_max_bytes, it will
pass the validation check successfully.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
On currently logic:
when I specify rasize=0~1 then it will be 4096.
when I specify rasize=2~4097 then it will be 8192.
Make it the same as rsize & wsize.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
KASAN found an UAF in ceph_statfs. This was a one-off bug but looking at
the code it looks like the monmap access needs to be protected as it can
be modified while we're accessing it. Fix this by protecting the access
with the monc->mutex.
BUG: KASAN: use-after-free in ceph_statfs+0x21d/0x2c0
Read of size 8 at addr ffff88006844f2e0 by task trinity-c5/304
CPU: 0 PID: 304 Comm: trinity-c5 Not tainted 4.17.0-rc6+ #172
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0xa5/0x11b
? show_regs_print_info+0x5/0x5
? kmsg_dump_rewind+0x118/0x118
? ceph_statfs+0x21d/0x2c0
print_address_description+0x73/0x2b0
? ceph_statfs+0x21d/0x2c0
kasan_report+0x243/0x360
ceph_statfs+0x21d/0x2c0
? ceph_umount_begin+0x80/0x80
? kmem_cache_alloc+0xdf/0x1a0
statfs_by_dentry+0x79/0xb0
vfs_statfs+0x28/0x110
user_statfs+0x8c/0xe0
? vfs_statfs+0x110/0x110
? __fdget_raw+0x10/0x10
__se_sys_statfs+0x5d/0xa0
? user_statfs+0xe0/0xe0
? mutex_unlock+0x1d/0x40
? __x64_sys_statfs+0x20/0x30
do_syscall_64+0xee/0x290
? syscall_return_slowpath+0x1c0/0x1c0
? page_fault+0x1e/0x30
? syscall_return_slowpath+0x13c/0x1c0
? prepare_exit_to_usermode+0xdb/0x140
? syscall_trace_enter+0x330/0x330
? __put_user_4+0x1c/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Allocated by task 130:
__kmalloc+0x124/0x210
ceph_monmap_decode+0x1c1/0x400
dispatch+0x113/0xd20
ceph_con_workfn+0xa7e/0x44e0
process_one_work+0x5f0/0xa30
worker_thread+0x184/0xa70
kthread+0x1a0/0x1c0
ret_from_fork+0x35/0x40
Freed by task 130:
kfree+0xb8/0x210
dispatch+0x15a/0xd20
ceph_con_workfn+0xa7e/0x44e0
process_one_work+0x5f0/0xa30
worker_thread+0x184/0xa70
kthread+0x1a0/0x1c0
ret_from_fork+0x35/0x40
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
inode info from non-auth can be stale.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
calc_target() isn't supposed to fail with anything but POOL_DNE, in
which case we report that the pool doesn't exist and fail the request
with -ENOENT. Doing this for -ENOMEM is at the very least confusing
and also harmful -- as the preceding requests complete, a short-lived
locator string allocation is likely to succeed after a wait.
(We used to call ceph_object_locator_to_pg() for a pi lookup. In
theory that could fail with -ENOENT, hence the "ret != -ENOENT" warning
being removed.)
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The intent behind making it a per-request setting was that it would be
set for writes, but not for reads. As it is, the flag is set for all
fs/ceph requests except for pool perm check stat request (technically
a read).
ceph_osdc_abort_on_full() skips reads since the previous commit and
I don't see a use case for marking individual requests.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
Don't consider reads for aborting and use ->base_oloc instead of
->target_oloc, as done in __submit_request().
Strictly speaking, we shouldn't be aborting FULL_TRY/FULL_FORCE writes
either. But, there is an inconsistency in FULL_TRY/FULL_FORCE handling
on the OSD side [1], so given that neither of these is used in the
kernel client, leave it for when the OSD behaviour is sorted out.
[1] http://tracker.ceph.com/issues/24339
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
Sending map check after complete_request() was called is not only
useless, but can lead to a use-after-free as req->r_kref decrement in
__complete_request() races with map check code.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
The "FULL or reached pool quota" warning is there to explain paused
requests. No need to emit it if pausing isn't going to occur.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
Scanning the trees just to see if there is anything to abort is
unnecessary -- all that is needed here is to update the epoch barrier
first, before we start aborting. Simplify and do the update inside the
loop before calling abort_request() for the first time.
The switch to for_each_request() also fixes a bug: homeless requests
weren't even considered for aborting.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
In the common case, req->r_callback is called by handle_reply() on the
ceph-msgr worker thread without any locks. If handle_reply() fails, it
is called with both osd->lock and osdc->lock. In the map check case,
it is called with just osdc->lock but held for write. Finally, if the
request is aborted because of -ENOSPC or by ceph_osdc_abort_requests(),
it is called directly on the submitter's thread, again with both locks.
req->r_callback on the submitter's thread is relatively new (introduced
in 4.12) and ripe for deadlocks -- e.g. writeback worker thread waiting
on itself:
inode_wait_for_writeback+0x26/0x40
evict+0xb5/0x1a0
iput+0x1d2/0x220
ceph_put_wrbuffer_cap_refs+0xe0/0x2c0 [ceph]
writepages_finish+0x2d3/0x410 [ceph]
__complete_request+0x26/0x60 [libceph]
complete_request+0x2e/0x70 [libceph]
__submit_request+0x256/0x330 [libceph]
submit_request+0x2b/0x30 [libceph]
ceph_osdc_start_request+0x25/0x40 [libceph]
ceph_writepages_start+0xdfe/0x1320 [ceph]
do_writepages+0x1f/0x70
__writeback_single_inode+0x45/0x330
writeback_sb_inodes+0x26a/0x600
__writeback_inodes_wb+0x92/0xc0
wb_writeback+0x274/0x330
wb_workfn+0x2d5/0x3b0
Defer __complete_request() to a workqueue in all failure cases so it's
never on the same thread as ceph_osdc_start_request() and always called
with no locks held.
Link: http://tracker.ceph.com/issues/23978
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
Move req->r_completion wake up and req->r_kref decrement into
__complete_request().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
|
|
destroy_workqueue() drains the workqueue before proceeding with
destruction.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Pending works hold inode references, which cause "Busy inodes after
unmount" warning.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This avoid force umount waiting on page writeback:
io_schedule+0xd/0x30
wait_on_page_bit_common+0xc6/0x130
__filemap_fdatawait_range+0xbd/0x100
filemap_fdatawait_keep_errors+0x15/0x40
sync_inodes_sb+0x1cf/0x240
sync_filesystem+0x52/0x90
generic_shutdown_super+0x1d/0x110
ceph_kill_sb+0x28/0x80 [ceph]
deactivate_locked_super+0x35/0x60
cleanup_mnt+0x36/0x70
task_work_run+0x79/0xa0
exit_to_usermode_loop+0x62/0x70
do_syscall_64+0xdb/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
0xffffffffffffffff
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This will be used by the filesystem for "umount -f".
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Currently, calling stat on a cephfs directory returns 1 for st_nlink.
This behaviour has recently changed in the fuse client, as some
applications seem to expect this value to be either 0 (if it's
unlinked) or 2 + number of subdirectories. This behaviour was changed
in the fuse client with commit 67c7e4619188 ("client: use common
interp of st_nlink for dirs").
This patch modifies the kernel client to have a similar behaviour.
Link: https://tracker.ceph.com/issues/23873
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|