Age | Commit message (Collapse) | Author |
|
Do not set any special register usage options, use the default which
is exactly what we should use for userspace code.
Make sure we remove the gcc plugin options from the 64-bit build.
The 32-bit cflags got it right already.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Not in vclock_gettime.c itself.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the TICK_PRIV_BIT was set, we would not be able to read the tick
register in user space, which is where this code runs.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One interesting thing we need to do is stop using
__builtin_return_address() in get_vvar_data().
Simply read the %pc register instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current VDSO patch mechanism has several problems:
1) It assumes how gcc will emit a function, with a register
window, an initial save instruction and then immediately
the %tick read when compiling vread_tick().
There is no such guarantees, code generation could change
at any time, gcc could put a nop between the save and
the %tick read, etc.
So this is extremely fragile and would fail some day.
2) It disallows us to properly inline vread_tick() into the callers
and thus get the best possible code sequences.
So fix this to patch properly, with location based annotations.
We have to be careful because we cannot do it the way we do
patches elsewhere in the kernel. Those use a sequence like:
1:
insn
.section .whatever_patch, "ax"
.word 1b
replacement_insn
.previous
This is a dynamic shared object, so that .word cannot be resolved at
build time, and thus cannot be used to execute the patches when the
kernel initializes the images.
Even trying to use label difference equations doesn't work in the
above kind of scheme:
1:
insn
.section .whatever_patch, "ax"
.word . - 1b
replacement_insn
.previous
The assembler complains that it cannot resolve that computation.
The issue is that this is contained in an executable section.
Borrow the sequence used by x86 alternatives, which is:
1:
insn
.pushsection .whatever_patch, "a"
.word . - 1b, . - 1f
.popsection
.pushsection .whatever_patch_replacements, "ax"
1:
replacement_insn
.previous
This works, allows us to inline vread_tick() as much as we like, and
can be used for arbitrary kinds of VDSO patching in the future.
Also, reverse the condition for patching. Most systems are %stick
based, so if we only patch on %tick systems the patching code will
get little or no testing.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v5.0/v4.20
As ever there's a lot of small and driver specific changes going on
here, but we do also have some relatively large changes in the core
thanks to the hard work of Charles and Morimoto-san:
- More component transitions from Morimoto-san, I think we're about
finished with this. Thanks for all the hard work!
- Morimoto-san also added a bunch of for_each_foo macros
- A bunch of cleanups and fixes for DAPM from Charles.
- MCLK support for several different devices, including CS42L51, STM32
SAI, and MAX98373.
- Support for Allwinner A64 CODEC analog, Intel boards with DA7219 and
MAX98927, Meson AXG PDM inputs, Nuvoton NAU8822, Renesas R8A7744 and
TI PCM3060.
|
|
Prepare input updates for 4.20 merge window.
|
|
We were iterating a block group's free space cache rbtree without locking
first the lock that protects it (the free_space_ctl->free_space_offset
rbtree is protected by the free_space_ctl->tree_lock spinlock).
KASAN reported an use-after-free problem when iterating such a rbtree due
to a concurrent rbtree delete:
[ 9520.359168] ==================================================================
[ 9520.359656] BUG: KASAN: use-after-free in rb_next+0x13/0x90
[ 9520.359949] Read of size 8 at addr ffff8800b7ada500 by task btrfs-transacti/1721
[ 9520.360357]
[ 9520.360530] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G L 4.19.0-rc8-nbor #555
[ 9520.360990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 9520.362682] Call Trace:
[ 9520.362887] dump_stack+0xa4/0xf5
[ 9520.363146] print_address_description+0x78/0x280
[ 9520.363412] kasan_report+0x263/0x390
[ 9520.363650] ? rb_next+0x13/0x90
[ 9520.363873] __asan_load8+0x54/0x90
[ 9520.364102] rb_next+0x13/0x90
[ 9520.364380] btrfs_dump_free_space+0x146/0x160 [btrfs]
[ 9520.364697] dump_space_info+0x2cd/0x310 [btrfs]
[ 9520.364997] btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
[ 9520.365310] __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
[ 9520.365646] ? btrfs_update_time+0x180/0x180 [btrfs]
[ 9520.365923] ? _raw_spin_unlock+0x27/0x40
[ 9520.366204] ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
[ 9520.366549] btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
[ 9520.366880] cache_save_setup+0x42e/0x580 [btrfs]
[ 9520.367220] ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
[ 9520.367518] ? lock_downgrade+0x2f0/0x2f0
[ 9520.367799] ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
[ 9520.368104] ? kasan_check_read+0x11/0x20
[ 9520.368349] ? do_raw_spin_unlock+0xa8/0x140
[ 9520.368638] btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
[ 9520.368978] ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
[ 9520.369282] ? do_raw_spin_unlock+0xa8/0x140
[ 9520.369534] ? _raw_spin_unlock+0x27/0x40
[ 9520.369811] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
[ 9520.370137] commit_cowonly_roots+0x4b9/0x610 [btrfs]
[ 9520.370560] ? commit_fs_roots+0x350/0x350 [btrfs]
[ 9520.370926] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
[ 9520.371285] btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
[ 9520.371612] ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
[ 9520.371943] ? start_transaction+0x168/0x6c0 [btrfs]
[ 9520.372257] transaction_kthread+0x21c/0x240 [btrfs]
[ 9520.372537] kthread+0x1d2/0x1f0
[ 9520.372793] ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
[ 9520.373090] ? kthread_park+0xb0/0xb0
[ 9520.373329] ret_from_fork+0x3a/0x50
[ 9520.373567]
[ 9520.373738] Allocated by task 1804:
[ 9520.373974] kasan_kmalloc+0xff/0x180
[ 9520.374208] kasan_slab_alloc+0x11/0x20
[ 9520.374447] kmem_cache_alloc+0xfc/0x2d0
[ 9520.374731] __btrfs_add_free_space+0x40/0x580 [btrfs]
[ 9520.375044] unpin_extent_range+0x4f7/0x7a0 [btrfs]
[ 9520.375383] btrfs_finish_extent_commit+0x15f/0x4d0 [btrfs]
[ 9520.375707] btrfs_commit_transaction+0xb06/0x10e0 [btrfs]
[ 9520.376027] btrfs_alloc_data_chunk_ondemand+0x237/0x5c0 [btrfs]
[ 9520.376365] btrfs_check_data_free_space+0x81/0xd0 [btrfs]
[ 9520.376689] btrfs_delalloc_reserve_space+0x25/0x80 [btrfs]
[ 9520.377018] btrfs_direct_IO+0x42e/0x6d0 [btrfs]
[ 9520.377284] generic_file_direct_write+0x11e/0x220
[ 9520.377587] btrfs_file_write_iter+0x472/0xac0 [btrfs]
[ 9520.377875] aio_write+0x25c/0x360
[ 9520.378106] io_submit_one+0xaa0/0xdc0
[ 9520.378343] __se_sys_io_submit+0xfa/0x2f0
[ 9520.378589] __x64_sys_io_submit+0x43/0x50
[ 9520.378840] do_syscall_64+0x7d/0x240
[ 9520.379081] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 9520.379387]
[ 9520.379557] Freed by task 1802:
[ 9520.379782] __kasan_slab_free+0x173/0x260
[ 9520.380028] kasan_slab_free+0xe/0x10
[ 9520.380262] kmem_cache_free+0xc1/0x2c0
[ 9520.380544] btrfs_find_space_for_alloc+0x4cd/0x4e0 [btrfs]
[ 9520.380866] find_free_extent+0xa99/0x17e0 [btrfs]
[ 9520.381166] btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
[ 9520.381474] btrfs_get_blocks_direct+0x60b/0xbd0 [btrfs]
[ 9520.381761] __blockdev_direct_IO+0x10ee/0x58a1
[ 9520.382059] btrfs_direct_IO+0x25a/0x6d0 [btrfs]
[ 9520.382321] generic_file_direct_write+0x11e/0x220
[ 9520.382623] btrfs_file_write_iter+0x472/0xac0 [btrfs]
[ 9520.382904] aio_write+0x25c/0x360
[ 9520.383172] io_submit_one+0xaa0/0xdc0
[ 9520.383416] __se_sys_io_submit+0xfa/0x2f0
[ 9520.383678] __x64_sys_io_submit+0x43/0x50
[ 9520.383927] do_syscall_64+0x7d/0x240
[ 9520.384165] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 9520.384439]
[ 9520.384610] The buggy address belongs to the object at ffff8800b7ada500
which belongs to the cache btrfs_free_space of size 72
[ 9520.385175] The buggy address is located 0 bytes inside of
72-byte region [ffff8800b7ada500, ffff8800b7ada548)
[ 9520.385691] The buggy address belongs to the page:
[ 9520.385957] page:ffffea0002deb680 count:1 mapcount:0 mapping:ffff880108a1d700 index:0x0 compound_mapcount: 0
[ 9520.388030] flags: 0x8100(slab|head)
[ 9520.388281] raw: 0000000000008100 ffffea0002deb608 ffffea0002728808 ffff880108a1d700
[ 9520.388722] raw: 0000000000000000 0000000000130013 00000001ffffffff 0000000000000000
[ 9520.389169] page dumped because: kasan: bad access detected
[ 9520.389473]
[ 9520.389658] Memory state around the buggy address:
[ 9520.389943] ffff8800b7ada400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 9520.390368] ffff8800b7ada480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 9520.390796] >ffff8800b7ada500: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
[ 9520.391223] ^
[ 9520.391461] ffff8800b7ada580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 9520.391885] ffff8800b7ada600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 9520.392313] ==================================================================
[ 9520.392772] BTRFS critical (device vdc): entry offset 2258497536, bytes 131072, bitmap no
[ 9520.393247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000011
[ 9520.393705] PGD 800000010dbab067 P4D 800000010dbab067 PUD 107551067 PMD 0
[ 9520.394059] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 9520.394378] CPU: 4 PID: 1721 Comm: btrfs-transacti Tainted: G B L 4.19.0-rc8-nbor #555
[ 9520.394858] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 9520.395350] RIP: 0010:rb_next+0x3c/0x90
[ 9520.396461] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
[ 9520.396762] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
[ 9520.397115] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
[ 9520.397468] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
[ 9520.397821] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
[ 9520.398188] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
[ 9520.398555] FS: 0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
[ 9520.399007] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9520.399335] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
[ 9520.399679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9520.400023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9520.400400] Call Trace:
[ 9520.400648] btrfs_dump_free_space+0x146/0x160 [btrfs]
[ 9520.400974] dump_space_info+0x2cd/0x310 [btrfs]
[ 9520.401287] btrfs_reserve_extent+0x1ee/0x1f0 [btrfs]
[ 9520.401609] __btrfs_prealloc_file_range+0x1cc/0x620 [btrfs]
[ 9520.401952] ? btrfs_update_time+0x180/0x180 [btrfs]
[ 9520.402232] ? _raw_spin_unlock+0x27/0x40
[ 9520.402522] ? btrfs_alloc_data_chunk_ondemand+0x2c0/0x5c0 [btrfs]
[ 9520.402882] btrfs_prealloc_file_range_trans+0x23/0x30 [btrfs]
[ 9520.403261] cache_save_setup+0x42e/0x580 [btrfs]
[ 9520.403570] ? btrfs_check_data_free_space+0xd0/0xd0 [btrfs]
[ 9520.403871] ? lock_downgrade+0x2f0/0x2f0
[ 9520.404161] ? btrfs_write_dirty_block_groups+0x11f/0x6e0 [btrfs]
[ 9520.404481] ? kasan_check_read+0x11/0x20
[ 9520.404732] ? do_raw_spin_unlock+0xa8/0x140
[ 9520.405026] btrfs_write_dirty_block_groups+0x2af/0x6e0 [btrfs]
[ 9520.405375] ? btrfs_start_dirty_block_groups+0x870/0x870 [btrfs]
[ 9520.405694] ? do_raw_spin_unlock+0xa8/0x140
[ 9520.405958] ? _raw_spin_unlock+0x27/0x40
[ 9520.406243] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
[ 9520.406574] commit_cowonly_roots+0x4b9/0x610 [btrfs]
[ 9520.406899] ? commit_fs_roots+0x350/0x350 [btrfs]
[ 9520.407253] ? btrfs_run_delayed_refs+0x1b8/0x230 [btrfs]
[ 9520.407589] btrfs_commit_transaction+0x5e5/0x10e0 [btrfs]
[ 9520.407925] ? btrfs_apply_pending_changes+0x90/0x90 [btrfs]
[ 9520.408262] ? start_transaction+0x168/0x6c0 [btrfs]
[ 9520.408582] transaction_kthread+0x21c/0x240 [btrfs]
[ 9520.408870] kthread+0x1d2/0x1f0
[ 9520.409138] ? btrfs_cleanup_transaction+0xb50/0xb50 [btrfs]
[ 9520.409440] ? kthread_park+0xb0/0xb0
[ 9520.409682] ret_from_fork+0x3a/0x50
[ 9520.410508] Dumping ftrace buffer:
[ 9520.410764] (ftrace buffer empty)
[ 9520.411007] CR2: 0000000000000011
[ 9520.411297] ---[ end trace 01a0863445cf360a ]---
[ 9520.411568] RIP: 0010:rb_next+0x3c/0x90
[ 9520.412644] RSP: 0018:ffff8801074ff780 EFLAGS: 00010292
[ 9520.412932] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff81b5ac4c
[ 9520.413274] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000011
[ 9520.413616] RBP: ffff8801074ff7a0 R08: ffffed0021d64ccc R09: ffffed0021d64ccc
[ 9520.414007] R10: 0000000000000001 R11: ffffed0021d64ccb R12: ffff8800b91e0000
[ 9520.414349] R13: ffff8800a3ceba48 R14: ffff8800b627bf80 R15: 0000000000020000
[ 9520.416074] FS: 0000000000000000(0000) GS:ffff88010eb00000(0000) knlGS:0000000000000000
[ 9520.416536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9520.416848] CR2: 0000000000000011 CR3: 0000000106b52000 CR4: 00000000000006a0
[ 9520.418477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9520.418846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9520.419204] Kernel panic - not syncing: Fatal exception
[ 9520.419666] Dumping ftrace buffer:
[ 9520.419930] (ftrace buffer empty)
[ 9520.420168] Kernel Offset: disabled
[ 9520.420406] ---[ end Kernel panic - not syncing: Fatal exception ]---
Fix this by acquiring the respective lock before iterating the rbtree.
Reported-by: Nikolay Borisov <nborisov@suse.com>
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Clang warns:
drivers/rtc/rtc-s35390a.c:124:27: warning: implicit conversion from
'int' to 'char' changes value from 192 to -64 [-Wconstant-conversion]
buf = S35390A_FLAG_RESET | S35390A_FLAG_24H;
~ ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
1 warning generated.
Update buf to be an unsigned 8-bit integer, which matches the buf member
in struct i2c_msg.
https://github.com/ClangBuiltLinux/linux/issues/145
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
Pull dma mapping updates from Christoph Hellwig:
"First batch of dma-mapping changes for 4.20.
There will be a second PR as some big changes were only applied just
before the end of the merge window, and I want to give them a few more
days in linux-next.
Summary:
- mostly more consolidation of the direct mapping code, including
converting over hexagon, and merging the coherent and non-coherent
code into a single dma_map_ops instance (me)
- cleanups for the dma_configure/dma_unconfigure callchains (me)
- better handling of dma_masks in odd setups (me, Alexander Duyck)
- better debugging of passing vmalloc address to the DMA API (Stephen
Boyd)
- CMA command line parsing fix (He Zhe)"
* tag 'dma-mapping-4.20' of git://git.infradead.org/users/hch/dma-mapping: (27 commits)
dma-direct: respect DMA_ATTR_NO_WARN
dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN
dma-direct: document the zone selection logic
dma-debug: Check for drivers mapping invalid addresses in dma_map_single()
dma-direct: fix return value of dma_direct_supported
dma-mapping: move dma_default_get_required_mask under ifdef
dma-direct: always allow dma mask <= physiscal memory size
dma-direct: implement complete bus_dma_mask handling
dma-direct: refine dma_direct_alloc zone selection
dma-direct: add an explicit dma_direct_get_required_mask
dma-mapping: make the get_required_mask method available unconditionally
unicore32: remove swiotlb support
Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"
dma-mapping: support non-coherent devices in dma_common_get_sgtable
dma-mapping: consolidate the dma mmap implementations
dma-mapping: merge direct and noncoherent ops
dma-mapping: move the dma_coherent flag to struct device
MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT
dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
dma-mapping: fix panic caused by passing empty cma command line argument
...
|
|
Pull libata updates from Jens Axboe:
"Here are the libata changes queued up for 4.20:
- %pOFn device_node.name conversion (Rob Herring)
- Use LBAM/LBAH password defines instead of hardcoding (Linus
Walleij)
- Series adding support for the allwinner R40 AHCI controller
(Corentin Labbe)
- Disable ALPM for Ampere Computing device (Suman Tripathi)
- ahci bcrm fixes (Florian Fainelli)
- Redundant Kconfig defaults (Bartlomiej Zolnierkiewicz)
- Code cleanups (Nathan Chancellor)"
* tag 'for-4.20/libata-20181021' of git://git.kernel.dk/linux-block:
ata: remove redundant 'default n' from Kconfig
ata: ep93xx: Use proper enums for directions
ata: ahci_brcm: Allow using driver or DSL SoCs
ata: ahci_brcm: Match BCM63138 compatible strings
ata: ahci_brcm: Allow optional reset controller to be used
dt-bindings: ata: Document BCM63138 compatible string
pata_atiixp: Remove unnecessary parentheses
ata: Disable AHCI ALPM feature for Ampere Computing eMAG SATA
dt-bindings: ata: update ahci_sunxi bindings
ata: ahci_sunxi: add support for r40
dt-bindings: ata: ahci-platform: document phy-supply
ata: ahci_platform: add support for PHY controller regulator
dt-bindings: ata: ahci-platform: document ahci-supply
ata: ahci_platform: add support for AHCI controller regulator
dt-bindings: ata: ahci-platform: fix indentation of target-supply
libata: Use SMART LBAM/LBAH password defines
ata: ahci: Convert to using %pOFn instead of device_node.name
|
|
Pull block layer updates from Jens Axboe:
"This is the main pull request for block changes for 4.20. This
contains:
- Series enabling runtime PM for blk-mq (Bart).
- Two pull requests from Christoph for NVMe, with items such as;
- Better AEN tracking
- Multipath improvements
- RDMA fixes
- Rework of FC for target removal
- Fixes for issues identified by static checkers
- Fabric cleanups, as prep for TCP transport
- Various cleanups and bug fixes
- Block merging cleanups (Christoph)
- Conversion of drivers to generic DMA mapping API (Christoph)
- Series fixing ref count issues with blkcg (Dennis)
- Series improving BFQ heuristics (Paolo, et al)
- Series improving heuristics for the Kyber IO scheduler (Omar)
- Removal of dangerous bio_rewind_iter() API (Ming)
- Apply single queue IPI redirection logic to blk-mq (Ming)
- Set of fixes and improvements for bcache (Coly et al)
- Series closing a hotplug race with sysfs group attributes (Hannes)
- Set of patches for lightnvm:
- pblk trace support (Hans)
- SPDX license header update (Javier)
- Tons of refactoring patches to cleanly abstract the 1.2 and 2.0
specs behind a common core interface. (Javier, Matias)
- Enable pblk to use a common interface to retrieve chunk metadata
(Matias)
- Bug fixes (Various)
- Set of fixes and updates to the blk IO latency target (Josef)
- blk-mq queue number updates fixes (Jianchao)
- Convert a bunch of drivers from the old legacy IO interface to
blk-mq. This will conclude with the removal of the legacy IO
interface itself in 4.21, with the rest of the drivers (me, Omar)
- Removal of the DAC960 driver. The SCSI tree will introduce two
replacement drivers for this (Hannes)"
* tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block: (204 commits)
block: setup bounce bio_sets properly
blkcg: reassociate bios when make_request() is called recursively
blkcg: fix edge case for blk_get_rl() under memory pressure
nvme-fabrics: move controller options matching to fabrics
nvme-rdma: always have a valid trsvcid
mtip32xx: fully switch to the generic DMA API
rsxx: switch to the generic DMA API
umem: switch to the generic DMA API
sx8: switch to the generic DMA API
sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE code
skd: switch to the generic DMA API
ubd: remove use of blk_rq_map_sg
nvme-pci: remove duplicate check
drivers/block: Remove DAC960 driver
nvme-pci: fix hot removal during error handling
nvmet-fcloop: suppress a compiler warning
nvme-core: make implicit seed truncation explicit
nvmet-fc: fix kernel-doc headers
nvme-fc: rework the request initialization code
nvme-fc: introduce struct nvme_fcp_op_w_sgl
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Apart from some new arm64 features and clean-ups, this also contains
the core mmu_gather changes for tracking the levels of the page table
being cleared and a minor update to the generic
compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ.
Summary:
- Core mmu_gather changes which allow tracking the levels of
page-table being cleared together with the arm64 low-level flushing
routines
- Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
mitigate Spectre-v4 dynamically without trapping to EL3 firmware
- Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
- Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
- Support for Common Not Private (CnP) translations allowing threads
of the same CPU to share the TLB entries
- Accelerated crc32 routines
- Move swapper_pg_dir to the rodata section
- Trap WFI instruction executed in user space
- ARM erratum 1188874 workaround (arch_timer)
- Miscellaneous fixes and clean-ups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
arm64: cpufeature: Fix handling of CTR_EL0.IDC field
arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
Documentation/arm64: HugeTLB page implementation
arm64: mm: Use __pa_symbol() for set_swapper_pgd()
arm64: Add silicon-errata.txt entry for ARM erratum 1188873
Revert "arm64: uaccess: implement unsafe accessors"
arm64: mm: Drop the unused cpu parameter
MAINTAINERS: fix bad sdei paths
arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines
arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
arm64: xen: Use existing helper to check interrupt status
arm64: Use daifflag_restore after bp_hardening
arm64: daifflags: Use irqflags functions for daifflags
arm64: arch_timer: avoid unused function warning
arm64: Trap WFI executed in userspace
arm64: docs: Document SSBS HWCAP
arm64: docs: Fix typos in ELF hwcaps
arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe
...
|
|
flush_pool is leaked when flush bio size is zero
Fixes: 5a409b4f56d5 ("MD: fix lock contention for flush bios")
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
I noticed kmemleak report memory leak when run create/stop
md in a loop, backtrace:
[<000000001ca975e7>] mempool_create_node+0x86/0xd0
[<0000000095576bcd>] md_run+0x1057/0x1410 [md_mod]
[<000000007b45c5fc>] do_md_run+0x15/0x130 [md_mod]
[<000000001ede9ec0>] md_ioctl+0x1f49/0x25d0 [md_mod]
[<000000004142cacf>] blkdev_ioctl+0x680/0xd00
The root cause is we alloc mddev->flush_pool and
mddev->flush_bio_pool in md_run, but from do_md_stop
will not call into md_stop but __md_stop, move the
mempool_destroy to __md_stop fixes the problem for me.
The bug was introduced in 5a409b4f56d5, the fixes should go to
4.18+
Fixes: 5a409b4f56d5 ("MD: fix lock contention for flush bios")
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Commit 51ed73eb998a1c79a2b0e9bed68f75a8a2c93b9b ("rtc: ds1340: Add support
for trickle charger.") breaks ds1339 wakealarm support by limiting
accessible registers. Fix this.
Fixes: 51ed73eb998a ("rtc: ds1340: Add support for trickle charger.")
Cc: stable@vger.kernel.org
Signed-off-by: Soeren Moch <smoch@web.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
This dead branch was missed during review. It only makes memory_entry()
more inefficient due to needless call to is_power_of_2(), etc.
Reported-by: shenghui <shhuiw@foxmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
When ramoops reserved a memory region in the kernel, it had an unhelpful
label of "persistent_memory". When reading /proc/iomem, it would be
repeated many times, did not hint that it was ramoops in particular,
and didn't clarify very much about what each was used for:
400000000-407ffffff : Persistent Memory (legacy)
400000000-400000fff : persistent_memory
400001000-400001fff : persistent_memory
...
4000ff000-4000fffff : persistent_memory
Instead, this adds meaningful labels for how the various regions are
being used:
400000000-407ffffff : Persistent Memory (legacy)
400000000-400000fff : ramoops:dump(0/252)
400001000-400001fff : ramoops:dump(1/252)
...
4000fc000-4000fcfff : ramoops:dump(252/252)
4000fd000-4000fdfff : ramoops:console
4000fe000-4000fe3ff : ramoops:ftrace(0/3)
4000fe400-4000fe7ff : ramoops:ftrace(1/3)
4000fe800-4000febff : ramoops:ftrace(2/3)
4000fec00-4000fefff : ramoops:ftrace(3/3)
4000ff000-4000fffff : ramoops:pmsg
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
|
|
This refactors compression initialization slightly to better handle
getting potentially called twice (via early pstore_register() calls
and later pstore_init()) and improves the comments and reporting to be
more verbose.
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
|
|
ramoops's call of pstore_register() was recently moved to run during
late_initcall() because the crypto backend may not have been ready during
postcore_initcall(). This meant early-boot crash dumps were not getting
caught by pstore any more.
Instead, lets allow calls to pstore_register() earlier, and once crypto
is ready we can initialize the compression.
Reported-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Fixes: cb3bee0369bc ("pstore: Use crypto compress API")
[kees: trivial rebase]
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
|
|
In preparation for having additional actions during init/exit, this moves
the init/exit into platform.c, centralizing the logic to make call outs
to the fs init/exit.
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
|
|
Commit f42b0e18f2e5 ("of: add node name compare helper functions")
failed to add the module exports to of_node_name_eq() and
of_node_name_prefix(). Add them now.
Fixes: f42b0e18f2e5 ("of: add node name compare helper functions")
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Add a new mount option 'nocopyfrom' that will prevent the usage of the
RADOS 'copy-from' operation in cephfs. This could be useful, for example,
for an administrator to temporarily mitigate any possible bugs in the
'copy-from' implementation.
Currently, only copy_file_range uses this RADOS operation. Setting this
mount option will result in this syscall reverting to the default VFS
implementation, i.e. to perform the copies locally instead of doing remote
object copies.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This commit implements support for the copy_file_range syscall in cephfs.
It is implemented using the RADOS 'copy-from' operation, which allows to
do a remote object copy, without the need to download/upload data from/to
the OSDs.
Some manual copy may however be required if the source/destination file
offsets aren't object aligned or if the copy length is smaller than the
object size.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Add support for performing remote object copies using the 'copy-from'
operation.
[ Add COPY_FROM to get_num_data_items(). ]
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
ceph_try_get_caps currently calls try_get_cap_refs with the nonblock
parameter always set to 'true'. This change adds a new parameter that
allows to set it's value. This will be useful for a follow-up patch that
will need to get two sets of capabilities for two different inodes without
risking a deadlock.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
setup_request_data() adds message data items to both request and reply
messages, but only checks request num_data_items before proceeding with
the loop. This is wrong because if an op doesn't have any request data
items but has a reply data item (e.g. read), a duplicate data item gets
added to the message on every resend attempt.
This went unnoticed for years but now that message data items are
preallocated, it promptly crashes in ceph_msg_data_add(). Amend the
signature to make it clear that setup_request_data() operates on both
request and reply messages. Also, remove data_len assert -- we have
another one in prepare_write_message().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Currently message data items are allocated with ceph_msg_data_create()
in setup_request_data() inside send_request(). send_request() has never
been allowed to fail, so each allocation is followed by a BUG_ON:
data = ceph_msg_data_create(...);
BUG_ON(!data);
It's been this way since support for multiple message data items was
added in commit 6644ed7b7e04 ("libceph: make message data be a pointer")
in 3.10.
There is no reason to delay the allocation of message data items until
the last possible moment and we certainly don't need a linked list of
them as they are only ever appended to the end and never erased. Make
ceph_msg_new2() take max_data_items and adapt the rest of the code.
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The current requirement is that ceph_osdc_alloc_messages() should be
called after oid and oloc are known. In preparation for preallocating
message data items, move ceph_osdc_alloc_messages() further down, so
that it is called when OSD op codes are known.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
ceph_osdc_alloc_messages() call will be moved out of
alloc_linger_request() in the next commit, which means that
ceph_osdc_watch() will need to call ceph_osdc_alloc_messages()
twice. Add a helper for that.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Register lingers directly in linger_submit(). This avoids allocating
memory for notify pagelist while holding osdc->lock and simplifies both
callers of linger_submit().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
ceph_msgpool_get() can fall back to ceph_msg_new() when it is asked for
a message whose front portion is larger than pool->front_len. However
the caller always passes 0, effectively disabling that code path. The
allocation goes to the message pool and returns a message with a front
that is smaller than requested, setting us up for a crash.
One example of this is a directory with a large number of snapshots.
If its snap context doesn't fit, we oops in encode_request_partial().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Two OSD op slots are allocated, but only one is ever used.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Any uninitialized or unknown ops will be caught by the default clause
anyway.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
__cap_delay_requeue could be invoked through ceph_check_caps when there
exists caps that needs to be sent and are delayed by "i_hold_caps_min"
or "i_hold_caps_max". If __cap_delay_requeue sets timeout unconditionally,
there could be a chance that some "wanted" caps can not be release for a
long since their timeouts are reset every time they get delayed.
Fixes: http://tracker.ceph.com/issues/36369
Signed-off-by: Xuehan Xu <xuxuehan@360.cn>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Because send_mds_reconnect() wants to send a message with a pagelist
and pass the ownership to the messenger, ceph_msg_data_add_pagelist()
consumes a ref which is then put in ceph_msg_data_destroy(). This
makes managing pagelists in the OSD client (where they are wrapped in
ceph_osd_data) unnecessarily hard because the handoff only happens in
ceph_osdc_start_request() instead of when the pagelist is passed to
ceph_osd_data_pagelist_init(). I counted several memory leaks on
various error paths.
Fix up ceph_msg_data_add_pagelist() and carry a pagelist ref in
ceph_osd_data.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
struct ceph_pagelist cannot be embedded into anything else because it
has its own refcount. Merge allocation and initialization together.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
If the read is large enough, we end up spinning in the messenger:
libceph: osd0 192.168.122.1:6801 io error
libceph: osd0 192.168.122.1:6801 io error
libceph: osd0 192.168.122.1:6801 io error
This is a receive side limit, so only reads were affected.
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Current implementation of cephfs fallocate isn't correct as it doesn't
really reserve the space in the cluster, which means that a subsequent
call to a write may actually fail due to lack of space. In fact, it is
currently possible to fallocate an amount space that is larger than the
free space in the cluster. It has behaved this way since the initial
commit ad7a60de882a ("ceph: punch hole support").
Since there's no easy solution to fix this at the moment, this patch
simply removes support for all fallocate operations but
FALLOC_FL_PUNCH_HOLE (which implies FALLOC_FL_KEEP_SIZE).
Link: https://tracker.ceph.com/issues/36317
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Avoid allocating memory for the entire user request: striped_read()
does a synchronous OSD request per object, so it doesn't need more than
object size worth of pages at a time.
[ Preserve the comment, changelog. ]
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
d_lookup()/d_alloc() require parent inode locked. Parent inode is
not locked if request is aborted.
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
This reverts commit 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3.
splice_dentry() is used by three places. For two places, req->r_dentry
is passed to splice_dentry(). In the case of error, req->r_dentry does
not get updated. So splice_dentry() should not drop reference.
Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Do the snap check first in ceph_set_acl(), so we can avoid
unnecessary operations when the inode has snap.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Add __init/__exit annotation to init/cleanup helpers
which are only called once in the module.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
__cap_delay_requeue() only requeue inode which does not
have CEPH_I_FLUSH flag, so avoid reset cap hold timeout
for that inode.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Put syscon device node when it is not needed anymore.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
The init of the pkey module currently fails if the pckmo instruction
or the subfunctions are not available. However, customers may
restrict their LPAR to switch off exactly these functions and work
with secure key only. So it is a valid case to have the pkey module
active and use it for secure key to protected key transfer only.
This patch moves the pckmo subfunction check from the pkey module init
function into the internal function where the pckmo instruction is
called. So now only on invocation of the pckmo instruction the check
for the required subfunction is done. If not available EOPNOTSUPP is
returned to the caller.
The check for having the pckmo instruction available is still done
during module init. This instruction came in with MSA 3 together with
the basic set of kmc instructions needed to work with protected keys.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|