summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-06-16io_uring: do not use prio task_work_add in uring_cmdDylan Yudaken
io_req_task_prio_work_add has a strict assumption that it will only be used with io_req_task_complete. There is a codepath that assumes this is the case and will not even call the completion function if it is hit. For uring_cmd with an arbitrary completion function change the call to the correct non-priority version. Fixes: ee692a21e9bf8 ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Dylan Yudaken <dylany@fb.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20220616135011.441980-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16ext4, doc: remove unnecessary escapingWang Jianjian
Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com> Link: https://lore.kernel.org/r/20220520022255.2120576-2-wangjianjian3@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16ext4: fix incorrect comment in ext4_bio_write_page()Wang Jianjian
Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com> Link: https://lore.kernel.org/r/20220520022255.2120576-1-wangjianjian3@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16mtd: rawnand: gpmi: Fix setting busy timeout settingSascha Hauer
The DEVICE_BUSY_TIMEOUT value is described in the Reference Manual as: | Timeout waiting for NAND Ready/Busy or ATA IRQ. Used in WAIT_FOR_READY | mode. This value is the number of GPMI_CLK cycles multiplied by 4096. So instead of multiplying the value in cycles with 4096, we have to divide it by that value. Use DIV_ROUND_UP to make sure we are on the safe side, especially when the calculated value in cycles is smaller than 4096 as typically the case. This bug likely never triggered because any timeout != 0 usually will do. In my case the busy timeout in cycles was originally calculated as 2408, which multiplied with 4096 is 0x968000. The lower 16 bits were taken for the 16 bit wide register field, so the register value was 0x8000. With 2970bf5a32f0 ("mtd: rawnand: gpmi: fix controller timings setting") however the value in cycles became 2384, which multiplied with 4096 is 0x950000. The lower 16 bit are 0x0 now resulting in an intermediate timeout when reading from NAND. Fixes: b1206122069aa ("mtd: rawnand: gpmi: use core timings instead of an empirical derivation") Cc: stable@vger.kernel.org Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220614083138.3455683-1-s.hauer@pengutronix.de
2022-06-16fs: fix jbd2_journal_try_to_free_buffers() kernel-doc commentYang Li
Add the description of @folio and remove @page in function kernel-doc comment to remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. fs/jbd2/transaction.c:2149: warning: Function parameter or member 'folio' not described in 'jbd2_journal_try_to_free_buffers' fs/jbd2/transaction.c:2149: warning: Excess function parameter 'page' description in 'jbd2_journal_try_to_free_buffers' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220512075432.31763-1-yang.lee@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-06-16io_uring: commit non-pollable provided mapped buffers upfrontJens Axboe
For recv/recvmsg, IO either completes immediately or gets queued for a retry. This isn't the case for read/readv, if eg a normal file or a block device is used. Here, an operation can get queued with the block layer. If this happens, ring mapped buffers must get committed immediately to avoid that the next read can consume the same buffer. Check if we're dealing with pollable file, when getting a new ring mapped provided buffer. If it's not, commit it immediately rather than wait post issue. If we don't wait, we can race with completions coming in, or just plain buffer reuse by committing after a retry where others could have grabbed the same buffer. Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers") Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-16drm/vc4: Warn if some v3d code is run on BCM2711Maxime Ripard
The BCM2711 has a separate driver for the v3d, and thus we can't call into any of the driver entrypoints that rely on the v3d being there. Let's add a bunch of checks and complain loudly if that ever happen. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-15-maxime@cerno.tech
2022-06-16drm/vc4: crtc: Fix out of order frames during asynchronous page flipsMaxime Ripard
When doing an asynchronous page flip (PAGE_FLIP ioctl with the DRM_MODE_PAGE_FLIP_ASYNC flag set), the current code waits for the possible GPU buffer being rendered through a call to vc4_queue_seqno_cb(). On the BCM2835-37, the GPU driver is part of the vc4 driver and that function is defined in vc4_gem.c to wait for the buffer to be rendered, and once it's done, call a callback. However, on the BCM2711 used on the RaspberryPi4, the GPU driver is separate (v3d) and that function won't do anything. This was working because we were going into a path, due to uninitialized variables, that was always scheduling the callback. However, we were never actually waiting for the buffer to be rendered which was resulting in frames being displayed out of order. The generic API to signal those kind of completion in the kernel are the DMA fences, and fortunately the v3d drivers supports them and signal when its job is done. That API also provides an equivalent function that allows to have a callback being executed when the fence is signalled as done. Let's change our driver a bit to rely on the previous function for the older SoCs, and on DMA fences for the BCM2711. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Melissa Wen <mwen@igalia.com> Link: https://lore.kernel.org/r/20220610115149.964394-14-maxime@cerno.tech
2022-06-16drm/vc4: crtc: Don't call into BO Handling on Async Page-Flips on BCM2711Maxime Ripard
The BCM2711 doesn't have a v3d GPU so we don't want to call into its BO management code. Let's create an asynchronous page-flip handler for the BCM2711 that just calls into the common code. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-13-maxime@cerno.tech
2022-06-16drm/vc4: crtc: Move the BO Handling out of Common Page-Flip HandlerMaxime Ripard
The function vc4_async_page_flip() handles asynchronous page-flips in the vc4 driver. However, it mixes some generic code with code that should only be run on older generations that have the GPU handled by the vc4 driver. Let's split the generic part out of vc4_async_page_flip() and into a common function that we be reusable by an handler made for the BCM2711. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-12-maxime@cerno.tech
2022-06-16drm/vc4: crtc: Move the BO handling out of common page-flip callbackMaxime Ripard
We'll soon introduce another completion callback source that won't need to use the BO reference counting, so let's move it around to create a function we will be able to share between both callbacks. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-11-maxime@cerno.tech
2022-06-16drm/vc4: crtc: Use an union to store the page flip callbackMaxime Ripard
We'll need to extend the vc4_async_flip_state structure to rely on another callback implementation, so let's move the current one into a union. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-10-maxime@cerno.tech
2022-06-16drm/vc4: drv: Skip BO Backend Initialization on BCM2711Maxime Ripard
On the BCM2711, we currently call the vc4_bo_cache_init() and vc4_gem_init() functions. These functions initialize the BO and GEM backends. However, this code was initially created to accomodate the requirements of the GPU on the older SoCs, while the BCM2711 has a separate driver for it. So let's just skip these calls when we're on a newer hardware. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-9-maxime@cerno.tech
2022-06-16drm/vc4: plane: Register a different drm_plane_helper_funcs on BCM2711Maxime Ripard
On the BCM2711, our current definition of drm_plane_helper_funcs uses the custom vc4_prepare_fb() and vc4_cleanup_fb(). Those functions rely on the buffer allocation path that was relying on the GPU, and is no longer relevant. Let's create another drm_plane_helper_funcs structure that we will register on the BCM2711. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-8-maxime@cerno.tech
2022-06-16drm/vc4: kms: Register a different drm_mode_config_funcs on BCM2711Maxime Ripard
On the BCM2711, our current definition of drm_mode_config_funcs uses the custom vc4_fb_create(). However, that function relies on the buffer allocation path that was relying on the GPU, and is no longer relevant. Let's create another drm_mode_config_funcs structure that we will register on the BCM2711. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-7-maxime@cerno.tech
2022-06-16drm/vc4: drv: Register a different driver on BCM2711Maxime Ripard
Prior to the BCM2711/RaspberryPi4, the GPU was a part of the display components of the SoC. It was thus a part of the vc4 driver. However, with the BCM2711, it got split out and thus the v3d driver was created. The vc4 driver now only handles the display part. We didn't properly split out the code when doing the BCM2711 support though, and most of the code around buffer allocations is still involved, even though it doesn't have the backing hardware anymore. Let's start the split out by creating a new drm_driver that only reports and uses what we support on the BCM2711. The ioctl were properly filtered already, but we were still exposing a .gem_create_object hook, as well as having an .open and .postclose hooks which are only relevant on older generations. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-6-maxime@cerno.tech
2022-06-16drm/vc4: bo: Split out Dumb buffers fixupMaxime Ripard
The vc4_bo_dumb_create() both fixes up the allocation arguments to match the hardware constraints and actually performs the allocation. Since we're going to introduce a new function that uses a different allocator, let's split the arguments fixup to a separate function we will be able to reuse. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-5-maxime@cerno.tech
2022-06-16drm/vc4: bo: Rename vc4_dumb_createMaxime Ripard
We're going to add a new variant of the dumb BO allocation function, so let's rename vc4_dumb_create() to something a bit more specific. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-4-maxime@cerno.tech
2022-06-16drm/vc4: Consolidate Hardware Revision CheckMaxime Ripard
A new generation of controller has been introduced with the BCM2711/RaspberryPi4. This generation needs a bunch of quirks, and over time we've piled on a number of checks in most parts of the drivers. All these checks are performed several times, and are not always consistent. Let's create a single, global, variable to hold it and use it everywhere. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-3-maxime@cerno.tech
2022-06-16drm/vc4: plane: Prevent async update if we don't have a dlistMaxime Ripard
The vc4 planes are setup in hardware by creating a hardware descriptor in a dedicated RAM. As part of the process to setup a plane in KMS, we thus need to allocate some part of that dedicated RAM to store our descriptor there. The async update path will just reuse the descriptor already allocated for that plane and will modify it directly in RAM to match whatever has been asked for. In order to do that, it will compare the descriptor for the old plane state and the new plane state, will make sure they fit in the same size, and check that only the position or buffer address have changed. Reviewed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220610115149.964394-2-maxime@cerno.tech
2022-06-16init: Initialize noop_backing_dev_info earlyJan Kara
noop_backing_dev_info is used by superblocks of various pseudofilesystems such as kdevtmpfs. After commit 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") this broke because __mark_inode_dirty() started to access more fields from noop_backing_dev_info and this led to crashes inside locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty(). Fix the problem by initializing noop_backing_dev_info before the filesystems get mounted. Fixes: 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-16ext2: fix fs corruption when trying to remove a non-empty directory with IO ↵Ye Bin
error We got issue as follows: [home]# mount /dev/sdd test [home]# cd test [test]# ls dir1 lost+found [test]# rmdir dir1 ext2_empty_dir: inject fault [test]# ls lost+found [test]# cd .. [home]# umount test [home]# fsck.ext2 -fn /dev/sdd e2fsck 1.42.9 (28-Dec-2013) Pass 1: Checking inodes, blocks, and sizes Inode 4065, i_size is 0, should be 1024. Fix? no Pass 2: Checking directory structure Pass 3: Checking directory connectivity Unconnected directory inode 4065 (/???) Connect to /lost+found? no '..' in ... (4065) is / (2), should be <The NULL inode> (0). Fix? no Pass 4: Checking reference counts Inode 2 ref count is 3, should be 4. Fix? no Inode 4065 ref count is 2, should be 3. Fix? no Pass 5: Checking group summary information /dev/sdd: ********** WARNING: Filesystem still has errors ********** /dev/sdd: 14/128016 files (0.0% non-contiguous), 18477/512000 blocks Reason is same with commit 7aab5c84a0f6. We can't assume directory is empty when read directory entry failed. Link: https://lore.kernel.org/r/20220615090010.1544152-1-yebin10@huawei.com Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-16drm/sun4i: Fix crash during suspend after component bind failureSamuel Holland
If the component driver fails to bind, or is unbound, the driver data for the top-level platform device points to a freed drm_device. If the system is then suspended, the driver passes this dangling pointer to drm_mode_config_helper_suspend(), which crashes. Fix this by only setting the driver data while the platform driver holds a reference to the drm_device. Fixes: 624b4b48d9d8 ("drm: sun4i: Add support for suspending the display driver") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220615054254.16352-1-samuel@sholland.org
2022-06-16drm/sun4i: dw-hdmi: Fix ddc-en GPIO consumer conflictSamuel Holland
commit 6de79dd3a920 ("drm/bridge: display-connector: add ddc-en gpio support") added a consumer for this GPIO in the HDMI connector device. This new consumer conflicts with the pre-existing GPIO consumer in the sun8i HDMI controller driver, which prevents the driver from probing: [ 4.983358] display-connector connector: GPIO lookup for consumer ddc-en [ 4.983364] display-connector connector: using device tree for GPIO lookup [ 4.983392] gpio-226 (ddc-en): gpiod_request: status -16 [ 4.983399] sun8i-dw-hdmi 6000000.hdmi: Couldn't get ddc-en gpio [ 4.983618] sun4i-drm display-engine: failed to bind 6000000.hdmi (ops sun8i_dw_hdmi_ops [sun8i_drm_hdmi]): -16 [ 4.984082] sun4i-drm display-engine: Couldn't bind all pipelines components [ 4.984171] sun4i-drm display-engine: adev bind failed: -16 [ 4.984179] sun8i-dw-hdmi: probe of 6000000.hdmi failed with error -16 Both drivers have the same behavior: they leave the GPIO active for the life of the device. Let's take advantage of the new implementation, and drop the now-obsolete code from the HDMI controller driver. Fixes: 6de79dd3a920 ("drm/bridge: display-connector: add ddc-en gpio support") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20220614073100.11550-1-samuel@sholland.org
2022-06-15xfs: preserve DIFLAG2_NREXT64 when setting other inode attributesDarrick J. Wong
It is vitally important that we preserve the state of the NREXT64 inode flag when we're changing the other flags2 fields. Fixes: 9b7d16e34bbe ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Chandan Babu R <chandan.babu@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15xfs: fix variable state usageDarrick J. Wong
The variable @args is fed to a tracepoint, and that's the only place it's used. This is fine for the kernel, but for userspace, tracepoints are #define'd out of existence, which results in this warning on gcc 11.2: xfs_attr.c: In function ‘xfs_attr_node_try_addname’: xfs_attr.c:1440:42: warning: unused variable ‘args’ [-Wunused-variable] 1440 | struct xfs_da_args *args = attr->xattri_da_args; | ^~~~ Clean this up. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15xfs: fix TOCTOU race involving the new logged xattrs control knobDarrick J. Wong
I found a race involving the larp control knob, aka the debugging knob that lets developers enable logging of extended attribute updates: Thread 1 Thread 2 echo 0 > /sys/fs/xfs/debug/larp setxattr(REPLACE) xfs_has_larp (returns false) xfs_attr_set echo 1 > /sys/fs/xfs/debug/larp xfs_attr_defer_replace xfs_attr_init_replace_state xfs_has_larp (returns true) xfs_attr_init_remove_state <oops, wrong DAS state!> This isn't a particularly severe problem right now because xattr logging is only enabled when CONFIG_XFS_DEBUG=y, and developers *should* know what they're doing. However, the eventual intent is that callers should be able to ask for the assistance of the log in persisting xattr updates. This capability might not be required for /all/ callers, which means that dynamic control must work correctly. Once an xattr update has decided whether or not to use logged xattrs, it needs to stay in that mode until the end of the operation regardless of what subsequent parallel operations might do. Therefore, it is an error to continue sampling xfs_globals.larp once xfs_attr_change has made a decision about larp, and it was not correct for me to have told Allison that ->create_intent functions can sample the global log incompat feature bitfield to decide to elide a log item. Instead, create a new op flag for the xfs_da_args structure, and convert all other callers of xfs_has_larp and xfs_sb_version_haslogxattrs within the attr update state machine to look for the operations flag. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2022-06-15selinux: free contexts previously transferred in selinux_add_opt()Christian Göttsche
`selinux_add_opt()` stopped taking ownership of the passed context since commit 70f4169ab421 ("selinux: parse contexts for mount options early"). unreferenced object 0xffff888114dfd140 (size 64): comm "mount", pid 15182, jiffies 4295687028 (age 796.340s) hex dump (first 32 bytes): 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f system_u:object_ 72 3a 74 65 73 74 5f 66 69 6c 65 73 79 73 74 65 r:test_filesyste backtrace: [<ffffffffa07dbef4>] kmemdup_nul+0x24/0x80 [<ffffffffa0d34253>] selinux_sb_eat_lsm_opts+0x293/0x560 [<ffffffffa0d13f08>] security_sb_eat_lsm_opts+0x58/0x80 [<ffffffffa0af1eb2>] generic_parse_monolithic+0x82/0x180 [<ffffffffa0a9c1a5>] do_new_mount+0x1f5/0x550 [<ffffffffa0a9eccb>] path_mount+0x2ab/0x1570 [<ffffffffa0aa019e>] __x64_sys_mount+0x20e/0x280 [<ffffffffa1f47124>] do_syscall_64+0x34/0x80 [<ffffffffa200007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff888108e71640 (size 64): comm "fsmount", pid 7607, jiffies 4295044974 (age 1601.016s) hex dump (first 32 bytes): 73 79 73 74 65 6d 5f 75 3a 6f 62 6a 65 63 74 5f system_u:object_ 72 3a 74 65 73 74 5f 66 69 6c 65 73 79 73 74 65 r:test_filesyste backtrace: [<ffffffff861dc2b1>] memdup_user+0x21/0x90 [<ffffffff861dc367>] strndup_user+0x47/0xa0 [<ffffffff864f6965>] __do_sys_fsconfig+0x485/0x9f0 [<ffffffff87940124>] do_syscall_64+0x34/0x80 [<ffffffff87a0007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Cc: stable@vger.kernel.org Fixes: 70f4169ab421 ("selinux: parse contexts for mount options early") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-06-15audit: free module nameChristian Göttsche
Reset the type of the record last as the helper `audit_free_module()` depends on it. unreferenced object 0xffff888153b707f0 (size 16): comm "modprobe", pid 1319, jiffies 4295110033 (age 1083.016s) hex dump (first 16 bytes): 62 69 6e 66 6d 74 5f 6d 69 73 63 00 6b 6b 6b a5 binfmt_misc.kkk. backtrace: [<ffffffffa07dbf9b>] kstrdup+0x2b/0x50 [<ffffffffa04b0a9d>] __audit_log_kern_module+0x4d/0xf0 [<ffffffffa03b6664>] load_module+0x9d4/0x2e10 [<ffffffffa03b8f44>] __do_sys_finit_module+0x114/0x1b0 [<ffffffffa1f47124>] do_syscall_64+0x34/0x80 [<ffffffffa200007e>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Cc: stable@vger.kernel.org Fixes: 12c5e81d3fd0 ("audit: prepare audit_context for use in calling contexts beyond syscalls") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-06-15Merge tag 'hardening-v5.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Correctly handle vm_map areas in hardened usercopy (Matthew Wilcox) - Adjust CFI RCU usage to avoid boot splats with cpuidle (Sami Tolvanen) * tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usercopy: Make usercopy resilient against ridiculously large copies usercopy: Cast pointer to an integer once usercopy: Handle vm_map_ram() areas cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle
2022-06-15drm/msm/gem: Drop early returns in close/purge vmaRob Clark
Keep the warn, but drop the early return. If we do manage to hit this sort of issue, skipping the cleanup just makes things worse (dangling drm_mm_nodes when the msm_gem_vma is freed, etc). Whereas the worst that happens if we tear down a mapping the GPU is accessing is that we get GPU iova faults, but otherwise the world keeps spinning. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Reported-by: Steev Klimaszewski <steev@kali.org> Patchwork: https://patchwork.freedesktop.org/patch/489115/ Link: https://lore.kernel.org/r/20220610172055.2337977-1-robdclark@gmail.com
2022-06-15drm/msm/gem: Separate object and vma unpinRob Clark
Previously the BO_PINNED state in the submit was tracking two related but different things: (1) that the buffer object was pinned, and (2) that the vma (mapping within a set of pagetables) was pinned. But with fenced vma unpin (needed so that userspace couldn't race with retire path for releasing a vma) these two were decoupled. The fact that the BO_PINNED flag was already cleared meant that we leaked the bo pin count which should have been dropped when the submit was retired. So split this state into BO_OBJ_PINNED and BO_VMA_PINNED, so they can be dropped independently. Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin") Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/487559/ Link: https://lore.kernel.org/r/20220527172341.2151005-1-robdclark@gmail.com
2022-06-15printk: Wait for the global console lock when the system is going downPetr Mladek
There are reports that the console kthreads block the global console lock when the system is going down, for example, reboot, panic. First part of the solution was to block kthreads in these problematic system states so they stopped handling newly added messages. Second part of the solution is to wait when for the kthreads when they are actively printing. It solves the problem when a message was printed before the system entered the problematic state and the kthreads managed to step in. A busy waiting has to be used because panic() can be called in any context and in an unknown state of the scheduler. There must be a timeout because the kthread might get stuck or sleeping and never release the lock. The timeout 10s is an arbitrary value inspired by the softlockup timeout. Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1 Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com Signed-off-by: Petr Mladek <pmladek@suse.com> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20220615162805.27962-3-pmladek@suse.com
2022-06-15printk: Block console kthreads when direct printing will be requiredPetr Mladek
There are known situations when the console kthreads are not reliable or does not work in principle, for example, early boot, panic, shutdown. For these situations there is the direct (legacy) mode when printk() tries to get console_lock() and flush the messages directly. It works very well during the early boot when the console kthreads are not available at all. It gets more complicated in the other situations when console kthreads might be actively printing and block console_trylock() in printk(). The same problem is in the legacy code as well. Any console_lock() owner could block console_trylock() in printk(). It is solved by a trick that the current console_lock() owner is responsible for printing all pending messages. It is actually the reason why there is the risk of softlockups and why the console kthreads were introduced. The console kthreads use the same approach. They are responsible for printing the messages by definition. So that they handle the messages anytime when they are awake and see new ones. The global console_lock is available when there is nothing to do. It should work well when the problematic context is correctly detected and printk() switches to the direct mode. But it seems that it is not enough in practice. There are reports that the messages are not printed during panic() or shutdown() even though printk() tries to use the direct mode here. The problem seems to be that console kthreads become active in these situation as well. They steel the job before other CPUs are stopped. Then they are stopped in the middle of the job and block the global console_lock. First part of the solution is to block console kthreads when the system is in a problematic state and requires the direct printk() mode. Link: https://lore.kernel.org/r/20220610205038.GA3050413@paulmck-ThinkPad-P17-Gen-1 Link: https://lore.kernel.org/r/CAMdYzYpF4FNTBPZsEFeWRuEwSies36QM_As8osPWZSr2q-viEA@mail.gmail.com Suggested-by: John Ogness <john.ogness@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220615162805.27962-2-pmladek@suse.com
2022-06-15Merge tag 'tpmdd-next-v5.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "Two fixes for this merge window" * tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST build certs/blacklist_hashes.c: fix const confusion in certs blacklist
2022-06-15NFSv4: Add FMODE_CAN_ODIRECT after successful open of a NFS4.x fileDave Wysochanski
Commit a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") added the FMODE_CAN_ODIRECT flag for NFSv3 but neglected to add it for NFSv4.x. This causes direct io on NFSv4.x to fail open with EINVAL: mount -o vers=4.2 127.0.0.1:/export /mnt/nfs4 dd if=/dev/zero of=/mnt/nfs4/file.bin bs=128k count=1 oflag=direct dd: failed to open '/mnt/nfs4/file.bin': Invalid argument dd of=/dev/null if=/mnt/nfs4/file.bin bs=128k count=1 iflag=direct dd: failed to open '/mnt/dir1/file1.bin': Invalid argument Fixes: a2ad63daa88b ("VFS: add FMODE_CAN_ODIRECT file flag") Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-06-15certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST buildMasahiro Yamada
Commit addf466389d9 ("certs: Check that builtin blacklist hashes are valid") was applied 8 months after the submission. In the meantime, the base code had been removed by commit b8c96a6b466c ("certs: simplify $(srctree)/ handling and remove config_filename macro"). Fix the Makefile. Create a local copy of $(CONFIG_SYSTEM_BLACKLIST_HASH_LIST). It is included from certs/blacklist_hashes.c and also works as a timestamp. Send error messages from check-blacklist-hashes.awk to stderr instead of stdout. Fixes: addf466389d9 ("certs: Check that builtin blacklist hashes are valid") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-06-15certs/blacklist_hashes.c: fix const confusion in certs blacklistMasahiro Yamada
This file fails to compile as follows: CC certs/blacklist_hashes.o certs/blacklist_hashes.c:4:1: error: ignoring attribute ‘section (".init.data")’ because it conflicts with previous ‘section (".init.rodata")’ [-Werror=attributes] 4 | const char __initdata *const blacklist_hashes[] = { | ^~~~~ In file included from certs/blacklist_hashes.c:2: certs/blacklist.h:5:38: note: previous declaration here 5 | extern const char __initconst *const blacklist_hashes[]; | ^~~~~~~~~~~~~~~~ Apply the same fix as commit 2be04df5668d ("certs/blacklist_nohashes.c: fix const confusion in certs blacklist"). Fixes: 734114f8782f ("KEYS: Add a system blacklist keyring") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-06-15x86/Hyper-V: Add SEV negotiate protocol support in Isolation VMTianyu Lan
Hyper-V Isolation VM current code uses sev_es_ghcb_hv_call() to read/write MSR via GHCB page and depends on the sev code. This may cause regression when sev code changes interface design. The latest SEV-ES code requires to negotiate GHCB version before reading/writing MSR via GHCB page and sev_es_ghcb_hv_call() doesn't work for Hyper-V Isolation VM. Add Hyper-V ghcb related implementation to decouple SEV and Hyper-V code. Negotiate GHCB version in the hyperv_init() and use the version to communicate with Hyper-V in the ghcb hv call function. Fixes: 2ea29c5abbc2 ("x86/sev: Save the negotiated GHCB version") Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-06-15x86/tdx: Clarify RIP adjustments in #VE handlerKirill A. Shutemov
After successful #VE handling, tdx_handle_virt_exception() has to move RIP to the next instruction. The handler needs to know the length of the instruction. If the #VE happened due to instruction execution, the GET_VEINFO TDX module call provides info on the instruction in R10, including its length. For #VE due to EPT violation, the info in R10 is not populand and the kernel must decode the instruction manually to find out its length. Restructure the code to make it explicit that the instruction length depends on the type of #VE. Make individual #VE handlers return the instruction length on success or -errno on failure. [ dhansen: fix up changelog and comments ] Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220614120135.14812-3-kirill.shutemov@linux.intel.com
2022-06-15Merge branch 'md-fixes' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19 Pull MD fixes from Song. * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid5-ppl: Fix argument order in bio_alloc_bioset() Revert "md: don't unregister sync_thread with reconfig_mutex held"
2022-06-15x86/tdx: Fix early #VE handlingKirill A. Shutemov
tdx_early_handle_ve() does not increment RIP after successfully handling the exception. That leads to infinite loop of exceptions. Move RIP when exceptions are successfully handled. [ dhansen: make problem statement more clear ] Fixes: 32e72854fa5f ("x86/tdx: Port I/O: Add early boot support") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lkml.kernel.org/r/20220614120135.14812-2-kirill.shutemov@linux.intel.com
2022-06-15md/raid5-ppl: Fix argument order in bio_alloc_bioset()Logan Gunthorpe
bio_alloc_bioset() takes a block device, number of vectors, the OP flags, the GFP mask and the bio set. However when the prototype was changed, the callisite in ppl_do_flush() had the OP flags and the GFP flags reversed. This introduced some sparse error: drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3 (different base types) drivers/md/raid5-ppl.c:632:57: expected unsigned int opf drivers/md/raid5-ppl.c:632:57: got restricted gfp_t [usertype] drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4 (different base types) drivers/md/raid5-ppl.c:633:61: expected restricted gfp_t [usertype] gfp_mask drivers/md/raid5-ppl.c:633:61: got unsigned long long The sparse error introduction may not have been reported correctly by 0day due to other work that was cleaning up other sparse errors in this area. Fixes: 609be1066731 ("block: pass a block_device and opf to bio_alloc_bioset") Cc: stable@vger.kernel.org # 5.18+ Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org>
2022-06-15bpf: Limit maximum modifier chain length in btf_check_type_tagsKumar Kartikeya Dwivedi
On processing a module BTF of module built for an older kernel, we might sometimes find that some type points to itself forming a loop. If such a type is a modifier, btf_check_type_tags's while loop following modifier chain will be caught in an infinite loop. Fix this by defining a maximum chain length and bailing out if we spin any longer than that. Fixes: eb596b090558 ("bpf: Ensure type tags precede modifiers in BTF") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220615042151.2266537-1-memxor@gmail.com
2022-06-15Revert "md: don't unregister sync_thread with reconfig_mutex held"Guoqing Jiang
The 07reshape5intr test is broke because of below path. md_reap_sync_thread -> mddev_unlock -> md_unregister_thread(&mddev->sync_thread) And md_check_recovery is triggered by, mddev_unlock -> md_wakeup_thread(mddev->thread) then mddev->reshape_position is set to MaxSector in raid5_finish_reshape since MD_RECOVERY_INTR is cleared in md_check_recovery, which means feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's reshape_position can't be updated accordingly. Fixes: 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held") Reported-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org>
2022-06-15mmc: mediatek: wait dma stop bit reset to 0Mengqi Zhang
MediaTek IP requires that after dma stop, it need to wait this dma stop bit auto-reset to 0. When bus is in high loading state, it will take a while for the dma stop complete. If there is no waiting operation here, when program runs to clear fifo and reset, bus will hang. In addition, there should be no return in msdc_data_xfer_next() if there is data need be transferred, because no matter what error occurs here, it should continue to excute to the following mmc_request_done. Otherwise the core layer may wait complete forever. Signed-off-by: Mengqi Zhang <mengqi.zhang@mediatek.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220609112239.18911-1-mengqi.zhang@mediatek.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-06-15Merge tag 'fs.fixes.v5.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull vfs idmapping fix from Christian Brauner: "This fixes an issue where we fail to change the group of a file when the caller owns the file and is a member of the group to change to. This is only relevant on idmapped mounts. There's a detailed description in the commit message and regression tests have been added to xfstests" * tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: account for group membership
2022-06-15dm: fix race in dm_start_io_acctBenjamin Marzinski
After commit 82f6cdcc3676c ("dm: switch dm_io booleans over to proper flags") dm_start_io_acct stopped atomically checking and setting was_accounted, which turned into the DM_IO_ACCOUNTED flag. This opened the possibility for a race where IO accounting is started twice for duplicate bios. To remove the race, check the flag while holding the io->lock. Fixes: 82f6cdcc3676c ("dm: switch dm_io booleans over to proper flags") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-06-15Merge tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme into block-5.19Jens Axboe
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.19 - quirks, quirks, quirks to work around buggy consumer grade devices (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh) - better kernel messages for devices that need quirking (Keith Bush) - make a kernel message more useful (Thomas Weißschuh)" * tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme: nvme-pci: disable write zeros support on UMIC and Samsung SSDs nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs nvme-pci: sk hynix p31 has bogus namespace ids nvme-pci: smi has bogus namespace ids nvme-pci: phison e12 has bogus namespace ids nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50 nvme-pci: add trouble shooting steps for timeouts nvme: add bug report info for global duplicate id nvme: add device name to warning in uuid_show()
2022-06-15arm64: ftrace: remove redundant labelMark Rutland
Since commit: c4a0ebf87cebbfa2 ("arm64/ftrace: Make function graph use ftrace directly") The 'ftrace_common_return' label has been unused. Remove it. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Will Deacon <will@kernel.org> Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614080944.1349146-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>