summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-01riscv: evaluate put_user() arg before enabling user accessBen Dooks
The <asm/uaccess.h> header has a problem with put_user(a, ptr) if the 'a' is not a simple variable, such as a function. This can lead to the compiler producing code as so: 1: enable_user_access() 2: evaluate 'a' into register 'r' 3: put 'r' to 'ptr' 4: disable_user_acess() The issue is that 'a' is now being evaluated with the user memory protections disabled. So we try and force the evaulation by assigning 'x' to __val at the start, and hoping the compiler barriers in enable_user_access() do the job of ordering step 2 before step 1. This has shown up in a bug where 'a' sleeps and thus schedules out and loses the SR_SUM flag. This isn't sufficient to fully fix, but should reduce the window of opportunity. The first instance of this we found is in scheudle_tail() where the code does: $ less -N kernel/sched/core.c 4263 if (current->set_child_tid) 4264 put_user(task_pid_vnr(current), current->set_child_tid); Here, the task_pid_vnr(current) is called within the block that has enabled the user memory access. This can be made worse with KASAN which makes task_pid_vnr() a rather large call with plenty of opportunity to sleep. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com Suggested-by: Arnd Bergman <arnd@arndb.de> -- Changes since v1: - fixed formatting and updated the patch description with more info Changes since v2: - fixed commenting on __put_user() (schwab@linux-m68k.org) Change since v3: - fixed RFC in patch title. Should be ready to merge. Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-01riscv: Drop const annotation for spKefeng Wang
The const annotation should not be used for 'sp', or it will become read only and lead to bad stack output. Fixes: dec822771b01 ("riscv: stacktrace: Move register keyword to beginning of declaration") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-01scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUsCan Guo
In __ufshcd_issue_tm_cmd(), it is not correct to use hba->nutrs + req->tag as the Task Tag in a TMR UPIU. Directly use req->tag as the Task Tag. Fixes: e293313262d3 ("scsi: ufs: Fix broken task management command implementation") Link: https://lore.kernel.org/r/1617262750-4864-3-git-send-email-cang@codeaurora.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01scsi: ufs: core: Fix task management request completion timeoutCan Guo
ufshcd_tmc_handler() calls blk_mq_tagset_busy_iter(fn = ufshcd_compl_tm()), but since blk_mq_tagset_busy_iter() only iterates over all reserved tags and requests which are not in IDLE state, ufshcd_compl_tm() never gets a chance to run. Thus, TMR always ends up with completion timeout. Fix it by calling blk_mq_start_request() in __ufshcd_issue_tm_cmd(). Link: https://lore.kernel.org/r/1617262750-4864-2-git-send-email-cang@codeaurora.org Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01scsi: hpsa: Add an assert to prevent __packed reintroductionSergei Trofimovich
Link: https://lore.kernel.org/r/20210330071958.3788214-3-slyfox@gentoo.org Fixes: f749d8b7a989 ("scsi: hpsa: Correct dev cmds outstanding for retried cmds") CC: linux-ia64@vger.kernel.org CC: storagedev@microchip.com CC: linux-scsi@vger.kernel.org CC: Joe Szczypek <jszczype@redhat.com> CC: Scott Benesh <scott.benesh@microchip.com> CC: Scott Teel <scott.teel@microchip.com> CC: Tomas Henzl <thenzl@redhat.com> CC: "Martin K. Petersen" <martin.petersen@oracle.com> CC: Don Brace <don.brace@microchip.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Suggested-by: Don Brace <don.brace@microchip.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01scsi: hpsa: Fix boot on ia64 (atomic_t alignment)Sergei Trofimovich
Boot failure was observed on an HP rx3600 ia64 machine with RAID bus controller: Hewlett-Packard Company Smart Array P600: kernel unaligned access to 0xe000000105dd8b95, ip=0xa000000100b87551 kernel unaligned access to 0xe000000105dd8e95, ip=0xa000000100b87551 hpsa 0000:14:01.0: Controller reports max supported commands of 0 Using 16 instead. Ensure that firmware is up to date. swapper/0[1]: error during unaligned kernel access The unaligned access comes from 'struct CommandList' that happens to be packed. Commit f749d8b7a989 ("scsi: hpsa: Correct dev cmds outstanding for retried cmds") introduced unexpected padding and unaligned atomic_t from natural alignment to something else. This change removes packing annotation from a struct not intended to be sent to controller as is. This restores natural `atomic_t` alignment. The change was tested on the same rx3600 machine. Link: https://lore.kernel.org/r/20210330071958.3788214-2-slyfox@gentoo.org Fixes: f749d8b7a989 ("scsi: hpsa: Correct dev cmds outstanding for retried cmds") CC: linux-ia64@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: storagedev@microchip.com CC: linux-scsi@vger.kernel.org CC: Joe Szczypek <jszczype@redhat.com> CC: Scott Benesh <scott.benesh@microchip.com> CC: Scott Teel <scott.teel@microchip.com> CC: Tomas Henzl <thenzl@redhat.com> CC: "Martin K. Petersen" <martin.petersen@oracle.com> CC: Don Brace <don.brace@microchip.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Suggested-by: Don Brace <don.brace@microchip.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01scsi: hpsa: Use __packed on individual structs, not header-wideSergei Trofimovich
The hpsa driver uses data structures which contain a combination of driver internals and commands sent directly to the hardware. To manage alignment for the hardware portions the driver used #pragma pack(1). Commit f749d8b7a989 ("scsi: hpsa: Correct dev cmds outstanding for retried cmds") switched an existing variable from int to bool. Due to the pragma an atomic_t in the same data structure ended up being misaligned and broke boot on ia64. Add __packed to every struct and union in the header file. Subsequent commits will address the actual atomic_t misalignment regression. The commit is a no-op at least on ia64: $ diff -u <(objdump -d -r old.o) <(objdump -d -r new.o) Link: https://lore.kernel.org/r/20210330071958.3788214-1-slyfox@gentoo.org Fixes: f749d8b7a989 ("scsi: hpsa: Correct dev cmds outstanding for retried cmds") CC: linux-ia64@vger.kernel.org CC: storagedev@microchip.com CC: linux-scsi@vger.kernel.org CC: Joe Szczypek <jszczype@redhat.com> CC: Scott Benesh <scott.benesh@microchip.com> CC: Scott Teel <scott.teel@microchip.com> CC: Tomas Henzl <thenzl@redhat.com> CC: "Martin K. Petersen" <martin.petersen@oracle.com> CC: Don Brace <don.brace@microchip.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Suggested-by: Don Brace <don.brace@microchip.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01Merge tag 'lto-v5.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull LTO fix from Kees Cook: "It seems that there is a bug in ld.bfd when doing module section merging. As explicit merging is only needed for LTO, the work-around is to only do it under LTO, leaving the original section layout choices alone under normal builds: - Only perform explicit module section merges under LTO (Sean Christopherson)" * tag 'lto-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: lto: Merge module sections if and only if CONFIG_LTO_CLANG is enabled
2021-04-01Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-04-01 This series contains updates to i40e driver only. Arkadiusz fixes warnings for inconsistent indentation. Magnus fixes an issue on xsk receive where single packets over time are batched rather than received immediately. Eryk corrects warnings and reporting of veb-stats. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01Merge branch 'mptcp-deadlock'David S. Miller
Paolo Abeni says: ==================== mptcp: mptcp: fix deadlock in mptcp{,6}_release syzkaller has reported a few deadlock triggered by mptcp{,6}_release. These patches address the issue in the easy way - blocking the relevant, multicast related, sockopt options on MPTCP sockets. Note that later on net-next we are going to revert patch 1/2, as a part of a larger MPTCP sockopt implementation refactor ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01mptcp: revert "mptcp: provide subflow aware release function"Paolo Abeni
This change reverts commit ad98dd37051e ("mptcp: provide subflow aware release function"). The latter introduced a deadlock spotted by syzkaller and is not needed anymore after the previous commit. Fixes: ad98dd37051e ("mptcp: provide subflow aware release function") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01mptcp: forbit mcast-related sockopt on MPTCP socketsPaolo Abeni
Unrolling mcast state at msk dismantel time is bug prone, as syzkaller reported: ====================================================== WARNING: possible circular locking dependency detected 5.11.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor905/8822 is trying to acquire lock: ffffffff8d678fe8 (rtnl_mutex){+.+.}-{3:3}, at: ipv6_sock_mc_close+0xd7/0x110 net/ipv6/mcast.c:323 but task is already holding lock: ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1600 [inline] ffff888024390120 (sk_lock-AF_INET6){+.+.}-{0:0}, at: mptcp6_release+0x57/0x130 net/mptcp/protocol.c:3507 which lock already depends on the new lock. Instead we can simply forbit any mcast-related setsockopt Fixes: 717e79c867ca5 ("mptcp: Add setsockopt()/getsockopt() socket operations") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01drivers: net: fix memory leak in peak_usb_create_devPavel Skripkin
syzbot reported memory leak in peak_usb. The problem was in case of failure after calling ->dev_init()[2] in peak_usb_create_dev()[1]. The data allocated int dev_init() wasn't freed, so simple ->dev_free() call fix this problem. backtrace: [<0000000079d6542a>] kmalloc include/linux/slab.h:552 [inline] [<0000000079d6542a>] kzalloc include/linux/slab.h:682 [inline] [<0000000079d6542a>] pcan_usb_fd_init+0x156/0x210 drivers/net/can/usb/peak_usb/pcan_usb_fd.c:868 [2] [<00000000c09f9057>] peak_usb_create_dev drivers/net/can/usb/peak_usb/pcan_usb_core.c:851 [inline] [1] [<00000000c09f9057>] peak_usb_probe+0x389/0x490 drivers/net/can/usb/peak_usb/pcan_usb_core.c:949 Reported-by: syzbot+91adee8d9ebb9193d22d@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01net: udp: Add support for getsockopt(..., ..., UDP_GRO, ..., ...);Norman Maurer
Support for UDP_GRO was added in the past but the implementation for getsockopt was missed which did lead to an error when we tried to retrieve the setting for UDP_GRO. This patch adds the missing switch case for UDP_GRO Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Norman Maurer <norman_maurer@apple.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01drivers: net: fix memory leak in atusb_probePavel Skripkin
syzbot reported memory leak in atusb_probe()[1]. The problem was in atusb_alloc_urbs(). Since urb is anchored, we need to release the reference to correctly free the urb backtrace: [<ffffffff82ba0466>] kmalloc include/linux/slab.h:559 [inline] [<ffffffff82ba0466>] usb_alloc_urb+0x66/0xe0 drivers/usb/core/urb.c:74 [<ffffffff82ad3888>] atusb_alloc_urbs drivers/net/ieee802154/atusb.c:362 [inline][2] [<ffffffff82ad3888>] atusb_probe+0x158/0x820 drivers/net/ieee802154/atusb.c:1038 [1] Reported-by: syzbot+28a246747e0a465127f3@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01Merge branch 'AF_XDP Socket Creation Fixes'Alexei Starovoitov
Ciara Loftus says: ==================== This series fixes some issues around socket creation for AF_XDP. Patch 1 fixes a potential NULL pointer dereference in xsk_socket__create_shared. Patch 2 ensures that the umem passed to xsk_socket__create(_shared) remains unchanged in event of failure. Patch 3 makes it possible for xsk_socket__create(_shared) to succeed even if the rx and tx XDP rings have already been set up by introducing a new fields to struct xsk_umem which represent the ring setup status for the xsk which shares the fd with the umem. v3->v4: * Reduced nesting in xsk_put_ctx as suggested by Alexei. * Use bools instead of a u8 and flags to represent the ring setup status as suggested by Björn. v2->v3: * Instead of ignoring the return values of the setsockopt calls, introduce a new flag to determine whether or not to call them based on the ring setup status as suggested by Alexei. v1->v2: * Simplified restoring the _save pointers as suggested by Magnus. * Fixed the condition which determines whether to unmap umem rings when socket create fails. ==================== Acked-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-04-01libbpf: Only create rx and tx XDP rings when necessaryCiara Loftus
Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com
2021-04-01libbpf: Restore umem state after socket create failureCiara Loftus
If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com
2021-04-01libbpf: Ensure umem pointer is non-NULL before dereferencingCiara Loftus
Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com
2021-04-01bpf: program: Refuse non-O_RDWR flags in BPF_OBJ_GETLorenz Bauer
As for bpf_link, refuse creating a non-O_RDWR fd. Since program fds currently don't allow modifications this is a precaution, not a straight up bug fix. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210326160501.46234-2-lmb@cloudflare.com
2021-04-01bpf: link: Refuse non-O_RDWR flags in BPF_OBJ_GETLorenz Bauer
Invoking BPF_OBJ_GET on a pinned bpf_link checks the path access permissions based on file_flags, but the returned fd ignores flags. This means that any user can acquire a "read-write" fd for a pinned link with mode 0664 by invoking BPF_OBJ_GET with BPF_F_RDONLY in file_flags. The fd can be used to invoke BPF_LINK_DETACH, etc. Fix this by refusing non-O_RDWR flags in BPF_OBJ_GET. This works because OBJ_GET by default returns a read write mapping and libbpf doesn't expose a way to override this behaviour for programs and links. Fixes: 70ed506c3bbc ("bpf: Introduce pinnable bpf_link abstraction") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210326160501.46234-1-lmb@cloudflare.com
2021-04-01drm/msm: Fix removal of valid error case when checking speed_binJohn Stultz
Commit 7bf168c8fe8c ("drm/msm: Fix speed-bin support not to access outside valid memory"), reworked the nvmem reading of "speed_bin", but in doing so dropped handling of the -ENOENT case which was previously documented as "fine". That change resulted in the db845c board display to fail to start, with the following error: adreno 5000000.gpu: [drm:a6xx_gpu_init] *ERROR* failed to read speed-bin (-2). Some OPPs may not be supported by hardware Thus, this patch simply re-adds the ENOENT handling so the lack of the speed_bin entry isn't fatal for display, and gets things working on db845c. Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Eric Anholt <eric@anholt.net> Cc: Douglas Anderson <dianders@chromium.org> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: YongQin Liu <yongqin.liu@linaro.org> Reported-by: YongQin Liu <yongqin.liu@linaro.org> Fixes: 7bf168c8fe8c ("drm/msm: Fix speed-bin support not to access outside valid memory") Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Message-Id: <20210330013408.2532048-1-john.stultz@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-04-01drm/msm: Set drvdata to NULL when msm_drm_init() failsStephen Boyd
We should set the platform device's driver data to NULL here so that code doesn't assume the struct drm_device pointer is valid when it could have been destroyed. The lifetime of this pointer is managed by a kref but when msm_drm_init() fails we call drm_dev_put() on the pointer which will free the pointer's memory. This driver uses the component model, so there's sort of two "probes" in this file, one for the platform device i.e. msm_pdev_probe() and one for the component i.e. msm_drm_bind(). The msm_drm_bind() code is using the platform device's driver data to store struct drm_device so the two functions are intertwined. This relationship becomes a problem for msm_pdev_shutdown() when it tests the NULL-ness of the pointer to see if it should call drm_atomic_helper_shutdown(). The NULL test is a proxy check for if the pointer has been freed by kref_put(). If the drm_device has been destroyed, then we shouldn't call the shutdown helper, and we know that is the case if msm_drm_init() failed, therefore set the driver data to NULL so that this pointer liveness is tracked properly. Fixes: 9d5cbf5fe46e ("drm/msm: add shutdown support for display platform_driver") Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Fabio Estevam <festevam@gmail.com> Cc: Krishna Manikandan <mkrishn@codeaurora.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Message-Id: <20210325212822.3663144-1-swboyd@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-04-01kbuild: lto: Merge module sections if and only if CONFIG_LTO_CLANG is enabledSean Christopherson
Merge module sections only when using Clang LTO. With ld.bfd, merging sections does not appear to update the symbol tables for the module, e.g. 'readelf -s' shows the value that a symbol would have had, if sections were not merged. ld.lld does not show this problem. The stale symbol table breaks gdb's function disassembler, and presumably other things, e.g. gdb -batch -ex "file arch/x86/kvm/kvm.ko" -ex "disassemble kvm_init" reads the wrong bytes and dumps garbage. Fixes: dd2776222abb ("kbuild: lto: merge module sections") Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210322234438.502582-1-seanjc@google.com
2021-04-01bpf: Refcount task stack in bpf_get_task_stackDave Marchevsky
On x86 the struct pt_regs * grabbed by task_pt_regs() points to an offset of task->stack. The pt_regs are later dereferenced in __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if the task in question exits while bpf_get_task_stack is executing, as warned by task_stack_page's comment: * When accessing the stack of a non-current task that might exit, use * try_get_task_stack() instead. task_stack_page will return a pointer * that could get freed out from under you. Taking the comment's advice and using try_get_task_stack() and put_task_stack() to hold task->stack refcount, or bail early if it's already 0. Incrementing stack_refcount will ensure the task's stack sticks around while we're using its data. I noticed this bug while testing a bpf task iter similar to bpf_iter_task_stack in selftests, except mine grabbed user stack, and getting intermittent crashes, which resulted in dumps like: BUG: unable to handle page fault for address: 0000000000003fe0 \#PF: supervisor read access in kernel mode \#PF: error_code(0x0000) - not-present page RIP: 0010:__bpf_get_stack+0xd0/0x230 <snip...> Call Trace: bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec bpf_iter_run_prog+0x24/0x81 __task_seq_show+0x58/0x80 bpf_seq_read+0xf7/0x3d0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x48/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210401000747.3648767-1-davemarchevsky@fb.com
2021-04-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "It's a bit larger than I (and probably you) would like by the time we get to -rc6, but perhaps not entirely unexpected since the changes in the last merge window were larger than usual. x86: - Fixes for missing TLB flushes with TDP MMU - Fixes for race conditions in nested SVM - Fixes for lockdep splat with Xen emulation - Fix for kvmclock underflow - Fix srcdir != builddir builds - Other small cleanups ARM: - Fix GICv3 MMIO compatibility probing - Prevent guests from using the ARMv8.4 self-hosted tracing extension" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0) KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update() KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken KVM: x86: reduce pvclock_gtod_sync_lock critical sections KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit KVM: SVM: load control fields from VMCB12 before checking them KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap KVM: make: Fix out-of-source module builds selftests: kvm: make hardware_disable_test less verbose KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE KVM: x86: remove unused declaration of kvm_write_tsc() KVM: clean up the unused argument tools/kvm_stat: Add restart delay KVM: arm64: Fix CPU interface MMIO compatibility detection KVM: arm64: Disable guest access to trace filter controls KVM: arm64: Hide system instruction access to Trace registers
2021-04-01Merge tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Things have settled down in time for Easter, a random smattering of small fixes across a few drivers. I'm guessing though there might be some i915 and misc fixes out there I haven't gotten yet, but since today is a public holiday here, I'm sending this early so I can have the day off, I'll see if more requests come in and decide what to do with them later. amdgpu: - Polaris idle power fix - VM fix - Vangogh S3 fix - Fixes for non-4K page sizes amdkfd: - dqm fence memory corruption fix tegra: - lockdep warning fix - runtine PM reference fix - display controller fix - PLL Fix imx: - memory leak in error path fix - LDB driver channel registration fix - oob array warning in LDB driver exynos - unused header file removal" * tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: check alignment on CPU page for bo map drm/amdgpu: Set a suitable dev_info.gart_page_size drm/amdgpu/vangogh: don't check for dpm in is_dpm_running when in suspend drm/amdkfd: dqm fence memory corruption drm/tegra: sor: Grab runtime PM reference across reset drm/tegra: dc: Restore coupling of display controllers gpu: host1x: Use different lock classes for each client drm/tegra: dc: Don't set PLL clock to 0Hz drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings() drm/amd/pm: no need to force MCLK to highest when no display connected drm/exynos/decon5433: Remove the unused include statements drm/imx: imx-ldb: fix out of bounds array access warning drm/imx: imx-ldb: Register LDB channel1 when it is the only channel to be used drm/imx: fix memory leak when fails to init
2021-04-02Merge tag 'imx-drm-fixes-2021-04-01' of ↵Dave Airlie
git://git.pengutronix.de/git/pza/linux into drm-fixes drm/imx: imx-drm-core and imx-ldb fixes Fix a memory leak in an error path during DRM device initialization, fix the LDB driver to register channel 1 even if channel 0 is unused, and fix an out of bounds array access warning in the LDB driver. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210401092235.GA13586@pengutronix.de
2021-04-02Merge tag 'drm/tegra/for-5.12-rc6' of ↵Dave Airlie
ssh://git.freedesktop.org/git/tegra/linux into drm-fixes drm/tegra: Fixes for v5.12-rc6 This contains a couple of fixes for various issues such as lockdep warnings, runtime PM references, coupled display controllers and misconfigured PLLs. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thierry Reding <thierry.reding@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210401163352.3348296-1-thierry.reding@gmail.com
2021-04-01RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session filesMd Haris Iqbal
KASAN detected the following BUG: BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10 Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012 Call Trace: <IRQ> dump_stack+0x96/0xe0 print_address_description.constprop.4+0x1f/0x300 ? irq_work_claim+0x2e/0x50 __kasan_report.cold.8+0x78/0x92 ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] kasan_report+0x10/0x20 rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client] rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client] ? lockdep_hardirqs_on+0x1a8/0x290 ? process_io_rsp+0xb0/0xb0 [rtrs_client] ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib] ? add_interrupt_randomness+0x1a2/0x340 __ib_process_cq+0x97/0x100 [ib_core] ib_poll_handler+0x41/0xb0 [ib_core] irq_poll_softirq+0xe0/0x260 __do_softirq+0x127/0x672 irq_exit+0xd1/0xe0 do_IRQ+0xa3/0x1d0 common_interrupt+0xf/0xf </IRQ> RIP: 0010:cpuidle_enter_state+0xea/0x780 Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00 RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414 RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0 ? lockdep_hardirqs_on+0x1a8/0x290 ? cpuidle_enter_state+0xe5/0x780 cpuidle_enter+0x3c/0x60 do_idle+0x2fb/0x390 ? arch_cpu_idle_exit+0x40/0x40 ? schedule+0x94/0x120 cpu_startup_entry+0x19/0x1b start_kernel+0x5da/0x61b ? thread_stack_cache_init+0x6/0x6 ? load_ucode_amd_bsp+0x6f/0xc4 ? init_amd_microcode+0xa6/0xa6 ? x86_family+0x5/0x20 ? load_ucode_bsp+0x182/0x1fd secondary_startup_64+0xa4/0xb0 Allocated by task 5730: save_stack+0x19/0x80 __kasan_kmalloc.constprop.9+0xc1/0xd0 kmem_cache_alloc_trace+0x15b/0x350 alloc_sess+0xf4/0x570 [rtrs_client] rtrs_clt_open+0x3b4/0x780 [rtrs_client] find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client] rnbd_clt_map_device+0xd7/0xf50 [rnbd_client] rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client] kernfs_fop_write+0x141/0x240 vfs_write+0xf3/0x280 ksys_write+0xba/0x150 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5822: save_stack+0x19/0x80 __kasan_slab_free+0x125/0x170 kfree+0xe7/0x3f0 kobject_put+0xd3/0x240 rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client] rtrs_clt_close+0x3c/0x80 [rtrs_client] close_rtrs+0x45/0x80 [rnbd_client] rnbd_client_exit+0x10f/0x2bd [rnbd_client] __x64_sys_delete_module+0x27b/0x340 do_syscall_64+0x68/0x270 entry_SYSCALL_64_after_hwframe+0x49/0xbe When rtrs_clt_close is triggered, it iterates over all the present rtrs_clt_sess and triggers close on them. However, the call to rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This is incorrect since during the initialization phase we allocate rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it. If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it may so happen that an inflight IO completion would trigger the function rtrs_clt_rdma_done, which would lead to the above UAF case. Hence close the rtrs_clt_con connections first, and then trigger the destruction of session files. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01tracing: Fix stack trace event sizeSteven Rostedt (VMware)
Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace event so that user space could parse it properly. Originally the stack trace format to user space showed that the called stack was a dynamic array. But it is not actually a dynamic array, in the way that other dynamic event arrays worked, and this broke user space parsing for it. The update was to make the array look to have 8 entries in it. Helper functions were added to make it parse it correctly, as the stack was dynamic, but was determined by the size of the event stored. Although this fixed user space on how it read the event, it changed the internal structure used for the stack trace event. It changed the array size from [0] to [8] (added 8 entries). This increased the size of the stack trace event by 8 words. The size reserved on the ring buffer was the size of the stack trace event plus the number of stack entries found in the stack trace. That commit caused the amount to be 8 more than what was needed because it did not expect the caller field to have any size. This produced 8 entries of garbage (and reading random data) from the stack trace event: <idle>-0 [002] d... 1976396.837549: <stack trace> => trace_event_raw_event_sched_switch => __traceiter_sched_switch => __schedule => schedule_idle => do_idle => cpu_startup_entry => secondary_startup_64_no_verify => 0xc8c5e150ffff93de => 0xffff93de => 0 => 0 => 0xc8c5e17800000000 => 0x1f30affff93de => 0x00000004 => 0x200000000 Instead, subtract the size of the caller field from the size of the event to make sure that only the amount needed to store the stack trace is reserved. Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/ Cc: stable@vger.kernel.org Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly") Reported-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01Merge tag 'sound-5.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Things seem calming down, only usual device-specific fixes for HD-audio and USB-audio at this time" * tag 'sound-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8 ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks ALSA: hda: Re-add dropped snd_poewr_change_state() calls ALSA: usb-audio: Apply sample rate quirk to Logitech Connect ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO
2021-04-01Merge tag 'tomoyo-pr-20210401' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1Linus Torvalds
Pull tomory fix from Tetsuo Handa: "An update on 'tomoyo: recognize kernel threads correctly' from Jens Axboe to not special case PF_IO_WORKER for PF_KTHREAD" * tag 'tomoyo-pr-20210401' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1: tomoyo: don't special case PF_IO_WORKER for PF_KTHREAD
2021-04-01Merge tag 'xarray-5.12' of git://git.infradead.org/users/willy/xarrayLinus Torvalds
Pull XArray fixes from Matthew Wilcox: "My apologies for the lateness of this. I had a bug reported in the test suite, and when I started working on it, I realised I had two fixes sitting in the xarray tree since last November. Anyway, everything here is fixes, apart from adding xa_limit_16b. The test suite passes. Summary: - Fix a bug when splitting to a non-zero order - Documentation fix - Add a predefined 16-bit allocation limit - Various test suite fixes" * tag 'xarray-5.12' of git://git.infradead.org/users/willy/xarray: idr test suite: Improve reporting from idr_find_test_1 idr test suite: Create anchor before launching throbber idr test suite: Take RCU read lock in idr_find_test_1 radix tree test suite: Register the main thread with the RCU library radix tree test suite: Fix compilation XArray: Add xa_limit_16b XArray: Fix splitting to non-zero orders XArray: Fix split documentation
2021-04-01i40e: Fix display statistics for veb_tcEryk Rybak
If veb-stats was enabled, the ethtool stats triggered a warning due to invalid size: 'unexpected stat size for veb.tc_%u_tx_packets'. This was due to an incorrect structure definition for the statistics. Structures and functions have been improved in line with requirements for the presentation of statistics, in particular for the functions: 'i40e_add_ethtool_stats' and 'i40e_add_stat_strings'. Fixes: 1510ae0be2a4 ("i40e: convert VEB TC stats to use an i40e_stats array") Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01i40e: fix receiving of single packets in xsk zero-copy modeMagnus Karlsson
Fix so that single packets are received immediately instead of in batches of 8. If you sent 1 pps to a system, you received 8 packets every 8 seconds instead of 1 packet every second. The problem behind this was that the work_done reporting from the Tx part of the driver was broken. The work_done reporting in i40e controls not only the reporting back to the napi logic but also the setting of the interrupt throttling logic. When Tx or Rx reports that it has more to do, interrupts are throttled or coalesced and when they both report that they are done, interrupts are armed right away. If the wrong work_done value is returned, the logic will start to throttle interrupts in a situation where it should have just enabled them. This leads to the undesired batching behavior seen in user-space. Fix this by returning the correct boolean value from the Tx xsk zero-copy path. Return true if there is nothing to do or if we got fewer packets to process than we asked for. Return false if we got as many packets as the budget since there might be more packets we can process. Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01i40e: Fix inconsistent indentingArkadiusz Kubalewski
Fixed new static analysis findings: "warn: inconsistent indenting" - introduced lately, reported with lkp and smatch build. Fixes: 4b208eaa8078 ("i40e: Add init and default config of software based DCB") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01io_uring: fix EIOCBQUEUED iter revertPavel Begunkov
iov_iter_revert() is done in completion handlers that happensf before read/write returns -EIOCBQUEUED, no need to repeat reverting afterwards. Moreover, even though it may appear being just a no-op, it's actually races with 1) user forging a new iovec of a different size 2) reissue, that is done via io-wq continues completely asynchronously. Fixes: 3e6a0d3c7571c ("io_uring: fix -EAGAIN retry with IOPOLL") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01io_uring/io-wq: protect against sprintf overflowPavel Begunkov
task_pid may be large enough to not fit into the left space of TASK_COMM_LEN-sized buffers and overflow in sprintf. We not so care about uniqueness, so replace it with safer snprintf(). Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1702c6145d7e1c46fbc382f28334c02e1a3d3994.1617267273.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01io_uring: don't mark S_ISBLK async work as unboundedJens Axboe
S_ISBLK is marked as unbounded work for async preparation, because it doesn't match S_ISREG. That is incorrect, as any read/write to a block device is also a bounded operation. Fix it up and ensure that S_ISBLK isn't marked unbounded. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01ARM: mvebu: avoid clang -Wtautological-constant warningArnd Bergmann
Clang warns about the comparison when using a 32-bit phys_addr_t: drivers/bus/mvebu-mbus.c:621:17: error: result of comparison of constant 4294967296 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (reg_start >= 0x100000000ULL) Add a cast to shut up the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20210323131952.2835509-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01ARM: pxa: mainstone: avoid -Woverride-init warningArnd Bergmann
The default initializer at the start of the array causes a warning when building with W=1: In file included from arch/arm/mach-pxa/mainstone.c:47: arch/arm/mach-pxa/mainstone.h:124:33: error: initialized field overwritten [-Werror=override-init] 124 | #define MAINSTONE_IRQ(x) (MAINSTONE_NR_IRQS + (x)) | ^ arch/arm/mach-pxa/mainstone.h:133:33: note: in expansion of macro 'MAINSTONE_IRQ' 133 | #define MAINSTONE_S0_CD_IRQ MAINSTONE_IRQ(9) | ^~~~~~~~~~~~~ arch/arm/mach-pxa/mainstone.c:506:15: note: in expansion of macro 'MAINSTONE_S0_CD_IRQ' 506 | [5] = MAINSTONE_S0_CD_IRQ, | ^~~~~~~~~~~~~~~~~~~ Rework the initializer to list each element explicitly and only once. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210323130849.2362001-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01ARM: omap1: fix building with clang IASArnd Bergmann
The clang integrated assembler fails to build one file with a complex asm instruction: arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: error: invalid instruction, any one of the following would fix this: mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit ^ arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: note: instruction requires: armv6t2 mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit ^ arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: note: instruction requires: thumb2 mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit ^ The problem is that 'NR_IRQS_LEGACY' is not defined here. Apparently gas does not care because we first add and then subtract this number, leading to the immediate value to be the same regardless of the specific definition of NR_IRQS_LEGACY. Neither the way that 'gas' just silently builds this file, nor the way that clang IAS makes nonsensical suggestions for how to fix it is great. Fortunately there is an easy fix, which is to #include the header that contains the definition. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Lindgren <tony@atomide.com> Link: https://lore.kernel.org/r/20210308153430.2530616-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01soc/fsl: qbman: fix conflicting alignment attributesArnd Bergmann
When building with W=1, gcc points out that the __packed attribute on struct qm_eqcr_entry conflicts with the 8-byte alignment attribute on struct qm_fd inside it: drivers/soc/fsl/qbman/qman.c:189:1: error: alignment 1 of 'struct qm_eqcr_entry' is less than 8 [-Werror=packed-not-aligned] I assume that the alignment attribute is the correct one, and that qm_eqcr_entry cannot actually be unaligned in memory, so add the same alignment on the outer struct. Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210323131530.2619900-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01ARM: keystone: fix integer overflow warningArnd Bergmann
clang warns about an impossible condition when building with 32-bit phys_addr_t: arch/arm/mach-keystone/keystone.c:79:16: error: result of comparison of constant 51539607551 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] mem_end > KEYSTONE_HIGH_PHYS_END) { ~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~ arch/arm/mach-keystone/keystone.c:78:16: error: result of comparison of constant 34359738368 with expression of type 'phys_addr_t' (aka 'unsigned int') is always true [-Werror,-Wtautological-constant-out-of-range-compare] if (mem_start < KEYSTONE_HIGH_PHYS_START || ~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~ Change the temporary variable to a fixed-size u64 to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Link: https://lore.kernel.org/r/20210323131814.2751750-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-02powerpc/vdso: Make sure vdso_wrapper.o is rebuilt everytime vdso.so is rebuiltChristophe Leroy
Commit bce74491c300 ("powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o") moved vdso32_wrapper.o and vdso64_wrapper.o out of arch/powerpc/kernel/vdso[32/64]/ and removed the dependencies in the Makefile. This leads to the wrappers not being re-build hence the kernel embedding the old vdso library. Add back missing dependencies to ensure vdso32_wrapper.o and vdso64_wrapper.o are rebuilt when vdso32.so.dbg and vdso64.so.dbg are changed. Fixes: bce74491c300 ("powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8bb015bc98c51d8ced581415b7e3d157e18da7c9.1617181918.git.christophe.leroy@csgroup.eu
2021-04-02powerpc/signal32: Fix Oops on sigreturn with unmapped VDSOChristophe Leroy
PPC32 encounters a KUAP fault when trying to handle a signal with VDSO unmapped. Kernel attempted to read user page (7fc07ec0) - exploit attempt? (uid: 0) BUG: Unable to handle kernel data access on read at 0x7fc07ec0 Faulting instruction address: 0xc00111d4 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=16K PREEMPT CMPC885 CPU: 0 PID: 353 Comm: sigreturn_vdso Not tainted 5.12.0-rc4-s3k-dev-01553-gb30c310ea220 #4814 NIP: c00111d4 LR: c0005a28 CTR: 00000000 REGS: cadb3dd0 TRAP: 0300 Not tainted (5.12.0-rc4-s3k-dev-01553-gb30c310ea220) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48000884 XER: 20000000 DAR: 7fc07ec0 DSISR: 88000000 GPR00: c0007788 cadb3e90 c28d4a40 7fc07ec0 7fc07ed0 000004e0 7fc07ce0 00000000 GPR08: 00000001 00000001 7fc07ec0 00000000 28000282 1001b828 100a0920 00000000 GPR16: 100cac0c 100b0000 105c43a4 105c5685 100d0000 100d0000 100d0000 100b2e9e GPR24: ffffffff 105c43c8 00000000 7fc07ec8 cadb3f40 cadb3ec8 c28d4a40 00000000 NIP [c00111d4] flush_icache_range+0x90/0xb4 LR [c0005a28] handle_signal32+0x1bc/0x1c4 Call Trace: [cadb3e90] [100d0000] 0x100d0000 (unreliable) [cadb3ec0] [c0007788] do_notify_resume+0x260/0x314 [cadb3f20] [c000c764] syscall_exit_prepare+0x120/0x184 [cadb3f30] [c00100b4] ret_from_syscall+0xc/0x28 --- interrupt: c00 at 0xfe807f8 NIP: 0fe807f8 LR: 10001060 CTR: c0139378 REGS: cadb3f40 TRAP: 0c00 Not tainted (5.12.0-rc4-s3k-dev-01553-gb30c310ea220) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000482 XER: 20000000 GPR00: 00000025 7fc081c0 77bb1690 00000000 0000000a 28000482 00000001 0ff03a38 GPR08: 0000d032 00006de5 c28d4a40 00000009 88000482 1001b828 100a0920 00000000 GPR16: 100cac0c 100b0000 105c43a4 105c5685 100d0000 100d0000 100d0000 100b2e9e GPR24: ffffffff 105c43c8 00000000 77ba7628 10002398 10010000 10002124 00024000 NIP [0fe807f8] 0xfe807f8 LR [10001060] 0x10001060 --- interrupt: c00 Instruction dump: 38630010 7c001fac 38630010 4200fff0 7c0004ac 4c00012c 4e800020 7c001fac 2c0a0000 38630010 4082ffcc 4bffffe4 <7c00186c> 2c070000 39430010 4082ff8c ---[ end trace 3973fb72b049cb06 ]--- This is because flush_icache_range() is called on user addresses. The same problem was detected some time ago on PPC64. It was fixed by enabling KUAP in commit 59bee45b9712 ("powerpc/mm: Fix missing KUAP disable in flush_coherent_icache()"). PPC32 doesn't use flush_coherent_icache() and fallbacks on clean_dcache_range() and invalidate_icache_range(). We could fix it similarly by enabling user access in those functions, but this is overkill for just flushing two instructions. The two instructions are 8 bytes aligned, so a single dcbst/icbi is enough to flush them. Do like __patch_instruction() and inline a dcbst followed by an icbi just after the write of the instructions, while user access is still allowed. The isync is not required because rfi will be used to return to user. icbi() is handled as a read so read-write user access is needed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bde9154e5351a5ac7bca3d59cdb5a5e8edacbb79.1617199569.git.christophe.leroy@csgroup.eu
2021-04-02powerpc/ptrace: Don't return error when getting/setting FP regs without ↵Christophe Leroy
CONFIG_PPC_FPU_REGS An #ifdef CONFIG_PPC_FPU_REGS is missing in arch_ptrace() leading to the following Oops because [REGSET_FPR] entry is not initialised in native_regsets[]. [ 41.917608] BUG: Unable to handle kernel instruction fetch [ 41.922849] Faulting instruction address: 0xff8fd228 [ 41.927760] Oops: Kernel access of bad area, sig: 11 [#1] [ 41.933089] BE PAGE_SIZE=4K PREEMPT CMPC885 [ 41.940753] Modules linked in: [ 41.943768] CPU: 0 PID: 366 Comm: gdb Not tainted 5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty #4835 [ 41.952800] NIP: ff8fd228 LR: c004d9e0 CTR: ff8fd228 [ 41.957790] REGS: caae9df0 TRAP: 0400 Not tainted (5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty) [ 41.966741] MSR: 40009032 <EE,ME,IR,DR,RI> CR: 82004248 XER: 20000000 [ 41.973540] [ 41.973540] GPR00: c004d9b4 caae9eb0 c1b64f60 c1b64520 c0713cd4 caae9eb8 c1bacdfc 00000004 [ 41.973540] GPR08: 00000200 ff8fd228 c1bac700 00001032 28004242 1061aaf4 00000001 106d64a0 [ 41.973540] GPR16: 00000000 00000000 7fa0a774 10610000 7fa0aef9 00000000 10610000 7fa0a538 [ 41.973540] GPR24: 7fa0a580 7fa0a570 c1bacc00 c1b64520 c1bacc00 caae9ee8 00000108 c0713cd4 [ 42.009685] NIP [ff8fd228] 0xff8fd228 [ 42.013300] LR [c004d9e0] __regset_get+0x100/0x124 [ 42.018036] Call Trace: [ 42.020443] [caae9eb0] [c004d9b4] __regset_get+0xd4/0x124 (unreliable) [ 42.026899] [caae9ee0] [c004da94] copy_regset_to_user+0x5c/0xb0 [ 42.032751] [caae9f10] [c002f640] sys_ptrace+0xe4/0x588 [ 42.037915] [caae9f30] [c0011010] ret_from_syscall+0x0/0x28 [ 42.043422] --- interrupt: c00 at 0xfd1f8e4 [ 42.047553] NIP: 0fd1f8e4 LR: 1004a688 CTR: 00000000 [ 42.052544] REGS: caae9f40 TRAP: 0c00 Not tainted (5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty) [ 42.061494] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48004442 XER: 00000000 [ 42.068551] [ 42.068551] GPR00: 0000001a 7fa0a040 77dad7e0 0000000e 00000170 00000000 7fa0a078 00000004 [ 42.068551] GPR08: 00000000 108deb88 108dda40 106d6010 44004442 1061aaf4 00000001 106d64a0 [ 42.068551] GPR16: 00000000 00000000 7fa0a774 10610000 7fa0aef9 00000000 10610000 7fa0a538 [ 42.068551] GPR24: 7fa0a580 7fa0a570 1078fe00 1078fd70 1078fd70 00000170 0fdd3244 0000000d [ 42.104696] NIP [0fd1f8e4] 0xfd1f8e4 [ 42.108225] LR [1004a688] 0x1004a688 [ 42.111753] --- interrupt: c00 [ 42.114768] Instruction dump: [ 42.117698] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 42.125443] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 42.133195] ---[ end trace d35616f22ab2100c ]--- Adding the missing #ifdef is not good because gdb doesn't like getting an error when getting registers. Instead, make ptrace return 0s when CONFIG_PPC_FPU_REGS is not set. Fixes: b6254ced4da6 ("powerpc/signal: Don't manage floating point regs when no FPU") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9121a44a2d50ba1af18d8aa5ada06c9a3bea8afd.1617200085.git.christophe.leroy@csgroup.eu
2021-04-01null_blk: fix command timeout completion handlingDamien Le Moal
Memory backed or zoned null block devices may generate actual request timeout errors due to the submission path being blocked on memory allocation or zone locking. Unlike fake timeouts or injected timeouts, the request submission path will call blk_mq_complete_request() or blk_mq_end_request() for these real timeout errors, causing a double completion and use after free situation as the block layer timeout handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in blk_mq_check_expired(). This problem often triggers a NULL pointer dereference such as: BUG: kernel NULL pointer dereference, address: 0000000000000050 RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20 ... Call Trace: dd_finish_request+0x56/0x80 blk_mq_free_request+0x37/0x130 null_handle_cmd+0xbf/0x250 [null_blk] ? null_queue_rq+0x67/0xd0 [null_blk] blk_mq_dispatch_rq_list+0x122/0x850 __blk_mq_do_dispatch_sched+0xbb/0x2c0 __blk_mq_sched_dispatch_requests+0x13d/0x190 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x49/0x90 process_one_work+0x26c/0x580 worker_thread+0x55/0x3c0 ? process_one_work+0x580/0x580 kthread+0x134/0x150 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x1f/0x30 This problem very often triggers when running the full btrfs xfstests on a memory-backed zoned null block device in a VM with limited amount of memory. Avoid this by executing blk_mq_complete_request() in null_timeout_rq() only for commands that are marked for a fake timeout completion using the fake_timeout boolean in struct null_cmd. For timeout errors injected through debugfs, the timeout handler will execute blk_mq_complete_request()i as before. This is safe as the submission path does not execute complete requests in this case. In null_timeout_rq(), also make sure to set the command error field to BLK_STS_TIMEOUT and to propagate this error through to the request completion. Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Reviewed-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01idr test suite: Improve reporting from idr_find_test_1Matthew Wilcox (Oracle)
Instead of just reporting an assertion failure, report enough information that we can start diagnosing exactly went wrong. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>