summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-13Merge tag 'kvmarm-fixes-5.14-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.14, take #2 - Plug race between enabling MTE and creating vcpus - Fix off-by-one bug when checking whether an address range is RAM
2021-08-13KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PFSean Christopherson
Use vmx_need_pf_intercept() when determining if L0 wants to handle a #PF in L2 or if the VM-Exit should be forwarded to L1. The current logic fails to account for the case where #PF is intercepted to handle guest.MAXPHYADDR < host.MAXPHYADDR and ends up reflecting all #PFs into L1. At best, L1 will complain and inject the #PF back into L2. At worst, L1 will eat the unexpected fault and cause L2 to hang on infinite page faults. Note, while the bug was technically introduced by the commit that added support for the MAXPHYADDR madness, the shame is all on commit a0c134347baf ("KVM: VMX: introduce vmx_need_pf_intercept"). Fixes: 1dbf5d68af6f ("KVM: VMX: Add guest physical address check in EPT violation and misconfig") Cc: stable@vger.kernel.org Cc: Peter Shier <pshier@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210812045615.3167686-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13kvm: vmx: Sync all matching EPTPs when injecting nested EPT faultJunaid Shahid
When a nested EPT violation/misconfig is injected into the guest, the shadow EPT PTEs associated with that address need to be synced. This is done by kvm_inject_emulated_page_fault() before it calls nested_ept_inject_page_fault(). However, that will only sync the shadow EPT PTE associated with the current L1 EPTP. Since the ASID is based on EP4TA rather than the full EPTP, so syncing the current EPTP is not enough. The SPTEs associated with any other L1 EPTPs in the prev_roots cache with the same EP4TA also need to be synced. Signed-off-by: Junaid Shahid <junaids@google.com> Message-Id: <20210806222229.1645356-1-junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13Merge branch 'kvm-vmx-secctl' into kvm-masterPaolo Bonzini
Merge common topic branch for 5.14-rc6 and 5.15 merge window.
2021-08-13KVM: x86: remove dead initializationPaolo Bonzini
hv_vcpu is initialized again a dozen lines below, and at this point vcpu->arch.hyperv is not valid. Remove the initializer. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-13KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernelsSean Christopherson
Remove an ancient restriction that disallowed exposing EFER.NX to the guest if EFER.NX=0 on the host, even if NX is fully supported by the CPU. The motivation of the check, added by commit 2cc51560aed0 ("KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit"), was to rule out the case of host.EFER.NX=0 and guest.EFER.NX=1 so that KVM could run the guest with the host's EFER.NX and thus avoid context switching EFER if the only divergence was the NX bit. Fast forward to today, and KVM has long since stopped running the guest with the host's EFER.NX. Not only does KVM context switch EFER if host.EFER.NX=1 && guest.EFER.NX=0, KVM also forces host.EFER.NX=0 && guest.EFER.NX=1 when using shadow paging (to emulate SMEP). Furthermore, the entire motivation for the restriction was made obsolete over a decade ago when Intel added dedicated host and guest EFER fields in the VMCS (Nehalem timeframe), which reduced the overhead of context switching EFER from 400+ cycles (2 * WRMSR + 1 * RDMSR) to a mere ~2 cycles. In practice, the removed restriction only affects non-PAE 32-bit kernels, as EFER.NX is set during boot if NX is supported and the kernel will use PAE paging (32-bit or 64-bit), regardless of whether or not the kernel will actually use NX itself (mark PTEs non-executable). Alternatively and/or complementarily, startup_32_smp() in head_32.S could be modified to set EFER.NX=1 regardless of paging mode, thus eliminating the scenario where NX is supported but not enabled. However, that runs the risk of breaking non-KVM non-PAE kernels (though the risk is very, very low as there are no known EFER.NX errata), and also eliminates an easy-to-use mechanism for stressing KVM's handling of guest vs. host EFER across nested virtualization transitions. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210805183804.1221554-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-12Merge tag 'net-5.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from netfilter, bpf, can and ieee802154. The size of this is pretty normal, but we got more fixes for 5.14 changes this week than last week. Nothing major but the trend is the opposite of what we like. We'll see how the next week goes.. Current release - regressions: - r8169: fix ASPM-related link-up regressions - bridge: fix flags interpretation for extern learn fdb entries - phy: micrel: fix link detection on ksz87xx switch - Revert "tipc: Return the correct errno code" - ptp: fix possible memory leak caused by invalid cast Current release - new code bugs: - bpf: add missing bpf_read_[un]lock_trace() for syscall program - bpf: fix potentially incorrect results with bpf_get_local_storage() - page_pool: mask the page->signature before the checking, avoid dma mapping leaks - netfilter: nfnetlink_hook: 5 fixes to information in netlink dumps - bnxt_en: fix firmware interface issues with PTP - mlx5: Bridge, fix ageing time Previous releases - regressions: - linkwatch: fix failure to restore device state across suspend/resume - bareudp: fix invalid read beyond skb's linear data Previous releases - always broken: - bpf: fix integer overflow involving bucket_size - ppp: fix issues when desired interface name is specified via netlink - wwan: mhi_wwan_ctrl: fix possible deadlock - dsa: microchip: ksz8795: fix number of VLAN related bugs - dsa: drivers: fix broken backpressure in .port_fdb_dump - dsa: qca: ar9331: make proper initial port defaults Misc: - bpf: add lockdown check for probe_write_user helper - netfilter: conntrack: remove offload_pickup sysctl before 5.14 is out - netfilter: conntrack: collect all entries in one cycle, heuristically slow down garbage collection scans on idle systems to prevent frequent wake ups" * tag 'net-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) vsock/virtio: avoid potential deadlock when vsock device remove wwan: core: Avoid returning NULL from wwan_create_dev() net: dsa: sja1105: unregister the MDIO buses during teardown Revert "tipc: Return the correct errno code" net: mscc: Fix non-GPL export of regmap APIs net: igmp: increase size of mr_ifc_count MAINTAINERS: switch to my OMP email for Renesas Ethernet drivers tcp_bbr: fix u32 wrap bug in round logic if bbr_init() called after 2B packets net: pcs: xpcs: fix error handling on failed to allocate memory net: linkwatch: fix failure to restore device state across suspend/resume net: bridge: fix memleak in br_add_if() net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted by drivers towards the bridge net: bridge: fix flags interpretation for extern learn fdb entries net: dsa: sja1105: fix broken backpressure in .port_fdb_dump net: dsa: lantiq: fix broken backpressure in .port_fdb_dump net: dsa: lan9303: fix broken backpressure in .port_fdb_dump net: dsa: hellcreek: fix broken backpressure in .port_fdb_dump bpf, core: Fix kernel-doc notation net: igmp: fix data-race in igmp_ifc_timer_expire() net: Fix memory leak in ieee802154_raw_deliver ...
2021-08-12Merge tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A patch to avoid a soft lockup in ceph_check_delayed_caps() from Luis and a reference handling fix from Jeff that should address some memory corruption reports in the snaprealm area. Both marked for stable" * tag 'ceph-for-5.14-rc6' of git://github.com/ceph/ceph-client: ceph: take snap_empty_lock atomically with snaprealm refcount change ceph: reduce contention in ceph_check_delayed_caps()
2021-08-12Merge tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Another week, another set of pretty regular fixes, nothing really stands out too much. amdgpu: - Yellow carp update - RAS EEPROM fixes - BACO/BOCO fixes - Fix a memory leak in an error path - Freesync fix - VCN harvesting fix - Display fixes i915: - GVT fix for Windows VM hang. - Display fix of 12 BPC bits for display 12 and newer. - Don't try to access some media register for fused off domains. - Fix kerneldoc build warnings. mediatek: - Fix dpi bridge bug. - Fix cursor plane no update. meson: - Fix colors when booting with HDR" * tag 'drm-fixes-2021-08-13' of git://anongit.freedesktop.org/drm/drm: drm/doc/rfc: drop lmem uapi section drm/i915: Only access SFC_DONE when media domain is not fused off drm/i915/display: Fix the 12 BPC bits for PIPE_MISC reg drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work drm/amd/display: Remove invalid assert for ODM + MPC case drm/amd/pm: bug fix for the runtime pm BACO drm/amdgpu: handle VCN instances when harvesting (v2) drm/meson: fix colour distortion from HDR set during vendor u-boot drm/i915/gvt: Fix cached atomics setting for Windows VM drm/amdgpu: Add preferred mode in modeset when freesync video mode's enabled. drm/amd/pm: Fix a memory leak in an error handling path in 'vangogh_tables_init()' drm/amdgpu: don't enable baco on boco platforms in runpm drm/amdgpu: set RAS EEPROM address from VBIOS drm/amd/pm: update smu v13.0.1 firmware header drm/mediatek: Fix cursor plane no update drm/mediatek: mtk-dpi: Set out_fmt from config if not the last bridge drm/mediatek: dpi: Fix NULL dereference in mtk_dpi_bridge_atomic_check
2021-08-12ARM: ixp4xx: fix building both pci driversArnd Bergmann
When both the old and the new PCI drivers are enabled in the same kernel, there are a couple of namespace conflicts that cause a build failure: drivers/pci/controller/pci-ixp4xx.c:38: error: "IXP4XX_PCI_CSR" redefined [-Werror] 38 | #define IXP4XX_PCI_CSR 0x1c | In file included from arch/arm/mach-ixp4xx/include/mach/hardware.h:23, from arch/arm/mach-ixp4xx/include/mach/io.h:15, from arch/arm/include/asm/io.h:198, from include/linux/io.h:13, from drivers/pci/controller/pci-ixp4xx.c:20: arch/arm/mach-ixp4xx/include/mach/ixp4xx-regs.h:221: note: this is the location of the previous definition 221 | #define IXP4XX_PCI_CSR(x) ((volatile u32 *)(IXP4XX_PCI_CFG_BASE_VIRT+(x))) | drivers/pci/controller/pci-ixp4xx.c:148:12: error: 'ixp4xx_pci_read' redeclared as different kind of symbol 148 | static int ixp4xx_pci_read(struct ixp4xx_pci *p, u32 addr, u32 cmd, u32 *data) | ^~~~~~~~~~~~~~~ Rename both the ixp4xx_pci_read/ixp4xx_pci_write functions and the IXP4XX_PCI_CSR macro. In each case, I went with the version that has fewer callers to keep the change small. Fixes: f7821b493458 ("PCI: ixp4xx: Add a new driver for IXP4xx") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: soc@kernel.org Link: https://lore.kernel.org/r/20210721151546.2325937-1-arnd@kernel.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-12ARM: configs: Update the nhk8815_defconfigLinus Walleij
The platform lost the framebuffer due to a commit solving a circular dependency in v5.14-rc1, so add it back in by explicitly selecting the framebuffer. Also fix up some Kconfig options that got dropped or moved around while we're at it. Fixes: f611b1e7624c ("drm: Avoid circular dependencies for CONFIG_FB") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210807225518.3607126-1-linus.walleij@linaro.org' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-08-13Merge tag 'drm-misc-fixes-2021-08-12' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * meson: Fix colors when booting with HDR Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YRTb+qUuBYWjJDVg@linux-uq9g.fritz.box
2021-08-12hrtimer: Unbreak hrtimer_force_reprogram()Thomas Gleixner
Since the recent consoliation of reprogramming functions, hrtimer_force_reprogram() is affected by a check whether the new expiry time is past the current expiry time. This breaks the NOHZ logic as that relies on the fact that the tick hrtimer is moved into the future. That means cpu_base->expires_next becomes stale and subsequent reprogramming attempts fail as well until the situation is cleaned up by an hrtimer interrupts. For some yet unknown reason this leads to a complete stall, so for now partially revert the offending commit to a known working state. The root cause for the stall is still investigated and will be fixed in a subsequent commit. Fixes: b14bca97c9f5 ("hrtimer: Consolidate reprogramming code") Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mike Galbraith <efault@gmx.de> Link: https://lore.kernel.org/r/8735recskh.ffs@tglx
2021-08-12hrtimer: Use raw_cpu_ptr() in clock_was_set()Thomas Gleixner
clock_was_set() can be invoked from preemptible context. Use raw_cpu_ptr() to check whether high resolution mode is active or not. It does not matter whether the task migrates after acquiring the pointer. Fixes: e71a4153b7c2 ("hrtimer: Force clock_was_set() handling for the HIGHRES=n, NOHZ=y case") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/875ywacsmb.ffs@tglx
2021-08-13Merge tag 'drm-intel-fixes-2021-08-12' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - GVT fix for Windows VM hang. - Display fix of 12 BPC bits for display 12 and newer. - Don't try to access some media register for fused off domains. - Fix kerneldoc build warnings. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YRU/hnQ1sNr+j37x@intel.com
2021-08-12Merge tag 'ieee802154-for-davem-2021-08-12' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== ieee802154 for net 2021-08-12 Mostly fixes coming from bot reports. Dongliang Mu tackled some syzkaller reports in hwsim again and Takeshi Misawa a memory leak in ieee802154 raw. * tag 'ieee802154-for-davem-2021-08-12' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan: net: Fix memory leak in ieee802154_raw_deliver ieee802154: hwsim: fix GPF in hwsim_new_edge_nl ieee802154: hwsim: fix GPF in hwsim_set_edge_lqi ==================== Link: https://lore.kernel.org/r/20210812183912.1663996-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12x86/resctrl: Fix default monitoring groups reportingBabu Moger
Creating a new sub monitoring group in the root /sys/fs/resctrl leads to getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes on the entire filesystem. Steps to reproduce: 1. mount -t resctrl resctrl /sys/fs/resctrl/ 2. cd /sys/fs/resctrl/ 3. cat mon_data/mon_L3_00/mbm_total_bytes 23189832 4. Create sub monitor group: mkdir mon_groups/test1 5. cat mon_data/mon_L3_00/mbm_total_bytes Unavailable When a new monitoring group is created, a new RMID is assigned to the new group. But the RMID is not active yet. When the events are read on the new RMID, it is expected to report the status as "Unavailable". When the user reads the events on the default monitoring group with multiple subgroups, the events on all subgroups are consolidated together. Currently, if any of the RMID reads report as "Unavailable", then everything will be reported as "Unavailable". Fix the issue by discarding the "Unavailable" reads and reporting all the successful RMID reads. This is not a problem on Intel systems as Intel reports 0 on Inactive RMIDs. Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data") Reported-by: Paweł Szulik <pawel.szulik@intel.com> Signed-off-by: Babu Moger <Babu.Moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311 Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoger-ubuntu
2021-08-12vsock/virtio: avoid potential deadlock when vsock device removeLongpeng(Mike)
There's a potential deadlock case when remove the vsock device or process the RESET event: vsock_for_each_connected_socket: spin_lock_bh(&vsock_table_lock) ----------- (1) ... virtio_vsock_reset_sock: lock_sock(sk) --------------------- (2) ... spin_unlock_bh(&vsock_table_lock) lock_sock() may do initiative schedule when the 'sk' is owned by other thread at the same time, we would receivce a warning message that "scheduling while atomic". Even worse, if the next task (selected by the scheduler) try to release a 'sk', it need to request vsock_table_lock and the deadlock occur, cause the system into softlockup state. Call trace: queued_spin_lock_slowpath vsock_remove_bound vsock_remove_sock virtio_transport_release __vsock_release vsock_release __sock_release sock_close __fput ____fput So we should not require sk_lock in this case, just like the behavior in vhost_vsock or vmci. Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko") Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20210812053056.1699-1-longpeng2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event nameSteven Rostedt (VMware)
The following commands: # echo 'read_max u64 size;' > synthetic_events # echo 'hist:keys=common_pid:count=count:onmax($count).trace(read_max,count)' > events/syscalls/sys_enter_read/trigger Causes: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 4 PID: 1763 Comm: bash Not tainted 5.14.0-rc2-test+ #155 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strcmp+0xc/0x20 Code: 75 f7 31 c0 0f b6 0c 06 88 0c 02 48 83 c0 01 84 c9 75 f1 4c 89 c0 c3 0f 1f 80 00 00 00 00 31 c0 eb 08 48 83 c0 01 84 d2 74 0f <0f> b6 14 07 3a 14 06 74 ef 19 c0 83 c8 01 c3 31 c0 c3 66 90 48 89 RSP: 0018:ffffb5fdc0963ca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffffb3a4e040 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9714c0d0b640 RDI: 0000000000000000 RBP: 0000000000000000 R08: 00000022986b7cde R09: ffffffffb3a4dff8 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9714c50603c8 R13: 0000000000000000 R14: ffff97143fdf9e48 R15: ffff9714c01a2210 FS: 00007f1fa6785740(0000) GS:ffff9714da400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000002d863004 CR4: 00000000001706e0 Call Trace: __find_event_file+0x4e/0x80 action_create+0x6b7/0xeb0 ? kstrdup+0x44/0x60 event_hist_trigger_func+0x1a07/0x2130 trigger_process_regex+0xbd/0x110 event_trigger_write+0x71/0xd0 vfs_write+0xe9/0x310 ksys_write+0x68/0xe0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f1fa6879e87 The problem was the "trace(read_max,count)" where the "count" should be "$count" as "onmax()" only handles variables (although it really should be able to figure out that "count" is a field of sys_enter_read). But there's a path that does not find the variable and ends up passing a NULL for the event, which ends up getting passed to "strcmp()". Add a check for NULL to return and error on the command with: # cat error_log hist:syscalls:sys_enter_read: error: Couldn't create or find variable Command: hist:keys=common_pid:count=count:onmax($count).trace(read_max,count) ^ Link: https://lkml.kernel.org/r/20210808003011.4037f8d0@oasis.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: stable@vger.kernel.org Fixes: 50450603ec9cb tracing: Add 'onmax' hist trigger action support Reviewed-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12init: Suppress wrong warning for bootconfig cmdline parameterMasami Hiramatsu
Since the 'bootconfig' command line parameter is handled before parsing the command line, it doesn't use early_param(). But in this case, kernel shows a wrong warning message about it. [ 0.013714] Kernel command line: ro console=ttyS0 bootconfig console=tty0 [ 0.013741] Unknown command line parameters: bootconfig To suppress this message, add a dummy handler for 'bootconfig'. Link: https://lkml.kernel.org/r/162812945097.77369.1849780946468010448.stgit@devnote2 Fixes: 86d1919a4fb0 ("init: print out unknown kernel parameters") Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12tracing: define needed config DYNAMIC_FTRACE_WITH_ARGSLukas Bulwahn
Commit 2860cd8a2353 ("livepatch: Use the default ftrace_ops instead of REGS when ARGS is available") intends to enable config LIVEPATCH when ftrace with ARGS is available. However, the chain of configs to enable LIVEPATCH is incomplete, as HAVE_DYNAMIC_FTRACE_WITH_ARGS is available, but the definition of DYNAMIC_FTRACE_WITH_ARGS, combining DYNAMIC_FTRACE and HAVE_DYNAMIC_FTRACE_WITH_ARGS, needed to enable LIVEPATCH, is missing in the commit. Fortunately, ./scripts/checkkconfigsymbols.py detects this and warns: DYNAMIC_FTRACE_WITH_ARGS Referencing files: kernel/livepatch/Kconfig So, define the config DYNAMIC_FTRACE_WITH_ARGS analogously to the already existing similar configs, DYNAMIC_FTRACE_WITH_REGS and DYNAMIC_FTRACE_WITH_DIRECT_CALLS, in ./kernel/trace/Kconfig to connect the chain of configs. Link: https://lore.kernel.org/kernel-janitors/CAKXUXMwT2zS9fgyQHKUUiqo8ynZBdx2UEUu1WnV_q0OCmknqhw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20210806195027.16808-1-lukas.bulwahn@gmail.com Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Miroslav Benes <mbenes@suse.cz> Cc: stable@vger.kernel.org Fixes: 2860cd8a2353 ("livepatch: Use the default ftrace_ops instead of REGS when ARGS is available") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12trace/osnoise: Print a stop tracing messageDaniel Bristot de Oliveira
When using osnoise/timerlat with stop tracing, sometimes it is not clear in which CPU the stop condition was hit, mainly when using some extra events. Print a message informing in which CPU the trace stopped, like in the example below: <idle>-0 [006] d.h. 2932.676616: #1672599 context irq timer_latency 34689 ns <idle>-0 [006] dNh. 2932.676618: irq_noise: local_timer:236 start 2932.676615639 duration 2391 ns <idle>-0 [006] dNh. 2932.676620: irq_noise: virtio0-output.0:47 start 2932.676620180 duration 86 ns <idle>-0 [003] d.h. 2932.676621: #1673374 context irq timer_latency 1200 ns <idle>-0 [006] d... 2932.676623: thread_noise: swapper/6:0 start 2932.676615964 duration 4339 ns <idle>-0 [003] dNh. 2932.676623: irq_noise: local_timer:236 start 2932.676620597 duration 1881 ns <idle>-0 [006] d... 2932.676623: sched_switch: prev_comm=swapper/6 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=timerlat/6 next_pid=852 next_prio=4 timerlat/6-852 [006] .... 2932.676623: #1672599 context thread timer_latency 41931 ns <idle>-0 [003] d... 2932.676623: thread_noise: swapper/3:0 start 2932.676620854 duration 880 ns <idle>-0 [003] d... 2932.676624: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=timerlat/3 next_pid=849 next_prio=4 timerlat/6-852 [006] .... 2932.676624: timerlat_main: stop tracing hit on cpu 6 timerlat/3-849 [003] .... 2932.676624: #1673374 context thread timer_latency 4310 ns Link: https://lkml.kernel.org/r/b30a0d7542adba019185f44ee648e60e14923b11.1626598844.git.bristot@kernel.org Cc: Tom Zanussi <zanussi@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12trace/timerlat: Add a header with PREEMPT_RT additional fieldsDaniel Bristot de Oliveira
Some extra flags are printed to the trace header when using the PREEMPT_RT config. The extra flags are: need-resched-lazy, preempt-lazy-depth, and migrate-disable. Without printing these fields, the timerlat specific fields are shifted by three positions, for example: # tracer: timerlat # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # || / # |||| ACTIVATION # TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY # | | | |||| | | | | <idle>-0 [000] d..h... 3279.798871: #1 context irq timer_latency 830 ns <...>-807 [000] ....... 3279.798881: #1 context thread timer_latency 11301 ns Add a new header for timerlat with the missing fields, to be used when the PREEMPT_RT is enabled. Link: https://lkml.kernel.org/r/babb83529a3211bd0805be0b8c21608230202c55.1626598844.git.bristot@kernel.org Cc: Tom Zanussi <zanussi@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12trace/osnoise: Add a header with PREEMPT_RT additional fieldsDaniel Bristot de Oliveira
Some extra flags are printed to the trace header when using the PREEMPT_RT config. The extra flags are: need-resched-lazy, preempt-lazy-depth, and migrate-disable. Without printing these fields, the osnoise specific fields are shifted by three positions, for example: # tracer: osnoise # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth MAX # || / SINGLE Interference counters: # |||| RUNTIME NOISE %% OF CPU NOISE +-----------------------------+ # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD # | | | |||| | | | | | | | | | | <...>-741 [000] ....... 1105.690909: 1000000 234 99.97660 36 21 0 1001 22 3 <...>-742 [001] ....... 1105.691923: 1000000 281 99.97190 197 7 0 1012 35 14 <...>-743 [002] ....... 1105.691958: 1000000 1324 99.86760 118 11 0 1016 155 143 <...>-744 [003] ....... 1105.691998: 1000000 109 99.98910 21 4 0 1004 33 7 <...>-745 [004] ....... 1105.692015: 1000000 2023 99.79770 97 37 0 1023 52 18 Add a new header for osnoise with the missing fields, to be used when the PREEMPT_RT is enabled. Link: https://lkml.kernel.org/r/1f03289d2a51fde5a58c2e7def063dc630820ad1.1626598844.git.bristot@kernel.org Cc: Tom Zanussi <zanussi@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-12Merge branch 'for-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull ucounts fix from Eric Biederman: "This fixes the ucount sysctls on big endian architectures. The counts were expanded to be longs instead of ints, and the sysctl code was overlooked, so only the low 32bit were being processed. On litte endian just processing the low 32bits is fine, but on 64bit big endian processing just the low 32bits results in the high order bits instead of the low order bits being processed and nothing works proper. This change took a little bit to mature as we have the SYSCTL_ZERO, and SYSCTL_INT_MAX macros that are only usable for sysctls operating on ints, but unfortunately are not obviously broken. Which resulted in the versions of this change working on big endian and not on little endian, because the int SYSCTL_ZERO when extended 64bit wound up being 0x100000000. So we only allowed values greater than 0x100000000 and less than 0faff. Which unfortunately broken everything that tried to set the sysctls. (First reported with the windows subsystem for linux). I have tested this on x86_64 64bit after first reproducing the problems with the earlier version of this change, and then verifying the problems do not exist when we use appropriate long min and max values for extra1 and extra2" * 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ucounts: add missing data type changes
2021-08-12Merge tag 'sound-5.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This seems to be a usual bump in the middle, containing lots of pending ASoC fixes: - Yet another PCM mmap regression fix - Fix for ASoC DAPM prefix handling - Various cs42l42 codec fixes - PCM buffer reference fixes in a few ASoC drivers - Fixes for ASoC SOF, AMD, tlv320, WM - HD-audio quirks" * tag 'sound-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (32 commits) ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC ALSA: pcm: Fix mmap breakage without explicit buffer setup ALSA: hda: Add quirk for ASUS Flow x13 ASoC: cs42l42: Fix mono playback ASoC: cs42l42: Constrain sample rate to prevent illegal SCLK ASoC: cs42l42: Fix LRCLK frame start edge ASoC: cs42l42: PLL must be running when changing MCLK_SRC_SEL ASoC: cs42l42: Remove duplicate control for WNF filter frequency ASoC: cs42l42: Fix inversion of ADC Notch Switch control ASoC: SOF: Intel: hda-ipc: fix reply size checking ASoC: SOF: Intel: Kconfig: fix SoundWire dependencies ASoC: amd: Fix reference to PCM buffer address ASoC: nau8824: Fix open coded prefix handling ASoC: kirkwood: Fix reference to PCM buffer address ASoC: uniphier: Fix reference to PCM buffer address ASoC: xilinx: Fix reference to PCM buffer address ASoC: intel: atom: Fix reference to PCM buffer address ASoC: cs42l42: Fix bclk calculation for mono ASoC: cs42l42: Don't allow SND_SOC_DAIFMT_LEFT_J ASoC: cs42l42: Correct definition of ADC Volume control ...
2021-08-12wwan: core: Avoid returning NULL from wwan_create_dev()Andy Shevchenko
Make wwan_create_dev() to return either valid or error pointer, In some cases it may return NULL. Prevent this by converting it to the respective error pointer. Fixes: 9a44c1cc6388 ("net: Add a WWAN subsystem") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/20210811124845.10955-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-12block: pass a gendisk to bdev_resize_partitionChristoph Hellwig
bdev_resize_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: pass a gendisk to bdev_del_partitionChristoph Hellwig
bdev_del_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: pass a gendisk to bdev_add_partitionChristoph Hellwig
bdev_add_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12block: store a gendisk in struct parsed_partitionsChristoph Hellwig
Partition scanning only happens on the whole device, so pass a struct gendisk instead of the whole device block_device to the scanners. This allows to simplify printing the device name in various places as the disk name is available in disk->name. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210810154512.1809898-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12cifs: Call close synchronously during unlink/rename/lease break.Rohith Surabattula
During unlink/rename/lease break, deferred work for close is scheduled immediately but in an asynchronous manner which might lead to race with actual(unlink/rename) commands. This change will schedule close synchronously which will avoid the race conditions with other commands. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: stable@vger.kernel.org # 5.13 Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-12cifs: Handle race conditions during renameRohith Surabattula
When rename is executed on directory which has files for which close is deferred, then rename will fail with EACCES. This patch will try to close all deferred files when EACCES is received and retry rename on a directory. Signed-off-by: Rohith Surabattula <rohiths@microsoft.com> Cc: stable@vger.kernel.org # 5.13 Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-12block: remove GENHD_FL_UPChristoph Hellwig
Just check inode_unhashed on the whole device bdev inode instead, and provide a helper to check for that information. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12bcache: move the del_gendisk call out of bcache_device_freeChristoph Hellwig
Let the callers call del_gendisk so that we can check if add_disk has been called properly for the cached device case instead of relying on the block layer internal GENHD_FL_UP flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210809064028.1198327-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12bcache: add proper error unwinding in bcache_device_initChristoph Hellwig
Except for the IDA none of the allocations in bcache_device_init is unwound on error, fix that. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210809064028.1198327-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12sx8: use the internal state machine to check if del_gendisk needs to be calledChristoph Hellwig
Remove usage of the block layer internal GENHD_FL_UP by just looking at the host state. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12nvme: replace the GENHD_FL_UP check in nvme_mpath_shutdown_diskChristoph Hellwig
Use the nvme-internal NVME_NSHEAD_DISK_LIVE flag instead of abusing the block layer state. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12nvme: remove the GENHD_FL_UP check in nvme_ns_removeChristoph Hellwig
Early probe failure never reaches nvme_ns_remove, so GENHD_FL_UP must be set at this point. Remove the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12mmc: block: cleanup gendisk creationChristoph Hellwig
Restructure mmc_blk_probe to avoid a failure path with a half created disk. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210809064028.1198327-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12mmc: block: let device_add_disk create disk attributesChristoph Hellwig
Pass the attribute group for the attributes on the gendisk to device_add_disk so that they are created atomically with the disk creation. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210809064028.1198327-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-12xen/events: Fix race in set_evtchn_to_irqMaximilian Heyne
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq mapping are lazily allocated in this function. The check whether the row is already present and the row initialization is not synchronized. Two threads can at the same time allocate a new row for evtchn_to_irq and add the irq mapping to the their newly allocated row. One thread will overwrite what the other has set for evtchn_to_irq[row] and therefore the irq mapping is lost. This will trigger a BUG_ON later in bind_evtchn_to_cpu: INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802 INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002) INFO: nvme nvme77: 1/0/0 default/read/poll queues CRIT: kernel BUG at drivers/xen/events/events_base.c:427! WARN: invalid opcode: 0000 [#1] SMP NOPTI WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme] WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0 WARN: Call Trace: WARN: set_affinity_irq+0x121/0x150 WARN: irq_do_set_affinity+0x37/0xe0 WARN: irq_setup_affinity+0xf6/0x170 WARN: irq_startup+0x64/0xe0 WARN: __setup_irq+0x69e/0x740 WARN: ? request_threaded_irq+0xad/0x160 WARN: request_threaded_irq+0xf5/0x160 WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme] WARN: pci_request_irq+0xa9/0xf0 WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130 WARN: queue_request_irq+0x4c/0x70 [nvme] WARN: nvme_reset_work+0x82d/0x1550 [nvme] WARN: ? check_preempt_wakeup+0x14f/0x230 WARN: ? check_preempt_curr+0x29/0x80 WARN: ? nvme_irq_check+0x30/0x30 [nvme] WARN: process_one_work+0x18e/0x3c0 WARN: worker_thread+0x30/0x3a0 WARN: ? process_one_work+0x3c0/0x3c0 WARN: kthread+0x113/0x130 WARN: ? kthread_park+0x90/0x90 WARN: ret_from_fork+0x3a/0x50 This patch sets evtchn_to_irq rows via a cmpxchg operation so that they will be set only once. The row is now cleared before writing it to evtchn_to_irq in order to not create a race once the row is visible for other threads. While at it, do not require the page to be zeroed, because it will be overwritten with -1's in clear_evtchn_to_irq_row anyway. Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Fixes: d0b075ffeede ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated") Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-08-12platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200sHans de Goede
The Asus TP200s / E205SA 360 degree hinges 2-in-1 supports reporting SW_TABLET_MODE info through the ASUS_WMI_DEVID_LID_FLIP WMI device-id. Add a quirk to enable this. BugLink: https://gitlab.freedesktop.org/libinput/libinput/-/issues/639 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210812145513.39117-2-hdegoede@redhat.com
2021-08-12platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a ↵Hans de Goede
module option Unfortunately we have been unable to find a reliable way to detect if and how SW_TABLET_MODE reporting is supported, so we are relying on DMI quirks for this. Add a module-option to specify the SW_TABLET_MODE method so that this can be easily tested without needing to rebuild the kernel. BugLink: https://gitlab.freedesktop.org/libinput/libinput/-/issues/639 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210812145513.39117-1-hdegoede@redhat.com
2021-08-12x86/tools: Fix objdump version check againRandy Dunlap
Skip (omit) any version string info that is parenthesized. Warning: objdump version 15) is older than 2.19 Warning: Skipping posttest. where 'objdump -v' says: GNU objdump (GNU Binutils; SUSE Linux Enterprise 15) 2.35.1.20201123-7.18 Fixes: 8bee738bb1979 ("x86: Fix objdump version check in chkobjdump.awk for different formats.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20210731000146.2720-1-rdunlap@infradead.org
2021-08-12riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUEAlexandre Ghiti
The current comment states that we check if the 64-bit kernel mapping overlaps with the last 4K of the address space that is reserved to error values in create_kernel_page_table, which is not the case since it is done in setup_vm. But anyway, remove the reference to any function and simply note that in 64-bit kernel, the check should be done as soon as the kernel mapping base address is known. Fixes: db6b84a368b4 ("riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE") Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support itChangbin Du
The RISC-V special option '-mno-relax' which to disable linker relaxations is supported by GCC8+. For GCC7 and lower versions do not support this option. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12isofs: joliet: Fix iocharset=utf8 mount optionPali Rohár
Currently iocharset=utf8 mount option is broken. To use UTF-8 as iocharset, it is required to use utf8 mount option. Fix iocharset=utf8 mount option to use be equivalent to the utf8 mount option. If UTF-8 as iocharset is used then s_nls_iocharset is set to NULL. So simplify code around, remove s_utf8 field as to distinguish between UTF-8 and non-UTF-8 it is needed just to check if s_nls_iocharset is set to NULL or not. Link: https://lore.kernel.org/r/20210808162453.1653-5-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
2021-08-12udf: Fix iocharset=utf8 mount optionPali Rohár
Currently iocharset=utf8 mount option is broken. To use UTF-8 as iocharset, it is required to use utf8 mount option. Fix iocharset=utf8 mount option to use be equivalent to the utf8 mount option. If UTF-8 as iocharset is used then s_nls_map is set to NULL. So simplify code around, remove UDF_FLAG_NLS_MAP and UDF_FLAG_UTF8 flags as to distinguish between UTF-8 and non-UTF-8 it is needed just to check if s_nls_map set to NULL or not. Link: https://lore.kernel.org/r/20210808162453.1653-4-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
2021-08-12powerpc/xive: Do not skip CPU-less nodes when creating the IPIsCédric Le Goater
On PowerVM, CPU-less nodes can be populated with hot-plugged CPUs at runtime. Today, the IPI is not created for such nodes, and hot-plugged CPUs use a bogus IPI, which leads to soft lockups. We can not directly allocate and request the IPI on demand because bringup_up() is called under the IRQ sparse lock. The alternative is to allocate the IPIs for all possible nodes at startup and to request the mapping on demand when the first CPU of a node is brought up. Fixes: 7dcc37b3eff9 ("powerpc/xive: Map one IPI interrupt per node") Cc: stable@vger.kernel.org # v5.13 Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210807072057.184698-1-clg@kaod.org