Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- mmc_spi: Add SPI IDs to silence warning
- sdhci: Fix ADMA for PAGE_SIZE >= 64KiB
- sdhci-esdhc-imx: Disable broken CMDQ for imx8qm/imx8qxp/imx8mm
* tag 'mmc-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: spi: Add device-tree SPI IDs
mmc: sdhci: Fix ADMA for PAGE_SIZE >= 64KiB
mmc: sdhci-esdhc-imx: disable CMDQ support
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has an interrupt storm fix for the i801, better timeout handling
for the new virtio driver, and some documentation fixes this time"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
docs: i2c: smbus-protocol: mention the repeated start condition
i2c: virtio: disable timeout handling
i2c: i801: Fix interrupt storm from SMB_ALERT signal
i2c: i801: Restore INTREN on unload
dt-bindings: i2c: imx-lpi2c: Fix i.MX 8QM compatible matching
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- Kconfig fix to make it possible to control building of the privcmd
driver
- three fixes for issues identified by the kernel test robot
- a five-patch series to simplify timeout handling for Xen PV driver
initialization
- two patches to fix error paths in xenstore/xenbus driver
initialization
* tag 'for-linus-5.16c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: make HYPERVISOR_set_debugreg() always_inline
xen: make HYPERVISOR_get_debugreg() always_inline
xen: detect uninitialized xenbus in xenbus_init
xen: flag xen_snd_front to be not essential for system boot
xen: flag pvcalls-front to be not essential for system boot
xen: flag hvc_xen to be not essential for system boot
xen: flag xen_drm_front to be not essential for system boot
xen: add "not_essential" flag to struct xenbus_driver
xen/pvh: add missing prototype to header
xen: don't continue xenstore initialization in case of errors
xen/privcmd: make option visible in Kconfig
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Three arm64 fixes.
The main one is a fix to the way in which we evaluate the macro
arguments to our uaccess routines, which we _think_ might be the root
cause behind some unkillable tasks we've seen in the Android arm64 CI
farm (testing is ongoing). In any case, it's worth fixing.
Other than that, we've toned down an over-zealous VM_BUG_ON() and
fixed ftrace stack unwinding in a bunch of cases.
Summary:
- Evaluate uaccess macro arguments outside of the critical section
- Tighten up VM_BUG_ON() in pmd_populate_kernel() to avoid false positive
- Fix ftrace stack unwinding using HAVE_FUNCTION_GRAPH_RET_ADDR_PTR"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: uaccess: avoid blocking within critical sections
arm64: mm: Fix VM_BUG_ON(mm != &init_mm) for trans_pgd
arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
|
|
When CONFIG_COMMON_CLOCK is disabled, the 8996 specific
phy code is left out, which results in a link failure:
ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined reference to `msm_hdmi_phy_8996_cfg'
This was only exposed after it became possible to build
test the driver without the clock interfaces.
Make COMMON_CLK a hard dependency for compile testing,
and simplify it a little based on that.
Fixes: b3ed524f84f5 ("drm/msm: allow compile_test on !ARM")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20211013144308.2248978-1-arnd@kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
|
If writeback isn't configured, then we get the following warning when
compiling zram:
drivers/block/zram/zram_drv.c:1824:45: warning: unused variable 'zram_wb_devops' [-Wunused-const-variable]
Make sure we only define the block_device_operations if that option is
enabled.
Link: https://lore.kernel.org/lkml/202111261614.gCJMqcyh-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We need to call rq_qos_done() regardless of whether or not we're freeing
the request or not, as the reference count doesn't cover the IO completion
tracking.
Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()")
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
DRM_DEBUG_* and DRM_* log calls are deprecated.
Change them to drm_dbg_* / drm_{err,info,...} calls in drm core
files.
To avoid making a very big patch, this change is split in
smaller patches. This one includes drm_a*.c
Signed-off-by: Claudio Suarez <cssk@net-c.es>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/YaC7zXW119tlzfVh@gineta.localdomain
|
|
WARNING: inconsistent lock state
5.16.0-rc2-syzkaller #0 Not tainted
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
ffff888078e11418 (&ctx->timeout_lock
){?.+.}-{2:2}
, at: io_timeout_fn+0x6f/0x360 fs/io_uring.c:5943
{HARDIRQ-ON-W} state was registered at:
[...]
spin_unlock_irq include/linux/spinlock.h:399 [inline]
__io_poll_remove_one fs/io_uring.c:5669 [inline]
__io_poll_remove_one fs/io_uring.c:5654 [inline]
io_poll_remove_one+0x236/0x870 fs/io_uring.c:5680
io_poll_remove_all+0x1af/0x235 fs/io_uring.c:5709
io_ring_ctx_wait_and_kill+0x1cc/0x322 fs/io_uring.c:9534
io_uring_release+0x42/0x46 fs/io_uring.c:9554
__fput+0x286/0x9f0 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0xc14/0x2b40 kernel/exit.c:832
674ee8e1b4a41 ("io_uring: correct link-list traversal locking") fixed a
data race but introduced a possible deadlock and inconsistentcy in irq
states. E.g.
io_poll_remove_all()
spin_lock_irq(timeout_lock)
io_poll_remove_one()
spin_lock/unlock_irq(poll_lock);
spin_unlock_irq(timeout_lock)
Another type of problem is freeing a request while holding
->timeout_lock, which may leads to a deadlock in
io_commit_cqring() -> io_flush_timeouts() and other places.
Having 3 nested locks is also too ugly. Add io_match_task_safe(), which
would briefly take and release timeout_lock for race prevention inside,
so the actuall request cancellation / free / etc. code doesn't have it
taken.
Reported-by: syzbot+ff49a3059d49b0ca0eec@syzkaller.appspotmail.com
Reported-by: syzbot+847f02ec20a6609a328b@syzkaller.appspotmail.com
Reported-by: syzbot+3368aadcd30425ceb53b@syzkaller.appspotmail.com
Reported-by: syzbot+51ce8887cdef77c9ac83@syzkaller.appspotmail.com
Reported-by: syzbot+3cb756a49d2f394a9ee3@syzkaller.appspotmail.com
Fixes: 674ee8e1b4a41 ("io_uring: correct link-list traversal locking")
Cc: stable@kernel.org # 5.15+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/397f7ebf3f4171f1abe41f708ac1ecb5766f0b68.1637937097.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
WARNING: CPU: 1 PID: 20 at fs/io_uring.c:6269 io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
CPU: 1 PID: 20 Comm: kworker/1:0 Not tainted 5.16.0-rc1-syzkaller #0
Workqueue: events io_fallback_req_func
RIP: 0010:io_try_cancel_userdata+0x3c5/0x640 fs/io_uring.c:6269
Call Trace:
<TASK>
io_req_task_link_timeout+0x6b/0x1e0 fs/io_uring.c:6886
io_fallback_req_func+0xf9/0x1ae fs/io_uring.c:1334
process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
kthread+0x405/0x4f0 kernel/kthread.c:327
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
</TASK>
We need original task's context to do cancellations, so if it's dying
and the callback is executed in a fallback mode, fail the cancellation
attempt.
Fixes: 89b263f6d56e6 ("io_uring: run linked timeouts from task_work")
Cc: stable@kernel.org # 5.15+
Reported-by: syzbot+ab0cfe96c2b3cd1c1153@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4c41c5f379c6941ad5a07cd48cb66ed62199cf7e.1637937097.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
[BUG]
Fstests generic/027 is pretty easy to trigger a slow but steady memory
leak if run with "-o compress=lzo" mount option.
Normally one single run of generic/027 is enough to eat up at least 4G ram.
[CAUSE]
In commit d4088803f511 ("btrfs: subpage: make lzo_compress_pages()
compatible") we changed how @page_in is released.
But that refactoring makes @page_in only released after all pages being
compressed.
This leaves error path not releasing @page_in. And by "error path"
things like incompressible data will also be treated as an error
(-E2BIG).
Thus it can cause a memory leak if even nothing wrong happened.
[FIX]
Add check under @out label to release @page_in when needed, so when we
hit any error, the input page is properly released.
Reported-by: Josef Bacik <josef@toxicpanda.com>
Fixes: d4088803f511 ("btrfs: subpage: make lzo_compress_pages() compatible")
Reviewed-and-tested-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
vfs_kernel_mount() modifies the passed in mount options, leaving us with
"huge", instead of "huge=within_size". Normally this shouldn't matter
with the usual module load/unload flow, however with the core_hotunplug
IGT we are hitting the following, when re-probing the memory regions:
i915 0000:00:02.0: [drm] Transparent Hugepage mode 'huge'
tmpfs: Bad value for 'huge'
[drm] Unable to create a private tmpfs mount, hugepage support will be disabled(-22).
References: https://gitlab.freedesktop.org/drm/intel/-/issues/4651
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126110843.2028582-1-matthew.auld@intel.com
|
|
Move the declaration of temporary arrays to somewhere that won't go out
of scope before the devm_clk_hw_register() call, lest we be at the whim
of the compiler for whether those stack variables get overwritten.
Fixes a crash seen with gcc version 11.2.1 20210728 (Red Hat 11.2.1-1)
Fixes: bdd229ab26be ("ASoC: rt5682s: Add driver for ALC5682I-VS codec")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211118010453.843286-2-robdclark@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Move the declaration of temporary arrays to somewhere that won't go out
of scope before the devm_clk_hw_register() call, lest we be at the whim
of the compiler for whether those stack variables get overwritten.
Fixes a crash seen with gcc version 11.2.1 20210728 (Red Hat 11.2.1-1)
Fixes: edbd24ea1e5c ("ASoC: rt5682: Drop usage of __clk_get_name()")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211118010453.843286-1-robdclark@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver currently subscribes for a late system sleep call.
The initcall_debug log shows that suspend call for ADX device
happens after the parent device (AHUB). This seems to cause
suspend failure on Jetson TX2 platform. Also there is no use
of having late system sleep specifically for ADX device. Fix
the order by using normal system sleep.
Fixes: a99ab6f395a9 ("ASoC: tegra: Add Tegra210 based ADX driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-7-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver currently subscribes for a late system sleep call.
The initcall_debug log shows that suspend call for AMX device
happens after the parent device (AHUB). This seems to cause
suspend failure on Jetson TX2 platform. Also there is no use
of having late system sleep specifically for AMX device. Fix
the order by using normal system sleep.
Fixes: 77f7df346c45 ("ASoC: tegra: Add Tegra210 based AMX driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-6-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver currently subscribes for a late system sleep call.
The initcall_debug log shows that suspend call for Mixer device
happens after the parent device (AHUB). This seems to cause
suspend failure on Jetson TX2 platform. Also there is no use
of having late system sleep specifically for Mixer device. Fix
the order by using normal system sleep.
Fixes: 05bb3d5ec64a ("ASoC: tegra: Add Tegra210 based Mixer driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-5-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver currently subscribes for a late system sleep call.
The initcall_debug log shows that suspend call for MVC device
happens after the parent device (AHUB). This seems to cause
suspend failure on Jetson TX2 platform. Also there is no use
of having late system sleep specifically for MVC device. Fix
the order by using normal system sleep.
Fixes: e539891f9687 ("ASoC: tegra: Add Tegra210 based MVC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-4-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The driver currently subscribes for a late system sleep call.
The initcall_debug log shows that suspend call for SFC device
happens after the parent device (AHUB). This seems to cause
suspend failure on Jetson TX2 platform. Also there is no use
of having late system sleep specifically for SFC device. Fix
the order by using normal system sleep.
Fixes: b2f74ec53a6c ("ASoC: tegra: Add Tegra210 based SFC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-3-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
After successful application of volume/mute settings via mixer control
put calls, the control returns without balancing the runtime PM count.
This makes device to be always runtime active. Fix this by allowing
control to reach pm_runtime_put() call.
Fixes: e539891f9687 ("ASoC: tegra: Add Tegra210 based MVC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1637676459-31191-2-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
INVLPG operates on guest virtual address, which are represented by
vcpu->arch.walk_mmu. In nested virtualization scenarios,
kvm_mmu_invlpg() was using the wrong MMU structure; if L2's invlpg were
emulated by L0 (in practice, it hardly happen) when nested two-dimensional
paging is enabled, the call to ->tlb_flush_gva() would be skipped and
the hardware TLB entry would not be invalidated.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211124122055.64424-5-jiangshanlai@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If the is an L1 with nNPT in 32bit, the shadow walk starts with
pae_root.
Fixes: a717a780fc4e ("KVM: x86/mmu: Support shadowing NPT when 5-level paging is enabled in host)
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211124122055.64424-2-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
may hit RLIMIT_NOFILE limits:
# ./kvm_create_max_vcpus
KVM_CAP_MAX_VCPU_ID: 4096
KVM_CAP_MAX_VCPUS: 1024
Testing creating 1024 vCPUs, with IDs 0...1023.
/dev/kvm not available (errno: 24), skipping test
Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
capability which is generally not needed to run kvm selftests (but without
raising the limit the test is doomed to fail anyway).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
[Skip the test if the hard limit can be raised. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Commit 63f5a1909f9e ("KVM: x86: Alert userspace that KVM_SET_CPUID{,2}
after KVM_RUN is broken") officially deprecated KVM_SET_CPUID{,2} ioctls
after first successful KVM_RUN and promissed to make this sequence forbiden
in 5.16. It's time to fulfil the promise.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
hyperv_features's sole purpose is to test access to various Hyper-V MSRs
and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
for each sub-test.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fully emulate a guest TLB flush on nested VM-Enter which changes vpid12,
i.e. L2's VPID, instead of simply doing INVVPID to flush real hardware's
TLB entries for vpid02. From L1's perspective, changing L2's VPID is
effectively a TLB flush unless "hardware" has previously cached entries
for the new vpid12. Because KVM tracks only a single vpid12, KVM doesn't
know if the new vpid12 has been used in the past and so must treat it as
a brand new, never been used VPID, i.e. must assume that the new vpid12
represents a TLB flush from L1's perspective.
For example, if L1 and L2 share a CR3, the first VM-Enter to L2 (with a
VPID) is effectively a TLB flush as hardware/KVM has never seen vpid12
and thus can't have cached entries in the TLB for vpid12.
Reported-by: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
Fixes: 5c614b3583e7 ("KVM: nVMX: nested VPID emulation")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211125014944.536398-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Like KVM_REQ_TLB_FLUSH_CURRENT, the GUEST variant needs to be serviced at
nested transitions, as KVM doesn't track requests for L1 vs L2. E.g. if
there's a pending flush when a nested VM-Exit occurs, then the flush was
requested in the context of L2 and needs to be handled before switching
to L1, otherwise the flush for L2 would effectiely be lost.
Opportunistically add a helper to handle CURRENT and GUEST as a pair, the
logic for when they need to be serviced is identical as both requests are
tied to L1 vs. L2, the only difference is the scope of the flush.
Reported-by: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
Fixes: 07ffaf343e34 ("KVM: nVMX: Sync all PGDs on nested transition with shadow paging")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211125014944.536398-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Flush the current VPID when handling KVM_REQ_TLB_FLUSH_GUEST instead of
always flushing vpid01. Any TLB flush that is triggered when L2 is
active is scoped to L2's VPID (if it has one), e.g. if L2 toggles CR4.PGE
and L1 doesn't intercept PGE writes, then KVM's emulation of the TLB
flush needs to be applied to L2's VPID.
Reported-by: Lai Jiangshan <jiangshanlai+lkml@gmail.com>
Fixes: 07ffaf343e34 ("KVM: nVMX: Sync all PGDs on nested transition with shadow paging")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211125014944.536398-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The capability, albeit present, was never exposed via KVM_CHECK_EXTENSION.
Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration")
Cc: Peter Gonda <pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Ensure that the ASID are freed promptly, which becomes more important
when more tests are added to this file.
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM leaves the source VM in a dead state,
so migrating back to the original source VM fails the ioctl. Adjust
the test.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Synchronize the two calls to kvm_x86_sync_pir_to_irr. The one
in the reenter-guest fast path invoked the callback unconditionally
even if LAPIC is present but disabled. In this case, there are
no interrupts to deliver, and therefore posted interrupts can
be ignored.
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is not an unrecoverable situation. Users of kvm_read_guest_offset_cached
and kvm_write_guest_offset_cached must expect the read/write to fail, and
therefore it is possible to just return early with an error value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
An uninitialized gfn_to_hva_cache has ghc->len == 0, which causes
the accessors to croak very loudly. While a BUG_ON is definitely
_too_ loud and a bug on its own, there is indeed an issue of using
the caches in such a way that they could not have been initialized,
because ghc->gpa == 0 might match and thus kvm_gfn_to_hva_cache_init
would not be called.
For the vmcs12_cache, the solution is simply to invoke
kvm_gfn_to_hva_cache_init unconditionally: we already know
that the cache does not match the current VMCS pointer.
For the shadow_vmcs12_cache, there is no similar condition
that checks the VMCS link pointer, so invalidate the cache
on VMXON.
Fixes: cee66664dcd6 ("KVM: nVMX: Use a gfn_to_hva_cache for vmptrld")
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reported-by: syzbot+7b7db8bb4db6fd5e157b@syzkaller.appspotmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.16, take #2
- Fix constant sign extension affecting TCR_EL2 and preventing
running on ARMv8.7 models due to spurious bits being set
- Fix use of helpers using PSTATE early on exit by always sampling
it as soon as the exit takes place
- Move pkvm's 32bit handling into a common helper
|
|
into HEAD
KVM/riscv fixes for 5.16, take #1
- Fix incorrect KVM_MAX_VCPUS value
- Unmap stage2 mapping when deleting/moving a memslot
(This was due to empty kvm_arch_flush_shadow_memslot())
|
|
The capture code is typically run entirely in the fence signalling
critical path. We're about to add lockdep annotation in an upcoming patch
which reveals a lockdep splat similar to the below one.
Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
(which is the same as GFP_WAIT, but open-coded for clarity) rather than
GFP_KERNEL for memory allocation in the capture path. This has the
potential drawback that capture might fail in situations with memory
pressure.
[ 234.842048] WARNING: possible circular locking dependency detected
[ 234.842050] 5.15.0-rc7+ #20 Tainted: G U W
[ 234.842052] ------------------------------------------------------
[ 234.842054] gem_exec_captur/1180 is trying to acquire lock:
[ 234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
[ 234.842063]
but task is already holding lock:
[ 234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[ 234.842138]
which lock already depends on the new lock.
[ 234.842140]
the existing dependency chain (in reverse order) is:
[ 234.842142]
-> #2 (dma_fence_map){++++}-{0:0}:
[ 234.842145] __dma_fence_might_wait+0x41/0xa0
[ 234.842149] dma_resv_lockdep+0x1dc/0x28f
[ 234.842151] do_one_initcall+0x58/0x2d0
[ 234.842154] kernel_init_freeable+0x273/0x2bf
[ 234.842157] kernel_init+0x16/0x120
[ 234.842160] ret_from_fork+0x1f/0x30
[ 234.842163]
-> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[ 234.842166] fs_reclaim_acquire+0x6d/0xd0
[ 234.842168] __kmalloc_node+0x51/0x3a0
[ 234.842171] alloc_cpumask_var_node+0x1b/0x30
[ 234.842174] native_smp_prepare_cpus+0xc7/0x292
[ 234.842177] kernel_init_freeable+0x160/0x2bf
[ 234.842179] kernel_init+0x16/0x120
[ 234.842181] ret_from_fork+0x1f/0x30
[ 234.842184]
-> #0 (fs_reclaim){+.+.}-{0:0}:
[ 234.842186] __lock_acquire+0x1161/0x1dc0
[ 234.842189] lock_acquire+0xb5/0x2b0
[ 234.842192] fs_reclaim_acquire+0xa1/0xd0
[ 234.842193] __kmalloc+0x4d/0x330
[ 234.842196] i915_vma_coredump_create+0x78/0x5b0 [i915]
[ 234.842253] intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[ 234.842307] __i915_gpu_coredump+0x290/0x5e0 [i915]
[ 234.842365] i915_capture_error_state+0x57/0xa0 [i915]
[ 234.842415] intel_gt_handle_error+0x348/0x3e0 [i915]
[ 234.842462] intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[ 234.842504] simple_attr_write+0xc1/0xe0
[ 234.842507] full_proxy_write+0x53/0x80
[ 234.842509] vfs_write+0xbc/0x350
[ 234.842513] ksys_write+0x58/0xd0
[ 234.842514] do_syscall_64+0x38/0x90
[ 234.842516] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 234.842519]
other info that might help us debug this:
[ 234.842521] Chain exists of:
fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
[ 234.842526] Possible unsafe locking scenario:
[ 234.842528] CPU0 CPU1
[ 234.842529] ---- ----
[ 234.842531] lock(dma_fence_map);
[ 234.842532] lock(mmu_notifier_invalidate_range_start);
[ 234.842535] lock(dma_fence_map);
[ 234.842537] lock(fs_reclaim);
[ 234.842539]
*** DEADLOCK ***
[ 234.842540] 4 locks held by gem_exec_captur/1180:
[ 234.842543] #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[ 234.842547] #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
[ 234.842552] #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
[ 234.842602] #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
[ 234.842656]
stack backtrace:
[ 234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G U W 5.15.0-rc7+ #20
[ 234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[ 234.842664] Call Trace:
[ 234.842666] dump_stack_lvl+0x57/0x72
[ 234.842669] check_noncircular+0xde/0x100
[ 234.842672] ? __lock_acquire+0x3bf/0x1dc0
[ 234.842675] __lock_acquire+0x1161/0x1dc0
[ 234.842678] lock_acquire+0xb5/0x2b0
[ 234.842680] ? __kmalloc+0x4d/0x330
[ 234.842683] ? finish_task_switch.isra.0+0xf2/0x360
[ 234.842686] ? i915_vma_coredump_create+0x78/0x5b0 [i915]
[ 234.842734] fs_reclaim_acquire+0xa1/0xd0
[ 234.842737] ? __kmalloc+0x4d/0x330
[ 234.842739] __kmalloc+0x4d/0x330
[ 234.842742] i915_vma_coredump_create+0x78/0x5b0 [i915]
[ 234.842793] ? capture_vma+0xbe/0x110 [i915]
[ 234.842844] intel_engine_coredump_add_vma+0x36/0xe0 [i915]
[ 234.842892] __i915_gpu_coredump+0x290/0x5e0 [i915]
[ 234.842939] i915_capture_error_state+0x57/0xa0 [i915]
[ 234.842985] intel_gt_handle_error+0x348/0x3e0 [i915]
[ 234.843032] ? __mutex_lock+0x81/0x830
[ 234.843035] ? simple_attr_write+0x3a/0xe0
[ 234.843038] ? __lock_acquire+0x3bf/0x1dc0
[ 234.843041] intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
[ 234.843083] ? _copy_from_user+0x45/0x80
[ 234.843086] simple_attr_write+0xc1/0xe0
[ 234.843089] full_proxy_write+0x53/0x80
[ 234.843091] vfs_write+0xbc/0x350
[ 234.843094] ksys_write+0x58/0xd0
[ 234.843096] do_syscall_64+0x38/0x90
[ 234.843098] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 234.843101] RIP: 0033:0x7fa467480877
[ 234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
[ 234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
[ 234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
[ 234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
[ 234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005
v5:
- Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
(Daniel Vetter)
v6:
- Include an instance in execlists_capture_work().
- Rework the commit message due to patch reordering.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
|
|
The gpu coredump typically takes place in a dma_fence signalling
critical path, and hence can't use GFP_KERNEL allocations, as that
means we might hit deadlocks under memory pressure. However
changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
patch will instead mean a lower chance of the allocation succeeding.
In particular large contigous allocations like the coredump page
vector.
Remove the page vector in favor of a linked list of single pages.
Use the page lru list head as the list link, as the page owner is
allowed to do that.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-2-thomas.hellstrom@linux.intel.com
|
|
Jakub Kicinski says:
====================
tls: splice_read fixes
As I work my way to unlocked and zero-copy TLS Rx the obvious bugs
in the splice_read implementation get harder and harder to ignore.
This is to say the fixes here are discovered by code inspection,
I'm not aware of anyone actually using splice_read.
====================
Link: https://lore.kernel.org/r/20211124232557.2039757-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Previous patch fixes overriding callbacks incorrectly. Triggering
the crash in sendpage_locked would be more spectacular but it's
hard to get to, so take the easier path of proving this is broken
and call getname. We're currently getting IPv4 socket info on an
IPv6 socket.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We replace proto_ops whenever TLS is configured for RX. But our
replacement also overrides sendpage_locked, which will crash
unless TX is also configured. Similarly we plug both of those
in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
different implementation for TX.
Last but not least we always plug in something based on inet_stream_ops
even though a few of the callbacks differ for IPv6 (getname, release,
bind).
Use a callback building method similar to what we do for struct proto.
Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Fixes: d4ffb02dee2f ("net/tls: enable sk_msg redirect to tls socket egress")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add tests for half-received and peeked records.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
recvmsg() will put peek()ed and partially read records onto the rx_list.
splice_read() needs to consult that list otherwise it may miss data.
Align with recvmsg() and also put partially-read records onto rx_list.
tls_sw_advance_skb() is pretty pointless now and will be removed in
net-next.
Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make sure we correctly reject splicing non-data records.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We don't support splicing control records. TLS 1.3 changes moved
the record type check into the decrypt if(). The skb may already
be decrypted and still be an alert.
Note that decrypt_skb_update() is idempotent and updates ctx->decrypted
so the if() is pointless.
Reorder the check for decryption errors with the content type check
while touching them. This part is not really a bug, because if
decryption failed in TLS 1.3 content type will be DATA, and for
TLS 1.2 it will be correct. Nevertheless its strange to touch output
before checking if the function has failed.
Fixes: fedf201e1296 ("net: tls: Refactor control message handling on recv")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Test broken records.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add helpers for sending and receiving special record types.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have the same code 3 times, about to add a fourth copy.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
More missed changes, the response back to another system sending a
command that had no user to handle it wasn't formatted properly.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|