summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-05-28tools headers UAPI: Sync kvm headers with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: 4af663c2f64a8d25 ("KVM: SEV: Allow per-guest configuration of GHCB protocol version") 4f5defae708992dd ("KVM: SEV: introduce KVM_SEV_INIT2 operation") 26c44aa9e076ed83 ("KVM: SEV: define VM types for SEV and SEV-ES") ac5c48027bacb1b5 ("KVM: SEV: publish supported VMSA features") 651d61bc8b7d8bb6 ("KVM: PPC: Fix documentation for ppc mmu caps") That don't change functionality in tools/perf, as no new ioctl is added for the 'perf trace' scripts to harvest. This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joel Stanley <joel@jms.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Roth <michael.roth@amd.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/lkml/ZlYxAdHjyAkvGtMW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28drm/sti: Allow build with COMPILE_TEST=yVille Syrjälä
Allow sti to be built with COMPILE_TEST=y for greater coverage. Builds fine on x86/x86_64 at least. Cc: Alain Volmat <alain.volmat@foss.st.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240408170426.9285-8-ville.syrjala@linux.intel.com Acked-by: Alain Volmat <alain.volmat@foss.st.com>
2024-05-28drm/sti: Include linux/io.h for devm_ioremap()Ville Syrjälä
Include linux/io.h for devm_ioremap(). When built on x86_64 w/ COMPILE_TEST=y: ../drivers/gpu/drm/sti/sti_dvo.c:531:21: error: implicit declaration of function ‘devm_ioremap’ [-Werror=implicit-function-declaration] 531 | dvo->regs = devm_ioremap(dev, res->start, | ^~~~~~~~~~~~ ../drivers/gpu/drm/sti/sti_dvo.c:531:19: error: assignment to ‘void *’ from ‘int’ makes pointer from integer without a cast [-Werror=int-conversion] 531 | dvo->regs = devm_ioremap(dev, res->start, | ^ Cc: Alain Volmat <alain.volmat@foss.st.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240408170426.9285-7-ville.syrjala@linux.intel.com Acked-by: Alain Volmat <alain.volmat@foss.st.com>
2024-05-28tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes from these csets: 53bc516ade85a764 ("x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place") That patch just move definitions around, so this just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://lore.kernel.org/lkml/ZlYe8jOzd1_DyA7X@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28Merge tag 'tpmdd-next-6.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "This fixes two unaddressed review comments for the HMAC encryption patch set. They are cosmetic but we are better off, if such unnecessary glitches do not exist in the release. The important part is enabling the HMAC encryption by default only on x86-64 because that is the only sufficiently tested arch. Finally, there is a bug fix for SPI transfer buffer allocation, which did not take into account the SPI header size" * tag 'tpmdd-next-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Enable TCG_TPM2_HMAC by default only for X86_64 tpm: Rename TPM2_OA_TMPL to TPM2_OA_NULL_KEY and make it local tpm: Open code tpm_buf_parameters() tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer
2024-05-28Merge tag 'probes-fixes-v6.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - uprobes: prevent mutex_lock() under rcu_read_lock(). Recent changes moved uprobe_cpu_buffer preparation which involves mutex_lock(), under __uprobe_trace_func() which is called inside rcu_read_lock(). Fix it by moving uprobe_cpu_buffer preparation outside of __uprobe_trace_func() - kprobe-events: handle the error case of btf_find_struct_member() * tag 'probes-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: fix error check in parse_btf_field() uprobes: prevent mutex_lock() under rcu_read_lock()
2024-05-28of: of_test: add MODULE_DESCRIPTION()Jeff Johnson
Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240524-md-of-of_test-v1-1-6ebd078d620f@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-05-28nvmet: fix a possible leak when destroy a ctrl during qp establishmentSagi Grimberg
In nvmet_sq_destroy we capture sq->ctrl early and if it is non-NULL we know that a ctrl was allocated (in the admin connect request handler) and we need to release pending AERs, clear ctrl->sqs and sq->ctrl (for nvme-loop primarily), and drop the final reference on the ctrl. However, a small window is possible where nvmet_sq_destroy starts (as a result of the client giving up and disconnecting) concurrently with the nvme admin connect cmd (which may be in an early stage). But *before* kill_and_confirm of sq->ref (i.e. the admin connect managed to get an sq live reference). In this case, sq->ctrl was allocated however after it was captured in a local variable in nvmet_sq_destroy. This prevented the final reference drop on the ctrl. Solve this by re-capturing the sq->ctrl after all inflight request has completed, where for sure sq->ctrl reference is final, and move forward based on that. This issue was observed in an environment with many hosts connecting multiple ctrls simoutanuosly, creating a delay in allocating a ctrl leading up to this race window. Reported-by: Alex Turin <alex@vastdata.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-28nvme: use srcu for iterating namespace listKeith Busch
The nvme pci driver synchronizes with all the namespace queues during a reset to ensure that there's no pending timeout work. Meanwhile the timeout work potentially iterates those same namespaces to freeze their queues. Each of those namespace iterations use the same read lock. If a write lock should somehow get between the synchronize and freeze steps, then forward progress is deadlocked. We had been relying on the nvme controller state machine to ensure the reset work wouldn't conflict with timeout work. That guarantee may be a bit fragile to rely on, so iterate the namespace lists without taking potentially circular locks, as reported by lockdep. Link: https://lore.kernel.org/all/20220930001943.zdbvolc3gkekfmcv@shindev/ Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-28drm/xe: Enable Coarse Power GatingRiana Tauro
Enable power gating for all units and sub-pipes that are disabled by default. v2: change the init function name use symmetric calls for enable/disable pg re-pharase commit message (Rodrigo) modify the sub-pipe power gating condition v3: set hysteresis value for render and media when GuC PC is disabled skip CPG for PVC (Vinay) v4: rebase Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> #v2 Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524070916.143022-3-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-28drm/xe: Standardize power gate registersRiana Tauro
Standardize power gate registers No functional changes v2: change commit message (Rodrigo) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524070916.143022-2-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-28drm/dp: Fix documentation warningMarileneGarcia
It fixes the following warnings when the kernel documentation is generated: ./include/drm/display/drm_dp_helper.h:126: warning: Function parameter or struct member 'mode' not described in 'drm_dp_as_sdp' ./include/drm/display/drm_dp_helper.h:126: warning: Excess struct member 'operation_mode' description in 'drm_dp_as_sdp' Signed-off-by: MarileneGarcia <marilene.agarcia@gmail.com> Fixes: 0bbb8f594e33 ("drm/dp: Add Adaptive Sync SDP logging") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20240405141640.09b0bdbf@canb.auug.org.au Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240519031027.433751-1-marilene.agarcia@gmail.com
2024-05-28bcachefs: Better fsck error message for key versionKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: btree_gc can now handle unknown btreesKent Overstreet
Compatibility fix - we no longer have a separate table for which order gc walks btrees in, and special case the stripes btree directly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: add missing MODULE_DESCRIPTION()Jeff Johnson
Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/bcachefs/mean_and_variance_test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix setting of downgrade recovery passes/errorsKent Overstreet
bch2_check_version_downgrade() was setting c->sb.version, which bch2_sb_set_downgrade() expects to be at the previous version; and it shouldn't even have been set directly because c->sb.version is updated by write_super(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Run check_key_has_snapshot in snapshot_delete_keys()Kent Overstreet
delete_dead_snapshots now runs before the main fsck.c passes which check for keys for invalid snapshots; thus, it needs those checks as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Refactor delete_dead_snapshots()Kent Overstreet
Consolidate per-key work into delete_dead_snapshots_process_key(), so we now walk all keys once, not twice. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix locking assertKent Overstreet
We now track whether a transaction is locked, and verify that we don't have nodes locked when the transaction isn't locked; reorder relocks to not pop the new assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Fix lookup_first_inode() when inode_generations are presentKent Overstreet
This function is used for finding the hash seed (which is the same in all versions of an inode in different snapshots): ff an inode has been deleted in a child snapshot we need to iterate until we find a live version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28bcachefs: Plumb bkey into __btree_err()Kent Overstreet
It can be useful to know the exact byte offset within a btree node where an error occured. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-28tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUsDhananjay Ugwekar
Update cpupower's P-State frequency calculation and reporting with AMD Family 1Ah+ processors, when using the acpi-cpufreq driver. This is due to a change in the PStateDef MSR layout in AMD Family 1Ah+. Tested on 4th and 5th Gen AMD EPYC system Signed-off-by: Ananth Narayan <Ananth.Narayan@amd.com> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-28drm/xe: Don't refer to general LRC initialization as a "wa"Matt Roper
During engine LRC initialization a number of registers need to be programmed as general setup. This programming is not a "workaround" so naming the RTP table as "lrc_was" is misleading; switch to the name "lrc_setup" to more accurately describe what the table is actually for. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524230444.1447797-2-matthew.d.roper@intel.com
2024-05-28btrfs: ensure fast fsync waits for ordered extents after a write failureFilipe Manana
If a write path in COW mode fails, either before submitting a bio for the new extents or an actual IO error happens, we can end up allowing a fast fsync to log file extent items that point to unwritten extents. This is because dropping the extent maps happens when completing ordered extents, at btrfs_finish_one_ordered(), and the completion of an ordered extent is executed in a work queue. This can result in a fast fsync to start logging file extent items based on existing extent maps before the ordered extents complete, therefore resulting in a log that has file extent items that point to unwritten extents, resulting in a corrupt file if a crash happens after and the log tree is replayed the next time the fs is mounted. This can happen for both direct IO writes and buffered writes. For example consider a direct IO write, in COW mode, that fails at btrfs_dio_submit_io() because btrfs_extract_ordered_extent() returned an error: 1) We call btrfs_finish_ordered_extent() with the 'uptodate' parameter set to false, meaning an error happened; 2) That results in marking the ordered extent with the BTRFS_ORDERED_IOERR flag; 3) btrfs_finish_ordered_extent() queues the completion of the ordered extent - so that btrfs_finish_one_ordered() will be executed later in a work queue. That function will drop extent maps in the range when it's executed, since the extent maps point to unwritten locations (signaled by the BTRFS_ORDERED_IOERR flag); 4) After calling btrfs_finish_ordered_extent() we keep going down the write path and unlock the inode; 5) After that a fast fsync starts and locks the inode; 6) Before the work queue executes btrfs_finish_one_ordered(), the fsync task sees the extent maps that point to the unwritten locations and logs file extent items based on them - it does not know they are unwritten, and the fast fsync path does not wait for ordered extents to complete, which is an intentional behaviour in order to reduce latency. For the buffered write case, here's one example: 1) A fast fsync begins, and it starts by flushing delalloc and waiting for the writeback to complete by calling filemap_fdatawait_range(); 2) Flushing the dellaloc created a new extent map X; 3) During the writeback some IO error happened, and at the end io callback (end_bbio_data_write()) we call btrfs_finish_ordered_extent(), which sets the BTRFS_ORDERED_IOERR flag in the ordered extent and queues its completion; 4) After queuing the ordered extent completion, the end io callback clears the writeback flag from all pages (or folios), and from that moment the fast fsync can proceed; 5) The fast fsync proceeds sees extent map X and logs a file extent item based on extent map X, resulting in a log that points to an unwritten data extent - because the ordered extent completion hasn't run yet, it happens only after the logging. To fix this make btrfs_finish_ordered_extent() set the inode flag BTRFS_INODE_NEEDS_FULL_SYNC in case an error happened for a COW write, so that a fast fsync will wait for ordered extent completion. Note that this issues of using extent maps that point to unwritten locations can not happen for reads, because in read paths we start by locking the extent range and wait for any ordered extents in the range to complete before looking for extent maps. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-28tools headers: Update the syscall tables and unistd.h, mostly to support the ↵Arnaldo Carvalho de Melo
new 'mseal' syscall But also to wire up shadow stacks on 32-bit x86, picking up those changes from these csets: ff388fe5c481d39c ("mseal: wire up mseal syscall") 2883f01ec37dd866 ("x86/shstk: Enable shadow stacks for x32") This makes 'perf trace' support it, now its possible, for instance to do: # perf trace -e mseal --max-stack=16 Here is an example with the 'sendmmsg' syscall: root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1 0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1 syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) [0x117ce7] (/usr/lib64/libc.so.6 (deleted)) root@x1:~# To do a system wide tracing of the new 'mseal' syscall with a backtrace of at most 16 entries. This addresses these perf tools build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H J Lu <hjl.tools@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28drm/xe: Use platform name in xe_assert()Michal Wajdeczko
We can now use more user-friendly platform name instead of previosly used magic platform enumerator value: [ ] xe 0000:00:02.0: [drm] Assertion `false` failed! platform: ALDERLAKE_S ... [ ] xe 0000:03:00.0: [drm] Assertion `false` failed! platform: DG2 ... vs [ ] xe 0000:00:02.0: [drm] Assertion `false` failed! platform: 3 ... [ ] xe 0000:03:00.0: [drm] Assertion `false` failed! platform: 7 ... Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521142257.756-4-michal.wajdeczko@intel.com
2024-05-28drm/xe: Store platform name in xe_device.infoMichal Wajdeczko
We already maintain the platform name as part of the device descriptor, but in xe_device.info we only store platform enum, which is not the best for use in any user-facing messages. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521142257.756-2-michal.wajdeczko@intel.com
2024-05-28net: ti: icssg-prueth: Fix start counter for ft1 filterMD Danish Anwar
The start counter for FT1 filter is wrongly set to 0 in the driver. FT1 is used for source address violation (SAV) check and source address starts at Byte 6 not Byte 0. Fix this by changing start counter to ETH_ALEN in icssg_ft1_set_mac_addr(). Fixes: e9b4ece7d74b ("net: ti: icssg-prueth: Add Firmware config and classification APIs.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20240527063015.263748-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28bcache: code cleanup in __bch_bucket_alloc_set()Coly Li
In __bch_bucket_alloc_set() the lines after lable 'err:' indeed do nothing useful after multiple cache devices are removed from bcache code. This cleanup patch drops the useless code to save a bit CPU cycles. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-4-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28bcache: call force_wake_up_gc() if necessary in check_should_bypass()Coly Li
If there are extreme heavy write I/O continuously hit on relative small cache device (512GB in my testing), it is possible to make counter c->gc_stats.in_use continue to increase and exceed CUTOFF_CACHE_ADD. If 'c->gc_stats.in_use > CUTOFF_CACHE_ADD' happens, all following write requests will bypass the cache device because check_should_bypass() returns 'true'. Because all writes bypass the cache device, counter c->sectors_to_gc has no chance to be negative value, and garbage collection thread won't be waken up even the whole cache becomes clean after writeback accomplished. The aftermath is that all write I/Os go directly into backing device even the cache device is clean. To avoid the above situation, this patch uses a quite conservative way to fix: if 'c->gc_stats.in_use > CUTOFF_CACHE_ADD' happens, only wakes up garbage collection thread when the whole cache device is clean. Before the fix, the writes-always-bypass situation happens after 10+ hours write I/O pressure on 512GB Intel optane memory which acts as cache device. After this fix, such situation doesn't happen after 36+ hours testing. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-3-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28bcache: allow allocator to invalidate bucket in gcDongsheng Yang
Currently, if the gc is running, when the allocator found free_inc is empty, allocator has to wait the gc finish. Before that, the IO is blocked. But actually, there would be some buckets is reclaimable before gc, and gc will never mark this kind of bucket to be unreclaimable. So we can put these buckets into free_inc in gc running to avoid IO being blocked. Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28block: check for max_hw_sectors underflowHannes Reinecke
The logical block size need to be smaller than the max_hw_sector setting, otherwise we can't even transfer a single LBA. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28block: stack max_user_sectorsChristoph Hellwig
The max_user_sectors is one of the three factors determining the actual max_sectors limit for READ/WRITE requests. Because of that it needs to be stacked at least for the device mapper multi-path case where requests are directly inserted on the lower device. For SCSI disks this is important because the sd driver actually sets it's own advisory limit that is lower than max_hw_sectors based on the block limits VPD page. While this is a bit odd an unusual, the same effect can happen if a user or udev script tweaks the value manually. Fixes: 4f563a64732d ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28sd: also set max_user_sectors when setting max_sectorsChristoph Hellwig
sd can set a max_sectors value that is lower than the max_hw_sectors limit based on the block limits VPD page. While this is rather unusual, it used to work until the max_user_sectors field was split out to cleanly deal with conflicting hardware and user limits when the hardware limit changes. Also set max_user_sectors to ensure the limit can properly be stacked. Fixes: 4f563a64732d ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28null_blk: Print correct max open zones limit in null_init_zoned_dev()Damien Le Moal
When changing the maximum number of open zones, print that number instead of the total number of zones. Fixes: dc4d137ee3b7 ("null_blk: add support for max open/active zone limit for zoned devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240528062852.437599-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-28drm/xe: allow unaligned start and size xe_res_cursor parametersAndrzej Hajda
xe_res_cursor code does not depend on the alignment. On the other side unaligned accesses are useful from pread/pwrite point of view. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240418-xe_res_cursor-no-align-v1-1-8df7834266c9@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-28drm/xe: flush gtt before signalling user fence on all enginesAndrzej Hajda
Tests show that user fence signalling requires kind of write barrier, otherwise not all writes performed by the workload will be available to userspace. It is already done for render and compute, we need it also for the rest: video, gsc, copy. v2: added gsc and copy engines, added fixes and r-b tags Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1488 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522-xu_flush_vcs_before_ufence-v2-1-9ac3e9af0323@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-28regulator: rtq2208: Fix invalid memory access when ↵Alina Yu
devm_of_regulator_put_matches is called In this patch, a software bug has been fixed. rtq2208_ldo_match is no longer a local variable. It prevents invalid memory access when devm_of_regulator_put_matches is called. Signed-off-by: Alina Yu <alina_yu@richtek.com> Link: https://msgid.link/r/4ce8c4f16f1cf3aa4e5f36c0694dd3c5ccf3cd1c.1716870419.git.alina_yu@richtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-28ALSA/hda: intel-dsp-config: reduce log verbosityPierre-Louis Bossart
The information on PCI class/subclass was interesting in the Skylake timeframe, since the DSP was only enabled on a limited number of platforms. Now most Intel platforms do enable the DSP, so the information is less interesting to log. When a DSP driver is used, the common helper may be called multiple times due to deferred probes, but there's no reason to print the same information multiple times. Using dev_info_once() covers all the existing usages for internal cards with DSPs. External cards don't rely on DSPs so far. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20240527193808.165652-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-28ALSA: seq: Don't clear bank selection at event -> UMP MIDI2 conversionTakashi Iwai
The current code to convert from a legacy sequencer event to UMP MIDI2 clears the bank selection at each time the program change is submitted. This is confusing and may lead to incorrect bank values tranmitted to the destination in the end. Drop the line to clear the bank info and keep the provided values. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-28ALSA: seq: Fix missing bank setup between MIDI1/MIDI2 UMP conversionTakashi Iwai
When a UMP packet is converted between MIDI1 and MIDI2 protocols, the bank selection may be lost. The conversion from MIDI1 to MIDI2 needs the encoding of the bank into UMP_MSG_STATUS_PROGRAM bits, while the conversion from MIDI2 to MIDI1 needs the extraction from that instead. This patch implements the missing bank selection mechanism in those conversions. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-28tpm: Enable TCG_TPM2_HMAC by default only for X86_64Jarkko Sakkinen
Given the not fully root caused performance issues on non-x86 platforms, enable the feature by default only for x86-64. That is the platform it brings the most value and has gone most of the QA. Can be reconsidered later and can be obviously opt-in enabled too on any arch. Link: https://lore.kernel.org/linux-integrity/bf67346ef623ff3c452c4f968b7d900911e250c3.camel@gmail.com/#t Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-28tpm: Rename TPM2_OA_TMPL to TPM2_OA_NULL_KEY and make it localJarkko Sakkinen
Rename and document TPM2_OA_TMPL, as originally requested in the patch set review, but left unaddressed without any appropriate reasoning. The new name is TPM2_OA_NULL_KEY, has a documentation and is local only to tpm2-sessions.c. Link: https://lore.kernel.org/linux-integrity/ddbeb8111f48a8ddb0b8fca248dff6cc9d7079b2.camel@HansenPartnership.com/ Link: https://lore.kernel.org/linux-integrity/CZCKTWU6ZCC9.2UTEQPEVICYHL@suppilovahvero/ Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-28sock_map: avoid race between sock_map_close and sk_psock_putThadeu Lima de Souza Cascardo
sk_psock_get will return NULL if the refcount of psock has gone to 0, which will happen when the last call of sk_psock_put is done. However, sk_psock_drop may not have finished yet, so the close callback will still point to sock_map_close despite psock being NULL. This can be reproduced with a thread deleting an element from the sock map, while the second one creates a socket, adds it to the map and closes it. That will trigger the WARN_ON_ONCE: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Modules linked in: CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02 RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293 RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000 RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0 RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3 R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840 R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870 FS: 000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0 Call Trace: <TASK> unix_release+0x87/0xc0 net/unix/af_unix.c:1048 __sock_release net/socket.c:659 [inline] sock_close+0xbe/0x240 net/socket.c:1421 __fput+0x42b/0x8a0 fs/file_table.c:422 __do_sys_close fs/open.c:1556 [inline] __se_sys_close fs/open.c:1541 [inline] __x64_sys_close+0x7f/0x110 fs/open.c:1541 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fb37d618070 Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070 RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Use sk_psock, which will only check that the pointer is not been set to NULL yet, which should only happen after the callbacks are restored. If, then, a reference can still be gotten, we may call sk_psock_stop and cancel psock->work. As suggested by Paolo Abeni, reorder the condition so the control flow is less convoluted. After that change, the reproducer does not trigger the WARN_ON_ONCE anymore. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reported-by: syzbot+07a2e4a1a57118ef7355@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=07a2e4a1a57118ef7355 Fixes: aadb2bb83ff7 ("sock_map: Fix a potential use-after-free in sock_map_close()") Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself") Cc: stable@vger.kernel.org Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20240524144702.1178377-1-cascardo@igalia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28tpm: Open code tpm_buf_parameters()Jarkko Sakkinen
With only single call site, this makes no sense (slipped out of the radar during the review). Open code and document the action directly to the site, to make it more readable. Fixes: 1b6d7f9eb150 ("tpm: add session encryption protection to tpm2_get_random()") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-28tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer bufferMatthew R. Ochs
The TPM SPI transfer mechanism uses MAX_SPI_FRAMESIZE for computing the maximum transfer length and the size of the transfer buffer. As such, it does not account for the 4 bytes of header that prepends the SPI data frame. This can result in out-of-bounds accesses and was confirmed with KASAN. Introduce SPI_HDRSIZE to account for the header and use to allocate the transfer buffer. Fixes: a86a42ac2bd6 ("tpm_tis_spi: Add hardware wait polling") Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Tested-by: Carol Soto <csoto@nvidia.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-28drm/rockchip: dsi: Add support for RK3128Alex Bee
The DesignWare MIPI DSI controller found RK3128 SoCs supports up to 4 DSI data lanes. Similar to PX30/RK356x/RV1126 it uses an external D-PHY. Signed-off-by: Alex Bee <knaerzche@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240509140653.168591-6-knaerzche@gmail.com
2024-05-28dt-bindings: display: rockchip,dw-mipi-dsi: Document RK3128 DSIAlex Bee
Document the MIPI DSI controller for Rockchip RK3128. The integration is similar to PX30 so it's bindings-constraints can be re-used. Signed-off-by: Alex Bee <knaerzche@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240509140653.168591-2-knaerzche@gmail.com
2024-05-28drm/sun4i: hdmi: Switch to HDMI connectorMaxime Ripard
The new HDMI connector infrastructure allows to remove some boilerplate, especially to generate infoframes. Let's switch to it. Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Sui Jingfeng <sui.jingfeng@linux.dev> Reviewed-by: Andy Yan <andyshrk@163.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240527-kms-hdmi-connector-state-v15-29-c5af16c3aae2@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-05-28drm/rockchip: inno_hdmi: Switch to HDMI connectorMaxime Ripard
The new HDMI connector infrastructure allows to remove some boilerplate, especially to generate infoframes. Let's switch to it. Reviewed-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Andy Yan <andyshrk@163.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240527-kms-hdmi-connector-state-v15-28-c5af16c3aae2@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>